Agentic warfare and the role of the human
Will those who wield the most advanced artificial intelligence (AI) dominate the future of warfare? That is the implicit wager behind growing investments in military AI across capitals from Washington to Beijing. According to The Brookings Institution, the United States tripled its spending on AI from 2022 to 2023. Most was spending by the Pentagon, which continues to increase.
The vision is one of “agentic warfare,” in which autonomous systems, powered by increasingly capable AI, take on critical battlefield roles, from surveillance and targeting to decision support and, perhaps one day, command. In this scenario, human involvement becomes not just optional, but marginal.
This techno-determinist outlook has gained traction among defence planners and Silicon Valley entrepreneurs, including Alex Wang, co-founder and CEO of Scale AI, a key provider of training data to OpenAI, Google, Microsoft, and Meta. Writing in The Economist on March 4, Wang notes, “With AI agents at the helm, battle strategies will adapt in real time to capitalise on enemy weaknesses — moving from first strike to decisive victory before technologically inferior forces even grasp that the game is under way.”
Scale AI has secured a multimillion-dollar contract with the US Department of Defense (DOD), joining companies like Anduril and Microsoft in the Pentagon’s Thunderforge project. According to the DOD’s Defense Innovation Unit, Thunderforge aims to “provide AI-assisted planning capabilities, decision support tools, and automated workflows, enabling military planners to navigate evolving operational environments.” The project is designed to ensure that critical decisions in future conflict scenarios can be made at so-called “machine speed.”
But warfare is not just a contest of machines. It is a fundamentally human enterprise, shaped by judgement, culture, politics, and ethics. To suggest that agentic systems alone will determine outcomes is to indulge in a form of technological mysticism. Worse, it risks creating systems we do not fully understand, deploying them in contexts we cannot fully control.
Shifting to agentic AI in warfare
The idea of agentic warfare deserves greater scrutiny. At its core are technological developments that capture a shift to a new generation of AI — one that goes beyond today’s familiar and widely used tools, such as ChatGPT, which operate within fixed parameters and await user prompts.
Agentic AI systems are designed not only to respond, but to take action on their own to reach a goal or objective. A generative tool might use a traveller’s preferences to suggest the best time to visit Italy; an agentic system would proceed to book the flights, reserve the hotel, and adjust the itinerary.
In a military context, agentic AI would not simply assist commanders, but would influence or even make battlefield decisions, needing humans only for a final approval. Before such a change occurs, it is important to take a closer look at what this kind of technology means for the future of war and to figure out how humans will stay in control.
Such a technological shift is currently being researched and tested for use in defence environments. Already, swarms of drones are being designed to coordinate autonomously. AI is increasingly used to filter intelligence, prioritize threats, and even suggest courses of action in command centres. Future iterations may go further, reasoning over incomplete data, anticipating adversarial behaviour, and proposing adaptive strategies in real time.
This autonomous functioning raises concerns. Key among them is the question of control: who is accountable when a system makes a mistake, causes an escalation in conflict, or acts unpredictably? How do we ensure that the actions of AI agents align with human intent, particularly under conditions of uncertainty, deception, or adversarial interference?
These concerns are not new. Discussions about autonomous weapons systems under the United Nations Convention on Certain Conventional Weapons (CCW) began in 2014, well before large language models and reinforcement-learning agents captured the public imagination. For years, these deliberations were confined to a narrow circle of diplomats, civil society advocates, and a handful of technologists. I have followed them closely since 2015, and while the conversation has expanded significantly, particularly with the rise of responsible AI frameworks, it has struggled to keep pace with technical change.
Much of the focus has been on “meaningful human control,” a principle that requires clear commitment from states. Is control meaningful if a human supervises an autonomous drone fleet but cannot intervene in real time? Is it meaningful if an operator approves a targeting decision made by an opaque neural network whose reasoning they cannot grasp? These are not theoretical dilemmas. They are the daily design choices of engineers and the policy puzzles of defence bureaucrats.
The problem is exacerbated by a gap between technical and diplomatic communities. AI researchers speak of alignment, reward hacking, and emergent behaviour. Diplomats speak of norms, accountability, and humanitarian law. Rarely do these vocabularies intersect. Yet they must, because the risks of agentic warfare are not confined to coding errors or rogue drones. They extend to strategic stability, alliance cohesion, and the moral legitimacy of force.
Alignment challenges
In AI research, alignment refers to the process of ensuring that a system’s actions remain consistent with human goals and values. But achieving alignment is difficult, especially when systems operate in dynamic environments, learn from complex data, or interact with other agents. Misalignment can take subtle forms; an agent optimizing for proxy metrics could ignore uncommon situations (“rare edge cases”) or learn to deceive its evaluators. In military contexts, these failures can have lethal consequences.
Even more worrying is the phenomenon of “alignment faking,” in which a system appears compliant during testing but behaves differently in deployment. This is of particular concern in discussions on responsible military AI that focus on ensuring proper testing of systems. Large-scale language models already exhibit behaviours that shift depending on prompt framing, task phrasing, or oversight cues. As models become more agentic, capable of planning, memory, and self-modification, the risk of emergent power-seeking behaviour grows. While still the subject of active debate in AI safety circles, these risks should not be ignored in military contexts in which the cost of failure is war or conflict escalation.
Geopolitical dynamics raise the stakes. The strategic rivalry between the United States and China will likely continue. Both powers are investing heavily in military applications of AI, from logistics and decision-support to electronic warfare and autonomous platforms. While there is some bilateral dialogue, there is little trust, limited transparency, and no binding agreement on the responsible use of AI in military settings.
Bringing together policy and technical knowledge
What, then, can we do?
First, we must resist the idea that agentic warfare is a foregone conclusion. Rather, it is a choice that can be guided, shaped, and constrained by policy, law, and ethics. The role of the human must not be treated as a legacy constraint but as a core design principle. Systems must be built for oversight, auditability, and intervention, not just speed and scale.
Second, we need more dialogue among AI researchers, military planners, ethicists, and diplomats. Misalignment is both a technical and a governance problem. Building systems that reflect human intent requires an understanding of what that intent is, how the intent is expressed, and how it can be enforced across organizational and national boundaries.
Third, international frameworks must evolve. The CCW, while valuable as an incubator of ideas, has struggled to deliver concrete outcomes. The future of other initiatives, such as the US-led Political Declaration on Responsible Military Use of Artificial Intelligence and Autonomy, is currently uncertain under the administration of Donald Trump. The multi-stakeholder REAIM process offers a pathway for knowledge-building between the policy community and technical community. However, it is neither as widely representative nor as likely a forum for an official agreement. Discussions at the United Nations General Assembly have sought to extend engagement to more states and could provide a venue for a more relevant framework. But even the preliminary discussions have faced pushback from some major states, which argue that the CCW is the appropriate forum for these talks.
Still, all these forums can contribute to norm-building and the confidence-building measures necessary to pave the way for clear commitments. However, to do more than produce consensus statements, they need to move toward mechanisms for verification, incident reporting, and cooperative risk reduction.
Finally, we must invest in the human element — not only in engineers and analysts, but in diplomats, ethicists, and civil society actors who can provide independent scrutiny.
There is no denying that AI will change warfare. But whether it serves or supplants human values remains up to humans. The future of conflict is not yet written in code. It is being negotiated, debated, and designed in real time.
Published in The Ploughshares Monitor Summer 2025

 
		