← Writing · AI Automation
Flux Working Paper No. 20

Accountability Is Not a Reason to Keep Humans in the Process

Ken Ruto · Flux (FluxImpact) · June 2026 · 8 min
↩ Read as essay
BibTeX · RIS
Abstract

For well-specified, high-volume, rule-based processes, this paper argues, human involvement does not produce accountability so much as a bottleneck. It frames the reflex to keep a human in the loop as the institutionalisation of distrust in systems never designed well enough to deserve trust, and argues that better-designed, auditable systems make that reflex obsolete.

Keywords: accountability, automation, human-in-the-loop, process design, agentic AI

I want to make an argument that I expect most readers to reject, at least initially. I want to make it precisely because the initial rejection is instructive — it reveals an assumption so deeply embedded in how organisations think about process design that it is rarely examined.

The argument is this: for well-specified, high-volume, rule-based processes, human involvement does not produce accountability. It produces a bottleneck, and occasionally a point of failure that a well-specified automated system would not have. The instinct to keep humans in the loop for these processes is not caution. It is the institutionalisation of distrust in systems that were never designed well enough to deserve trust — and it has been, for most of the history of organisational life, the correct response to that distrust. But it is no longer the only possible response, because we now have the capability to build systems that deserve trust.

The Accountability Argument Examined

The standard argument for keeping humans in the loop is that accountability requires a human. When a human makes a decision — approves a transaction, signs a document, reviews a compliance output — there is a named person responsible for the quality of that decision. When software makes the decision, there is no named person. Accountability diffuses into the architecture of the system, the quality of the training data, the decisions of the engineers who built it.

This argument is correct in the cases where it applies. When the decision involves genuine judgment — the kind that depends on context, precedent, and interpretive skill that resists specification — a named human accountable for the outcome is not merely a governance requirement. It is a substantive input to the quality of the decision. A judge, a doctor, a senior manager approving a strategic exception — these are not bureaucratic checkpoints. They are nodes in the process where genuine expertise is applied, and where accountability is meaningful because it tracks a judgment that was actually made.

But the accountability argument is applied to a much broader category of decisions than the one it describes accurately. It is used to justify human involvement in processes where the decisions are not matters of judgment — where the policy is clear, the rules are specified, and the human's role is to check the request against rules that could be encoded. In these cases, the named human is not making a judgment. They are executing a process. And when they make errors — which they do, with the regularity that human attention and memory permit — those errors are not failures of judgment. They are failures of execution that a well-specified automated system would not produce.

The Error Rate Problem

Organisations that keep humans in the loop for rule-based processes rarely measure the error rate of the humans performing the rules check. This is a significant omission, because the error rate is often the clearest evidence that human involvement is not producing the accountability it is supposed to provide.

A finance officer reviewing expense reports against policy is, in practice, a human performing a pattern-matching task under time pressure with incomplete attention to each individual case. The approval rate is high because most requests are compliant. The error rate — requests that violate policy and are approved anyway, or requests that are compliant and are rejected — is non-trivial, and it is higher during high-volume periods, at the end of the month, and for reviewers managing large volumes across many submitters.1

A compliance officer reviewing NGO constitutions against a statutory checklist is performing a task for which the checklist is known and the document is available. The error rate — provisions that are missing but not flagged, provisions that are present but incorrectly assessed — is a function of the reviewer's familiarity with the specific requirements, the quality of their attention over the course of a long document, and their experience with the specific language patterns that do and do not satisfy each requirement.

In both cases, the named human is providing accountability in the formal sense: there is a person who can be asked "did you review this?" and who will say yes. The accountability is real in the process record. It is not always real in the substance. The human reviewed it — and the human may have missed things that a systematic process would not have missed.

What Systematic Accountability Looks Like

I want to distinguish between two kinds of accountability.

The first is accountability as responsibility: a named person is responsible for the outcome. If something goes wrong, that person is accountable. This is the kind of accountability that the "keep humans in the loop" argument invokes.

The second is accountability as reliability: the process that produced the outcome is specified, documented, and auditable. If something goes wrong, the reason can be identified in the process record. The failure can be diagnosed and the process corrected. This is the kind of accountability that a well-specified automated system provides.

For well-specified, rule-based processes, the second kind of accountability is superior to the first. Not because named responsibility is unimportant, but because named responsibility for a rule-based process is a weak accountability mechanism: it tracks whether a person was present in the process, not whether the process produced the correct outcome. An automated process that fails is diagnosable and correctable. A human process that fails is explainable but harder to fix: "the reviewer missed it" is an account of what happened, not a solution to the structural problem that the reviewer could miss it again.

PBOMaster's constitution analyser provides systematic accountability. Every check it runs is documented — the specific provision checked, the text of the constitution assessed against it, the determination reached. If the analyser reaches a wrong determination, the error is visible in the audit trail and can be corrected. The correction applies to every future analysis. There is no named human accountable for the analysis. There is a documented process accountable for every step of it.

This is better accountability for this task than the alternative of paying a lawyer sixty thousand shillings to do a review that produces a written opinion but not a systematic record of every check performed.

The Right Scope for Human Oversight

I am not arguing that humans should be removed from every process. The argument is about scope.

Humans should be in the loop for: decisions that involve genuine judgment, interpretive skill, and context-sensitivity that cannot be specified in advance. The exception cases — the situations that fall outside the parameters of a well-specified rule set. The policy design process itself, which determines what the automated system will enforce. The governance oversight function that determines whether the automated system is performing correctly over time.

Humans should not be in the loop for: the individual instances of a well-specified, high-volume rule-checking process. The routine expense approvals, constitution checks, data validation passes, document routing decisions that together constitute the administrative bulk of most organisations' operations. Not because these decisions don't matter, but because the mechanism for ensuring they are made correctly is better served by systematic automation than by individual human attention applied at high volume.

The organisations that understand this distinction will design their AI deployments differently from the ones that don't. They will not ask "how can AI assist our people in doing these reviews?" They will ask "what is the most reliable mechanism for ensuring that these rules are applied correctly?" For many processes, the answer is: build a system that applies the rules, audit the system's performance, and free your people for the work that actually requires them.

The Implication for Trust

There is a deeper issue behind the accountability argument that I want to name. The insistence on human oversight for well-specified processes is, at its root, an expression of distrust in software — and historically, it has been the correct expression. Most of the software that organisations have used to manage compliance, approvals, and process enforcement has been genuinely unreliable: misconfigured, poorly maintained, opaque in its operation, and difficult to audit when something goes wrong.

The correct response to unreliable software is to keep humans in the loop. Not because humans are more reliable than software in principle, but because a human who is wrong can be questioned and corrected, while software that is wrong often cannot be diagnosed without specialist intervention.

But the basis of this distrust has changed. The software that now exists for well-specified compliance and process tasks is more auditable, more configurable, and more transparent in its operation than the enterprise software that preceded it. When the system is wrong, the error is visible and correctable. The appropriate response to this change is to calibrate human oversight to the actual reliability of the current system — not to maintain the same level of oversight that was justified by the unreliability of previous systems.

Accountability is not a reason to keep humans in the process. It is a reason to design processes that are accountable. For well-specified tasks, a well-designed automated system is more accountable — in the substantive sense — than a human review chain that produces named responsibility without systematic reliability.


  1. On error rates in manual compliance and approval processes: Association of Certified Fraud Examiners (ACFE), Report to the Nations on Occupational Fraud and Abuse (various years) [verify specific statistics on approved-but-non-compliant expense claims]; see also academic literature on human error rates in rule-based tasks [search: cognitive load and decision fatigue in compliance roles].

Provenance
Flux Working Paper No. 20 · Ken Ruto, Flux (FluxImpact)
Published 9 Jun 2026
Content hash (SHA-256): 450a7b283250b237… · build 81caba6
DOI: pending deposit
Ken Ruto
About the author
Ken Ruto

Founder of Flux. Building vertical AI-powered SaaS for Africa's institutions — and writing the thesis behind every bet. kenruto.fluximpact.org →

Share X LinkedIn WhatsApp
Did this land?
Was it useful?

Comments

No comments yet — be the first.

Get new essays

No spam — just the next piece when it's out.

Think I got something wrong? Highlight any sentence to push back on it — or It comes straight to me, never shown publicly.

Push back
Related writing
14 min
The Personal Operating Layer: The Case for an AI That Lives in Your Messages
People already run their lives inside chat threads — texting themselves, opening solo group chats to keep track of things. The most useful AI won't be an app you open. It will be a presence in your messages that talks like a friend, acts on your behalf, and holds you to the person you're trying to become.
18 min
The Operating Workforce: An African-First Theory of AI Employees
When you hear “AI employee” you picture a Western firm replacing expensive knowledge workers. The African case inverts it: AI employees will first-fill roles institutions never managed to staff at all. ProcureBee is the first of a fleet.
8 min
The Institutional Gap Is the Feature
The most constrained environments will build the most complete AI workflows first — not because of technological advantage, but because in those environments the process-elimination question is forced rather than optional. The gap is not the barrier. It is the brief.