Blog Posts

BlueContext

Opinions on Agentic Software Engineering

Re-inventing Software Engineering - now with agents?

Having moved from academia into industry with a focus on agentic software development, I quickly noticed on thing: a remarkable number of articles, guides, and framework documentation are rediscovering practices that the software engineering field spent the better part of four decades learning the hard way – not that academia is immune to this.

TL/DR:

Specs — is spec-driven development repeating the original waterfall mistake: treating the spec as a fixed input to the agent rather than a living document the agent should help revise?
Narrow, focused agents — Decomposing work into single-responsibility units is the oldest principle in software engineering, applying equally to modules, teams, and now agents.
Documentation — The challenge of giving a context-limited participant enough structured information to act correctly without overwhelming.
Retrospectives — stepping back to improve the process of working with agents and improving agents is what retrospectives and process improvement programmes have been designed for.

(Disclaimer: these are just a few examples that caught my attention and I’m sure there are many valid different ways to draw parallels to past work.)

Re-inventing the waterfall

Consider first the push toward writing specifications before building — now spreading as "spec-driven development." The historical irony here runs deep. Benington described a nine-phase process for the SAGE air-defense system in his 1956 paper Production of Large Computer Programs in which specifications preceded coding — but explicitly warned in his 1983 retrospective that treating those specs as fixed and handed down was "misleading and dangerous," and that the right approach was to build a working prototype and let it drive revision. Royce, whose 1970 paper Managing the Development of Large Software Systems became the blueprint for the waterfall model, was making exactly the same argument: he considered the naive sequential reading of his own work "risky and inviting failure," and advocated iterative development and customer involvement throughout. Both men watched the industry take their work and turn a blind eye ti precisely the part that made it useful.

Current spec-driven agentic development appears to be repeating this exact mistake. The focus is almost entirely on providing the agent with a well-formed task list to implement against — a fixed contract handed downward. What is largely absent from the conversation is the other half of the equation: what role should the agent play in revising those specifications as implementation surfaces assumptions that turn out to be wrong? And how do changes discovered during implementation get reflected back into the living specification, so that the document doesn't immediately drift from reality? Without answers to those questions, spec-driven agent workflows risk becoming the new waterfall — complete with the same enthusiastic misreading of authors who were actually arguing for something far more dynamic.

Conway Was Here First

The same pattern of rediscovery appears elsewhere. The current guidance around keeping agent tasks narrow and focused is Parnas (1972) — his paper On the Criteria to Be Used in Decomposing Systems into Modules argued that each unit of work should hide exactly one design decision, and that comprehensibility and reliability both depend on that discipline. Conway had already made the organisational complement to this point in How Do Committees Invent? (Datamation, 1968), observing that systems tend to mirror the communication structure of the teams that build them — and that clear separation of responsibilities is therefore a property of how work is organised, not just how code is written. Both arguments apply with equal force to a team of specialised agents as to a team of specialised engineers.

Writing Things Down, Again

The current enthusiasm for structured project documentation, for organised knowledge bases that agents can consult resemble Architecture Decision Records. Parnas, Clements and Weiss's 1984 paper The Modular Structure of Complex Systems introduced the Module Guide as a hierarchically structured document allowing both designers and maintainers to find what they need without reading everything, similar as Fred Brooks's Project Workbook concept from The Mythical Man-Month (1975). The idea that documentation must capture not just what was decided but why — the rationale behind architectural choices — was formally argued by Perry and Wolf in Foundations for the Study of Software Architecture (ACM SIGSOFT Software Engineering Notes, 1992), a concern that resurfaced two decades later in Nygard's Architecture Decision Records (2011) and surfaces again today in the push to populate agent knowledge bases with structured context. The problem being solved is identical across all three moments: how do you give a context-limited participant — whether a new hire or a stateless AI — enough structured information to act correctly? The answer in 1975 was a well-organised, hierarchical documentation artifact. The answer in 2025 is a CLAUDE.md file pointing to a well-organized, hierarchical collection of markdown files.

Learning From How We Work, Not Just What We Build

Even the emerging practice of periodically reviewing how agents are being directed — revisiting prompt rules, refining task structures, updating the conventions in a CLAUDE.md after things go unexpectedly — maps directly onto the Agile retrospective, formalised by Derby and Larsen in Agile Retrospectives: Making Good Teams Great (2006) and embedded in the Scrum framework from its earliest formulations by Schwaber and Sutherland in the Scrum Guide (1995). The deeper tradition behind it is the software process improvement movement, exemplified by the NASA Software Engineering Laboratory's quarter-century programme of capturing experience across projects and feeding it back into revised processes — documented in Basili et al.'s Lessons Learned from 25 Years of Process Improvement (ICSE, 2002). So what can we learn from retrospectives and process improvement programmes to improve the way we improve agents?

So, nothing new then?

None of this is to say that agents change nothing. They change everything — but along a different axis. The principles of good software engineering remain what they were; what has changed is the rate at which they are applied. A human developer who receives a poorly decomposed task or a frozen specification will lose hours or days before the mismatch becomes visible. An agent operating at machine speed, generating thousands of lines of output across dozens of files in minutes, will have propagated every assumption in that flawed specification deep into the codebase before anyone has had a chance to review the first commit. This is not an argument against agents — it is an argument for taking the underlying engineering discipline far more seriously than the current discourse suggests. As others have already argued, the human is still the bottleneck, no longer in coding but with steering intent and validation.

And this is also why I expect broader rediscovery of software engineering processes (as I will lay out in my next post).

About BlueContext

We are a young university spin-off helping organizations modernize their engineering processes, unlock AI agent potential, and ship faster with confidence.

Further links

Impressum

Contact

office@bluecontext.at

+43 650 3103130

Gerstmayrweg 20

4040 Lichtenberg

Austria