The Myth of Agentic Code Understanding – A Technical Explanation

Recently, the industry has grown comfortable calling any automated interaction with code agentic. If a system can read source files, generate summaries, and run without human intervention, it is labelled intelligent—sometimes even autonomous.

This framing is not just inaccurate. It is technically wrong.

At system scale, understanding code is not a language problem - It is a truth problem.

No amount of prompting, context expansion, or narrative coherence can compensate for the absence of a deterministic definition of reality. Without an explicit, machine-verifiable model of structure and behaviour, there is nothing for an “agent” to reason over, nothing it can be accountable to, and nothing that can be trusted when change occurs.

This technical companion paper makes a deliberately unfashionable argument: before intelligence, there must be determinism.

Its formed as a Q&A technical paper and is designed to challenge current thinking.

Why table and column extraction is structurally insufficient

Tables and columns describe interfaces, not behaviour.

They tell you:

what structures exist
where data enters and exits

They do not capture:

expressions
conditional logic
derivations
cross‑step semantics

At system scale, behaviour — not structure — defines impact.

A change in how a value is derived is almost always more significant than a change in where it is stored.

Any approach that stops at tables and columns is blind to:

transformation logic
propagation effects
semantic reuse
compounding impact across processes

This is not a tooling limitation.

It is a modelling failure.

The hidden problem of cross‑code resolution

Single‑file understanding is not the challenge.

System‑level understanding requires the ability to:

link logic across files
resolve reuse and indirection
track propagation paths
understand intermediate transformations

This is where most LLM‑based approaches quietly fail.

Because cross‑code resolution is not a language problem. It is a structural resolution problem.

The answer is not “more context” or “better prompting”.

The answer is a graph.

Why graphs are not optional

Graphs are not a visual convenience. They are the only representation that makes dependency explicit and traversable.

A deterministic lineage graph provides:

explicit nodes (tables, columns, expressions)
explicit edges (read, write, derive, depend)
directionality
transitive closure

This enables operations that narrative output fundamentally cannot:

exhaustive upstream and downstream traversal
root‑and‑branch impact analysis
completeness checking
structural diffing across versions

A graph can be fully walked. A language model cannot know whether it has seen everything.

That distinction matters.

Why an LLM on top of code is not agentic

An LLM placed on top of code — even with tooling and automation — remains a probabilistic language model.

It:

generates plausible continuations
smooths over gaps
optimises for coherence

It does not:

enumerate all execution paths
guarantee completeness
signal when something is missing
act with authority

Wrapping this output in workflows does not change its nature.

Automation is not agency.

An agent, by definition, is authorised to act.

Authorisation requires:

a stable definition of reality
explicit boundaries
predictable outcomes
accountability

A probabilistic model cannot supply those properties.

This is why an LLM alone is, at best, an assistant — not an agent.

What “agentic” actually means (technically)

From a systems perspective, agentic does not mean:

autonomous text generation
chaining prompts
or running without human input

Agentic means:

goal‑directed behaviour
operating within defined authority
making decisions based on trusted state
producing actions that can be audited and replayed

This requires a substrate that is:

deterministic
explicit
complete
machine‑verifiable

Without that substrate, there is nothing for an agent to be responsible to.

Determinism as an explicit contract

Determinism is not a philosophical stance.

It is an engineering contract.

A deterministic system explicitly states:

what is guaranteed
what is not inferred
what can be reproduced
what can be compared

This makes outputs machine‑trustworthy, not just human‑readable.

It allows systems to answer questions like:

“Has anything structurally changed?”
“Is this impact identical to last time?”
“What is newly introduced vs previously known?”

No probabilistic system can offer those guarantees.

The correct division of labour

Safe, scalable systems separate responsibilities cleanly:

Deterministic systems

resolve structure and truth
define reality
constrain action

AI systems (including LLMs)

interpret known structure
explain behaviour
assist navigation
propose actions within bounds

When this boundary is respected:

AI becomes powerful
systems remain stable
automation becomes defensible

When it is not:

errors scale silently
confidence outpaces correctness
trust erodes gradually

This pattern repeats across every large estate.

The Three Models the Industry Is Using — and Why Only One Scales

When organisations talk about “AI‑driven code understanding”, they are usually referring to one of three models — even if they don’t name them explicitly.

Understanding why two of these fail is essential to understanding why determinism and graphs are not optional.

Model 1: Human‑Centric Code Reading (The Manual Model)

This is the traditional approach:

Hire skilled engineers or consultants
Have them read, interpret, and document the code
Produce lineage artefacts, impact assessments, or migration plans

This model has two defining characteristics:

It is slow
Large codebases routinely take days or weeks per system slice.
It provides no guarantees

Coverage depends on time and human judgment
Results are not repeatable
Two analysts will not produce identical outputs

The key limitation is not effort — it is non‑determinism.

Human understanding cannot be replayed, diffed, or mechanically validated.

At scale, this becomes operationally and economically unsustainable.

Model 2: LLM‑Centric Code Interpretation (The Probabilistic Model)

This is the increasingly popular alternative:

Provide code to an LLM
Ask it to extract tables, columns, or summaries
Automate the process
Treat the output as “gold”

This model feels radically faster — and superficially more modern.

But technically, it has a hard ceiling.

LLMs are probabilistic language models. They:

do not execute code
do not follow execution paths
do not enumerate branches
do not resolve runtime semantics
do not signal what they missed

This is especially important for SAS‑based systems.

No LLM will ever reliably:

resolve %INCLUDE chains across environments
expand nested macros deterministically
follow execution‑time branch logic
track dynamic variable creation and reassignment
reconstruct actual execution paths from logs

These are not tuning problems. They are model‑class limitations.

Automating this output does not make it agentic. It only makes probabilistic interpretation faster.

Why This Is Commonly (and Incorrectly) Called “Agentic”

The confusion arises because:

the output looks coherent
the process is automated
the system runs without human input

But autonomy is not agency.

An agent must be able to:

act within defined authority
reason over trusted state
produce outcomes that are explainable and repeatable

A probabilistic model operating directly on code cannot satisfy those conditions.

It can describe.
It cannot guarantee.

Model 3: Deterministic Graph‑Based Resolution (The Structural Model)

The third model takes a fundamentally different approach:

Resolve execution semantics, not surface text
Expand macros deterministically
Follow execution branches explicitly
Materialise tables, columns, and expressions
Build a complete dependency graph
Produce repeatable, auditable outputs

This model is harder.

It requires:

language‑specific execution modelling
structural parsing
explicit handling of ambiguity
clear contracts about what is and is not provable

But it has properties the other two never can:

identical inputs → identical outputs
full upstream/downstream traversal
provable completeness
machine‑verifiable change detection

This is why the difference is not “seconds vs days”.

It is: determinism vs interpretation

Why the Time Comparison Matters — but Is Not the Point

It is tempting to reduce this to performance:

seconds for deterministic resolution
days or weeks for manual analysis

But speed is not the real differentiator.

The real difference is what happens after the result is produced.

Only deterministic, graph‑based outputs can be:

reused safely
compared across versions
fed into automation
placed under governance
used as a control plane for AI

Everything else is a one‑off explanation.

The hidden human patch layer (how “agentic” fails get masked)

A predictable pattern is emerging in many AI programmes:

A Large Language Model produces a plausible output.
Humans review it, fix what’s missing or wrong.
Only the final corrected output is visible to stakeholders.
The system is described as “agentic” because the workflow is automated.

This creates a dangerous illusion: the agent appears reliable because humans are silently compensating for its gaps.

The cost is not only labour. The cost is trust:

teams cannot distinguish “the system worked” from “humans corrected it,”
quality becomes dependent on invisible effort,
and drift accumulates until it surfaces during change (refactor, migration, integration).

Why this is not agentic (even if the workflow is automated)

An LLM-on-code system is still a probabilistic text generator, not an authority on system structure. The fact that it runs in an automated pipeline does not change the model class.

A useful definition of “agentic” has two non-negotiables:

it must be able to act (not just talk), and
it must do so under governance, oversight, and deterministic guardrails.

This is explicitly captured in Agentic AI Customer Message 2026, which defines agentic systems as combining autonomy with human oversight and governance by design, and balancing LLMs with deterministic guardrails.

So if an “agent” is producing outcomes that require silent human correction, what you have is not agency—it is human decisioning with AI-assisted drafting.

The “silent failure” mechanism (why refactors and integrations are where trust collapses)

The most damaging failures in code understanding do not arrive as errors. They arrive as omissions:

missing dependencies,
misinterpreted branches,
incomplete propagation across processes.

Those omissions often remain hidden until change is introduced—exactly the moment organisations are trying to accelerate: refactoring, transpilation, upgrades, integration, and audit readiness.

This is why the controlled approach described in D1 - From Deterministic Lineage Graphs to Controlled AI Responses makes a blunt point: controlling an LLM is not primarily prompt engineering; it requires a finite answer space, deterministic ordering, template-bound output, and a hard stop on inference—because the lineage itself must remain stable and provable.

The key insight from D1 - From Deterministic Lineage Graphs to Controlled AI Responses is worth stating directly:

“The lineage never changes. Only the narration can.”

If the lineage appears to “change” between runs, the system is not discovering new truth—it is drifting.

The competitive leap: why deterministic truth is the unlock (and where value compounds)

Once an organisation embeds deterministic, canonical truth as a graph, two things happen:

Manual validation stops being the hidden cost centre.
The system can be tested, diffed, replayed, and challenged mechanically.
AI becomes safe to deploy at scale, because it is no longer the authority on facts.
It becomes an interface to proven structure.

That’s the progression described in A0 - Lineage to Infrastructure - Building Truth Before Intelligence: determinism enables infrastructure; infrastructure enables governance at scale; only then does intelligence become safe.

If your ‘agent’ needs humans to quietly correct it, what you’ve built isn’t autonomy—it’s a hidden manual process with an AI front-end.

Why this addresses your “all AI is LLM” concern

By naming and dissecting the hidden human patch layer, you’re not just criticising LLMs—you’re protecting AI from reputational damage. You’re drawing a clean distinction:

LLM-as-authority → drift + omissions + hidden human repair → “AI doesn’t work”
Deterministic truth + LLM-as-interface → stable facts + auditable outputs → “AI scales safely”

Why This Matters for “Agentic” Systems

Agentic systems require authority.

Authority requires:

a stable definition of reality
a boundary the agent cannot invent
an artefact that can be audited

That boundary is not language. It is structure.

Until deterministic graphs exist, there is nothing for an agent to act within.

Calling LLM‑driven automation “agentic” before that foundation exists is not ambitious.

It is premature.

Final Technical Position

This is not an argument against LLMs.

It is an argument against misplacing them.

LLMs are excellent at:

explanation
summarisation
guidance

They are not capable of:

defining execution truth
resolving system‑level behaviour
acting as authorities on structure

Agentic systems begin after determinism — not instead of it.

The hard path was never fashionable. But it is the only one that has ever scaled.

Next - The Minimum Deterministic Substrate What Must Be True Before AI Is Allowed to Act - Link

The Full Series

Determinism, Probability, and the Cost of Getting This Wrong - Link
Why probabilistic language models are being mistaken for agents — and why systems expose the flaw - Link
Stop Calling It Agentic: You’ve Just Automated an LLM - Link
The Myth of Agentic Code Understanding – A Technical Explanation - Link
The Minimum Deterministic Substrate What Must Be True Before AI Is Allowed to Act - Link
Determinism Is the Forgotten Path to Success: Why the hard path is often the only one that actually scales – Link
The Broken Escalator, Deterministic Lineage, and the Problem of Grounded Truth in AI - Link
When Probabilistic Systems (LLMs) Pretend to Be Deterministic: A Lineage Case Study – Link