Senior engineers and the tacit context problem
In Philipp Schmid’s discussion of why AI agents can struggle where seasoned engineers succeed, he points to a simple but stubborn truth: what a human designer takes for granted—like a deleteItem function’s real-world implications or how state is tracked—remains invisible to an agent that only sees a function signature and a docstring. That disconnect helps explain why even veteran engineers often find handoffs to autonomous tooling fragile or brittle in practice.
The five shifts that reshape tooling for agents
- Text replaces structured state — When you’ve spent years organizing state in structured data, an agent’s preferred input is often prose or unstructured context. The mismatch makes it hard for the agent to maintain the right through-line across long-running tasks.
- Errors are inputs, not restart triggers — A failure shouldn’t force a full restart after minutes of execution; errors must influence ongoing decision-making, especially when an agent has already been running for a while.
- Evals replace unit tests — The crucial question becomes how often the system works in real-world use, not whether a fixed input always yields a fixed output. Real effectiveness depends on sustained, repeatable behavior under varied conditions.
- Build to delete — You should architect with the expectation that the agent will be rebuilt or improved as models evolve; tooling must accommodate this lifecycle rather than lock you into a single version.
What this means for teams and organizations
The talk frames a broader industry shift: the tools designed to empower humans and AI must tolerate iteration, partial information, and evolving models. With AI agents becoming more common in business and specialized domains, the challenge is not just technical polish but governance, safety, and workflow design that aligns with how humans actually work. In the corporate arena, leaders are increasingly focused on redesigning work itself to accommodate agents—an insight that has surfaced in coverage of executives embedding agents into real teams and decision cycles (Fortune). The tooling ecosystem is also maturing toward hybrids that serve both people and agents, as industry reporting highlights broader moves like unified command-line interfaces built for humans and AI agents (InfoQ). And the practical scale of agent deployment in specific domains—such as legal tech with dozens of agents—underscores how quickly these toolings crowd the space (Artificial Lawyer).
Sources & further reading
- AI Engineer (YouTube video description) — Provides the core claims of Philipp Schmid about implicit context and the four shifts in agent design.
- InfoQ – Google Workspace CLI — Shows industry tooling trends toward supporting both humans and AI agents, reflecting the shift in tooling design.
- Fortune — Illustrates the broader claim that redesigning work around AI agents is a major organizational challenge.
- Artificial Lawyer — Demonstrates the scale of agent proliferation in specialized domains, echoing the article’s theme of tooling expansion.
Definitions
- Implicit context
- Tacit knowledge and assumptions that humans bring to design and operation, which are not encoded in explicit artifacts like code or docs and therefore not visible to AI agents.
- Tool schemas and docstrings
- Interfaces that describe functions (names, signatures, and descriptions) but do not capture real-world usage context; agents may only see these, not the broader workflow.
- Evals vs unit tests
- A testing mindset focused on real-world effectiveness over fixed inputs, emphasizing ongoing evaluation under varied conditions.
- Build to delete
- A design principle that assumes components (like AI agents) will be rebuilt or replaced as models improve, requiring tooling that supports iteration.