Agentic Engineering: What Comes After Vibe Coding_

2026-04-06by AI Gens Team[building-in-ai]

#Agentic Engineering#Vibe Coding#AI Code Generation#Developer Tools#Cursor#Software Quality

cat agentic-engineering-after-vibe-coding.md

Twelve months. That's all it took for "vibe coding" to go from a tweet to Collins Dictionary's Word of the Year — and then to obsolescence.

Andrej Karpathy coined the term in February 2025, describing a development style where you "fully give in to the vibes, embrace exponentials, and forget that the code even exists." You tell the AI what you want. It writes the code. You don't read it. You don't need to understand it. You just... vibe.

The idea resonated instantly. Within months, 25% of Y Combinator's Winter 2025 batch had codebases that were 95% or more AI-generated. Cursor reached a $9.9 billion valuation. Windsurf was acquired by OpenAI for $3 billion. The entire developer tooling ecosystem reorganized around the premise that AI would write the code and humans would provide the direction.

Then reality hit.

By February 2026 — exactly one year later — Karpathy himself declared vibe coding "passe." In its place, he proposed something more nuanced: agentic engineering — orchestrating AI agents with oversight and scrutiny.

That twelve-month arc, from surrender to discipline, contains everything you need to understand about where AI-assisted development is actually heading.

The Vibe Coding Hangover

The data on vibe-coded applications tells a consistent story: impressive speed, concerning quality.

CodeRabbit's analysis of 470 GitHub pull requests found that AI co-authored code contains 1.7x more major issues and 2.74x higher security vulnerabilities than human-written code. These aren't edge cases or nitpicks — they're the kind of issues that cause production outages, data breaches, and silent data corruption.

Lovable, one of the most popular AI app generators, produced 1,645 apps in a study period. 170 of them — roughly 10% — had identifiable security issues. Not bugs. Security vulnerabilities. In generated code that users were deploying without review.

"Development hell" became the recurring phrase among teams trying to maintain vibe-coded applications. The code works. Until it doesn't. And when it breaks, no one on the team understands why it was written that way, because no one wrote it — the AI did, and no one read it.

The pattern repeated across the industry: fast prototype, impressive demo, gradual accumulation of technical debt, then a reckoning when the application needs to scale, handle edge cases, or pass a security audit.

Vibe coding's promise — that you could build without understanding — turned out to be vibe coding's trap. You can build without understanding. You just can't maintain without understanding. And maintenance is where software spends 80% of its life.

Why Vibe Coding Worked (Briefly)

Before we bury vibe coding entirely, it's worth understanding why it worked as well as it did, for as long as it did.

The prototype-to-demo pipeline accelerated dramatically. Ideas that used to take weeks to prototype could be functional in hours. For founders validating concepts, this was genuinely transformative. You could test ten ideas in the time it previously took to test one.

Non-technical founders could build. For the first time, people with domain expertise but no programming background could create functional software. This unlocked a wave of domain-specific applications built by people who deeply understood the problem space — even if they couldn't write a for loop.

Boilerplate disappeared. The tedious, repetitive code that every application needs — authentication flows, CRUD operations, form validation, API integrations — could be generated reliably. Developers were freed from the parts of coding that nobody enjoys.

Learning accelerated. Junior developers could see working implementations of concepts they were studying. The gap between "I understand the theory" and "I can see it working" collapsed.

These benefits are real and lasting. They didn't disappear when vibe coding's limitations became apparent. What disappeared was the illusion that these benefits were sufficient for production software.

The Karpathy Pivot

When Karpathy proposed agentic engineering as vibe coding's successor, he wasn't rejecting AI-assisted development. He was maturing it.

The key distinction: vibe coding is about surrender — giving up control to the AI and accepting whatever comes back. Agentic engineering is about leverage — using AI's capabilities while maintaining human judgment at critical decision points.

In practice, agentic engineering means:

Setting up specialized AI agents for specific tasks rather than using one general-purpose AI for everything. A code generation agent. A testing agent. A security review agent. A documentation agent. Each optimized for its domain, each producing output that can be reviewed independently.

Maintaining architectural control. The human defines the system architecture, the component boundaries, the data flow, and the integration patterns. AI agents work within that architecture, not around it. The human decides what to build; the AI helps build it.

Reviewing output critically. Not blindly accepting generated code, but not line-by-line reviewing every semicolon either. The skill is knowing what to review: security-sensitive code, data handling logic, architectural decisions, error handling, and edge cases. Let the AI write the boilerplate. Read the parts that matter.

Intervening at decision points. AI agents can generate options, but humans choose between them at architectural forks, trade-off decisions, and anywhere the choice has long-term implications. The AI proposes; the human disposes.

This is a fundamentally different mental model than vibe coding. Vibe coding said: "Trust the AI." Agentic engineering says: "Use the AI, but verify."

The Decision Framework

Not everything needs the same level of human oversight. The value of agentic engineering is knowing when to let the AI run and when to take the wheel.

High AI Autonomy (Let the AI Lead)

Prototyping and exploration. When you're exploring ideas, testing concepts, or building throwaway prototypes, maximum AI autonomy makes sense. Speed matters more than quality. If the prototype fails, you throw it away. This is where vibe coding's instincts are correct.

Boilerplate and scaffolding. Authentication flows, CRUD APIs, form components, database schemas for well-understood patterns. These are solved problems with well-known solutions. AI generates them reliably, and the downside of imperfection is low.

Test generation. AI is surprisingly good at generating test cases, including edge cases that humans might miss. The risk is low because tests are verification tools — a bad test fails visibly, unlike bad production code that fails silently.

Documentation. Generating API documentation, code comments, README files, and onboarding guides from existing code. AI excels here because the task is descriptive, not generative.

High Human Oversight (Keep Control)

Architecture decisions. How components connect. Where data lives. What scales independently. These decisions have long-term consequences that compound — a bad architectural choice made by AI becomes progressively more expensive to fix.

Security implementation. Authentication, authorization, encryption, input validation, secret management. The CodeRabbit data showing 2.74x more security vulnerabilities in AI code makes this non-negotiable. Every security-related code path needs human review.

Data handling and privacy. How personal data flows through the system, what gets stored, what gets logged, what gets shared with third parties. Privacy violations are existential risks, and AI consistently underperforms on the nuanced judgment these decisions require.

Production readiness. Error handling, graceful degradation, monitoring, alerting, rollback procedures. The difference between a demo and a production system is entirely in how it handles failure, and AI-generated code notoriously ignores failure modes.

Collaborative Zone (Both)

Testing strategy. AI generates test cases, humans define what needs testing and review the coverage. Humans ask: "What could go wrong?" AI asks: "Let me generate 50 test cases for that scenario."

Performance optimization. AI identifies bottlenecks and proposes solutions; humans evaluate trade-offs (memory vs. speed, complexity vs. performance, cost vs. latency).

Code review. AI tools flag issues, suggest improvements, and check for common patterns. Humans evaluate whether the suggestions make sense in context, whether the flagged issues are real problems, and whether the "improvements" actually improve anything.

The Quality Tax

There's a hidden cost to vibe coding that most teams discover too late: the quality tax.

When AI generates code without human oversight, it produces code that works but isn't good. The patterns are inconsistent. The error handling is superficial. The naming conventions vary. The abstractions are either too deep or too shallow. Dependencies are imported that aren't needed. Configuration is hardcoded where it should be parameterized.

None of these individually are catastrophic. But they accumulate. And they create a codebase that becomes progressively harder to modify, debug, and extend. Every change requires understanding code that no one on the team wrote or reviewed. Every bug requires tracing through logic that no one consciously designed.

This is the quality tax: the ongoing cost of maintaining code that was generated fast but not generated well. For a prototype that lives for two weeks, the tax is zero. For a production system that lives for two years, the tax compounds to the point where it's cheaper to rewrite than to maintain.

Agentic engineering addresses this directly. By maintaining human oversight at critical points — architecture, security, data handling, production readiness — it captures AI's speed advantage without accumulating the quality tax that makes vibe-coded applications unmaintainable.

How AI Gens Ventures Are Built

At AI Gens, we adopted agentic engineering principles not because we read about them, but because we lived through the vibe coding hangover ourselves.

Our early ventures experimented with maximum AI code generation. The speed was intoxicating. The demo readiness was unprecedented. And then the production deployments started, and we encountered exactly the problems the data predicts: security issues, maintenance nightmares, and "development hell" when trying to evolve features that AI had generated.

Now, every AI Gens venture follows a discipline:

Architecture is human-designed. Before any AI touches code, a human architect defines the system boundaries, data model, integration points, and scaling strategy. This document becomes the constraints within which AI agents operate.

Security is human-reviewed. Every authentication flow, every authorization check, every data access pattern is reviewed by a human with security expertise. We've seen too many AI-generated "authentication" implementations that look right and are terrifyingly wrong.

AI handles the volume. Within the human-defined architecture, AI agents generate implementation code, tests, documentation, and boilerplate. This is where the speed advantage is enormous and the risk is manageable.

Production readiness is human-verified. Before any deployment, humans verify error handling, monitoring, alerting, and graceful degradation. The question isn't "does it work?" — it's "what happens when it doesn't work?"

This discipline adds time to initial development — roughly 20-30% more than pure vibe coding. But it eliminates the quality tax that makes vibe-coded applications progressively more expensive to maintain. Over a twelve-month product lifecycle, agentic engineering is dramatically cheaper than vibe coding. The upfront investment in oversight pays for itself within the first production incident it prevents.

The Skill Shift

The evolution from vibe coding to agentic engineering represents a skill shift that every builder needs to internalize.

Vibe coding skills: Prompt writing. Tool selection. Speed of iteration. Comfort with not understanding the code. Ability to evaluate output by running it rather than reading it.

Agentic engineering skills: System architecture. Security awareness. Quality judgment. The ability to read AI-generated code critically — not every line, but the lines that matter. Understanding when AI output is good enough and when it's dangerously wrong. Managing multiple AI agents as a workflow rather than using one AI as a magic box.

The second set is harder. It requires the kind of engineering judgment that comes from having built, maintained, and debugged production systems. This is why Karpathy — who has deep engineering experience — could see the limitations of vibe coding clearly while developers with less experience saw only the magic.

And it's why, paradoxically, the rise of AI code generation makes experienced engineers more valuable, not less. Anyone can prompt an AI to write code. Only experienced engineers know which parts of the output to trust, which to rewrite, and which to throw away entirely.

That judgment — the ability to orchestrate AI with discipline — is what separates building fast from building something that lasts. It's the difference between a prototype and a product. Between a demo and a company.

Between vibing and engineering.

$ cd ../blog