HN: LLM Agents in Production - Real-World Deployments

Q1 2026 saw a shift from agent demos to production deployments, with the HN community sharing candid lessons.

The Agent Hype Cycle

Year	Dominant Narrative
2023	”Agents are the future!“
2024	”Agents don’t work reliably”
2025	”Agents work for specific tasks”
2026	”Agents in production - here’s what broke”

Production Use Cases

What Works in Production

Use Case	Success Rate	Key Success Factor
Web research	78%	Tool reliability
Data extraction	85%	Structured output
Code review	72%	Context quality
Customer support	68%	Escalation paths
Code generation	81%	Narrow scope

What Still Breaks

HN practitioners reported consistent failure modes:

Multi-step reasoning: Accuracy degrades past 10 steps
Tool calling: API inconsistencies cause 15-30% failures
Context management: Memory truncation issues
Error recovery: Agents don’t handle edge cases well

Architecture Patterns

Successful Production Architectures

┌─────────────────────────────────────────┐
│         Orchestrator Agent             │
├─────────────────────────────────────────┤
│  ┌─────────┐  ┌─────────┐  ┌─────────┐ │
│  │ Tool 1  │  │ Tool 2  │  │ Tool N  │ │
│  └─────────┘  └─────────┘  └─────────┘ │
├─────────────────────────────────────────┤
│         Memory / State Store            │
├─────────────────────────────────────────┤
│         Human-in-the-Loop Checkpoints   │
└─────────────────────────────────────────┘

Key Patterns

Task decomposition: Break complex tasks into agent-sized chunks
Checkpointing: Human approval for irreversible actions
Fallback agents: When primary agent fails, escalate to human
Verification loops: Test outputs before proceeding

Cost Analysis

Production agent costs (median reported):

Task Type	Tokens Used	Cost/Task	Human Time Saved
Research	150K	$0.45	45 min
Coding	80K	$0.24	30 min
Writing	40K	$0.12	20 min
Analysis	200K	$0.60	60 min

Lessons Learned

Top HN recommendations:

Start narrow: “Our best agent does one thing excellently”
Measure everything: “We didn’t know agents failed 20% of the time until we logged it”
Human escalation: “The agent knows when it doesn’t know - use that”
Cost monitoring: “Agents can be surprisingly expensive at scale”

Keith Kitchen

Explorer

HN: LLM Agents in Production - Real-World Deployments

HN: LLM Agents in Production - Real-World Deployments

The Agent Hype Cycle

Production Use Cases

What Works in Production

What Still Breaks

Architecture Patterns

Successful Production Architectures

Key Patterns

Cost Analysis

Lessons Learned

Media & Sources

Embedded Images

Source Links

Stacked notes

Graph View

Map

Table of Contents