Extended Context 2.0: Agents Need Architecture, Not Just Note-Taking

Published on 🤖 If you want your AI to read this post, download it as markdown.
Silhouette of K in Blade Runner 2049 looking at the city from a rooftop, symbolizing knowledge architecture for AI agents
Observing the city

This guide is an improved version of what I proposed in "Extended Context. Agents also need to take notes" and Guide: Developing Quality Software Assisted by AI Agents.

When editors with autonomous AI agents arrived (Cursor, GitHub Copilot, Claude Code), many of us bought the promise: "write the perfect prompt and the agent will generate the complete feature in one shot". Reality was different: without structure or constraints, we ended up spending more time fixing bugs than writing code.

"Vibe-coding", trusting the agent to "get the vibe" of your project, works well for quick prototypes, but in real, complex projects it creates more problems than solutions:

  • Lost decisions: Why did we choose this architecture two weeks ago?
  • Constant hallucinations: AI invents APIs, mixes frameworks, ignores constraints
  • Monolithic work: Impossible to divide features into trackable tasks
  • Continuous improvisation: Each session starts from scratch

A few months ago I documented a solution (v1): a structured workflow (analysis → solution → backlog → execution → saved progress) with four context files that turned vibe-coding into trackable engineering. It worked, but there was friction: manually repeating the same prompts in each session was painful.

This guide presents the evolution of that system (v2). It leverages Cursor's rules system to automate prompts and converts the workflow into reusable protocols. But while designing it, a universal pattern emerged: separating reusable knowledge (rules) from temporary memory. This goes beyond saving you from writing the same prompts: you can reuse knowledge across different projects and keep an organized record of everything you do. And the pattern works with any AI tool, not just Cursor.

Let's see how to build that architecture.

Problem with v1

Four-file system in context/:

  • 01-expert.md - Expert profile
  • 02-analysis.md - Code analysis
  • 03-plan.md - Implementation plan
  • 04-backlog.md - Pending tasks

Limitations in Real Projects

Mixed knowledge: 01-expert.md combines general principles (KISS, DRY) with project-specific constraints (stack, versions, critical dependencies). Impossible to reuse principles between projects without dragging irrelevant configuration.

Duplication across projects: Each project replicates common knowledge. If you work on 3 WordPress projects, you have security: sanitize_*, nonces copied three times. You update a best practice, you must manually sync.

No history management: v1 gives you three options, all bad:

  • Overwrite 02-analysis.md → lose previous decisions
  • Accumulate analyses in the same file → grows uncontrollably, impossible to track when each decision was made
  • Create multiple files (02-analysis-feature-X.md) → mess, no consistent naming

In all cases: Why did you reject architecture X two weeks ago? There's no systematic way to retrieve it.

No modularity: Switching from WordPress to React requires rewriting the entire 01-expert.md. You can't maintain common modules (principles, patterns) and only swap the technical expert.

v1 had two fundamental problems: (1) mixed knowledge in a single file (01-expert.md), (2) manual execution - each action required writing the complete prompt ("analyze this file and save to 02-analysis.md", "read 04-backlog.md, execute tasks 1-3, update state"). v2 solves both: separates knowledge into modular rules/ and automates execution through invocable protocols in rules/utils/ (@analysis, @execute).

Solution: Rules/Memory Separation

The solution replicates how our brain works: we separate permanent knowledge (language, math, skills) from episodic memory (what we did yesterday, decisions from last week). But also, our knowledge isn't monolithic: we have specialized modules (JavaScript, WordPress, KISS principles) that we activate based on context.

v2 implements this dual architecture:

  1. Knowledge/memory separation: Reusable rules vs temporary memory
  2. Knowledge modularization: Experts, guidelines, and executable protocols
.cursor/
├── rules/          # Reusable and modular knowledge
│   ├── experts/    # Technical roles (WordPress, React, Python...)
│   ├── guidelines/ # Principles and constraints (KISS, project-specific)
│   └── utils/      # Executable protocols (analysis, backlog, commit)
└── memory/         # Temporary experiences
    └── YYYYMMDD-VV-description.md

Key Concept: alwaysApply

The rules/memory system needs an activation mechanism: what knowledge does Cursor load automatically vs. what you invoke on demand? Without this, you'd have 100 rules competing for limited context.

The alwaysApply flag solves this:

  • alwaysApply: true → Cursor loads them automatically in every conversation
    • Use for: Base experts, critical project guidelines
    • Example: wordpress-classic-themes.mdc, constraints-balneario.mdc
  • alwaysApply: false → User invokes them with @ as needed
    • Use for: On-demand protocols, specific tools
    • Example: @analysis, @backlog

Rules: Reusable Knowledge

Experts - Technical/domain expert profiles:

rules/experts/wordpress-classic-themes.mdc

---
alwaysApply: true
---
# WordPress Classic Expert
PHP + vanilla JS specialist. WordPress core APIs, theme dev.

## Stack
Backend: PHP 7.4+, WP Core APIs | Frontend: Vanilla JS, HTML5, CSS3

## Core Patterns
- Security: sanitize_*, esc_*, nonces
- Performance: transients (12h), conditional enqueue
- Translation: __(), _e() with text domain

Guidelines - Code principles and project constraints:

rules/guidelines/constraints-balneario.mdc

---
alwaysApply: true
---
# Balneario Theme
Text domain: themefront v0.1.2
Stack: Node | SASS 1.67.0 | PHP 7.4+ | jQuery 3.x
Build: npm run watch:sass (never edit compiled CSS)

Critical: ACF PRO (18 groups), WooCommerce, WPML
Core Systems: PDF Generator (pdf-generator.php), WooCommerce integration

rules/guidelines/kiss.mdc

---
alwaysApply: true
---
# KISS Principle
- Direct solutions, avoid over-engineering
- Simple patterns, minimal dependencies
- Single responsibility per function

Rule: If you can't explain it in 2-3 sentences, it's too complex.

The KISS principle is especially important in AI-assisted development: it's easier to ask the agent to add complexity to a simple solution than to waste time analyzing what's excessive in its proposal (LLMs tend to generate more complex code than necessary).

Utils - Invokable protocols:

rules/utils/analysis.mdc → Analyzes code/folders and saves to memory

---
alwaysApply: false
---
# Analysis Protocol
Trigger: @file/@folder or "analyze"
1. Acknowledge → 2. Analyze → 3. Save to .cursor/memory/YYYYMMDD-VV-*.md → 4. Confirm

rules/utils/solution.mdc → Proposes solution and saves to memory

---
alwaysApply: false
---
# Solution Protocol (KISS)
Trigger: User requests solution
1. Acknowledge → 2. Propose simplest solution → 3. Save to memory → 4. Confirm
Principles: Simplicity > complexity, proven patterns > novel

rules/utils/backlog.mdc → Converts solution into atomic trackable tasks

---
alwaysApply: false
---
# Backlog Protocol
Solution → atomic tasks → persistent backlog (YYYYMMDD-VV-backlog-*.md)

Format: [Phase.Task#] ⏳/🔄/✅/⚠️ Title
> What to do | Date completed | Work done

Rules: 2-4h max, sequential, testable

rules/utils/execute.mdc → Executes backlog tasks and updates state

---
alwaysApply: false
---
# Task Execution Protocol
Load backlog → parse scope → execute → update status → stop at boundary
Input: [1.1], [2.3,2.4], Phase 2, continue

rules/utils/commit.mdc → Generates conventional commits and updates backlog

---
alwaysApply: false
---
# Commit Protocol
Analyze staged → generate conventional commit → approve → execute → update backlog
Format: [JIRA-XXX] type(scope): subject (English, imperative, max 72 chars)

Memory: Historical Record

Utils protocols automatically generate files in memory/ following the nomenclature YYYYMMDD-VV-description.md (VV = daily sequence: 01, 02, 03...):

memory/
├── 20250919-01-analisis-pdf-generator.md
├── 20250919-02-sistema-pdfs-emails-finalizado.md
├── 20251015-01-analisis-completo-balneario-theme.md
├── 20251015-02-solucion-facturacion-automatica-399.md
└── 20251015-03-backlog-facturacion-automatica-399.md

Advantages: chronological ordering, multiple entries/day, traceability, visible relationships.

This solves the three bad options problem from v1: the chronological nomenclature (YYYYMMDD-VV) accumulates history without losing control, maintains order without disorganization, and preserves decisions without overwriting.

Workflow in Cursor

Note: Commands with @ invoke specific rules. For example, @analysis loads rules/utils/analysis.mdc without needing to write the full path. This is part of Cursor's native syntax.

This section describes the general workflow for AI-assisted development: separating reusable knowledge from temporary memory, following the cycle analysis → solution → backlog → execution → commit → continue. We use Cursor and its rules system as an implementation example, but the workflow is applicable to any AI tool capable of injecting context.

Below you'll see how the workflow works in practice. If you later want to implement it in your projects with Cursor, check the Implementation and Configuration guide below.

Feature Development: Automatic Invoicing

1. Analysis

@analysis checkout.php

→ Generates: 20251015-01-analisis-checkout.md

2. Solution

@solution
Implement automatic WooCommerce invoicing

→ Generates: 20251015-02-solucion-facturacion.md

3. Backlog

@memory/20251015-02-solucion-facturacion
@backlog

→ Generates: 20251015-03-backlog-facturacion.md

4. Execution

@memory/20251015-03-backlog-facturacion
@execute [1.1]

→ Updates: [1.1] ⏳ → ✅

5. Commit

@memory/20251015-03-backlog-facturacion
@commit

→ Creates commit + updates backlog

6. Continue work (new session)

@memory/20251015-03-backlog-facturacion
Resume work

→ Loads context + shows next tasks

Implementation and Configuration

1. Install the system (5 min)

This system is available at github.com/arinspunk/extended-context.

Two ways to configure:

Option A - Contribute to shared repo (recommended if you want to participate in maintenance):

cd your-project
git clone https://github.com/arinspunk/extended-context.git .cursor

Maintains connection to the repo to receive updates and contribute improvements. Update whenever you want: cd .cursor && git pull

Option B - Snapshot for your project (recommended for daily work):

git clone https://github.com/arinspunk/extended-context.git temp
cp -r temp/rules temp/memory your-project/.cursor/
rm -rf temp

Independent copy that you customize freely without affecting the original repo, ideal for versioning together with your project code.

Migration from v1: Use Option B to integrate with your existing structure.

2. Initial Configuration (15 min)

Prerequisites:

  • Cursor installed and project open
  • rules/memory system cloned to .cursor/ (previous step completed)

Step 1: Identify your stack (2 min)

Review rules/experts/ and activate the expert that matches your project:

  • WordPress → wordpress-classic-themes.mdc
  • React → react-typescript.mdc (if exists)
  • Python Backend → python-fastapi.mdc (if exists)

If your stack doesn't exist: copy the closest expert and adapt it.

Step 2: Activate project constraints (3 min)

  1. Copy rules/guidelines/constraints-template.mdc to constraints-{project}.mdc
  2. Complete:
    • Technical stack (specific versions)
    • Critical dependencies (libraries that CANNOT be changed)
    • Legacy systems you must respect
  3. Change alwaysApply: falsetrue

Step 3: Activate base principles (1 min)

Activate kiss.mdc by changing alwaysApply: falsetrue

Step 4: Verify loading (1 min)

Open Cursor chat (Cmd+L) and type:

What experts and guidelines do you have active?

Cursor should list the files with alwaysApply: true.

Step 5: First execution (8 min)

Test the complete workflow with a real file:

@analysis src/components/Header.tsx

If it correctly generates memory/YYYYMMDD-01-*.md, the system is operational.

Note for other tools: This system is tool-agnostic. The rules/memory conceptual separation can be implemented in any tool capable of injecting context (GitHub Copilot, Claude Code, etc.).

Once the system is configured, the real work begins: feeding rules with project-specific knowledge and team's usual practices, while memory records the work done with AI (analyses, solution proposals, tasks and progress). In practice, this transforms how AI generates code: solutions coherent with project architecture and scalable. Additionally, it documents decisions that facilitate later debugging and knowledge transfer between developers.

Conclusion

Vibe-coding works for prototypes, but creates more problems than solutions in real projects: lost decisions, constant hallucinations, impossible to track work. v1 solved this with structured workflow, but required manually repeating prompts in each session.

v2 eliminates that friction by automating protocols and, in doing so, discovers a universal pattern: separating reusable knowledge (rules) from temporary memory. This separation, the same one your brain makes between skills and experiences, turns any AI tool into a system where knowledge persists, evolves, and is systematically reused.

It doesn't eliminate context window limitations, but transforms them into a structured framework for building software in a trackable and scalable way.

Next Steps

  1. If you use Cursor: Clone the repo and configure your first project
  2. If you use another tool: Adapt the rules/memory separation to your system
  3. Share your implementation: Created experts for your stack? Contribute to the repo
  4. Share results: Improved the workflow? Document your findings to collectively refine the system