Agent Memory Architecture

How Mio remembers

Five memory layers, two background processes, zero effort.
How a personal AI should handle memory — and why most don't.

The baseline: two markdown files

Most terminal-based open-source personal AI agents store their entire memory in two markdown files on your hard drive. This is the standard approach.

MEMORY.mdLong-term facts. The LLM writes preferences and important context here. Loaded at session start.
memory/2025-03-08.mdDaily append-only log. Running notes, day-to-day context. Today + yesterday loaded at session start.

No background memory processes — writing happens inline

No automatic preference extraction or confidence thresholds

No reflection or consolidation across sessions

No multi-device — memory lives on one machine

No structured data — everything is freeform markdown

No semantic search by default — files loaded as raw text

This simplicity is a real strength for power users who want full control. Two files you can open in any text editor. But when your daily log grows, the entire file eats context tokens. When you switch devices, the memory doesn't follow. When you text your AI from your phone, it has no idea who you are.

Mio's five memory layers

Modeled after human cognition — from volatile working memory to permanent life facts.

How Mio thinks
1
Conversation HistoryWorking Memory
Current session
2
Recent MemoriesShort-Term Memory
Rolling — always current
3
User PreferencesSemantic Memory
Permanent until changed
4
Long-Term MemoryEpisodic Memory
Permanent, exportable
5
Skill DataProcedural Memory
Permanent, user-owned, exportable
Increasing durability
1

Conversation History

Working Memory

The last ~10 messages loaded from the database on every request. Server-side history management — the API loads history from the DB, not from the client. Truncated to stay within token budgets.

LifespanCurrent session
AnalogyWhat you're actively thinking about right now
JSON
[
  { "role": "user", "content": "Log my lunch — grilled chicken salad, ~450 cal" },
  { "role": "assistant", "content": "Logged! That puts you at 1,280 cal today..." },
  { "role": "user", "content": "How am I tracking this week?" }
]
2

Recent Memories

Short-Term Memory

Five structured fields updated after every conversation turn by a background Haiku call. A rolling summary that always reflects Mio's latest understanding of you.

LifespanRolling — always current
AnalogyWhat you were doing 5 minutes ago
JSON
{
  "purpose_and_context": "Helping user track nutrition for a cut phase",
  "current_state": "User logged 3 meals today, 1,280 cal so far",
  "key_learnings": "Prefers simple logging — photo or one-liner",
  "approach_and_patterns": "Responds well to daily summaries at 8pm",
  "tools_and_resources": "meals table, body_measurements table"
}
3

User Preferences

Semantic Memory

Life facts extracted with high confidence — name, timezone, family members, dietary restrictions. Each candidate needs ≥75% confidence AND must match a canonical key. Server-side allowlist prevents hallucinated categories.

LifespanPermanent until changed
AnalogyKnowing your own name — facts that don't change
JSON
{
  "personal.name": "Bora",
  "personal.family": "Wife: Christina, Son: Kai (3yo)",
  "location.timezone": "America/New_York",
  "health.diet": "No red meat, intermittent fasting 16:8",
  "health.goal": "Cut to 180lbs by April",
  "preferences.units": "Imperial for weight, Celsius for temp"
}
4

Long-Term Memory

Episodic Memory

Durable learnings stored as vectors (LanceDB) in your private Arca vault — files in S3 that you can export or delete at any time. Extracted during "sleep-time reflection" — a cron-triggered Sonnet call that reviews the last 24 hours, identifies patterns across sessions, and writes durable facts to a memories table. Retrieved via semantic search.

LifespanPermanent, exportable
Analogy"I remember when you said you were building a fitness app"
JSON
{
  "category": "personal",
  "content": "User's son Kai started pre-K in September. User often asks about school pickup schedule on Tuesdays and Thursdays.",
  "confidence": 0.92,
  "created_at": "2025-10-14T04:00:00Z"
}
5

Skill Data

Procedural Memory

Every skill is two things: a skill file that teaches the agent how to use it, and a data table that holds your actual data. Together they form a self-describing database — stored in your private Arca vault as files in S3 that you own, export, and delete.

You can create skills for anything through conversation: calorie tracking, workouts, recipes, journal entries, bookmarks, weight logs. Two types — tabular skills (DuckDB) for structured data you can query with SQL, and vector skills (LanceDB) for text content you can search by meaning.

How a skill works
meals.skillSkill file — instructions for the agent
name: mealstype: data-table
Schema
food VARCHAR
calories INTEGER
protein_g INTEGER
meal_type VARCHAR
created_at TIMESTAMPTZ
Relationships

Related to exercises table — net calories = SUM(meals.calories) - SUM(exercises.calories_burned)

Notes

User is in America/New_York timezone. Weekly calorie budget is 14,000 calories running Monday–Sunday.

Agent reads skill file, then queries the table
meals tableDuckDB data table — your actual data
SQL
SELECT food, calories, protein_g, meal_type
FROM meals
WHERE timezone('America/New_York',
  CAST(created_at AS TIMESTAMPTZ))
  >= CURRENT_DATE - INTERVAL 7 DAYS
ORDER BY created_at DESC;
Tabular SkillsDuckDB

Structured data with typed columns. Query with SQL — aggregations, filters, joins across skills. For anything with numbers: calories, weight, finances, workouts.

Vector SkillsLanceDB

Text content with semantic search. Find entries by meaning, not keywords. For anything with prose: journal entries, recipes, meeting notes, bookmarks.

LifespanPermanent, user-owned, exportable
AnalogyKnowing how to ride a bike — accumulated knowledge you act on

Two memory processes

You never have to say “remember this.” Mio already does.

Real-Time Summarization

After every conversation turn

ModelHaiku (fast, cheap)
UpdatesRecent Memories + Preference Candidates
LatencyNon-blocking — fire and forget

A background Haiku call runs after every turn. It summarizes the conversation into five structured fields and proposes preference candidates — each requiring ≥75% confidence to be accepted.

Sleep-Time Reflection

Cron — every ~24 hours

ModelSonnet (stronger, sees patterns)
ReviewsLast 24h across all sessions
ProducesDurable learnings → your Arca vault

Like how your brain consolidates memories during sleep. A Sonnet call reviews the entire day's conversations, identifies cross-session patterns, and writes durable facts to a memories table in your private Arca vault.

Side by side

Two different philosophies. Terminal-based agents are transparent and simple — two files you can open in any text editor. Mio mirrors how human cognition works.

Aspect
Terminal-Based Agents
Mio
Storage
2 markdown files on disk
Supabase (JSONB) + Arca vault (LanceDB vectors + DuckDB tables in S3)
Memory types
2 (daily log + MEMORY.md)
5 distinct layers (working → semantic → episodic → procedural)
Structure
Freeform text, no schema
Typed layers — validated JSON, categorized vectors, SQL tables
Retrieval
Entire files loaded as raw text into context
Semantic search, SQL queries, compact JSON injection
Memory writing
LLM manually writes to files during session
Background AI — Haiku per turn, Sonnet every 24h
Preference extraction
Manual — LLM appends to a file
Automatic with ≥75% confidence + canonical key allowlist
Consolidation
None
Sleep-time reflection: Sonnet reviews 24h cross-session
Context bloat
Files grow → entire file loaded → eats context
Only relevant memories surface via semantic search
Cross-device
No — each machine has its own files
Yes — same memory across all clients
Surfaces
Terminal only
Web, iOS, macOS, SMS, WhatsApp, voice, email
Data querying
LLM parses markdown text
SQL for structured data + semantic search for text
Data ownership
Fully local (your machine, your files)
Your Arca vault — files in S3 you can export or delete anytime

The multi-surface problem

A terminal-based agent's memory lives on one machine. Text your AI from your phone — it doesn't know you. Switch laptops — memory gone. This is fine for a developer tool that lives in the terminal. It doesn't work for a personal AI that should know you everywhere.

Structured beats freeform for data

“What were my total calories this week?” With a terminal-based agent, the LLM searches markdown text and guesses. In Mio, it runs a SQL query against a structured table and gives you an exact number. Structured data means real answers, not approximations.

Memory should be automatic

You don't consciously decide which memories to keep. Your brain does it in the background — consolidating while you sleep, strengthening patterns, discarding noise. Mio works the same way. Two background processes handle all memory writing. You just talk.

Your memories, your files

Every memory Mio creates lives in your private Arca vault — LanceDB for vectors, DuckDB for structured tables, both stored as files in S3. Export them, query them directly, or delete everything with one click. No vendor lock-in, no data hostage situations. If you leave, your data leaves with you.

An AI that actually knows you.

Five memory layers. Two background processes. Every surface.

Get Started