Agent Memory Architecture
Five memory layers, two background processes, zero effort.
How a personal AI should handle memory — and why most don't.
Most terminal-based open-source personal AI agents store their entire memory in two markdown files on your hard drive. This is the standard approach.
✗ No background memory processes — writing happens inline
✗ No automatic preference extraction or confidence thresholds
✗ No reflection or consolidation across sessions
✗ No multi-device — memory lives on one machine
✗ No structured data — everything is freeform markdown
✗ No semantic search by default — files loaded as raw text
This simplicity is a real strength for power users who want full control. Two files you can open in any text editor. But when your daily log grows, the entire file eats context tokens. When you switch devices, the memory doesn't follow. When you text your AI from your phone, it has no idea who you are.
Modeled after human cognition — from volatile working memory to permanent life facts.
Working Memory
The last ~10 messages loaded from the database on every request. Server-side history management — the API loads history from the DB, not from the client. Truncated to stay within token budgets.
[
{ "role": "user", "content": "Log my lunch — grilled chicken salad, ~450 cal" },
{ "role": "assistant", "content": "Logged! That puts you at 1,280 cal today..." },
{ "role": "user", "content": "How am I tracking this week?" }
]Short-Term Memory
Five structured fields updated after every conversation turn by a background Haiku call. A rolling summary that always reflects Mio's latest understanding of you.
{
"purpose_and_context": "Helping user track nutrition for a cut phase",
"current_state": "User logged 3 meals today, 1,280 cal so far",
"key_learnings": "Prefers simple logging — photo or one-liner",
"approach_and_patterns": "Responds well to daily summaries at 8pm",
"tools_and_resources": "meals table, body_measurements table"
}Semantic Memory
Life facts extracted with high confidence — name, timezone, family members, dietary restrictions. Each candidate needs ≥75% confidence AND must match a canonical key. Server-side allowlist prevents hallucinated categories.
{
"personal.name": "Bora",
"personal.family": "Wife: Christina, Son: Kai (3yo)",
"location.timezone": "America/New_York",
"health.diet": "No red meat, intermittent fasting 16:8",
"health.goal": "Cut to 180lbs by April",
"preferences.units": "Imperial for weight, Celsius for temp"
}Episodic Memory
Durable learnings stored as vectors (LanceDB) in your private Arca vault — files in S3 that you can export or delete at any time. Extracted during "sleep-time reflection" — a cron-triggered Sonnet call that reviews the last 24 hours, identifies patterns across sessions, and writes durable facts to a memories table. Retrieved via semantic search.
{
"category": "personal",
"content": "User's son Kai started pre-K in September. User often asks about school pickup schedule on Tuesdays and Thursdays.",
"confidence": 0.92,
"created_at": "2025-10-14T04:00:00Z"
}Procedural Memory
Every skill is two things: a skill file that teaches the agent how to use it, and a data table that holds your actual data. Together they form a self-describing database — stored in your private Arca vault as files in S3 that you own, export, and delete.
You can create skills for anything through conversation: calorie tracking, workouts, recipes, journal entries, bookmarks, weight logs. Two types — tabular skills (DuckDB) for structured data you can query with SQL, and vector skills (LanceDB) for text content you can search by meaning.
food VARCHARcalories INTEGERprotein_g INTEGERmeal_type VARCHARcreated_at TIMESTAMPTZRelated to exercises table — net calories = SUM(meals.calories) - SUM(exercises.calories_burned)
User is in America/New_York timezone. Weekly calorie budget is 14,000 calories running Monday–Sunday.
SELECT food, calories, protein_g, meal_type
FROM meals
WHERE timezone('America/New_York',
CAST(created_at AS TIMESTAMPTZ))
>= CURRENT_DATE - INTERVAL 7 DAYS
ORDER BY created_at DESC;Structured data with typed columns. Query with SQL — aggregations, filters, joins across skills. For anything with numbers: calories, weight, finances, workouts.
Text content with semantic search. Find entries by meaning, not keywords. For anything with prose: journal entries, recipes, meeting notes, bookmarks.
You never have to say “remember this.” Mio already does.
After every conversation turn
A background Haiku call runs after every turn. It summarizes the conversation into five structured fields and proposes preference candidates — each requiring ≥75% confidence to be accepted.
Cron — every ~24 hours
Like how your brain consolidates memories during sleep. A Sonnet call reviews the entire day's conversations, identifies cross-session patterns, and writes durable facts to a memories table in your private Arca vault.
Two different philosophies. Terminal-based agents are transparent and simple — two files you can open in any text editor. Mio mirrors how human cognition works.
A terminal-based agent's memory lives on one machine. Text your AI from your phone — it doesn't know you. Switch laptops — memory gone. This is fine for a developer tool that lives in the terminal. It doesn't work for a personal AI that should know you everywhere.
“What were my total calories this week?” With a terminal-based agent, the LLM searches markdown text and guesses. In Mio, it runs a SQL query against a structured table and gives you an exact number. Structured data means real answers, not approximations.
You don't consciously decide which memories to keep. Your brain does it in the background — consolidating while you sleep, strengthening patterns, discarding noise. Mio works the same way. Two background processes handle all memory writing. You just talk.
Every memory Mio creates lives in your private Arca vault — LanceDB for vectors, DuckDB for structured tables, both stored as files in S3. Export them, query them directly, or delete everything with one click. No vendor lock-in, no data hostage situations. If you leave, your data leaves with you.
Five memory layers. Two background processes. Every surface.
Get Started