Agent Memory Architecture

How Mio remembers

Five memory layers, two background processes, zero effort.
How a personal AI should handle memory — and why most don't.

The baseline: two markdown files

Most terminal-based open-source personal AI agents store their entire memory in two markdown files on your hard drive. This is the standard approach.

MEMORY.mdLong-term facts. The LLM writes preferences and important context here. Loaded at session start.

memory/2025-03-08.mdDaily append-only log. Running notes, day-to-day context. Today + yesterday loaded at session start.

✗ No background memory processes — writing happens inline

✗ No automatic preference extraction or confidence thresholds

✗ No reflection or consolidation across sessions

✗ No multi-device — memory lives on one machine

✗ No structured data — everything is freeform markdown

✗ No semantic search by default — files loaded as raw text

This simplicity is a real strength for power users who want full control. Two files you can open in any text editor. But when your daily log grows, the entire file eats context tokens. When you switch devices, the memory doesn't follow. When you text your AI from your phone, it has no idea who you are.

Mio's five memory layers

Modeled after human cognition — from volatile working memory to permanent life facts.

How Mio thinks

Conversation HistoryWorking Memory

Current session

Recent MemoriesShort-Term Memory

Rolling — always current

User PreferencesSemantic Memory

Permanent until changed

Long-Term MemoryEpisodic Memory

Permanent, exportable

Skill DataProcedural Memory

Permanent, user-owned, exportable

Increasing durability

Conversation History

Working Memory

The last ~10 messages loaded from the database on every request. Server-side history management — the API loads history from the DB, not from the client. Truncated to stay within token budgets.

LifespanCurrent session

AnalogyWhat you're actively thinking about right now

JSON

[
  { "role": "user", "content": "Log my lunch — grilled chicken salad, ~450 cal" },
  { "role": "assistant", "content": "Logged! That puts you at 1,280 cal today..." },
  { "role": "user", "content": "How am I tracking this week?" }
]

Recent Memories

Short-Term Memory

Five structured fields updated after every conversation turn by a background Haiku call. A rolling summary that always reflects Mio's latest understanding of you.

LifespanRolling — always current

AnalogyWhat you were doing 5 minutes ago

JSON

{
  "purpose_and_context": "Helping user track nutrition for a cut phase",
  "current_state": "User logged 3 meals today, 1,280 cal so far",
  "key_learnings": "Prefers simple logging — photo or one-liner",
  "approach_and_patterns": "Responds well to daily summaries at 8pm",
  "tools_and_resources": "meals table, body_measurements table"
}

User Preferences

Semantic Memory

Life facts extracted with high confidence — name, timezone, family members, dietary restrictions. Each candidate needs ≥75% confidence AND must match a canonical key. Server-side allowlist prevents hallucinated categories.

LifespanPermanent until changed

AnalogyKnowing your own name — facts that don't change

JSON

{
  "personal.name": "Bora",
  "personal.family": "Wife: Christina, Son: Kai (3yo)",
  "location.timezone": "America/New_York",
  "health.diet": "No red meat, intermittent fasting 16:8",
  "health.goal": "Cut to 180lbs by April",
  "preferences.units": "Imperial for weight, Celsius for temp"
}

Long-Term Memory

Episodic Memory

Durable learnings stored as vectors (LanceDB) in your private Arca vault — files in S3 that you can export or delete at any time. Extracted during "sleep-time reflection" — a cron-triggered Sonnet call that reviews the last 24 hours, identifies patterns across sessions, and writes durable facts to a memories table. Retrieved via semantic search.

LifespanPermanent, exportable

Analogy"I remember when you said you were building a fitness app"

JSON

{
  "category": "personal",
  "content": "User's son Kai started pre-K in September. User often asks about school pickup schedule on Tuesdays and Thursdays.",
  "confidence": 0.92,
  "created_at": "2025-10-14T04:00:00Z"
}

Skill Data

Procedural Memory

Every skill is two things: a skill file that teaches the agent how to use it, and a data table that holds your actual data. Together they form a self-describing database — stored in your private Arca vault as files in S3 that you own, export, and delete.

You can create skills for anything through conversation: calorie tracking, workouts, recipes, journal entries, bookmarks, weight logs. Two types — tabular skills (DuckDB) for structured data you can query with SQL, and vector skills (LanceDB) for text content you can search by meaning.

How a skill works

meals.skillSkill file — instructions for the agent

name: mealstype: data-table

Schema

food VARCHAR

calories INTEGER

protein_g INTEGER

meal_type VARCHAR

created_at TIMESTAMPTZ

Relationships

Related to exercises table — net calories = SUM(meals.calories) - SUM(exercises.calories_burned)

Notes

User is in America/New_York timezone. Weekly calorie budget is 14,000 calories running Monday–Sunday.

Agent reads skill file, then queries the table

meals tableDuckDB data table — your actual data

SQL

SELECT food, calories, protein_g, meal_type
FROM meals
WHERE timezone('America/New_York',
  CAST(created_at AS TIMESTAMPTZ))
  >= CURRENT_DATE - INTERVAL 7 DAYS
ORDER BY created_at DESC;

Tabular SkillsDuckDB

Structured data with typed columns. Query with SQL — aggregations, filters, joins across skills. For anything with numbers: calories, weight, finances, workouts.

Vector SkillsLanceDB

Text content with semantic search. Find entries by meaning, not keywords. For anything with prose: journal entries, recipes, meeting notes, bookmarks.

LifespanPermanent, user-owned, exportable

AnalogyKnowing how to ride a bike — accumulated knowledge you act on

Two memory processes

You never have to say “remember this.” Mio already does.

Real-Time Summarization

After every conversation turn

ModelHaiku (fast, cheap)

UpdatesRecent Memories + Preference Candidates

LatencyNon-blocking — fire and forget

A background Haiku call runs after every turn. It summarizes the conversation into five structured fields and proposes preference candidates — each requiring ≥75% confidence to be accepted.

Sleep-Time Reflection

Cron — every ~24 hours

ModelSonnet (stronger, sees patterns)

ReviewsLast 24h across all sessions

ProducesDurable learnings → your Arca vault

Like how your brain consolidates memories during sleep. A Sonnet call reviews the entire day's conversations, identifies cross-session patterns, and writes durable facts to a memories table in your private Arca vault.

Side by side

Two different philosophies. Terminal-based agents are transparent and simple — two files you can open in any text editor. Mio mirrors how human cognition works.

Aspect

Terminal-Based Agents

Mio

Storage

2 markdown files on disk

Supabase (JSONB) + Arca vault (LanceDB vectors + DuckDB tables in S3)

Memory types

2 (daily log + MEMORY.md)

5 distinct layers (working → semantic → episodic → procedural)

Structure

Freeform text, no schema

Typed layers — validated JSON, categorized vectors, SQL tables

Retrieval

Entire files loaded as raw text into context

Semantic search, SQL queries, compact JSON injection

Memory writing

LLM manually writes to files during session

Background AI — Haiku per turn, Sonnet every 24h

Preference extraction

Manual — LLM appends to a file

Automatic with ≥75% confidence + canonical key allowlist

Consolidation

None

Sleep-time reflection: Sonnet reviews 24h cross-session

Context bloat

Files grow → entire file loaded → eats context

Only relevant memories surface via semantic search

Cross-device

No — each machine has its own files

Yes — same memory across all clients

Surfaces

Terminal only

Web, iOS, macOS, SMS, WhatsApp, voice, email

Data querying

LLM parses markdown text

SQL for structured data + semantic search for text

Data ownership

Fully local (your machine, your files)

Your Arca vault — files in S3 you can export or delete anytime

The multi-surface problem

A terminal-based agent's memory lives on one machine. Text your AI from your phone — it doesn't know you. Switch laptops — memory gone. This is fine for a developer tool that lives in the terminal. It doesn't work for a personal AI that should know you everywhere.

Structured beats freeform for data

“What were my total calories this week?” With a terminal-based agent, the LLM searches markdown text and guesses. In Mio, it runs a SQL query against a structured table and gives you an exact number. Structured data means real answers, not approximations.

Memory should be automatic

You don't consciously decide which memories to keep. Your brain does it in the background — consolidating while you sleep, strengthening patterns, discarding noise. Mio works the same way. Two background processes handle all memory writing. You just talk.

Your memories, your files

Every memory Mio creates lives in your private Arca vault — LanceDB for vectors, DuckDB for structured tables, both stored as files in S3. Export them, query them directly, or delete everything with one click. No vendor lock-in, no data hostage situations. If you leave, your data leaves with you.

An AI that actually knows you.

Five memory layers. Two background processes. Every surface.

Get Started