Devoured - April 28, 2026
Agent Memory Patterns (7 minute read)

Agent Memory Patterns (7 minute read)

Tech Read original

This article explains three practical patterns for adding editable memory to AI agents: files for data, memory blocks for learnable prompts, and skills for context-specific instructions.

What: A technical guide describing three types of mutable memory for AI agents: Files (hierarchical data storage), Memory Blocks (flat key-value stores embedded in prompts), and Skills (indexed files with metadata that load on-demand based on context).
Why it matters: Understanding these patterns helps developers build more capable agents that can learn from experience, maintain preferences, and access specialized knowledge when needed, rather than relying on static system prompts.
Takeaway: Start with memory blocks for agent preferences and behavior, add skills for specialized tasks with progressive disclosure, and version everything with Git to track evolution and enable rollbacks.
Deep dive
  • Files provide hierarchical data storage for agents and require three core functions: explore (ls/grep), read (cat), and write capabilities, and can be implemented as actual files, database records, or cloud storage blobs
  • Memory blocks are flat key-value stores injected into the system or user prompt, acting as a learnable system prompt that stores agent behavior, preferences, and identity information that cannot be ignored
  • Placing memory blocks in the user prompt is slightly better than system prompt because it causes less prompt cache invalidation and lower token costs when the agent updates blocks
  • Skills are directories containing a SKILL.md file with name and description metadata at the top, enabling progressive disclosure where specialized knowledge loads only when contextually relevant
  • Editable skills serve as an experience cache where agents can record learnings from investigations, making that knowledge available with better triggering than plain files
  • For observability, developers should construct reference graphs showing which files cite which others, comparing expected vs actual access patterns to detect when navigation becomes too random
  • Memory blocks should be kept under 500-5000 characters as larger blocks tend to confuse agents
  • Search indexes are valuable but add maintenance overhead as an additional data asset to manage
  • Git versioning for all memory types provides checkpoints, evolution tracking, and rollback capabilities, though branching and merging haven't proven successful yet
  • Knowledge graphs and SQL-backed data models generally fail because LLM weights don't know about custom schemas—LLMs reason in token space, not structured schemas
  • Issue trackers (like Chainlink or Asana) work surprisingly well as searchable work queues, with some agents using "interest" backlogs to autonomously explore and connect ideas overnight
  • Append-only logs (like events.jsonl) give agents grounded memory of what they actually did by recording every tool call and message as minified JSON objects
Decoder
  • System prompt: Instructions given to an LLM at the start of every conversation that define its behavior and capabilities
  • Prompt cache invalidation: When cached prompt data becomes outdated and must be reprocessed, increasing token usage and costs
  • Progressive disclosure: A design pattern where information is revealed only when contextually needed rather than shown all at once
  • Context window: The maximum amount of text (measured in tokens) an LLM can process at once
  • Token: The basic unit of text that LLMs process, roughly corresponding to word fragments
Original article

Agent Memory Patterns

Say you get asked to "add memory" to an agent. What does that mean? How do you do it?

There's three common kinds of mutable memory:

  1. Files
  2. Memory blocks
  3. Skills

If you don't need the agent to learn, then you're looking in the wrong place. You don't need memory. But this post might also be useful if you're just using agents, like a coding agent.

Files are for data & knowledge

Everything in this post needs to satisfy the following functions:

  1. Explore to find items — ls, find, grep, or equivalent tools
  2. Read an item — cat, or some ReadFile tool
  3. Write an item — pipe, sed, or some WriteFile tool

For files, all that seems fairly obvious. Files can be complicated, but those are the parts that are important for files to work as agent memory. Files don't have to be literal files. If they are, you can provide a Bash tool (or Powershell) that gives you cool Linux utilities for navigating the filesystem, reading parts of files, etc.

But also, you can absolutely use database records or S3 blobs. As long as:

  1. Each file has a hierarchical path, to enable exploring, but also so that files are a key-value store
  2. Long text content. We don't care too much about file structure or validation, but please do give the agent space to work.

Memory blocks are a learnable system prompt

Memory blocks are just a flat key-value store. Except the key isn't used for looking things up, it's just used for writing. All memory blocks are included inline in the system prompt, or user prompt.

Where to put it?

  • System prompt — this one's easier in a lot of systems. But can cause cache invalidation (higher token cost) when the agent calls WriteBlock.
  • User prompt (prepend) — This also works, it's still highly visible to the LLM, and it causes less prompt cache invalidation issues.

Either is fine. User prompt is slightly better, I guess.

Required tools:

  • WriteBlock(key, value [, sort_order]) — I like including a sort_order, because we know order does matter, so let the agent control it too. Not a huge deal though.

Optional tools:

  • ListBlocks()
  • ReadBlock(key)

Theoretically you don't need these because they're in the prompt already, but I've noticed that coding agents will always try to insert them and agent agents will always call them, every time. So, whatever that means..

What goes into blocks?

Blocks are a learnable system prompt. Put stuff in there that tends to go into the system prompt — behavior, preferences, identity, character, etc.

Since it's in the prompt, the agent can't look away or ignore. So you may want to promote from file to block if you want to guarantee visibility, like you don't want to risk the agent forgetting to read a file.

Skills are indexed files

Skills are a combination of files & memory blocks. They're files, literally, but they also are represented in the system prompt.

It's just a directory with a SKILL.md file:

the-skill/
  SKILL.md
  important-concept-1.md
  helper-script.py
  worksheet.csv

The SKILL.md is generally just a plain markdown file, but it has a special top few lines at the start of the file:

---
name: the-skill
description: what it does and when to use it
---

The description is the critial part. Both name and description go into the system prompt, but the description is the trigger. It encourages the agent to use the skill in the right circumstance.

Do you need a Skill tool?

Not really. Claude Code has a Skill(name) tool, but functionally it's the same as the agent reading the-skill/SKILL.md with a regular Read tool. The benefits are harness-side: lazy-loading the SKILL.md content (so it only enters the context window when invoked), telemetry, and permission scoping.

If you skip the dedicated tool, just tell the agent in the system prompt: "When a skill matches, read its SKILL.md before doing the thing." Works fine.

What goes into skills?

Data or instructions that are only needed in certain circumstances. Honestly "skill" is actually a really good name for them.

The key phrase is progressive disclosure — skills unfold as needed. The agent reads files as it deems necessary. Typically you'll include file references in the SKILL.md file, like "Read important-concept-1.md when you need to…". There's nothing special, no notation, it's just hints for the agent.

Scripts and data are nice too. Obviously scripts are only useful if you enable a Bash tool, but scripts especially can act like a agent optimizer. Like, sure, the agent can probably figure out how to string together all the headers to authenticate to your weird API, or you can just make a script for it and skip the LLM.

Editable skills

Most people think of skills as being immutable programs of English. Sure, they're useful when used like that, but they're even more useful when you allow your agent to change them.

A great way to use skills is as an experience cache. At the end of a long investigation or research, have the agent record the experience in a skill. Next time, it just reads the skill! Could you use files for this? Yes, but the description field in the system prompt makes it more likely to be used at the right time.

Observability

How do you know when the agent is using memory well?

For files & skills, you can start at the entry point and construct a graph of which files reference which other files:

  1. For each file
  2. Search for the file name
  3. Pair "file referenced from" → file

Then compare against reality. Find all the times those files were accessed in that order versus not. If they're referenced randomly, that means the agent needs to use Search or ListFiles tools to navigate. That might mean your files or skills are becoming too unwieldy.

Also, you should monitor memory block size & count. Definitely keep them under 5000, probably under 500 characters. When the blocks get too big, they tend to confuse the agent.

Unfortunately, given the nature of agents, there's not that much you can do for observability. But these two things do tend to be useful to monitor.

Search index

Is a search index a good idea? Yes absolutely. It's just annoying.

Seriously, it adds a data asset that needs to be maintained. Most of the time that's not a huge deal, but when it is, it is. Your call.

Git is an agent database

I highly recommend versioning files & ideally also skills & memory blocks. In open-strix I store memory blocks in yaml so they version and diff cleanly.

Versioning gives you checkpoints and lets you see evolution. It also lets you rollback or let the agent discover when a bad change was made. I've tried to use branching and merging, but not successfully.

Bad ideas

Knowledge graphs and other writable data models, e.g. backend by SQL, tend to not work very well because the LLM's weights doesn't know about their schemas. Most people talk themselves into knowledge graphs because they have structure and historically structure has been good. But the only structure LLMs need is tokens. They reason just fine in token space.

Good (but weirder) ideas

I've discovered that some types of generic data structures can be very useful for agents, for special purposes.

Issue trackers are oddly useful. I've been using chainlink, which is an issue tracker specifically for agents, but I've heard Asana also works fine. Probably any issue tracker would work. An issue tracker gives you a searchable work queue.

I've added an interest backlog to all of my agents now. Any time they come across something weird, interesting, or annoying they can create an issue and tag it interest. Then, during the night while I sleep they work through the backlog. This has led to multiple agents making connections between ideas & things I hadn't discovered yet, and generally coming up with fresh ideas that feel honestly novel.

Also, an append-only log is super useful. I have an events.jsonl file that goes into all of my agents. The agent harness writes every single event that happens, like tool calls and messages, and appends a JSON object minified to the events.jsonl file. It's not writable memory in the normal sense, but the agent can read it to give grounded answers about what it actually did.

Conclusion

Editable memory is extremely powerful. I highly recommend trying it out. Hopefully this helped.