How to Build a Universal AI Memory System with Postgres and MCP
The system has three moving parts:
Primary Focus
developmentAI Tools Covered
What You'll Learn
- ✓How It Works
- ✓Set Up Postgres with pgvector
- ✓Build the Ingestion Pipeline
- ✓Connect an Input Source (Slack Example)
- ✓Build the MCP Server
- ✓Wire MCP Into Your AI Tools
Guide Curriculum
Architecture and Data Foundation
Learn key concepts
- •How It Works1m
- •Set Up Postgres with pgvector1m
Ingestion — Getting Notes In
Learn key concepts
- •Build the Ingestion Pipeline2m
- •Connect an Input Source (Slack Example)1m
Retrieval — Querying From Any AI
Learn key concepts
- •Build the MCP Server2m
- •Wire MCP Into Your AI Tools1m
- •Query Your Brain from Any AI1m
- •Conclusion and Next Steps1m
Preview: First Lesson
Architecture and Data Foundation
How It Works
This module sets up the mental model for the whole system and then lays the foundation: a Postgres database with native vector storage. Once the architecture is clear and the database is ready to hold embeddings, every later step plugs into it.
How It Works
The system has three moving parts:
- Input — you drop a note somewhere easy (like a Slack channel or a simple web form)
- Processing — a background service extracts meaning, generates a vector embedding, pulls out metadata (people mentioned, topics, action items), and writes everything to Postgres
- Retrieval — an MCP (Model Context Protocol) server sits in front of Postgres and answers semantic search queries from any AI tool that supports MCP
The result: "Hey, search my brain for notes about people considering a career transition" works identically in Claude, ChatGPT, or Cursor — because they're all hitting the same database.
Start learning with this comprehensive guide
This guide includes:
About the Author
Hiram Clark is the founder of vybecoding.ai and editor of every guide and news article published on the site. He reviews all AI-drafted content for accuracy before publication and is personally accountable for factual errors. He works hands-on with the AI development tools, workflows, and infrastructure covered here.
Full Guide Content
Complete lesson text — start the interactive course above for exercises and progress tracking.
Module 1Architecture and Data Foundation
1.1How It Works
This module sets up the mental model for the whole system and then lays the foundation: a Postgres database with native vector storage. Once the architecture is clear and the database is ready to hold embeddings, every later step plugs into it.
How It Works
The system has three moving parts:
- Input — you drop a note somewhere easy (like a Slack channel or a simple web form)
- Processing — a background service extracts meaning, generates a vector embedding, pulls out metadata (people mentioned, topics, action items), and writes everything to Postgres
- Retrieval — an MCP (Model Context Protocol) server sits in front of Postgres and answers semantic search queries from any AI tool that supports MCP
The result: "Hey, search my brain for notes about people considering a career transition" works identically in Claude, ChatGPT, or Cursor — because they're all hitting the same database.
1.2Set Up Postgres with pgvector
Step 1 — Set Up Postgres with pgvector
You need Postgres with the pgvector extension, which adds a native vector column type for storing embeddings.
# Install pgvector (Ubuntu/Debian)
sudo apt install postgresql-16-pgvector
# Or use Docker for local dev
docker run -d \
-e POSTGRES_PASSWORD=secret \
-p 5432:5432 \
ankane/pgvector
Connect and enable the extension:
CREATE EXTENSION IF NOT EXISTS vector;
CREATE TABLE memories (
id SERIAL PRIMARY KEY,
raw_text TEXT NOT NULL,
embedding VECTOR(1536), -- OpenAI ada-002 dimensions
people TEXT[],
topics TEXT[],
type TEXT, -- 'note', 'decision', 'action-item', etc.
action_items TEXT[],
created_at TIMESTAMPTZ DEFAULT NOW()
);
-- Index for fast approximate nearest-neighbor search
CREATE INDEX ON memories USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);
Module 2Ingestion — Getting Notes In
2.1Build the Ingestion Pipeline
With the database in place, this module covers how notes actually get into it: a pipeline that embeds and tags each note, and a Slack bot that feeds the pipeline. Together they form the "input" and "processing" halves of the system.
Step 2 — Build the Ingestion Pipeline
When a note comes in, you need to: embed it, extract metadata, store it. This is the "5 seconds later" step from the video.
// ingest.ts
import OpenAI from "openai";
import { Pool } from "pg";
const openai = new OpenAI();
const db = new Pool({ connectionString: process.env.DATABASE_URL });
async function ingestNote(rawText: string) {
// 1. Generate embedding
const embeddingRes = await openai.embeddings.create({
model: "text-embedding-ada-002",
input: rawText,
});
const embedding = embeddingRes.data[0].embedding;
// 2. Extract metadata with a fast LLM call
const metaRes = await openai.chat.completions.create({
model: "gpt-4o-mini",
response_format: { type: "json_object" },
messages: [
{
role: "system",
content: `Extract metadata from the note. Return JSON with keys:
people (string[]), topics (string[]), type (string), action_items (string[])`,
},
{ role: "user", content: rawText },
],
});
const meta = JSON.parse(metaRes.choices[0].message.content!);
// 3. Store everything
await db.query(
`INSERT INTO memories (raw_text, embedding, people, topics, type, action_items)
VALUES ($1, $2, $3, $4, $5, $6)`,
[
rawText,
JSON.stringify(embedding),
meta.people,
meta.topics,
meta.type,
meta.action_items,
]
);
console.log("Stored:", meta);
}2.2Connect an Input Source (Slack Example)
Step 3 — Connect an Input Source (Slack Example)
The simplest input is a Slack bot that listens for messages in a private #brain channel:
// slack-listener.ts
import { App } from "@slack/bolt";
import { ingestNote } from "./ingest";
const app = new App({
token: process.env.SLACK_BOT_TOKEN,
signingSecret: process.env.SLACK_SIGNING_SECRET,
});
app.message(async ({ message, say }) => {
if ("text" in message && message.text) {
await ingestNote(message.text);
await say("Stored in your brain.");
}
});
(async () => await app.start(3000))();
Now you can type a note in Slack and it's in the database within seconds.
Module 3Retrieval — Querying From Any AI
3.1Build the MCP Server
This final module covers the "retrieval" half: building an MCP server that exposes your database, wiring it into your AI tools, querying it in natural language, and extending the system. By the end you can ask any AI tool to search your shared memory.
Step 4 — Build the MCP Server
MCP (Model Context Protocol) is how AI tools like Claude and Cursor query external data. You expose your Postgres database as an MCP tool called search_brain.
// mcp-server.ts
import { Server } from "@modelcontextprotocol/sdk/server/index.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import OpenAI from "openai";
import { Pool } from "pg";
const openai = new OpenAI();
const db = new Pool({ connectionString: process.env.DATABASE_URL });
const server = new Server({ name: "brain-mcp", version: "1.0.0" }, {
capabilities: { tools: {} },
});
server.setRequestHandler("tools/list", async () => ({
tools: [{
name: "search_brain",
description: "Search your persistent memory for relevant notes",
inputSchema: {
type: "object",
properties: {
query: { type: "string", description: "What to search for" },
limit: { type: "number", default: 5 },
},
required: ["query"],
},
}],
}));
server.setRequestHandler("tools/call", async (req) => {
const { query, limit = 5 } = req.params.arguments as any;
// Embed the query, then find nearest neighbors in Postgres
const embRes = await openai.embeddings.create({
model: "text-embedding-ada-002",
input: query,
});
const queryEmbedding = embRes.data[0].embedding;
const result = await db.query(
`SELECT raw_text, people, topics, type, action_items, created_at,
1 - (embedding <=> $1::vector) AS similarity
FROM memories
ORDER BY embedding <=> $1::vector
LIMIT $2`,
[JSON.stringify(queryEmbedding), limit]
);
return {
content: [{
type: "text",
text: JSON.stringify(result.rows, null, 2),
}],
};
});
const transport = new StdioServerTransport();
await server.connect(transport);3.2Wire MCP Into Your AI Tools
Step 5 — Wire MCP Into Your AI Tools
Add the MCP server to each tool's config file. For Claude Code, edit .mcp.json in your project:
{
"mcpServers": {
"brain": {
"command": "node",
"args": ["/path/to/mcp-server.js"],
"env": {
"DATABASE_URL": "postgresql://localhost:5432/brain",
"OPENAI_API_KEY": "sk-..."
}
}
}
}
For Cursor, add the same block to .cursor/mcp.json. For ChatGPT Desktop, use the MCP settings panel. One server, all tools.
3.3Query Your Brain from Any AI
Step 6 — Query Your Brain from Any AI
Once wired up, you can use natural language in any AI session:
Hey, search my brain for notes about people considering a career transition.
Search my brain for architecture decisions I made in the last two weeks.
Find any action items related to the reorg conversation.
The AI calls search_brain, gets the top vector matches from Postgres, and surfaces them directly in the conversation — with full context, not just a keyword hit.
3.4Conclusion and Next Steps
Conclusion and Next Steps
You now have the skeleton of a universal AI memory system: notes go in through Slack (or any input), get embedded and tagged automatically, and any AI can retrieve them by meaning rather than exact words.
Where to take this next:- Add a web UI — a simple form at
localhost:3000as an alternative to Slack for quick notes - Auto-ingest from other sources — calendar events, email drafts, GitHub PR descriptions
- Add a
forgettool — an MCP endpoint to delete or redact memories - Add per-user namespacing — if you're building this for a team, scope memories by
user_id - Use Supabase instead of self-hosted Postgres — they have pgvector built in and a free tier that covers personal use
The core insight is simple: AI tools are only as useful as the context they carry. Give them a shared, persistent, semantically-searchable database and every session feels like picking up where you left off.