iMemory is a context layer for LLM applications. Store, organize, and retrieve the right facts at the right time — across sessions, projects, and models.
No credit card required · Works with OpenAI, Anthropic, Gemini
Plays nicely with the models you already use
The problem
Every prompt is a fresh start. You pad the context with history, hit token limits, watch quality drop, and burn money on tokens the model doesn't need to see.
Stuffing chat history into every request wastes tokens and slows responses.
Users repeat themselves across sessions. The model never learns who they are.
Notes, docs, and decisions live in five different tools — none of which the LLM can see.
Features
Everything you need to capture, retrieve, and govern the knowledge that powers your AI features.
Store facts, preferences, and history. Retrieved automatically on every call.
Hybrid vector + keyword search surfaces the most relevant context, not all of it.
Organize memory by user, project, or agent. No bleed between tenants.
Drop-in MCP server and SDKs for TypeScript, Python, and HTTP.
Row-level security, encryption at rest, and full audit trail.
Edge-deployed retrieval keeps your agent loop snappy.
How it works
Push facts, messages, or documents into iMemory with a single SDK call. We chunk, embed, and index automatically.
Query by user, topic, or natural language. We return the smallest set of tokens your model needs to answer well.
Splice retrieved context into your prompt or use our middleware. Works with chat, tools, and agent loops.
import { iMemory } from "imemory";
const memory = new iMemory({ namespace: "user_42" });
// Recall what matters
const context = await memory.recall("project deadline preferences");
// Inject into your LLM call
const reply = await openai.chat.completions.create({
model: "gpt-5",
messages: [
{ role: "system", content: context },
{ role: "user", content: userMessage },
],
});
// Remember what's new
await memory.remember(reply.choices[0].message.content);Pricing
For teams shipping AI features.
For regulated and high-scale workloads.
Spin up your first memory store in under a minute. No credit card, no setup calls.