The first thing people do when something’s slow is add a cache. The second thing they do is add Redis. I’ve watched teams spin up a whole Redis instance to cache a config object that changes once a day and fits in a kilobyte.
You don’t always need Redis. Sometimes a plain Map in your process is faster, simpler, and one fewer thing that can page you at 3am. The trick is knowing which situation you’re in.
I’ve shipped both. I’ve also picked the wrong one and regretted it. Here’s how I think about it now.
The simplest cache that works
Before Redis, before any library, this is the whole thing:
const cache = new Map<string, { value: unknown; expires: number }>()
function get<T>(key: string): T | undefined {
const hit = cache.get(key)
if (!hit) return undefined
if (Date.now() > hit.expires) {
cache.delete(key)
return undefined
}
return hit.value as T
}
function set(key: string, value: unknown, ttlMs: number) {
cache.set(key, { value, expires: Date.now() + ttlMs })
}
No network call. No serialization. No connection pool. The lookup is a hash map access, which is about as fast as anything gets in a computer. For a lot of problems, this is the right answer and you can stop reading.
When in-memory actually wins
In-memory caching wins when three things are true, and they’re true more often than people admit.
The data is the same for every user. Feature flags, config, a list of supported countries, exchange rates you fetch once a minute. If every request wants the same value, caching it per-process is perfect. You’re not duplicating much, and a stale copy on one server for a few seconds isn’t a problem.
The cache can be wrong for a little while. Most caches can. If your homepage shows a product count that’s 30 seconds behind, nobody dies. In-memory caches with a short TTL are great here.
You’d have to hit Redis on the hot path anyway. This is the one people miss. A Redis lookup is a network round trip. Fast, sure, often under a millisecond inside the same data center. But it’s not free, and it’s not zero-risk. If you’re caching something read on every single request, an in-memory layer in front of Redis cuts real latency. I’ve put a 5-second in-memory cache in front of a Redis cache and watched p99 drop, because the hot keys never left the process.
The mental model: in-memory is a cache that lives where the work happens. No hop. That locality is the entire advantage, and it’s a big one.
Where in-memory falls apart
I’m not selling you a silver bullet. In-memory has real failure modes, and they’re the same ones I wrote about when I got a rate limiter wrong.
It doesn’t survive a restart. Deploy and the cache is empty. Usually fine. Sometimes a thundering herd of cold requests all miss at once and stampede your database. If a cold cache can knock over your DB, you need to think harder.
It doesn’t scale past one process. Run four workers and you have four copies, each with its own idea of the truth. For read-only config that’s fine. For anything you write to, like a session count or a rate-limit counter, four independent copies is a bug. I covered exactly this in the rate-limiter post: two workers both see count: 99 and both let the request through.
It grows forever if you let it. A Map with no eviction is a memory leak with a TTL field that nobody enforces. You need either the lazy delete on read (like above) or a real eviction policy. Reach for an LRU library here. lru-cache is boring and correct and I use it without thinking.
import { LRUCache } from "lru-cache"
const cache = new LRUCache<string, User>({
max: 5000, // cap the entries, evict the least-used
ttl: 60_000, // and expire after a minute
})
max is the line that saves you. It caps memory no matter what your traffic does.
When you actually need Redis
Redis earns its keep the moment state has to be shared or survive.
Multiple processes need the same value. A rate-limit counter, a job lock, a session that any worker might serve. The instant correctness depends on every process agreeing, you need state that lives outside any one process. That’s Redis (or similar). INCR being atomic is the whole reason it exists for this.
The cache must survive a restart. Anything where a cold start causes a stampede or a security gap. Brute-force protection that resets on deploy is a hole, not a cache.
The data is big or per-user at scale. Caching one config object in memory is nothing. Caching a million users' sessions in every process is how you OOM. Push that to Redis and let each process hold a connection, not a copy.
You need pub/sub or expiry events or any of the actual Redis features. At that point you’re not really caching, you’re using a data store, and that’s fine. Use the right tool.
The setup I actually run
For most services I run both, layered, and it’s not as complicated as it sounds.
async function getUser(id: string): Promise<User> {
// L1: in-process, microseconds, dies on restart
const local = memCache.get<User>(id)
if (local) return local
// L2: redis, shared across workers, survives restart
const cached = await redis.get(`user:${id}`)
if (cached) {
const user = JSON.parse(cached) as User
memCache.set(id, user, 5_000) // promote into L1 briefly
return user
}
// miss: hit the source, fill both
const user = await db.users.findById(id)
await redis.set(`user:${id}`, JSON.stringify(user), "PX", 60_000)
memCache.set(id, user, 5_000)
return user
}
The in-memory layer has a short TTL on purpose. Five seconds. Long enough to absorb the hot path, short enough that a stale value can’t hang around. Redis is the source of cache truth across workers. The database is the source of actual truth. Three layers, each doing the one job it’s good at.
That short L1 TTL is the part people skip, and it’s the part that makes the whole thing safe. You get the speed of local without committing to local being right for long.
How I actually decide
I ask one question first: does anything other than this process need to agree on this value?
If no, start in memory. Add a max and a TTL and move on. Don’t add infrastructure to cache a config blob.
If yes, you need shared state, and that’s Redis. Then ask the second question: is this read so often that even Redis is on the hot path? If yes, put a short-lived in-memory layer in front of it.
That’s the whole decision tree. Two questions. Most of the time the answer to the first one is “no” and you just saved yourself an entire service to operate.
Caching isn’t about picking Redis or memory. It’s about knowing what has to be shared and what has to survive, and then putting each piece of data where it belongs. Reach for memory by default, because it’s faster and there’s nothing to run. Reach for Redis the moment correctness depends on every process agreeing. And if something’s read on every request, there’s no shame in using both. The fastest cache is still the one that doesn’t leave the process.