Building a Redis clone in Rust: TTL, pattern matching, and active expiration

Phase 2 left me with a working key-value store, but with a known issue: expired keys would stick around in memory forever if nobody accessed them. This phase fixes that, and adds a few more commands along the way.

The main additions are:

EXPIRE, TTL, PERSIST for managing key lifetimes
KEYS for finding keys by pattern
Active expiration, a background task that actually cleans up expired keys

The TTL commands

These three commands are easy to implement given the existing StoredValue structure already tracks expires_at:

pub struct StoredValue {
    pub data: Vec<u8>,
    pub expires_at: Option<Instant>,
}

EXPIRE

EXPIRE key seconds sets a timeout on a key. One Redis behavior I wanted to match: negative or zero seconds should delete the key immediately.

pub async fn expire(&self, key: &str, seconds: i64) -> i64 {
    let mut write_guard = self.data.write().await;

    // Handle negative/zero seconds - delete the key
    if seconds <= 0 {
        if let Some(value) = write_guard.get(key)
            && !value.is_expired()
        {
            write_guard.remove(key);
            return 1;
        }
        write_guard.remove(key); // Clean up if expired
        return 0;
    }

    // Set expiration on existing non-expired key
    if let Some(value) = write_guard.get_mut(key) {
        if value.is_expired() {
            write_guard.remove(key);
            return 0;
        }
        value.expires_at = Some(Instant::now() + Duration::from_secs(seconds as u64));
        1
    } else {
        0
    }
}

The return value follows Redis semantics: 1 if the timeout was set (or key was deleted for negative values), 0 if the key doesn’t exist.

TTL

TTL key returns how many seconds until a key expires. Redis has specific return values: -2 if the key doesn’t exist, -1 if the key exists but has no expiration, otherwise the remaining seconds.

pub async fn ttl(&self, key: &str) -> i64 {
    let read_guard = self.data.read().await;

    if let Some(value) = read_guard.get(key) {
        if value.is_expired() {
            drop(read_guard);
            self.data.write().await.remove(key);
            return -2;
        }
        match value.expires_at {
            Some(expires_at) => {
                let now = Instant::now();
                if expires_at > now {
                    (expires_at - now).as_secs() as i64
                } else {
                    -2
                }
            }
            None => -1,
        }
    } else {
        -2
    }
}

PERSIST

PERSIST key removes the expiration from a key, making it permanent again:

pub async fn persist(&self, key: &str) -> i64 {
    let mut write_guard = self.data.write().await;

    if let Some(value) = write_guard.get_mut(key) {
        if value.is_expired() {
            write_guard.remove(key);
            return 0;
        }
        if value.expires_at.is_some() {
            value.expires_at = None;
            1
        } else {
            0
        }
    } else {
        0
    }
}

Glob pattern matching for KEYS

The KEYS pattern command finds all keys matching a glob pattern. Redis supports * (any sequence), ? (single character), and character classes like [abc] or [a-z]. I went with just * and ? for now.

The naive recursive solution is actually pretty clean:

fn glob_match(pattern: &str, text: &str) -> bool {
    let pattern: Vec<char> = pattern.chars().collect();
    let text: Vec<char> = text.chars().collect();
    glob_match_recursive(&pattern, &text, 0, 0)
}

fn glob_match_recursive(pattern: &[char], text: &[char], pi: usize, ti: usize) -> bool {
    // Base case: pattern exhausted
    if pi == pattern.len() {
        return ti == text.len();
    }

    match pattern[pi] {
        '*' => {
            // Try matching * with 0 or more characters
            for i in ti..=text.len() {
                if glob_match_recursive(pattern, text, pi + 1, i) {
                    return true;
                }
            }
            false
        }
        '?' => {
            // Match exactly one character
            if ti < text.len() {
                glob_match_recursive(pattern, text, pi + 1, ti + 1)
            } else {
                false
            }
        }
        c => {
            // Match literal character
            if ti < text.len() && text[ti] == c {
                glob_match_recursive(pattern, text, pi + 1, ti + 1)
            } else {
                false
            }
        }
    }
}

The * wildcard is the interesting one. It tries matching zero characters first, then one, then two, and so on. If any of those paths lead to a complete match, we’re done. This is $O\left( n \times m \right)$ in the worst case, but for typical Redis key patterns like user:* or session:???, it’s fast enough.

The KEYS implementation itself just iterates through all keys and filters:

pub async fn keys(&self, pattern: &str) -> Vec<String> {
    let read_guard = self.data.read().await;
    let mut matching_keys = Vec::new();
    let mut expired_keys = Vec::new();

    for (key, value) in read_guard.iter() {
        if value.is_expired() {
            expired_keys.push(key.clone());
        } else if glob_match(pattern, key) {
            matching_keys.push(key.clone());
        }
    }

    drop(read_guard);

    // Clean up expired keys
    if !expired_keys.is_empty() {
        let mut write_guard = self.data.write().await;
        for key in expired_keys {
            write_guard.remove(&key);
        }
    }

    matching_keys
}

This is why Redis warns against using KEYS in production: it scans every key in the database. The SCAN command exists for this reason, but that’s a future problem.

Active expiration

This was the main reason for Phase 3. In Phase 2, I mentioned that lazy expiration alone means keys never accessed will stay in memory forever. Redis solves this with periodic sampling. I say “Redis solves this”, but in reality Redis docs have been changed throughout the years, so they basically don’t reveal how they do it now. However, based on internet articles like this X Engineering Blog and this HackerNoon post), this is the agreed upon approach of how Redis at least used to do it.

The algorithm:

Every 100ms, sample up to 20 random keys
Delete any that are expired
If more than 25% were expired, repeat immediately

If you’re finding a lot of expired keys, there are probably more, so keep going. If most keys are fine, stop sampling and wait for the next cycle.

pub fn start_active_expiration(store: Store) -> tokio::task::JoinHandle<()> {
    tokio::spawn(async move {
        let mut interval = tokio::time::interval(Duration::from_millis(100));
        loop {
            interval.tick().await;
            store.expire_random_keys().await;
        }
    })
}

async fn expire_random_keys(&self) {
    const SAMPLE_SIZE: usize = 20;
    const EXPIRY_THRESHOLD: f64 = 0.25;

    loop {
        let keys_to_check: Vec<String> = {
            let read_guard = self.data.read().await;
            if read_guard.is_empty() {
                return;
            }
            read_guard.keys().take(SAMPLE_SIZE).cloned().collect()
        };

        if keys_to_check.is_empty() {
            return;
        }

        let mut expired_count = 0;
        let mut expired_keys = Vec::new();

        {
            let read_guard = self.data.read().await;
            for key in &keys_to_check {
                if let Some(value) = read_guard.get(key)
                    && value.is_expired()
                {
                    expired_keys.push(key.clone());
                    expired_count += 1;
                }
            }
        }

        if !expired_keys.is_empty() {
            let mut write_guard = self.data.write().await;
            for key in expired_keys {
                write_guard.remove(&key);
            }
        }

        // If less than 25% were expired, stop
        let ratio = expired_count as f64 / keys_to_check.len() as f64;
        if ratio < EXPIRY_THRESHOLD {
            return;
        }
        // Otherwise, continue sampling
    }
}

One thing I’m not doing that Redis does: actual random sampling. I’m just taking the first 20 keys from the iterator, which in Rust’s HashMap is pseudo-random due to the hash function, but not truly random¹. For a learning project, this is fine. For production, you’d want proper random sampling to avoid pathological cases where certain keys never get checked.

The background task is started when the server starts:

pub async fn run(&self) -> Result<()> {
    let _expiration_handle = Store::start_active_expiration(self.store.clone());

    loop {
        let (socket, addr) = self.listener.accept().await?;
        // ... handle connection
    }
}

The JoinHandle is stored but not awaited. The task runs independently in the background.

Testing active expiration

Integration testing active expiration is awkward because it involves timing. We need to verify keys expire without accessing them, otherwise we’d just be testing lazy expiration.

#[test]
fn test_redis_cli_active_expiration_without_access() {
    // ... setup ...

    // Create keys with SHORT TTL (2 seconds)
    for i in 1..=3 {
        let key = format!("noaccess{}", i);
        run_redis_cli(&["SETEX", &key, "2", &value]).unwrap();
    }

    // Verify keys exist using KEYS (not GET!)
    let result = run_redis_cli(&["KEYS", "noaccess*"]);
    // ... assert keys exist ...

    // CRITICAL: Do NOT access these keys with GET
    // This ensures deletion is via active expiration, not lazy deletion

    // Wait for TTL + buffer for background task
    std::thread::sleep(std::time::Duration::from_secs(3));

    // Verify keys are gone WITHOUT having accessed them
    let result = run_redis_cli(&["KEYS", "noaccess*"]);
    // ... assert keys are gone ...
}

The comment matters: if we used GET to check if the keys expired, we’d trigger lazy expiration and wouldn’t be testing the background task at all.

Benchmarks

I extended the benchmark script to cover the new commands:

TTL commands (50 clients):

Command	Redis (req/s)	Rudis (req/s)	Ratio
TTL	202,429	132,626	0.65x
EXPIRE	205,761	134,048	0.65x
PERSIST	179,211	132,979	0.74x

KEYS scaling:

Command	Redis (req/s)	Rudis (req/s)	Ratio
KEYS * (avg)	7,692	3,571	0.39x
KEYS pattern (avg)	12,500	4,762	0.30x

The TTL commands are in the same ballpark as Phase 2’s GET/SET, about 65-75% of Redis speed. That’s consistent with the general overhead of Tokio’s async runtime vs Redis’s custom event loop.

The KEYS performance is notably worse at 30-40% of Redis. Pattern matching adds overhead, and my recursive glob matcher isn’t optimized. Redis likely uses a more efficient algorithm, possibly with early termination optimizations. Since KEYS isn’t meant for production use anyway (it blocks the entire database), I’m not worried about optimizing it now.

Things I learned

HashMap iteration order isn’t random. When sampling keys for active expiration, I’m taking the first N keys from the iterator. This depends on Rust’s HashMap using SipHash by default. It’s “random enough” for my purposes, but a real implementation would use proper random sampling.

Testing timing-dependent behavior is tricky. The active expiration tests need to sleep for a few seconds and hope the background task runs. Flaky tests waiting to happen. In a more serious project, I’d inject a clock abstraction to make this deterministic.

Glob matching looks simple until you try to do it right. The recursive solution is correct, but there’s a reason libraries like glob exist. Character classes, escaping, and edge cases add up. I’m leaving full glob support for a future phase.

What’s next

Phase 3 wraps up the core TTL functionality. The data store now properly manages key lifetimes through both lazy deletion on access and active background cleanup.

Phase 3.5 (if I get to it) would add full glob pattern support for KEYS: character classes like [abc], negation with [^abc], and ranges like [a-z]. These are documented in the README as future work.

Phase 4 is persistence, RDB snapshots and AOF logging. That’s where I’ll need to serialize the entire store to disk and handle recovery on startup.

Code is here: github.com/aleksandar-had/rudis

From the Rust Book: “iterating over a hash map happens in an arbitrary order” ↩

Building a Redis Clone in Rust: TTL, Pattern Matching, and Active Expiration