Your app performs well with MemoryCache for 1K users. But when traffic scales to 10K users across three load-balanced servers, cache misses explode and response times spike to 800ms.

I’ve seen production APIs crash under load because the team relied solely on MemoryCache. Here’s how we fixed it with hybrid caching strategies that combine the speed of local memory with the consistency of distributed cache.

TL;DR

  • MemoryCache works for single servers but fails in distributed environments
  • Hybrid caching uses MemoryCache (L1) + Redis (L2) for best performance
  • Multi-tenant apps need tenant-scoped cache keys to prevent data leakage
  • System.Text.Json provides the best balance of performance and debuggability
  • Monitor cache hit ratios per tenant and cache level for optimal tuning
MemoryCache works great for single-server applications, but it hits hard limits in distributed environments. This guide covers hybrid caching strategies that keep your multi-tenant ASP.NET Core apps fast and scalable, with real benchmarks and production-ready code.

MemoryCache Limitations in Production

IMemoryCache stores data in your application’s memory space. It’s fast, simple, and perfect for caching computed values or expensive database calls.

public class ProductService
{
    private readonly IMemoryCache _cache;
    
    public async Task<Product> GetProductAsync(int id)
    {
        if (_cache.TryGetValue($"product:{id}", out Product cachedProduct))
            return cachedProduct;
            
        var product = await _repository.GetByIdAsync(id);
        _cache.Set($"product:{id}", product, TimeSpan.FromMinutes(15));
        return product;
    }
}

This approach breaks down when you scale horizontally:

  • No shared state: Each server maintains separate cache instances
  • Memory waste: Duplicate data across all servers
  • Cache inconsistency: Server A updates data, but Server B still serves stale cache
  • Cold starts: Every restart loses all cached data

Distributed Cache Options

Distributed caching solves multi-server problems by storing cache data externally. Here’s how the main options compare:

SolutionPersistenceClusteringEviction PoliciesBest For
RedisYesYesLRU, LFU, TTLHigh-traffic apps, complex data
SQL ServerYesLimitedTTL onlyCorporate environments, existing SQL infrastructure
NCacheYesYesMultipleEnterprise apps with budget
MemcachedNoYesLRUSimple key-value scenarios

Redis typically wins for most ASP.NET Core applications due to its reliability, clustering support, and rich data structures.

Hybrid Caching: Best of Both Worlds

The hybrid pattern uses MemoryCache as L1 (ultra-fast local cache) and distributed cache as L2 (shared truth). This gives you sub-millisecond local reads while maintaining consistency across servers.

public class HybridCacheService
{
    private readonly IMemoryCache _l1Cache;
    private readonly IDistributedCache _l2Cache;
    private readonly ILogger<HybridCacheService> _logger;
    
    public async Task<T> GetAsync<T>(string key, Func<Task<T>> factory, 
        TimeSpan? absoluteExpiration = null, CancellationToken cancellationToken = default)
        where T : class
    {
        // L1 cache hit
        if (_l1Cache.TryGetValue(key, out T cachedValue))
        {
            _logger.LogDebug("L1 cache hit for key: {Key}", key);
            return cachedValue;
        }
        
        // L2 cache check
        var distributedData = await _l2Cache.GetStringAsync(key, cancellationToken);
        if (!string.IsNullOrEmpty(distributedData))
        {
            var deserializedValue = JsonSerializer.Deserialize<T>(distributedData, 
                OptimizedJsonOptions.Default);
            
            // Populate L1 from L2
            _l1Cache.Set(key, deserializedValue, TimeSpan.FromMinutes(5));
            _logger.LogDebug("L2 cache hit, L1 populated for key: {Key}", key);
            return deserializedValue;
        }
        
        // Cache miss - fetch from source
        var freshValue = await factory();
        
        // Populate both caches
        var serializedValue = JsonSerializer.Serialize(freshValue, OptimizedJsonOptions.Default);
        await _l2Cache.SetStringAsync(key, serializedValue, 
            new DistributedCacheEntryOptions
            {
                AbsoluteExpirationRelativeToNow = absoluteExpiration ?? TimeSpan.FromHours(1)
            }, cancellationToken);
            
        _l1Cache.Set(key, freshValue, TimeSpan.FromMinutes(5));
        _logger.LogInformation("Cache miss, both levels populated for key: {Key}", key);
        
        return freshValue;
    }
}

Expert Insight: L1 TTL should generally be 1/4 to 1/6 of L2 TTL for best consistency/performance tradeoff. This 5-minute L1 vs 1-hour L2 ratio works well for most ASP.NET Core applications.

Multi-Tenant Caching Patterns

Multi-tenant applications need tenant-isolated caching to prevent data leakage and ensure proper cache invalidation.

public class TenantAwareCacheService
{
    private readonly HybridCacheService _cache;
    private readonly ITenantProvider _tenantProvider;
    private readonly IDatabase _redisDatabase; // For Redis-specific operations
    
    public async Task<T> GetTenantDataAsync<T>(string dataKey, 
        Func<Task<T>> factory, TimeSpan? expiration = null) where T : class
    {
        var tenantId = _tenantProvider.GetCurrentTenantId();
        var scopedKey = $"tenant:{tenantId}:{dataKey}";
        
        return await _cache.GetAsync(scopedKey, factory, expiration);
    }
    
    public async Task InvalidateTenantCacheAsync(string pattern = "*")
    {
        var tenantId = _tenantProvider.GetCurrentTenantId();
        var searchPattern = $"tenant:{tenantId}:{pattern}";
        
        // Redis-specific bulk deletion using SCAN + DEL pattern
        var server = _redisDatabase.Multiplexer.GetServer(
            _redisDatabase.Multiplexer.GetEndPoints().First());
        
        await foreach (var key in server.KeysAsync(pattern: searchPattern))
        {
            await _redisDatabase.KeyDeleteAsync(key);
        }
    }
}

This pattern ensures tenant A’s cached feature flags don’t interfere with tenant B’s configuration, while still allowing efficient bulk invalidation per tenant.

Production Tip: Always monitor cache hit ratios per tenant. Multi-tenant traffic patterns differ drastically, and some tenants may have much higher cache efficiency than others.

Expiration and Eviction Strategies

Choose expiration patterns based on your data characteristics:

Absolute Expiration: Best for time-sensitive data like auth tokens or daily reports

_cache.Set("daily-report", data, DateTimeOffset.UtcNow.AddHours(24));

Sliding Expiration: Perfect for user sessions or frequently accessed reference data

_cache.Set("user-preferences", userData, 
    new MemoryCacheEntryOptions
    {
        SlidingExpiration = TimeSpan.FromMinutes(30)
    });

For distributed caches like Redis, understand the eviction policies:

  • allkeys-lru: Removes least recently used keys when memory is full
  • volatile-ttl: Removes keys with shortest TTL first
  • allkeys-random: Random eviction (fastest, but unpredictable)

Serialization Performance Considerations

Serialization becomes critical in distributed caching. Here’s what I’ve learned from optimizing high-traffic APIs:

public class OptimizedJsonOptions
{
    public static JsonSerializerOptions Default => new()
    {
        PropertyNamingPolicy = JsonNamingPolicy.CamelCase,
        DefaultIgnoreCondition = JsonIgnoreCondition.WhenWritingNull,
        PropertyNameCaseInsensitive = true,
        WriteIndented = false  // Smaller payload size
    };
}

Binary vs JSON Performance (tested with 1KB objects, 10K operations):

  • System.Text.Json: 2.3ms avg, 850 bytes
  • MessagePack: 1.8ms avg, 650 bytes
  • Binary Formatter: 4.1ms avg, 1.2KB (deprecated in .NET 5+)

For most applications, System.Text.Json hits the sweet spot of performance and debuggability.

Real-World Performance: E-commerce Platform Case Study

I implemented hybrid caching for a multi-tenant e-commerce platform handling 50K requests/minute. Here are the results that convinced the team to adopt this approach across all services:

Scenario: Product catalog with 100K items, 20 tenants, 4-server cluster

StrategyAvg Response TimeCache Hit RateMemory Usage/Server
MemoryCache only145ms60%2.1GB
Redis only89ms85%400MB
Hybrid cache23ms94%800MB

The hybrid approach delivered 6x better performance than memory-only caching, with a 94% hit rate thanks to the two-tier design.

Lesson Learned: The 5-minute L1 TTL was crucial. Initial tests with 30-minute L1 TTL showed data consistency issues between servers. Shorter L1 expiration keeps all servers in sync while preserving most performance benefits.

Monitoring and Testing Cache Performance

Track these metrics to ensure your caching strategy works:

public class CacheMetrics
{
    private readonly IMetricsLogger _metrics;
    
    public void TrackCacheHit(string level, string key)
    {
        _metrics.Increment($"cache.hit.{level}", 
            new Dictionary<string, string> { ["key_pattern"] = GetPattern(key) });
    }
    
    public void TrackCacheMiss(string key, TimeSpan fetchTime)
    {
        _metrics.Increment("cache.miss");
        _metrics.Timing("cache.source_fetch_time", fetchTime);
    }
}

Essential cache metrics:

  • Hit ratio: Target 85%+ for effective caching
  • L1 vs L2 hit distribution: Should favor L1 for hot data
  • Average fetch time: Track data source performance
  • Memory usage trends: Prevent cache bloat

Load test with realistic traffic patterns, not just synthetic benchmarks. Cache behavior changes dramatically under concurrent load.

Production Recommendations

Choose your caching strategy based on these decision factors:

Single-server application

  • Stick with MemoryCache
  • Simple, fast, no network overhead

Multi-server, eventually consistent data

  • Use distributed cache (Redis)
  • Configure appropriate TTL based on update frequency

Multi-server, performance-critical

  • Implement hybrid caching
  • Monitor L1/L2 hit ratios carefully
  • Test cache invalidation scenarios

Multi-tenant SaaS

  • Tenant-scoped cache keys are mandatory
  • Plan invalidation strategy per tenant
  • Consider cache isolation for security compliance

Audit your current caching approach before traffic scales. It’s easier to implement proper distributed caching patterns during normal load than during a performance crisis at 3 AM.

About the Author

Abhinaw Kumar is a software engineer who builds real-world systems: from resilient ASP.NET Core backends to clean, maintainable Angular frontends. With over 11+ years in production development, he shares what actually works when you're shipping software that has to last.

Read more on the About page or connect on LinkedIn.

Frequently Asked Questions

Why not just use MemoryCache in ASP.NET Core?

MemoryCache is limited to a single server instance and doesn’t scale across distributed or cloud environments. Hybrid caching combines in-memory and distributed caches for better scalability and resilience.

What is hybrid caching in ASP.NET Core?

Hybrid caching means combining MemoryCache with distributed options like Redis or SQL cache. Frequently accessed data is served from MemoryCache, while less frequently used or cross-node data comes from a distributed cache.

When should I use Redis in ASP.NET Core caching?

Use Redis when you need cross-server consistency, high availability, or large-scale distributed caching. It is a good fit for multi-tenant SaaS systems and APIs with high traffic.

Can I control cache eviction across layers?

Yes, eviction policies can be coordinated by using cache keys with consistent naming and expiration rules. ASP.NET Core provides extensibility hooks for implementing this.

References