March 2026

How I reduced API latency using aggressive caching

Most APIs don't feel slow because of logic. They feel slow because they keep fetching the same data again and again. While building CodePersona, I was heavily dependent on external APIs like GitHub. The problem was simple. Every request meant fresh API calls, which made responses inconsistent and sometimes painfully slow.

This created a bad user experience. Profiles were not loading instantly, and I was also getting closer to API rate limits. The real issue was repeated fetching of data that does not change very often. So instead of optimizing computation, I focused on reducing how often I fetch data.

The approach was simple. Cache aggressively. First, I cached API responses after the initial request. So instead of hitting GitHub every time, most requests were served directly from cache. Second, I added a time-based expiry. Each cached response had a TTL, so the data stayed reasonably fresh without unnecessary calls. Third, I started storing processed data instead of raw responses. This removed the need to recompute things on every request.

The difference was immediate. Response times dropped significantly, API calls reduced, and the product started feeling instant instead of reactive. A simple shift in thinking helped. Most performance issues are not about speed of code. They are about how often you depend on slow systems. If your system depends on external APIs, caching is not optional. It is the system.