Scaling a Node.js API for 100k+ SA Users

April 4, 2026

Infrastructure Lessons: Scaling a Node.js API for 100k+ Users in SA

Scaling an API from 1,000 "loyal users" to 100,000+ "impatient customers" is the point where technical debt becomes a technical bankruptcy. In the South African context, we have unique challenges: latency to international data centers, expensive cloud costs (AWS Cape Town region is premium priced), and inconsistent mobile connections.

After scaling several Node.js applications to the 100k user mark, here are the non-negotiable architectural lessons I’ve learned.

1. The "Single Thread" Myth and Horizontal Scaling

Node.js is single-threaded, but your infrastructure shouldn't be.

  • Lesson: Stop trying to "tweak" a single massive EC2 instance. Move to a containerized approach (Docker) on Amazon ECS (Elastic Container Service) or EKS (Kubernetes) in the
    af-south-1
    (Cape Town) region.
  • Horizontal Scaling: Your API should be "stateless." Any data (like user sessions) must live in a centralized Redis store, not in the application's memory. This allows you to spin up 5 or 50 instances of your API behind a load balancer without "losing" user state.

2. The Database is Always the Bottleneck

Your Node.js app can handle thousands of requests per second, but your database (PostgreSQL/MongoDB) will likely buckle first.

  • Read-Write Splitting: Use a "Primary" database for writes and "Read Replicas" for the heavy lifting of fetching data.
  • Connection Pooling: Use a tool like PgBouncer for PostgreSQL. Node.js can easily exhaust database connections if you aren't careful.
  • Indexing: If your
    WHERE
    clauses aren't indexed, your R200k-a-month database will perform like a R200-a-month one.

3. Caching at the Edge

For SA users, every millisecond counts.

  • Redis for Everything: Cache frequently accessed data (like product lists or user profiles) in Redis.
  • CloudFront (CDN): Even for dynamic APIs, you can use CloudFront to cache responses at the "edge" in Johannesburg or Cape Town. This reduces the "Round Trip Time" (RTT) significantly.

4. Rate Limiting and Security (The "Bot" Problem)

Once you hit 100k users, you will be attacked. Whether it’s malicious scrapers or just poorly-written client-side loops, you must protect your API.

  • Implementation: Use a library like
    express-rate-limit
    or, better yet, handle rate limiting at the API Gateway level (AWS API Gateway or Nginx).
  • Security: Ensure you are using Helmet.js for basic header security and that your JWT (JSON Web Token) strategy is robust and revocable.

5. Monitoring: You Can't Fix What You Can't See

Logging to a text file doesn't work at scale.

  • The "Holy Trinity": You need Datadog or New Relic for APM (Application Performance Monitoring), Sentry for error tracking, and ELK Stack (Elasticsearch, Logstash, Kibana) for logs.
  • Alerting: Set up alerts for "5xx Errors > 1% over 5 minutes." Don't wait for your customers to tell you your API is down.

Conclusion

Scaling is an iterative process. You don't build for 100k users on day one, but you must build with the plan to get there. By focusing on stateless architecture, local infrastructure (Cape Town region), and aggressive caching, you can ensure your SA-based API remains fast and reliable as you grow.


Related Articles