Tips before going live...

Here is another extract from one of the chapters of the new edition of REST API Development with Node.js, I hope you enjoy it :)

There are certain aspects to consider when defining the architecture of your production environment, although they might depend on the type of project you’re creating.

High availability

If you’re going after high availability, you’re basically asking for your platform / system to stay functional in the face of disaster, or put simpler, when due to technical problems, parts of your modules start failing.

Don’t get me wrong, this subject alone is quite big and could fill up several chapters, I’m just doing a very high level introduction and describing a couple of technique that might come in handy when you start thinking about going live, and those are: Load balancing your web servers and thinking about zone availability. Let me go into a bit more details on them.

Load balancers

On any normal and successful application that starts being massively used, your incoming traffic will become a problem if you’re not prepared for it.

This traffic will start overloading your servers first, and when this happens, you’ll either have to scale up vertically or horizontally. Vertical scale implies adding more resources to your servers (memory or disc for instance) but that has (obviously) it’s limits, eventually this strategy will not be enough.

Horizontal scaling, however, implies adding more computing power by adding more computers (or in this case, servers). This is the better option if you know your traffic can keep growing to exceed the capacity of just one server, but is also adds extra problems.

When you have several web-servers for your client-facing application, how do you distribute the load amongst them? This is where load balancer come in.

Load balancers are software products you can install on your server, and they will (if properly configured) distribute incoming traffic into your array of servers. Some common load balancers are ELB from Amazon, F5 Networks and Nginx (yes, the web-server can also be configured as a Load balancer). These load balancers work by distributing incoming traffic based on a predefined set of rules, such as:

  • Round robin: every new request the balancer would hit a new server, and it’ll keep going by cycling to the first one.

  • Least connected: next incoming request will go to the server with the least number of active connections

  • IP hash function: Hashing function of the client’s IP address, assigning one hash code to each server.

One caveat from using load balancers for web applications, is that if your servers are statefull, you

might run into a problem if you forgot about sticky sessions. But of course, you’re building a RESTful API, which by definition is stateless, so this shouldn’t be a problem, should it?

Just in case, let’s review:

When dealing with user sessions, the web-server creates an in-memory object that resides on the actual server that is serving your requests. This is all fine and dandy until you get multiple servers working in parallel, and in theory, interchangeable with one another. The problem here is that when you have all those servers with an in-memory version of your session, you can’t really synchronize that data between all of them, so in the end as shown in the image below, they will have a partial version of the overall session information, and for most part, will render it useless to you.

Diagram showing classic problem with a fragmented session

Enter sticky sessions, they’re a way of letting your balancers know to keep one client associated to the same server after the first connection, that way the server-side session can be retrieved and updated on every request. This is a very well known technique and every load balancer has a way of dealing with this particular challenge.

You could also solve this by extracting your session management into a separate, common service, such as a database (see the next diagram), so every server would be able to access and update session data independently.

Session information extracted into a common database

The only caveat with this approach, is the extra work required to get the servers working with the database (specially compared to using out-of-the-box features when dealing with in-memory sessions).

Zone availability

Another way of achieving high availability in your application, specially if you’re deploying into a Cloud service, is the ability to have your deployment either spread or duplicated across multiple regions.

A very unpredictable and hard-to-overcome problem when you’re dealing with hosting your applications with a 3rd party provider, is that you don’t have any control over their infrastructure and if it fails, for any reason, you’re affected, whether you like it or not.

None of the mayor cloud providers will agree to a 100% uptime SLA, so no matter who you’re paying your hosting bills to, they will eventually, fail to provide. And that is when multi-zone deployments come in handy, because usually downtimes on the cloud affect an entire geographical zone, so the only way to get around this problem, is by having your deployments replicated or spread across several zones, effectively reducing the chances one of these geographical problems will affect you.

This is a very common practice with managed database services, they usually too, allow you to pick which regions to replicate the data to, so in case of a outage, you’ll still get your data, even if it’s a little bit slower. Look at the image below, in it you’ll see a simple example of the main differences between these two types of architectures.

Simple diagram showing the main differences between a replicated and a spread architecture.

Both options take advantage of multi-zone setups, and they effectively got you covered if something were to happen to some of the used zones. That being said, there are some core differences, and deciding which one is the best match for your project is completely up to you.

Let’s quickly look at some pros and cons of each case.



  • Easier deployment, everything deploys as one single block.

  • You only need one LB to pick the right zone based on availability (at least, you could also have other criteria, such as latency)

  • Simpler client-platform communication.


  • Less flexibility when it comes to deployments and balancing strategies.

  • If you’re dealing with in-memory sessions, you need to configure sticky sessions or use the common database.



  • Extra flexibility when it comes to zone-dependent deployments.

  • Stronger fault tolerance. If API #1 fails, you don’t lose the current transaction on API #2


  • A more complex deployment plan than the replicated model.

  • Complex client-platform communication, since you’d most likely end-up needing a load balancer configured for each pair of APIs.

In the end, there is no better or worse option, you'll have to decide based on your needs and the resources you have available.

That's it! I hope you enjoyed it! Let me know what you thought about this bit of future chapter 9 from the upcoming 2nd edition of REST API Development with Node.js. As always, if you'd like to know more about the books or me, use the contact form in the footer of the page.
Thanks for reading!


Recent Posts

See All