Threads in Node 10.5.0: a practical intro
A few days ago, version 10.5.0 of Node.js was released and one of the main features it contained was the addition of initial (and experimental) thread support.
This is interesting, specially coming from a language that’s always pride itself of not needed threads thanks to it’s fantastic async I/O. So why would we need threads in Node?
The quick and simple answer is: to have it excel in the only area where Node has suffered in the past: dealing with heavy CPU intensive computations. This is mainly why Node.js is not strong in areas such as AI, Machine Learning, Data Science and similar. There are a lot of efforts in progress to solve that, but we’re still not as performant as when deploying microservices for instance.
So I’m going to try and simplify the technical documentation provided by the initial PR and the official docs into a more practical and simple set of examples. Hopefully that’ll be enough to get you started.
So how do we use the new Threads module?
To start with, you’ll be requiring the module called “worked_threads”.
Note that this will only work if you use the --experimental-worker flag when executing the script, otherwise the module will not be found.
Notice how the flag refers to workers and not threads, this is how they’re going to be referenced throughout the documentation: worker threads or simply workers.
If you’ve used multi-processing in the past, you’ll see a lot of similarities with this approach, but if you haven’t, don’t worry, I’ll explain as much as I can.
What can you do with them?
Worker threads are meant, like I mentioned before, for CPU intensive tasks, using them for I/O would be a waste of resources, since according to the official documentation, the internal mechanism provided by Node to handle async I/O is much more efficient than using a worker thread for that, so… don’t bother.
Let’s start with a simple example of how you would go about creating a worker and using it.Example 1:
The above example will simply output a set of lines showing incremental counters, which will increase their values using different speeds.
Let’s break it down:
The code inside the IF statement creates 2 worker threads, the code for them is taken from the same file, due to the __filename parameter passed. Workers need the full path to the files right now, they can’t handle relative paths, so that is why this value is used.
The 2 workers are sent a value as a global parameter, in the form of the workerData attribute you see as part of the second argument. That value can then be accessed through a constant with the same name (see how the constant is created in the first line of the file and used later on in the last line).
This example is one of the most basic things you can do with this module, but it’s not really that fun, is it? Let’s look at another example.
Let’s try now to do some “heavy” computation while at the same time, doing some async stuff in the main thread.
This time around, we’re requesting the homepage for Google.com and at the same time, sorting a randomly generated array of 1 million numbers. This is going to take a few seconds, so it’s perfect for us to show how well this behaves. We’re also going to measure the time it takes for the worker thread to perform the sorting and we’re going to send that value (along with the first sorted value) to the main thread, where we’ll display the results.
The main takeaway from this example, is the communication between threads.
Workers can receive messages in the main thread through the on method. The events we can listen to are the ones shown on the code. The message event is triggered whenever we send a message from the actual thread using the parentPort.postMessage method. You could also send a message to the thread’s code using the same method, on your worker instance and catching them using the parentPortobject.
In case you’re wondering, the code for the helper module I used is here, although there is nothing note-worthy about it.
Let’s now look at a very similar example, but with a cleaner code, giving you a final idea of how you could structure your worker thread’s code.
As a final example, I’m going to stick to the same functionality, but showing you how you could clean it up a bit and have a more maintainable version.
And your thread code can be inside another file, such as:
Breaking this one down, we see:
Main thread and worker threads now have their code inside different files. This is easier to maintain and extend.
The startWorkerfunction returns the new instance, allowing you to later send messages to it if you so wanted.
You no longer need to worry if your main thread’s code is actually the main thread (we removed the main IF statement).
You can see in the worker’s code how you would receive a message from the main thread, allowing for a two-way asynchronous communication.
That is going to be IT for this article, I hope you got enough to understand how to get started to play around with this new module. Remember that:
This is still highly experimental and things explained here can change in future releases
Have fun! Play around, report bugs and suggest improvements, this is just starting!
See you on the next one!