We recently hit a major performance problem with our nodejs application that took some investigation, firstly a brief overview of nodejs as I now understand it (which I didn’t before debugging this problem).
- Your application code runs in a single thread. Your code will block all your other code from running.
- Four IO threads are used to run IO operations asynchronously from your code.
- One more additional thread is used as the event loop (epoll)
This means that at any given time we can expect nodejs to be running with 6 threads in total.
Back to our application… We found that while a long running database query was executing IO could completely lock up, including unrelated operations like file read/writes. During a problem phase I used gdb to inspect the state of the process and this is what I saw:-
(gdb) info threads
Id Target Id Frame
6 Thread 0x7ffff7fe4700 (LWP 23141) “SignalSender” 0x00007ffff7023420 in sem_wait ()
5 Thread 0x7ffff7f52700 (LWP 23153) “node” 0x00007ffff702418d in read () from /lib/x86_64-linux-gnu/libpthread.so.0
4 Thread 0x7ffff7f11700 (LWP 23156) “node” 0x00007ffff7023cec in __lll_lock_wait ()
3 Thread 0x7ffff7ed0700 (LWP 23159) “node” 0x00007ffff7023cec in __lll_lock_wait ()
2 Thread 0x7ffff7e8f700 (LWP 23160) “node” 0x00007ffff7023cec in __lll_lock_wait ()
* 1 Thread 0x7ffff7fe6720 (LWP 23140) “node” 0x00007ffff6d670d3 in epoll_wait () from /lib/x86_64-linux-gnu/libc.so.6
Let’s see what each thread is and what’s doing:-
- Thread: #1: From the epoll_wait() call we can deduce this is the nodejs event loop.
- Thread: #5: By running a backtrace on the thread it was clear to see this is an active database query to our database server. This is obviously an IO thread.
- Threads: #2-#4: These threads are waiting for a lock to be released before they run. These are obviously also IO threads.
- Thread: #6: This is the thread for running our application code (also clear from performing a backtrace on the thread).
From that state we can see that all the IO threads are consumed with work so any future IO request must sit and wait for an IO thread to become free. This is why we are seeing all the IO block.
The question is why are we seeing 3 IO threads being needlessly “wasted” just to sit there and wait for a lock to release?
When we perform a backtrace on all of these 3 threads we see that they are blocked within the database library itself and thus we can finally understand the big flaw with our application design and the reason for this performance problem.
Our application is using a single database connection so it’s clear that this is what’s happening:-
- A database query gets “dispatched” to an IO thread to run and does so.
- Further database work comes in, gets “dispatched” to the remaining IO threads but they can’t run because the single database connection is already in use so it sits there, within the database library, waiting for the connection to become free.
- With 4+ pieces of database work we have filled all of the IO threads so any other IO work must sit and wait for an IO thread to become free.
This is quite a fundamental design flaw with our application.
Fortunately the solution is rather simple and we’ve refactored the application to use a database connection pool (courteousy of node-pool [generic-pool]) so that we now have more than one database connection to work with so that our database queries can run in parallel and not block waiting for each other.
With nodejs v0.10+ we can also take that one step further. In this version libeio as been replaced with a threadpool implementation within libuv. This introduces a new environment variable that allows you to increase the number of IO threads beyond 4 abcpaperwriter.com by setting the environment variable:-
UV_THREADPOOL_SIZE (Range: 4 – 128, Default: 4).
We set this to around 1.5 * CPUs (in our case this was 12.)
Now we have more IO threads to work with and more database connections available so that queries can run in parallel.
Is there anything else we can do? Yes! To ensure that DB work cannot fully consume the IO threads and block file reading/writing we set the maximum pool size to be one less than the UV_THREADPOOL_SIZE. This means that only UV_THREADPOOL_SIZE – 1 database operations will ever occur in parallel which leaves one IO thread available at all times for our other IO operations (which are minimal) like file reading/writing.