Archive

Archive for the ‘Coding’ Category

nodejs and blocked IO

July 7th, 2014

The problem

We recently hit a major performance problem with our nodejs application that took some investigation, firstly a brief overview of nodejs as I now understand it (which I didn’t before debugging this problem).

Background

  • Your application code runs in a single thread. Your code will block all your other code from running.
  • Four[1] IO threads are used to run IO operations asynchronously from your code.
  • One more additional thread is used as the event loop (epoll)

 

This means that at any given time we can expect nodejs to be running with 6 threads in total.

Diagnosis

Back to our application… We found that while a long running database query was executing IO could completely lock up, including unrelated operations like file read/writes. During a problem phase I used gdb to inspect the state of the process and this is what I saw:-

(gdb) info threads
Id Target Id Frame
6 Thread 0x7ffff7fe4700 (LWP 23141) “SignalSender” 0x00007ffff7023420 in sem_wait ()
from /lib/x86_64-linux-gnu/libpthread.so.0
5 Thread 0x7ffff7f52700 (LWP 23153) “node” 0x00007ffff702418d in read () from /lib/x86_64-linux-gnu/libpthread.so.0
4 Thread 0x7ffff7f11700 (LWP 23156) “node” 0x00007ffff7023cec in __lll_lock_wait ()
from /lib/x86_64-linux-gnu/libpthread.so.0
3 Thread 0x7ffff7ed0700 (LWP 23159) “node” 0x00007ffff7023cec in __lll_lock_wait ()
from /lib/x86_64-linux-gnu/libpthread.so.0
2 Thread 0x7ffff7e8f700 (LWP 23160) “node” 0x00007ffff7023cec in __lll_lock_wait ()
from /lib/x86_64-linux-gnu/libpthread.so.0
* 1 Thread 0x7ffff7fe6720 (LWP 23140) “node” 0x00007ffff6d670d3 in epoll_wait () from /lib/x86_64-linux-gnu/libc.so.6

Let’s see what each thread is and what’s doing:-

  • Thread: #1: From the epoll_wait() call we can deduce this is the nodejs event loop.
  • Thread: #5: By running a backtrace on the thread it was clear to see this is an active database query to our database server. This is obviously an IO thread.
  • Threads: #2-#4: These threads are waiting for a lock to be released before they run. These are obviously also IO threads.
  • Thread: #6: This is the thread for running our application code (also clear from performing a backtrace on the thread).

 

From that state we can see that all the IO threads are consumed with work so any future IO request must sit and wait for an IO thread to become free. This is why we are seeing all the IO block.

The question is why are we seeing 3 IO threads being needlessly “wasted” just to sit there and wait for a lock to release?

When we perform a backtrace on all of these 3 threads we see that they are blocked within the database library itself and thus we can finally understand the big flaw with our application design and the reason for this performance problem.

Our application is using a single database connection so it’s clear that this is what’s happening:-

  1. A database query gets “dispatched” to an IO thread to run and does so.
  2. Further database work comes in, gets “dispatched” to the remaining IO threads but they can’t run because the single database connection is already in use so it sits there, within the database library, waiting for the connection to become free.
  3. With 4+ pieces of database work we have filled all of the IO threads so any other IO work must sit and wait for an IO thread to become free.

 

This is quite a fundamental design flaw with our application.

The Solution

Fortunately the solution is rather simple and we’ve refactored the application to use a database connection pool (courteousy of node-pool [generic-pool]) so that we now have more than one database connection to work with so that our database queries can run in parallel and not block waiting for each other.

With nodejs v0.10+ we can also take that one step further. In this version libeio as been replaced with a threadpool implementation within libuv. This introduces a new environment variable that allows you to increase the number of IO threads beyond 4 abcpaperwriter.com by setting the environment variable:-

 UV_THREADPOOL_SIZE (Range: 4 – 128, Default: 4).

We set this to around 1.5 * CPUs (in our case this was 12.)

Now we have more IO threads to work with and more database connections available so that queries can run in parallel.

Is there anything else we can do? Yes! To ensure that DB work cannot fully consume the IO threads and block file reading/writing we set the maximum pool size to be one less than the UV_THREADPOOL_SIZE. This means that only UV_THREADPOOL_SIZE – 1 database operations will ever occur in parallel which leaves one IO thread available at all times for our other IO operations (which are minimal) like file reading/writing.

Coding, Tech ,

Introducing SSPKD

March 13th, 2012

SSPKD is a system to securely distribute your SSH public keys to multiple hosts.

Security is achieved through the use of your GPG key to sign authorized_keys file updates which are verified on each recipient machine before an update to the authorized_keys file takes place.

In the event an invalid signature is presented then the update is not performed, so if your ‘central’ sspkd server was compromised then an attacker would be unable to simply add their sshkey and push it onto all your other hosts.

SSPKD is currently an alpha-release and is available from github: http://github.com/kmdm/sspkd

Coding, Linux, Tech , , , ,

PHP: Shamir’s Secret Sharing Class

August 19th, 2010

Having searched on google I wasn’t able to find a PHP implementation of Shamir’s secret sharing, I was only able to find a perl implementation.

So for anyone else who’s after the same thing I’ve created a (simple) PHP class which implements Shamir’s secret sharing which you can download below, it’s strongly based on the perl implementation above and is also licensed under the GNU GPL.

(It’s missing some error-condition checks in the ::recover() method.)

There’s a simple test case in the class file which’ll explain its usage.

[Download Shamir’s Secret Sharing class]

Coding , , ,

Resetting unix passwords from a webpage

October 8th, 2009

Been a while since my last post due to being extremely busy here over the past few months.

Recently we came across a little problem where one of our customers had a dedicated mailserver (courier, exim4) but had no ability to change their own passwords.

So I came up with two scripts, one bash (to do the actual password reset) and one php (as a front-end to the bash script). These two scripts are designed to be used in tandem to provide adequate input validation and security.

You can find these two scripts here: chpasswd.sh (backend bash script) and chpasswd.txt (frontend php script).

Please note these are simple scripts, in both the lack of error reporting and lack of styling / CSS however they should be functional and with the use of su’ing to the user and avoidance of a setuid root script they should be secure.

If you implement these you are strongly advised to also implement some anti-bruteforce code or in the very least restrict access to internal use only.

Coding, Linux , ,

PHP app Licensing Faux-pas

March 10th, 2009

One of the php web applications we use in the office requires a license to work, it locks its license to the hostname and ip address of the server on which it’s run.  When the license key is first entered it phones home to set the hostname and ip address on the licensing server and then stores a valid hash of the license.

We recently restructured our network and changed the ip address thus causing the app to complain the license was invalid, deleting the licensing file caused it to talk with the licensing server again but it was no use the licensing server still held the old ip address which was no longer correct. At this point I sent an e-mail off to their customer support team to get the information changed.

However, I couldn’t resist  taking a quick peek under the hood. To their credit the app is largely open source and readable except for the code that manages the license which is encrypted.  I removed the license file, fired up wireshark which logged the following conversation to their license server (anonymised to protect the guilty):-

GET /XXXXXX.php?license_key=Base64String&host_name=?Base64String&
host_ip=Base64String

Which generated the following reply:-

license-key|host-name|old-ip-address

On the face of it this seems quite easy to attack given it sends the current hostname/ip to the licensing server it’d be a trivial PHP script to send back what we assume the app would want to see:-

$key = base64_decode(urldecode($_GET['license_key']));
$host = base64_decode(urldecode($_GET['host_name']));
$ip = base64_decode(urldecode($_GET['host_ip']));

echo "$key|$host|$ip";

All that remains to do is set the script up on our server, and add an entry into our /etc/hosts file so that the licensing server domain name now points at our server. Once done after removing the license file I hit refresh and surprise surprise the app accepted the license response and things continued as normal.

This is particularly weak scheme since it doesn’t even run over SSL so capturing with wireshark is trivial. The other fundamental problem is that the class which converses with the licensing server is not encrypted so that would represent another point of attack which wouldn’t require setting up a fake licensing server – just hijack the response methods.

To their credit the license key itself is validated using a hash to determine what level of features you have access to but once you’ve bought one key you are then able to apply the methods above to copy the key to as many different locations as you wish.

Any comments identifying the application in question will be removed or censored, this is not an aid to bypassing licensing requirements more a discussion of the security implications of how this particular method was implemented.

Coding , ,

Web App config files

January 21st, 2009

So like most people when I write my web applications I create a config.inc.php of some description and put that in the directory with the application. However, when it comes to roll out the app this causes problems due to the config differences between the live and development environments so the config file needs to be changed once the new version has been put live.

This isn’t ideal so the first obvious thing to do is to move the config file out of the web application directory to somewhere else (for example: /etc/webapp.conf). Now when a version of the web app is released you simply just have to untar the web app tarball into the correct place and everything will just work without any post-install manual tweaking.

The problem with this approach is that it doesn’t work for multi-developer environments where a developer might need to tweak the config file for a new feature they’re testing, but the web app is now looking at the /etc/webapp.conf file no matter what developer it is.

The solution I then came up with was another config file, say called .dev-config.inc.php which is (optionally) kept in the same directory as the web app but excluded from your version control software so it’s never checked in. Then you simple change the web app to check for this file and if it doesn’t exist load the main config file so now a developer can create a config file for just their tree. For example:-

 
if(file_exists(dirname(__FILE__)."/.dev-config.inc.php")) {
    require_once(dirname(__FILE__)."/.dev-config.inc.php);
} else {
    require("/etc/webapp.conf");
}

Now once we have this setup everything works and developers can change the config files if they need too (e.g. database dsn’s) and the live system can be deployed by simply untar’ing a tarball.

The next thing I noticed was if I need to add a new config file option I need to add it in all the config files, even if it’s just a system config option that won’t change no matter what environment it’s in. This is a bit of chore given I’m just a lazy developer…

So the next change I make to the config file system is to reinstate the config.inc.php in the web app directory with one key difference, it now has the following design pattern and it is responsible for including our main config file:-

if(file_exists(dirname(__FILE__)."/.dev-config.inc.php")) {
    require_once(dirname(__FILE__)."/.dev-config.inc.php);
} else {
    require("/etc/webapp.conf");
}

$config = array(
    "A_CONFIG_OPTION"=>TRUE,
    "IS_THIS_OVERKILL"=>"MAYBE"
);

foreach($config as $const=>$value)
    if(!defined($const))
        DEFINE($const, $value);

Now this config file will first load our main web app config file (either /etc/webapp.conf or .dev-config.inc.php) and fill in all the missing (default/system) config options that weren’t needed to be changed in the per-environment config settings. Should that need change, simply just define them and this config file will not set them.

Questions
1. Is this over the top?
2. How does everyone else handle config files?
3. Do their methods solve the issues presented above?

Coding , ,

PHP, PDO & Nested Transactions

December 2nd, 2008

I’ve been using PDO as my database library and it works reasonably well (as long as you remember it’s not a full blown database abstraction library), however recently I needed to use nested transactions to ensure that the database remains consistent while doing a series of SQL statements.

Unfortunately PDO does not support nested transactions although PostgreSQL and MySQL do. I decided to extend the PDO class to support nested transactions while also using PDO to keep track of the first transaction. I came up with the following class (released under the GNU General Public License, Version 3):-

class MyPDO extends PDO {
    // Database drivers that support SAVEPOINTs.
    protected static $savepointTransactions = array("pgsql", "mysql");

    // The current transaction level.
    protected $transLevel = 0;

    protected function nestable() {
        return in_array($this->getAttribute(PDO::ATTR_DRIVER_NAME),
                        self::$savepointTransactions);
    }

    public function beginTransaction() {
        if(!$this->nestable() || $this->transLevel == 0) {
            parent::beginTransaction();
        } else {
            $this->exec("SAVEPOINT LEVEL{$this->transLevel}");
        }

        $this->transLevel++;
    }

    public function commit() {
        $this->transLevel--;

        if(!$this->nestable() || $this->transLevel == 0) {
            parent::commit();
        } else {
            $this->exec("RELEASE SAVEPOINT LEVEL{$this->transLevel}");
        }
    }

    public function rollBack() {
        $this->transLevel--;

        if(!$this->nestable() || $this->transLevel == 0) {
            parent::rollBack();
        } else {
            $this->exec("ROLLBACK TO SAVEPOINT LEVEL{$this->transLevel}");
        }
    }
}

This code will only attempt to use the SAVEPOINT code if you’re using a database driver that supports it (it should probably version check the database server) this then means that in your code you can do things like:-

$pdo = new MyPDO(DB_DSN, DB_USER, DB_PASS);
$pdo->beginTransaction();
try {
    $pdo->exec(...);
    $pdo->exec(...);

    $pdo->beginTransaction();
    try {
        $pdo->exec(...);
        $pdo->exec(...);
        $pdo->exec(...);
        $pdo->commit();
    } catch(PDOException $e) {
        // If this statement fails, rollback...
        // NOTE: This will only rollback statements made in the
        //       inner try { block and not the outer one.
        $pdo->rollBack();
    }

    $pdo->commit();
} catch (PDOException $e) {
    $pdo->rollBack();
}

NB: I’ve tweaked the code slightly when transferring it to my blog and I haven’t tested it, so there could be some minor errors – please leave comments if you spot any. Thanks!

Coding , , , ,

PHP and PDF Templates

November 26th, 2008

Recently I had to write a class that could generate PDF payslips. Going for the easy option I wanted to take the existing template to re-use it and overlay the text from the database onto the template rather than redraw the entire payslip layout (which is just simply time consuming in PHP).

My chosen PDF library is FPDF so my first attempt was to create a graphic (png/jpeg) of the existing template and use the Image() function in FPDF to add the image to the PDF. However, this didn’t work as well as I’d have hoped since the image became very blurry. I believe this is due to the image being output at 72dpi onto the PDF when it actually needs greater resolution to be usable.

Obviously a better solution was needed, enter FPDI which allows you to load a PDF file as a template. So now all I needed to do was use the payslip PDF as a template and just overlay the text onto the template. FPDI makes this extremely simple – just include the required class file and do the following:-

$pdf = new FPDI();
$count = $pdf->setSourceFile("payslip-template.pdf");
$template = $pdf->import(1, "/MediaBox");
$pdf->addPage();
$pdf->useTemplate($template);

Now all you have to do is populate your content into/over the template and output it as normal.

One caveat I did notice was that FPDI seemed to be sending the files out with the mime type “application/x-download” rather than “application/pdf” which caused my browser to not know what to do with the file if I chose to open it rather than save it. My solution was to reset the Content-type header after outputting the PDF by taking advantage of existing output buffering in my application:-

ob_clean();
$pdf->Output("payslip.pdf", "D");
header("Content-type: application/pdf");
exit;

This works because output buffering allows us to change the headers that have already been output. If you do not currently have output buffering enabled in your code change the “ob_clean();” line to “ob_start();”.

FPDI is released under the Apache Software License, Version 2.0 which is compatible with the GNU GPL Version 3.

Coding , ,