Monday 10 April 2017

Bad performance on combination of streams

I want to stream the results of a PostgreSQL query to a client via a websocket.

Data is fetched from the database using pg-promise and pg-query-stream. To stream data via a websocket I use socket.io-stream.

Individually, all components perform quite wel. Though when I pipe the pg-query-stream to the socket.io-stream, performance drops drastically.

I've started with:

var QueryStream = require('pg-query-stream');
var ss = require('socket.io-stream');

// Query with a lot of results
var qs = new QueryStream('SELECT...'); 

db.stream(qs, s => {
        var socketStream = ss.createStream({objectMode: true});
        ss(socket).emit('data', socketStream);
        s.pipe(socketStream);
    })
    .then(data => {
        console.log('Total rows processed:', data.processed,
            'Duration in milliseconds:', data.duration);
    });

I have tried to use non-object streams:

var socketStream = ss.createStream();
ss(socket).emit('data', socketStream);
s.pipe(JSONStream.stringify()).pipe(socketStream);

Or:

var socketStream = ss.createStream();
ss(socket).emit('data', socketStream);
s.pipe(JSONStream.stringify(false)).pipe(socketStream);

It takes roughly one minute to query and transfer the data for all solutions.

The query results can be written to a file within one second:

s.pipe(fs.createWriteStream('temp.txt'));

And that file can be transmitted within one second:

var socketStream = ss.createStream();
fs.createReadStream('temp.txt').pipe(socketStream);

So somehow, these streams don't seem to combine well.

As a silly experiment, I've tried placing something in between:

var socketStream = ss.createStream();
ss(socket).emit('data', socketStream);
var zip = zlib.createGzip();
var unzip = zlib.createGunzip();
s.pipe(JSONStream.stringify(false)).pipe(zip).pipe(unzip).pipe(socketStream);

And suddenly data can be queried and transfered within one second...

Unfortunately this is not going to work as my final solution. It would waste too much CPU. What is causing performance to degrade on this combination of streams? How can this be fixed?



via Bart

No comments:

Post a Comment