Tuesday, 4 April 2017

Trouble unzipping .json.gz file in Node.js using zlib

I'm downloading this AWS S3 object on my local Node.js server with this -

var url = "http://s3.amazonaws.com/cloudfront.s3post.cf/s3posts.json.gz";
var dest = "./s3posts.json.gz";

var download = function(url, dest, cb) {
    var file = fs.createWriteStream(dest);
    var request = http.get(url, function(response) {
        response.pipe(file);
        file.on('finish', function() {
            file.close(cb);
        });
    });
}

download(url, dest, function() {
    console.log('Download complete');
});

This successfully downloads a .json.gz object. I'm trying to unzip this object using zlib -

var gunzip = zlib.createGunzip();
var rstream = fs.createReadStream('./s3posts.json.gz');
var wstream = fs.createWriteStream('./s3posts.json');
rstream.pipe(gunzip).pipe(wstream);

However, this throws an error and the .json file that is created is empty -

events.js:163
      throw er; // Unhandled 'error' event
      ^

Error: unexpected end of file
    at Zlib._handle.onerror (zlib.js:355:17)

Weirdly, if I use only the download code to download the object and unzip it manually using gunzip s3posts.json.gz on the terminal, the created json file is filled with content and I can run my app successfully.

I'm not sure why I'm able to unzip manually but can't do it programmatically with zlib. It would be really helpful if someone can point out if I'm making a mistake.

The S3 object has the following metadata if it's relevant -

Cache-Control: max-age=31536000,no-transform,public
Content-Encoding: gzip
Content-Type: application/json



via Anish Sana

No comments:

Post a Comment