Sunday, 30 April 2017

Concat streamed data into single file or string while extracting zip files with Node

I am using yauzl node module. I am extracting XML files from Zip file. While reading entries, inside readSteam.on('data') callback function, I am logging entry.filename and its content as string. However, some files get shredded into multiple files. These parts share same fileName. Each part includes part of file content.

  let options = {
    lazyEntries: true
  }
  yauzl.open(params.pathToZip, options, function(err, zipfile) {
    if(err) throw err
    zipfile.readEntry()
    zipfile.on("error", function(err) {
      throw err
    })
    zipfile.on("entry", function(entry) {
      if (/\/$/.exec(entry)) return false

      zipfile.openReadStream(entry, options, (err, readStream) => {
        if (err) throw err
        readStream.on('data', data => {
          console.log(entry.fileName)
          console.log(data.toString())
        })
      })
      zipfile.readEntry()
    });
    zipfile.once("end", function() {
      console.log('END EVENT CALLBACK')
      zipfile.close()
    });
  });

Is there built-in method that would concat these strings (by knowing that they share same filename) into a single file or string that could be then parsed as one.

Log looks something like this if substr) method was used such as console.log(data.toString().substr(0,10)) (because we only care about 1st few characters of each datafile for representation purposes):

TH/CJM/CJM00083_en.xml
<hotel des
TH/CJM/CJM0007V_en.xml
<hotel des
TH/CJM/CJM0007V_en.xml
vel.com/HH
TH/CJM/YYY1RHVV_en.xml
<hotel des
TH/CJM/YYY1RHVV_en.xml
om/hotels/
TH/CJM/YYY0DJJ2_en.xml
<hotel des
TH/CJM/CJM0005P_en.xml
<hotel des

As logs show, filenames are repeated when data is shredded.



via Kunok

No comments:

Post a Comment