Wednesday, 31 May 2017

Getting the widths of a lot of images causes ENOTFOUND in Node.js

I have several sources of lists of images (flicker, images stored at s3, imgur, etc)

I want to get the dimenions of these images.

I use node and https://github.com/nodeca/probe-image-size to go over each url and use that to get the width of the image and count how many images are at a certain width via the following code

    probes = [];

    _.forEach(image_urls, url => {
      probes.push(probe(url));
    });
    results = await Promise.all(probes);
     _.forEach(results, result_of_image => {
      width = parseInt(result_of_image.width / 10) * 10;
      if (!widthes[width]) {
       widthes[width] = 1;
      } else {
       widthes[width]++;
      }
     });

even though all urls are accessible, I sometimes get getaddrinfo ENOTFOUND with the stack

at ClientRequest.req.once.err (/image_script/node_modules/got/index.js:73:21)
at Object.onceWrapper (events.js:293:19)
at emitOne (events.js:101:20)
at ClientRequest.emit (events.js:191:7)
at TLSSocket.socketErrorListener (_http_client.js:358:9)
at emitOne (events.js:96:13)
at TLSSocket.emit (events.js:191:7)
at connectErrorNT (net.js:1031:8)
at _combinedTickCallback (internal/process/next_tick.js:80:11)
at process._tickDomainCallback (internal/process/next_tick.js:128:9)

I suspect that because the url list is very large (in the thousands) that node just takes all resources of the system and things just stop working properly (this is a guess)

Is there a better way to do the above? or provide node with some connection pool?



via Nick Ginanto

No comments:

Post a Comment