I have several sources of lists of images (flicker, images stored at s3, imgur, etc)
I want to get the dimenions of these images.
I use node and https://github.com/nodeca/probe-image-size to go over each url and use that to get the width of the image and count how many images are at a certain width via the following code
probes = [];
_.forEach(image_urls, url => {
probes.push(probe(url));
});
results = await Promise.all(probes);
_.forEach(results, result_of_image => {
width = parseInt(result_of_image.width / 10) * 10;
if (!widthes[width]) {
widthes[width] = 1;
} else {
widthes[width]++;
}
});
even though all urls are accessible, I sometimes get getaddrinfo ENOTFOUND
with the stack
at ClientRequest.req.once.err (/image_script/node_modules/got/index.js:73:21)
at Object.onceWrapper (events.js:293:19)
at emitOne (events.js:101:20)
at ClientRequest.emit (events.js:191:7)
at TLSSocket.socketErrorListener (_http_client.js:358:9)
at emitOne (events.js:96:13)
at TLSSocket.emit (events.js:191:7)
at connectErrorNT (net.js:1031:8)
at _combinedTickCallback (internal/process/next_tick.js:80:11)
at process._tickDomainCallback (internal/process/next_tick.js:128:9)
I suspect that because the url list is very large (in the thousands) that node just takes all resources of the system and things just stop working properly (this is a guess)
Is there a better way to do the above? or provide node with some connection pool?
via Nick Ginanto
No comments:
Post a Comment