Wednesday, 24 May 2017

Error: connect ETIMEDOUT when scraping

I have a function that:
1. gets an array of 3000 'id' properties from mongoDB documents from collection foo.
2. Creates a GET request for each ID to get 'resp' obj for id, and stores it in another database.

router.get('/', (req, res) => {

    var collection = db.get().collection('foo');
    var collection2 = db.get().collection('test');
    collection.distinct('id',  (err, idArr) => { // count: 3000+
    idArr.forEach(id => {
    let url = 'https://externalapi.io/id=' + id
    request(url, (error, response, body) => {
           if (error) { 
             console.log(error) 
           } else {
             resp = JSON.parse(resp);
             collection2.insert(resp);
           }
    });
});

Node Error Log:

[0] events.js:163
[0]       throw er; // Unhandled 'error' event
[0]       ^
[0]
[0] Error: connect ETIMEDOUT [EXT URL REDACTED]
[0]     at Object.exports._errnoException (util.js:1050:11)
[0]     at exports._exceptionWithHostPort (util.js:1073:20)
[0]     at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1093:14)

I am using simple-rate-limiter not to cause rate limits (25cps):

const limit = require("simple-rate-limiter");
const request = limit(require("request")).to(20).per(1000);

But anywhere between 300-1700 requests I receive this error which crashes node on the command line. How can I handle this error to prevent my app from crashing?

I have tried a lot of error handling, but none of them were able to handle connect ETIMEDOUT



via Moshe

No comments:

Post a Comment