I'm scraping a certain website using node's https
module, like so:
https.request({}, function(res){
gzip = zlib.createGunzip();
res.pipe(gzip);
output = gzip;
...
});
Using Firefox or Chrome, the page's HTML contains this:
brouck%C3%A8re
However, in the string I get from the ServerResponse
object, the urlencoded part turns into an invalid character:
brouck�re
Why is it not staying url-encoded? I'm not decoding it anywhere in my flow.
I'm concatenating the data, but setting the encoding correctly:
output.on('data', function gotData(data) {
body += data.toString('utf-8');
});
So what's going on, here?
via skerit
No comments:
Post a Comment