I am trying to scrap some wikipedia pages with my Node.js
app, using jsdom
. Here is an example of what I'm doing:
jsdom.env({
url: "https://en.wikipedia.org/wiki/Bill_Gates",
features: {
FetchExternalResources: ['script'],
ProcessExternalResources: ['script'],
SkipExternalResources: false,
},
done: function (err, window) {
if (err) {
console.log("Error: ", err)
return;
}
var paras = window.document.querySelectorAll('p');
console.log("Paras: ", paras)
}
});
The weird thing is that querySelectorAll('p')
returns a NodeList
of empty elements:
Paras: NodeList {
'0': HTMLParagraphElement {},
'1': HTMLParagraphElement {},
'2': HTMLParagraphElement {},
'3': HTMLParagraphElement {},
'4': HTMLParagraphElement {},
'5': HTMLParagraphElement {},
'6': HTMLParagraphElement {},
'7': HTMLParagraphElement {},
...
62': HTMLParagraphElement {} }
Any idea on what could be the problem? Thanks!
via Randy
No comments:
Post a Comment