Sunday, 11 June 2017

How to get rid of Server header in get request using casper.js. (scrape Nike)

It looks like when websites see "name": "Server", "value": "AkamaiGHost" in the header they don't want me to access their website. I am "forbidden" to enter. How to get rid of the server header

self.thenOpen("http://" + link, {headers : {"Server": "Apache/2.4.1 (Unix)"}}, function(opened){
        console.log("headers :", JSON.stringify(opened.headers, null, 4))
        casper.wait(5000, function(){
                var html = casper.evaluate(function(){
                        return $("html")[0].outerHTML.substring(0,400000).toLowerCase();
                })
                // console.log(html)
             var result = stringDetector(["shoe", "shoes", "sneaker", "sneakers"], html, link,self)
             if(result){
                 outSide.push({name :result, htmlLength : html.length});
             }
        })

I tried doing {headers : {"Server": "Apache/2.4.1 (Unix)"}} to help me out but AkamaiGHost shows up for nike and converse websites. other websites, when I console the header shows other servers.

I'm guessing the problem is that "Phantom" uses AkamaiGHost (maybe) and nike detects phantom. How to stop them from detecting casper



via jack blank

No comments:

Post a Comment