Tuesday, 4 April 2017

How to use the correct selectors for scrapping urls?

If I want to crap the title, description and date from these lines of code using node.js what should my selectors be? I tried using the following nodejs code. Don't know where am i GOING WRONG!

<header class="entry-header">
    <h2 class="entry-title">
        <a href="http://www.raittude.in/the-spirit-of-horizon/" title="The Spirit of Horizon">The Spirit of Horizon</a>
    </h2><!-- .entry-title -->
    </header>


    <div class="entry-content clearfix">
        <p>If you&#8217;re an RAITian, you must have surely heard about the HORIZON, all the amazing events; the marvellous Fashion show, Footloose dance competition and of course, the Live-in concerts and grand Mainstage music festivals. But Horizon is much more than</p>
<table class="rw-rating-table rw-ltr rw-left rw-no-labels"><tr><td><nobr>&nbsp;</nobr></td><td><div class="rw-left"><div class="rw-ui-container rw-class-blog-post rw-urid-12680" data-img="http://www.raittude.in/wp-content/uploads/2017/02/10367577_709706035810460_4479719323869272611_n-300x200.jpg"></div></div></td></tr></table>    </div>

    <footer class="entry-meta-bar clearfix"><div class="entry-meta clearfix">
      <span class="by-author author vcard"><a class="url fn n" href="http://www.raittude.in/author/pratiksha_p/">Pratiksha Padhi</a></span>

      <span class="date"><a href="http://www.raittude.in/the-spirit-of-horizon/" title="10:35 pm" rel="bookmark"><time class="entry-date published" datetime="2017-02-02T22:35:10+00:00">February 2, 2017</time><time class="updated" datetime="2017-02-02T22:35:54+00:00">February 2, 2017</time></a></span>
               <span class="category"><a href="http://www.raittude.in/category/events-and-festivals/" rel="category tag">Events and Fests</a></span>

node.js :

$('a.entry_title').each(function(){
                json.url.push($(this).attr('title'));
            });


        $('time.entry-date published').each(function(){
            json.date.push($(this).text());
        });

        $('p.entry-content clearfix').each(function(){
            json.description.push($(this).text());
        });



via Prachi Vaity

No comments:

Post a Comment