Wednesday 26 April 2017

How can I get image src, title and the description from this html using cheerio?

I am trying to extract some content from website using nodejs with cheerio. I want to extract the following content:

  1. "This is my sample title text" text.
  2. " Here will be my description content" text.
  3. Image src .

Here is the html:

     <body>
     <div class="detail_loop">
         <img class="imfast" data-original="http://www.example.com/wp-content/uploads/2017/03/imageurl-250x150.jpg" title=""
              align="left" width="250" height="150"
              src="http://www.example.com/wp-content/uploads/2017/03/imageurl-250x150.jpg" style="display: block;">
         <h2>
             <a href="http://www.example.com/2017/04/576487/" rel="bookmark">This is my titile text</a>
         </h2>
         Here will be my description content.
         <div class="clear"></div>
         <div class="send_loop" style="display: none;">
             <a href="http://www.example.com/2017/04/576487//#respond" target="_blank">
                 <div class="send_com">
                     <div class="send_bubb">
                         <div class="count">
                             0
                         </div>
                     </div>
                 </div>
             </a>
             <a href="https://www.facebook.com/sendr.php?u=http://www.example.com/2017/04/576487/" target="_blank">
                 <div class="send_fb">
                     <div class="send_bubb">
                         <div class="count">
                             send
                         </div>
                     </div>
                 </div>
             </a>
             <a href="https://twitter.com/send?url=http://www.example.com/2017/04/576487/&amp;text=this is sample title;hashtags=example"
                target="_blank">
                 <div class="send_tt">
                     <div class="send_bubb">
                         <div class="count">
                             Tweet
                         </div>
                     </div>
                 </div>
             </a>
             <div class="clear"></div>
         </div>
         <div class="clear"></div>
         <div class="detail_loop_dvd"></div>
         <div class="clear"></div>
     </div>
    </body>



via Rahul Subedi

No comments:

Post a Comment