Friday 14 April 2017

Headless Chrome rendering full page

The problem with current headless Chrome is that there is no API to render the full page you only get the "window" that you set in CLI parameter.

I am using the chrome-remote-interface module, this is the capture example:

const fs = require('fs');
const CDP = require('chrome-remote-interface');

CDP({ port: 9222 }, client => {

    // extract domains
    const {Network, Page} = client;

    Page.loadEventFired(() => {
        const startTime = Date.now();
        setTimeout(() => {
            Page.captureScreenshot()
            .then(v => {
                let filename = `screenshot-${Date.now()}`;
                fs.writeFileSync(filename + '.png', v.data, 'base64');
                console.log(`Image saved as ${filename}.png`);
                let imageEnd = Date.now();
                console.log('image success in: ' + (+imageEnd - +startTime) + "ms");
                client.close();
            });
        }, 5e3);

    });
    // enable events then start!
    Promise.all([
        // Network.enable(),
        Page.enable()
    ]).then(() => {
        return Page.navigate({url: 'https://google.com'});
    }).catch((err) => {
        console.error(`ERROR: ${err.message}`);
        client.close();
    });
}).on('error', (err) => {
    console.error('Cannot connect to remote endpoint:', err);
});

To render the full page, one slower and hack solution would be partial rendering. Set fixed height and scroll through the page and take the screenshots after every X pixels. The problem is that how to drive the scrolling part? Would it be better to inject custom JS or is it doable through the Chrome remote interface?



via Risto Novik

No comments:

Post a Comment