Skip to content

Instantly share code, notes, and snippets.

@vheidari
Last active May 9, 2023 16:41
Show Gist options
  • Select an option

  • Save vheidari/bfbc68bfea64d186513f14eaefdc1306 to your computer and use it in GitHub Desktop.

Select an option

Save vheidari/bfbc68bfea64d186513f14eaefdc1306 to your computer and use it in GitHub Desktop.

Revisions

  1. vheidari revised this gist Apr 10, 2023. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion Puppeteer.md
    Original file line number Diff line number Diff line change
    @@ -181,7 +181,7 @@ import * as fs from 'node:fs/promises';
    const page = await browser.newPage();

    // go to this address
    await page.goto('https://www.amazon.com/Desktop-Processor-12-Thread-Unlocked-Motherboard/dp/B0972FHS7J/ref=sr_1_14?keywords=amd%2Bmotherboard&qid=1680889742&sr=8-14&th=1');
    await page.goto('https://www.amazon.com/Desktop-Processor-12-Thread-Unlocked-Motherboard/dp/B0972FHS7J');

    // set viewport size, width and height
    await page.setViewport({width: 1980, height: 1080});
  2. vheidari revised this gist Apr 9, 2023. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion Puppeteer.md
    Original file line number Diff line number Diff line change
    @@ -1,5 +1,5 @@
    # Puppeteer
    In this article I wanna introduce `Puppeteer` as a tools that help us to do something cool like `Web Scraping` or `Automation` some task. `Puppeteer` helps developer up and run a google chromium browser throught command line tools this google chromium is headless browser that that actining like real world browser. `Puppeteer` Api helps developer to do anyting that a user could do with it's browser. for example :
    In this article I wanna introduce `Puppeteer` as a tools that help us to do something cool like `Web Scraping` or `Automation` some task. `Puppeteer` helps developer up and run a google chromium browser throught command line tools this google chromium is headless browser that acting like real world browser. `Puppeteer` Api helps developer to do anyting that a user could do with it's browser. for example :

    - we could open and new page or new tab
    - we could select any element from DOM with it's api
  3. vheidari revised this gist Apr 9, 2023. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion Puppeteer.md
    Original file line number Diff line number Diff line change
    @@ -163,7 +163,7 @@ import puppeteer from 'puppeteer';

    ```

    ## `Puppeteer` Example 04 -> how we could download list of images with puppeteer
    ## `Puppeteer` Example 04 -> how we could download an images with puppeteer
    In this example we will attempt to extract first product image from the amazon.com website then save it on the hard disk.

    ```javascript
  4. vheidari created this gist Apr 9, 2023.
    227 changes: 227 additions & 0 deletions Puppeteer.md
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,227 @@
    # Puppeteer
    In this article I wanna introduce `Puppeteer` as a tools that help us to do something cool like `Web Scraping` or `Automation` some task. `Puppeteer` helps developer up and run a google chromium browser throught command line tools this google chromium is headless browser that that actining like real world browser. `Puppeteer` Api helps developer to do anyting that a user could do with it's browser. for example :

    - we could open and new page or new tab
    - we could select any element from DOM with it's api
    - we could typing and selection input element and manipulate them value
    - we could select a button and click on it
    - we could create a pdf from current page that
    - we could create a screenshot from current page
    - ...


    There is a official explanation about `Puppeteer` :

    > "Puppeteer" is a Node.js library which provides a high-level API to control Chrome/Chromium over the DevTools Protocol. Puppeteer runs in headless mode by default, but can be configured to run in full (non-headless) Chrome/Chromium.
    ## Install `Puppeteer`

    To install this tools we should follow below instructions : First we create a `package.json` file through this command

    ```bash
    npm init
    ```
    Then use this below command to install `Puppeteer`

    ```
    npm install puppeteer --save
    ```
    when `npm` did install all dependencies.
    open the `package.json` and add `"type": "module"` inside it as a key/value.

    ```json
    {
    "name": "project-name",
    "version": "1.0.0",
    "description": "",
    "type": "module",
    "main": "app.js",
    "scripts": {
    "run": "node app.js"
    },
    "author": "",
    "license": "ISC",
    "dependencies": {
    "puppeteer": "^19.8.3"
    }
    }

    ```
    ok, when we did all above tasks we are ready to implement our own first example.


    ## `Puppeteer` Example 01 -> Take a screeenshot
    In this example we will learn how we could take a screenshot from a web page or website then save it on the hard disk.

    ```javascript

    // import puppeteer package
    import puppeteer from 'puppeteer';


    ( async() => {

    // launch a browser
    const browser = await puppeteer.launch();

    // creat a new page
    const page = await browser.newPage();

    // go to this address https://developer.mozilla.org/en-US/
    await page.goto('https://developer.mozilla.org/en-US/');

    // set viewport size, width and height
    await page.setViewport({width: 1980, height: 1080});

    // take a screenshot
    await page.screenshot({path: 'mozillla-dev-center.png', fullPage: true});

    // close browser
    await browser.close();
    } )();



    ```


    ## `Puppeteer` Example 02 -> how to read bitcoin price from cmc
    In this example we decide to read bitcoin price from `CoinMarketCap` website. we will learn how to select a element and how extract data from it.

    ```javascript

    // import puppeteer package
    import puppeteer from 'puppeteer';

    ( async () => {
    // launch a browser
    const browser = await puppeteer.launch();

    // creat a new page
    const page = await browser.newPage();

    // go to this address https://coinmarketcap.com/currencies/bitcoin/
    await page.goto('https://coinmarketcap.com/currencies/bitcoin/');

    // set viewport size, width and height
    await page.setViewport({width: 1980, height: 1080});

    // select price element and store withing bitcoinElement
    const bitcoinElement = await page.waitForSelector('.priceValue>span');

    // extract price from bitcoinElement with evaluate method
    const bitcoinPrice = await bitcoinElement.evaluate( el => el.textContent );

    // print bitcoin price
    console.log("bitcoin price on the cmc : " + bitcoinPrice);

    // close browser
    await browser.close();

    } )();


    ```

    ## `Puppeteer` Example 03 -> how to select an input form and type inside it
    Example 03 show us how we could interact with a html form and type anything inside it.

    ```javascript

    // import puppeteer package
    import puppeteer from 'puppeteer';


    ( async () => {
    // launch a browser
    const browser = await puppeteer.launch();

    // creat a new page
    const page = await browser.newPage();

    // go to this address https://github.com/
    await page.goto('https://github.com/');

    // set viewport size, width and height
    await page.setViewport({width: 1980, height: 1080});

    // select search input form with waitForSelector through input[name="q"]
    const searchBox = await page.waitForSelector('input[name="q"]');

    // typing puppeteer inside input element with type method
    await searchBox.type('puppeteer');

    // creating a screenshot from webpage that show us everything is ok
    await page.screenshot({path: 'github-searchbox.png', fullPage: true});

    await browser.close();


    } )();



    ```

    ## `Puppeteer` Example 04 -> how we could download list of images with puppeteer
    In this example we will attempt to extract first product image from the amazon.com website then save it on the hard disk.

    ```javascript

    // import puppeteer package
    import puppeteer from 'puppeteer';
    import * as fs from 'node:fs/promises';


    ( async () => {
    // launch a browser
    const browser = await puppeteer.launch();

    // creat a new page
    const page = await browser.newPage();

    // go to this address
    await page.goto('https://www.amazon.com/Desktop-Processor-12-Thread-Unlocked-Motherboard/dp/B0972FHS7J/ref=sr_1_14?keywords=amd%2Bmotherboard&qid=1680889742&sr=8-14&th=1');

    // set viewport size, width and height
    await page.setViewport({width: 1980, height: 1080});

    // timeout to page completly loaded
    await page.waitForTimeout(10000);

    await page.screenshot({path:'amazon.png', fullPage:true});

    // select image throught its id
    const getLandingImage = await page.waitForSelector('#landingImage');

    // extract url inside browser through evaluate methond and pass it to landingImageUrl (nodejs enviourment)
    const landingImageUrl = await getLandingImage.evaluate( x => x.src);

    // go to image url
    const imagePage = await page.goto(landingImageUrl);

    // writing image on the hard disk through fs api, and puppeteer buffer method
    await fs.writeFile(landingImageUrl.split("/").pop(), await imagePage.buffer());

    // log image url to terminal
    console.log(landingImageUrl);

    // creating a screenshot from webpage that show us everything is ok
    await page.screenshot({path: 'github-searchbox.png', fullPage: true});

    await browser.close();


    } )();




    ```





    ## Good Examples and Resources
    https://github.com/puppeteer/puppeteer/tree/main/examples