Skip to content

Instantly share code, notes, and snippets.

@gopalkumar315
Forked from jakzal/crawler-edit.php
Created December 25, 2015 19:36
Show Gist options
  • Select an option

  • Save gopalkumar315/fb3a74a66f0deb9814b9 to your computer and use it in GitHub Desktop.

Select an option

Save gopalkumar315/fb3a74a66f0deb9814b9 to your computer and use it in GitHub Desktop.

Revisions

  1. @jakzal jakzal revised this gist Apr 1, 2015. 1 changed file with 3 additions and 2 deletions.
    5 changes: 3 additions & 2 deletions crawler-edit.php
    Original file line number Diff line number Diff line change
    @@ -24,8 +24,9 @@

    // remove all h2 nodes inside .content
    $crawler->filter('html .content h2')->each(function (Crawler $crawler) {
    $node = $crawler->getNode(0);
    $node->parentNode->removeChild($node);
    foreach ($crawler as $node) {
    $node->parentNode->removeChild($node);
    }
    });

    // output .content nodes with h2 removed
  2. @jakzal jakzal created this gist Apr 1, 2015.
    34 changes: 34 additions & 0 deletions crawler-edit.php
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,34 @@
    <?php
    <<<CONFIG
    packages:
    - "symfony/dom-crawler: ~2.3"
    - "symfony/css-selector: ~2.3"
    CONFIG;

    use Symfony\Component\DomCrawler\Crawler;

    $html = <<<HTML
    <html>
    <div class="content">
    <h2 class="gamma">Excerpt</h2>
    <p>...content html...</p>
    </div>
    <div class="content">
    <h2 class="gamma">Excerpt</h2>
    <p>...more content html...</p>
    </div>
    </html>
    HTML;

    $crawler = new Crawler($html, 'http://localhost');

    // remove all h2 nodes inside .content
    $crawler->filter('html .content h2')->each(function (Crawler $crawler) {
    $node = $crawler->getNode(0);
    $node->parentNode->removeChild($node);
    });

    // output .content nodes with h2 removed
    $crawler->filter('html .content')->each(function (Crawler $crawler) {
    echo $crawler->html();
    });