Skip to content

Instantly share code, notes, and snippets.

@OrigamiEngineer
Forked from AO8/table_writer.py
Created December 23, 2020 15:47
Show Gist options
  • Select an option

  • Save OrigamiEngineer/57c6b2f465eb79c2fa5d99f3d0d67238 to your computer and use it in GitHub Desktop.

Select an option

Save OrigamiEngineer/57c6b2f465eb79c2fa5d99f3d0d67238 to your computer and use it in GitHub Desktop.

Revisions

  1. @AO8 AO8 revised this gist Jun 21, 2018. 1 changed file with 1 addition and 2 deletions.
    3 changes: 1 addition & 2 deletions table_writer.py
    Original file line number Diff line number Diff line change
    @@ -1,5 +1,4 @@
    # Extra indent on page 88 leads to incorrect printout of rows, fixed below
    # Use of try / finally on page 88 not as Pythonic as a "with" context manager, suggested below
    # Adapted from example in "Web Scraping with Python, 2nd Edition" by Ran Mitchell.

    import csv
    from urllib.request import urlopen
  2. @AO8 AO8 created this gist Jun 19, 2018.
    20 changes: 20 additions & 0 deletions table_writer.py
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,20 @@
    # Extra indent on page 88 leads to incorrect printout of rows, fixed below
    # Use of try / finally on page 88 not as Pythonic as a "with" context manager, suggested below

    import csv
    from urllib.request import urlopen
    from bs4 import BeautifulSoup

    html = urlopen("http://en.wikipedia.org/wiki/"
    "Comparison_of_text_editors")
    soup = BeautifulSoup(html, "html.parser")
    table = soup.findAll("table", {"class":"wikitable"})[0]
    rows = table.findAll("tr")

    with open("editors.csv", "wt+", newline="") as f:
    writer = csv.writer(f)
    for row in rows:
    csv_row = []
    for cell in row.findAll(["td", "th"]):
    csv_row.append(cell.get_text())
    writer.writerow(csv_row)