Created
November 20, 2018 10:38
-
-
Save jeromesun14/4ed353569faef0e5189e26803e20fd68 to your computer and use it in GitHub Desktop.
python get particular text in a web page
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # Please install BeautifulSoup4 first | |
| # python getLastRevised.py to run this script | |
| # e.g. | |
| # ~$ python getLastRevised.py | |
| # URL: 'http://iq.ul.com/ul/cert.aspx?ULID=234547' Last Revised: 2017-05-11 | |
| from bs4 import BeautifulSoup | |
| import urllib2 | |
| url = "http://iq.ul.com/ul/cert.aspx?ULID=234547" | |
| content = urllib2.urlopen(url) | |
| soup = BeautifulSoup(content, "html.parser") | |
| ttag = soup.find_all(id="FooterTable") | |
| tds = ttag[0].find_all('td') | |
| date = tds[-1].get_text() | |
| print("URL: '{}' Last Revised: {}".format(url, date)) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment