Skip to content

Instantly share code, notes, and snippets.

@multivac61
Created September 30, 2020 16:15
Show Gist options
  • Select an option

  • Save multivac61/bc8444849c22240d774188f59c9b4fe7 to your computer and use it in GitHub Desktop.

Select an option

Save multivac61/bc8444849c22240d774188f59c9b4fe7 to your computer and use it in GitHub Desktop.

Revisions

  1. multivac61 created this gist Sep 30, 2020.
    23 changes: 23 additions & 0 deletions ruv_scrape.py
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,23 @@
    from bs4 import BeautifulSoup
    from requests_html import HTMLSession
    import re

    url = ''
    assert 'https://www.ruv.is/utvarp/spila' in url

    with HTMLSession() as s:
    resp = s.get(url).html
    resp.render() # Run JavaScript code on webpage

    mp3_filename = BeautifulSoup(resp.html, 'html.parser').find('h2', {'class': 't-h2 title ls-100'}).string + '.mp3'

    mp3_link, *_ = re.findall('https://ruv-rod.secure.footprint.net.*?mp3', resp.html)

    download = s.get(mp3_link)

    if download.status_code == 200:
    print(f"Downloading link {mp3_link} into {mp3_filename}")
    with open(mp3_filename, 'wb') as f:
    f.write(download.content)
    else:
    print(f"Download Failed For File {mp3_link}")