Skip to content

Instantly share code, notes, and snippets.

@undefinedzain
Created May 17, 2017 23:26
Show Gist options
  • Select an option

  • Save undefinedzain/c10716e1813d292690a6fafb661e66a5 to your computer and use it in GitHub Desktop.

Select an option

Save undefinedzain/c10716e1813d292690a6fafb661e66a5 to your computer and use it in GitHub Desktop.

Revisions

  1. undefinedzain created this gist May 17, 2017.
    32 changes: 32 additions & 0 deletions synonim_scraping.py
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,32 @@
    import requests
    from bs4 import BeautifulSoup
    import sys

    if len(sys.argv) == 1:
    print ('Please input at least a word, i.e python scraping.py bisa')
    else:
    kata = sys.argv[1]

    headers = {'User-Agent': 'Mozilla/5.0'}
    payload = {'q':kata}

    session = requests.Session()
    resp = session.post('http://www.persamaankata.com/search.php',headers=headers,data=payload)

    html_element = BeautifulSoup(resp.content,'lxml')

    if len(html_element.find_all('ul')) > 0:

    all_ul = html_element.find_all('ul')[0] # Sinonim aja ul[1] isinya antonim

    all_ul.find_all('div',{'class' : 'word_thesaurus'})

    a = all_ul.find_all('a')

    synonim_array = []

    for sinonim in a:
    print(sinonim).text

    else:
    print('Please input right words')
    5 changes: 5 additions & 0 deletions usage.md
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,5 @@
    # For usage just run command below

    python synonim_scraping.py baik
    OR
    python synonim_scraping.py tampan