Skip to content

Instantly share code, notes, and snippets.

@joswr1ght
Created March 13, 2024 17:25
Show Gist options
  • Select an option

  • Save joswr1ght/39b85beaa0be67c58a60b6e837ad4221 to your computer and use it in GitHub Desktop.

Select an option

Save joswr1ght/39b85beaa0be67c58a60b6e837ad4221 to your computer and use it in GitHub Desktop.

Revisions

  1. joswr1ght created this gist Mar 13, 2024.
    30 changes: 30 additions & 0 deletions nltk-patch.py
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,30 @@
    # NLTK makes the assumption that users are online when importing the library.
    # This is partly to automate the download or corpus files and other aassets,
    # but if those files already exist then offline mode is problematic. `import nltk`
    # will still work, but it takes a while to timeout, producing errors:
    #
    # [nltk_data] Error loading averaged_perceptron_tagger: <urlopen error
    # [nltk_data] [Errno -3] Temporary failure in name resolution>
    # [nltk_data] Error loading punkt: <urlopen error [Errno -3] Temporary
    # [nltk_data] failure in name resolution>
    # [nltk_data] Error loading stopwords: <urlopen error [Errno -3]
    # [nltk_data] Temporary failure in name resolution>
    #
    # To fix this, you must supply the NLTK files in an offline mode (e.g., download
    # them once and supply them in the appropriate directory, such as `/root/nltk_data`.
    # Then, in your code, override the `nltk.download` method with one that returns True.
    import nltk

    # Define a local function that returns True
    def mock_nltk_download(*args, **kwargs):
    print(f"Skipping NLTK download for: {args[0]}")
    return True # Return True to mimic successful download

    # Patch the nltk.download function to use the locally defined function
    nltk.download = mock_nltk_download

    # Need to patch nltk.download before importing ChatBot to avoid online NLTK
    # file check, so import it here.
    from chatterbot import ChatBot

    # ...