Created
August 7, 2016 12:21
-
-
Save hghwng/324cc28b007a8f650ce3aac5df099ef8 to your computer and use it in GitHub Desktop.
Convert Dynalist flavored OPML to Org Mode
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #!/usr/bin/env python | |
| import bs4 | |
| def convert_element(lines, level=1): | |
| result = '' | |
| for line in lines: | |
| if not isinstance(line, bs4.element.Tag) or \ | |
| line.name != 'outline': | |
| continue | |
| result += '*' * level + ' ' + line.attrs.get('text', '') + '\n' | |
| if 'note' in line.attrs: | |
| result += line.attrs['note'].replace('\r', '\n') + '\n' | |
| result += convert_element(line.children, level + 1) | |
| return result | |
| def convert_file(path): | |
| root = bs4.BeautifulSoup(open(path), "lxml") | |
| return convert_element(root.select('html body opml')[0]) | |
| def main(): | |
| import sys | |
| output_path = sys.argv[1][:-4] + 'org' | |
| result = convert_file(sys.argv[1]) | |
| open(output_path, 'w').write(result) | |
| if __name__ == '__main__': | |
| main() |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I have extended this script to convert inline Markdown formating (links, bold text, and so on) to Org Mode format.
The conversion is done by
pandocviapypandoc(which you both have to install, see here).Warning: This script is extremely slow and takes ages to finish if you have a decent amount of content in your Dynalist files.
The reason is, that
pypandocspawns a new process for every invocation (i.e. it's not a binding via a C-API or so), and we have to callpypandocfor every headline and for every note (I tried several approaches to prevent this, but it wasn't possible easily).I have used the following command with
GNU parallelto run several instances of the script in parallel:find . -name "*.opml" | parallel --bar --eta python convert.py {}I'm using Linux, so I don't know if these dependencies are available on MacOS or Windows.
Here is the altered script: