Skip to content

Instantly share code, notes, and snippets.

@SHDShim
Last active November 13, 2024 08:16
Show Gist options
  • Select an option

  • Save SHDShim/717fd9f60f98f85eee5e70831d6ac8a0 to your computer and use it in GitHub Desktop.

Select an option

Save SHDShim/717fd9f60f98f85eee5e70831d6ac8a0 to your computer and use it in GitHub Desktop.
Pandoc - LaTeX to WORD conversion

LaTeX to WORD conversion

Pandoc can be installed in anaconda by

conda install pandoc

or in brew. For brew, switch to brew environment first.

swt_brew # custom command in .bash_profile

Then go to the folder where you have .tex file to convert. Run the following command:

pandoc -s FeSH-JGR.tex -F pandoc-crossref  --citeproc -f latex -o FeSH-JGR.docx --bibliography=FeSH.bib

The command works only with brew version of pandoc for me. Change the file names appropriately.

Note that the order betweeen pandoc-crossref and citeproc should not change. Run multiples of time (unconfirmed).

pandoc knows how to deal with bibunits and \input{}. Note that line number referencing and citing do not work with pandoc.

Remove TrackChanges

If you have track changes, the command can be removed by:

python ../acceptchanges.py -c -n --infile=0-main.tex --outfile=0-main-no-track.tex

Note that acceptchanges.py works only with python v2.7. The acceptchanges.py can be found under TrackChanges folder or it comes with the latex package.

acceptchanges.py does not automatically deal with \input{} or \include{} files. So I have to process with acceptchanges.py all associated files individually.

For the command above to work, copy entire latex folder under /trackchanges-0.7.0/PythonPackage/ folder.

Remove comments and organize for submission

Under base environment, I installed arxiv_latex_cleaner (https://github.com/google-research/arxiv-latex-cleaner).

arxiv_latex_cleaner Hydrogen_FeS_paper/ --im_size 500

This will create organized package in a new folder, Hydrogen_FeS_paper_arXiv.

arxiv_latex_cleaner seems to know how to deal with \include{} and \input{}. It also deals well with bibunits.

Word to LaTeX conversion

pandoc -t latex -f docx in.docx -o out.tex

LaTeX to html conversion

pandoc text-main.tex -f latex -t html -s -o text-main.html --mathjax --bibliograph=B30.bib --citeproc

For better outcome to html:

  • convert all pdf figures to png figures
  • \multicolumn is not compatible with pandoc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment