Skip to content

Instantly share code, notes, and snippets.

@SHDShim
Last active November 13, 2024 08:16
Show Gist options
  • Select an option

  • Save SHDShim/717fd9f60f98f85eee5e70831d6ac8a0 to your computer and use it in GitHub Desktop.

Select an option

Save SHDShim/717fd9f60f98f85eee5e70831d6ac8a0 to your computer and use it in GitHub Desktop.
Pandoc - LaTeX to WORD conversion

LaTeX to WORD conversion

Pandoc can be installed in anaconda by

conda install pandoc

or in brew.

brew install pandoc

I strongly recommand to use brew version. For brew, switch to the brew environment first.

swt_brew # custom command in .bash_profile

Then go to the folder where you have .tex files to convert. Run the following command:

pandoc  --citeproc -f latex -s FeSH-JGR.tex -o FeSH-JGR.docx --bibliography=FeSH.bib

Change the file names appropriately.

  • The command works only with brew version of pandoc for me.
  • Note that the order betweeen pandoc-crossref and citeproc should not change.
  • Run multiples of time (unconfirmed).
  • Pandoc knows how to deal with bibunits and \input{}.
  • Note that line number referencing and citing do not work with pandoc.
  • Do not place both -F pandoc-crossref and --citeproc at the same time, it will repeat the figure and table caption titles.
  • Note that renumbering for figures and tables (for example S1 for supplementary contents) are all ignored. So be careful if this is important.

Remove TrackChanges

If you have track change commands from the TrackChanges package (not the changes package), the commands can be removed by:

python ../acceptchanges.py -c -n --infile=0-main.tex --outfile=0-main-no-track.tex
  • Note that acceptchanges.py works only with python v2.7.
  • The acceptchanges.py can be found under TrackChanges folder or it comes with the latex package.
  • acceptchanges.py does not automatically deal with \input{} or \include{} files. So I have to process with acceptchanges.py all associated files individually.
  • For the command above to work, copy entire latex folder under /trackchanges-0.7.0/PythonPackage/ folder.

Remove comments and organize files for submission

Under the base environment of anaconda, I installed arxiv_latex_cleaner (https://github.com/google-research/arxiv-latex-cleaner).

arxiv_latex_cleaner Hydrogen_FeS_paper/ --im_size 500
  • This will create organized package in a new folder, Hydrogen_FeS_paper_arXiv.
  • arxiv_latex_cleaner seems to know how to deal with \include{} and \input{}. It also deals well with bibunits.

Word to LaTeX conversion

pandoc -t latex -f docx in.docx -o out.tex

LaTeX to html conversion

pandoc -F pandoc-crossref  --citeproc --mathjax -f latex -t html -s FeSH-JGR.tex -o FeSH-JGR.html --bibliography=FeSH.bib

For better outcome to html:

  • convert all pdf figures to png figures
  • \multicolumn is not compatible with pandoc
  • Figure labelings are properly done, but not for table labeling. I try everything but so far the line above is the best. If you remove any of -F pandoc-crossref and --citeproc from the line above, it will mess up the citations.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment