Skip to content

Instantly share code, notes, and snippets.

@kpym
Last active February 1, 2021 18:09
Show Gist options
  • Select an option

  • Save kpym/dc74bdc78a192add4ca4b3853fb3473d to your computer and use it in GitHub Desktop.

Select an option

Save kpym/dc74bdc78a192add4ca4b3853fb3473d to your computer and use it in GitHub Desktop.

Revisions

  1. kpym revised this gist Feb 1, 2021. 2 changed files with 5 additions and 12 deletions.
    6 changes: 3 additions & 3 deletions README.md
    Original file line number Diff line number Diff line change
    @@ -8,9 +8,9 @@ This script needs the following tools :

    ```bash
    > ./pdf2bw.sh
    pdf2bw.sh file.pdf [options]
    Usage: pdf2bw.sh file.pdf [options]
    Result: file_bw.pdf
    Options:
    -r, --resolution <int> : the resolution in dpi
    -t, --thershold <0-100> : the black/white thershold in percent
    -r, --resolution <int> : the resolution in dpi [default 200]
    -t, --thershold <0-100> : the black/white thershold in percent [default 77]
    ```
    11 changes: 2 additions & 9 deletions pdf2bw.sh
    Original file line number Diff line number Diff line change
    @@ -33,18 +33,11 @@ if [ -z "$base" ]; then
    echo "Usage: $0 file.pdf [options]"
    echo "Result: file_bw.pdf"
    echo "Options:"
    echo " -r, --resolution <int> : the resolution in dpi"
    echo " -t, --thershold <0-100> : the black/white thershold in percent"
    echo " -r, --resolution <int> : the resolution in dpi [default 200]"
    echo " -t, --thershold <0-100> : the black/white thershold in percent [default 77]"
    exit 1
    fi

    # Check for mutool
    if ! command -v mutool &> /dev/null
    then
    echo "Error: can't find mutool (MuPDF)"
    exit 1
    fi

    # Check for ImageMagisk = magick on Windows, convert on Mac/Linux
    if command -v magick &> /dev/null
    then
  2. kpym revised this gist Feb 1, 2021. 1 changed file with 9 additions and 2 deletions.
    11 changes: 9 additions & 2 deletions pdf2bw.sh
    Original file line number Diff line number Diff line change
    @@ -4,6 +4,8 @@
    # Convert PDF to black and white CCITT Groupe 4 using ImageMagisk
    # and add blure filter with bbe or sed (if available).

    # ============== SET PARAMETERS =====================================

    # convert arguments to variables
    while [ -n "$1" ]; do
    key="$1"
    @@ -26,7 +28,7 @@ shift # past argument
    esac
    done

    # Get the pdf base
    # Print the usage message (if needed)
    if [ -z "$base" ]; then
    echo "Usage: $0 file.pdf [options]"
    echo "Result: file_bw.pdf"
    @@ -59,7 +61,6 @@ else
    fi
    fi


    # Set the output resolution in dpi
    if [ -z "$resolution" ]; then
    resolution=200
    @@ -72,10 +73,14 @@ if [ -z "$thershold" ]; then
    fi
    echo "Thersnold set to ${thershold}%."

    # ===================== CONVERT =====================================

    # Do the actual conversion
    echo "Converting ${base}.pdf to ccitt black and white pdf."
    $im -density $resolution "${base}.pdf" -threshold "${thershold}%" -type bilevel -compress Group4 "${base}_temp.pdf"

    # ==================== ADD BLUR =====================================

    # Use bbe or sed (if present) to add interpolation option to CCITT strames
    if command -v bbe &> /dev/null; then
    echo "Add interpolation (blur) using bbe : ${base}_temp.pdf -> ${base}_bw.pdf"
    @@ -89,6 +94,8 @@ else
    mv "${base}_temp.pdf" "${base}_bw.pdf"
    fi

    # ====================== CLEAR ======================================

    # Clear temporary files
    if [ -f "${base}_temp.pdf" ]; then
    echo delete temp pdf
  3. kpym revised this gist Feb 1, 2021. 2 changed files with 2 additions and 2 deletions.
    2 changes: 1 addition & 1 deletion README.md
    Original file line number Diff line number Diff line change
    @@ -11,6 +11,6 @@ This script needs the following tools :
    pdf2bw.sh file.pdf [options]
    Result: file_bw.pdf
    Options:
    -r, --resolution] <int> : the resolution in dpi
    -r, --resolution <int> : the resolution in dpi
    -t, --thershold <0-100> : the black/white thershold in percent
    ```
    2 changes: 1 addition & 1 deletion pdf2bw.sh
    Original file line number Diff line number Diff line change
    @@ -31,7 +31,7 @@ if [ -z "$base" ]; then
    echo "Usage: $0 file.pdf [options]"
    echo "Result: file_bw.pdf"
    echo "Options:"
    echo " -r, --resolution] <int> : the resolution in dpi"
    echo " -r, --resolution <int> : the resolution in dpi"
    echo " -t, --thershold <0-100> : the black/white thershold in percent"
    exit 1
    fi
  4. kpym revised this gist Feb 1, 2021. 2 changed files with 16 additions and 27 deletions.
    3 changes: 1 addition & 2 deletions README.md
    Original file line number Diff line number Diff line change
    @@ -1,9 +1,8 @@
    # Description

    This script needs the following tools :
    - `mutool` (from [MuPDF](https://www.mupdf.com/)) to convert all pages to `.png` files, and to merge back all compressed pages to single pdf.
    - [ImageMagick](https://imagemagick.org), `convert` (Mac/Linux) or `magick` (Windows), to convert all `.png` pages to black and white and compress them with `CCITT Groupe 4` compression.
    - Use [bbe](https://sourceforge.net/projects/bbe-/) or [sed](https://www.gnu.org/software/sed) (if present) to add interpolation option to all CCITT streams to make the pages more pleasing to the eye (less "pixelated").
    - Use [bbe](https://sourceforge.net/projects/bbe-/) or [sed](https://www.gnu.org/software/sed) (if present) to add interpolation option to all CCITT streams. This makes the pages more pleasing to the eye (less "pixelated").

    # Usage

    40 changes: 15 additions & 25 deletions pdf2bw.sh
    Original file line number Diff line number Diff line change
    @@ -1,8 +1,8 @@
    #!/bin/sh

    # https://gist.github.com/kpym/dc74bdc78a192add4ca4b3853fb3473d

    # Convert PDF to black and white CCITT Groupe 4 one. Use MuPDF, ImageMagisk and bbe (if available).
    # Convert PDF to black and white CCITT Groupe 4 using ImageMagisk
    # and add blure filter with bbe or sed (if available).

    # convert arguments to variables
    while [ -n "$1" ]; do
    @@ -72,35 +72,25 @@ if [ -z "$thershold" ]; then
    fi
    echo "Thersnold set to ${thershold}%."

    # Convert the pdf pages to set of png images
    echo "extract pages: ${base}.pdf to temp_pageX.png..."
    mutool draw -o "temp_page%03d.png" -r$resolution "${base}.pdf"

    # Compress all pages using CCITT Groupe 4
    for f in temp_page*.png; do
    printf "convert: $f -> ${f%.*}.pdf..."
    $im "$f" -threshold "${thershold}%" -type bilevel -compress Group4 "${f%.*}.pdf"
    echo "done"
    done

    # Collect all pages in a single pdf
    echo "join pages: temp_pageX.pdf -> temp_page_all.pdf"
    mutool merge -o "temp_page_all.pdf" temp_page*.pdf
    # Do the actual conversion
    echo "Converting ${base}.pdf to ccitt black and white pdf."
    $im -density $resolution "${base}.pdf" -threshold "${thershold}%" -type bilevel -compress Group4 "${base}_temp.pdf"

    # Use bbe or sed (if present) to add interpolation option to CCITT strames
    if command -v bbe &> /dev/null; then
    echo "Add interpolation (blur) using bbe : temp_page_all.pdf -> ${base}_bw.pdf"
    bbe -b "/BitsPerComponent 1/:18" -e "A /Interpolate\32true" temp_page_all.pdf -o "${base}_bw.pdf"
    echo "Add interpolation (blur) using bbe : ${base}_temp.pdf -> ${base}_bw.pdf"
    bbe -b "/BitsPerComponent 1/:18" -e "A /Interpolate\32true" "${base}_temp.pdf" -o "${base}_bw.pdf"
    elif command -v sed &> /dev/null; then
    echo "Add interpolation (blur) using sed : temp_page_all.pdf -> ${base}_bw.pdf"
    sed -b -e 's;/BitsPerComponent 1;/BitsPerComponent 1/Interpolate true;' temp_page_all.pdf > "${base}_bw.pdf"
    echo "Add interpolation (blur) using sed : ${base}_temp.pdf -> ${base}_bw.pdf"
    sed -b -e 's;/BitsPerComponent 1;/BitsPerComponent 1/Interpolate true;' "${base}_temp.pdf" > "${base}_bw.pdf"
    else
    echo "Can't find bbe nor sed. The images will look very sharp."
    echo "Move: temp_page_all.pdf -> ${base}_bw.pdf"
    mv temp_page_all.pdf "${base}_bw.pdf"
    echo "Move: ${base}_temp.pdf -> ${base}_bw.pdf"
    mv "${base}_temp.pdf" "${base}_bw.pdf"
    fi

    # Clear temporary files
    echo delete page files
    rm temp_page*.png
    rm temp_page*.pdf
    if [ -f "${base}_temp.pdf" ]; then
    echo delete temp pdf
    rm "${base}_temp.pdf"
    fi
  5. kpym revised this gist Feb 1, 2021. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion pdf2bw.sh
    Original file line number Diff line number Diff line change
    @@ -87,7 +87,7 @@ done
    echo "join pages: temp_pageX.pdf -> temp_page_all.pdf"
    mutool merge -o "temp_page_all.pdf" temp_page*.pdf

    # Use bbe (if present) to add interpolation option to CCITT strames
    # Use bbe or sed (if present) to add interpolation option to CCITT strames
    if command -v bbe &> /dev/null; then
    echo "Add interpolation (blur) using bbe : temp_page_all.pdf -> ${base}_bw.pdf"
    bbe -b "/BitsPerComponent 1/:18" -e "A /Interpolate\32true" temp_page_all.pdf -o "${base}_bw.pdf"
  6. kpym revised this gist Feb 1, 2021. 2 changed files with 8 additions and 8 deletions.
    4 changes: 1 addition & 3 deletions README.md
    Original file line number Diff line number Diff line change
    @@ -3,9 +3,7 @@
    This script needs the following tools :
    - `mutool` (from [MuPDF](https://www.mupdf.com/)) to convert all pages to `.png` files, and to merge back all compressed pages to single pdf.
    - [ImageMagick](https://imagemagick.org), `convert` (Mac/Linux) or `magick` (Windows), to convert all `.png` pages to black and white and compress them with `CCITT Groupe 4` compression.
    - Use [bbe](https://sourceforge.net/projects/bbe-/) (if present) to add interpolation option to all CCITT streams to make the pages more pleasing to the eye (less "pixelated").

    Remark : probably `sed` can be used instead of `bbe` (I didn't check).
    - Use [bbe](https://sourceforge.net/projects/bbe-/) or [sed](https://www.gnu.org/software/sed) (if present) to add interpolation option to all CCITT streams to make the pages more pleasing to the eye (less "pixelated").

    # Usage

    12 changes: 7 additions & 5 deletions pdf2bw.sh
    Original file line number Diff line number Diff line change
    @@ -88,12 +88,14 @@ echo "join pages: temp_pageX.pdf -> temp_page_all.pdf"
    mutool merge -o "temp_page_all.pdf" temp_page*.pdf

    # Use bbe (if present) to add interpolation option to CCITT strames
    if command -v bbe &> /dev/null
    then
    echo "Add interpolation (blur) : temp_page_all.pdf -> ${base}_bw.pdf"
    bbe -b ":/BitsPerComponent 1/" -e "A /Interpolate\32true" temp_page_all.pdf -o "${base}_bw.pdf"
    if command -v bbe &> /dev/null; then
    echo "Add interpolation (blur) using bbe : temp_page_all.pdf -> ${base}_bw.pdf"
    bbe -b "/BitsPerComponent 1/:18" -e "A /Interpolate\32true" temp_page_all.pdf -o "${base}_bw.pdf"
    elif command -v sed &> /dev/null; then
    echo "Add interpolation (blur) using sed : temp_page_all.pdf -> ${base}_bw.pdf"
    sed -b -e 's;/BitsPerComponent 1;/BitsPerComponent 1/Interpolate true;' temp_page_all.pdf > "${base}_bw.pdf"
    else
    echo "Can't find bbe command. The images will look very sharp."
    echo "Can't find bbe nor sed. The images will look very sharp."
    echo "Move: temp_page_all.pdf -> ${base}_bw.pdf"
    mv temp_page_all.pdf "${base}_bw.pdf"
    fi
  7. kpym revised this gist Feb 1, 2021. 2 changed files with 6 additions and 6 deletions.
    2 changes: 1 addition & 1 deletion README.md
    Original file line number Diff line number Diff line change
    @@ -12,7 +12,7 @@ Remark : probably `sed` can be used instead of `bbe` (I didn't check).
    ```bash
    > ./pdf2bw.sh
    pdf2bw.sh file.pdf [options]
    Result: file_xs.pdf
    Result: file_bw.pdf
    Options:
    -r, --resolution] <int> : the resolution in dpi
    -t, --thershold <0-100> : the black/white thershold in percent
    10 changes: 5 additions & 5 deletions pdf2bw.sh
    Original file line number Diff line number Diff line change
    @@ -29,7 +29,7 @@ done
    # Get the pdf base
    if [ -z "$base" ]; then
    echo "Usage: $0 file.pdf [options]"
    echo "Result: file_xs.pdf"
    echo "Result: file_bw.pdf"
    echo "Options:"
    echo " -r, --resolution] <int> : the resolution in dpi"
    echo " -t, --thershold <0-100> : the black/white thershold in percent"
    @@ -90,12 +90,12 @@ mutool merge -o "temp_page_all.pdf" temp_page*.pdf
    # Use bbe (if present) to add interpolation option to CCITT strames
    if command -v bbe &> /dev/null
    then
    echo "Add interpolation (blur) : temp_page_all.pdf -> ${base}_xs.pdf"
    bbe -b ":/BitsPerComponent 1/" -e "A /Interpolate\32true" temp_page_all.pdf -o "${base}_xs.pdf"
    echo "Add interpolation (blur) : temp_page_all.pdf -> ${base}_bw.pdf"
    bbe -b ":/BitsPerComponent 1/" -e "A /Interpolate\32true" temp_page_all.pdf -o "${base}_bw.pdf"
    else
    echo "Can't find bbe command. The images will look very sharp."
    echo "Move: temp_page_all.pdf -> ${base}_xs.pdf"
    mv temp_page_all.pdf "${base}_xs.pdf"
    echo "Move: temp_page_all.pdf -> ${base}_bw.pdf"
    mv temp_page_all.pdf "${base}_bw.pdf"
    fi

    # Clear temporary files
  8. kpym revised this gist Feb 1, 2021. 1 changed file with 5 additions and 3 deletions.
    8 changes: 5 additions & 3 deletions README.md
    Original file line number Diff line number Diff line change
    @@ -1,9 +1,11 @@
    # Description

    This script needs the following tools :
    - `mutool` (from MuPDF) to convert all pages to `.png` files, and to merge back all compressed pages to single pdf.
    - ImageMagick, `convert` (Mac/Linux) or `magick` (Windows), to convert all `.png` pages to black and white and compress them with `CCITT Groupe 4` compression.
    - Use `bbe` (if present) to add interpolation option to all CCITT streams to make the pages more pleasing to the eye (less "pixelated").
    - `mutool` (from [MuPDF](https://www.mupdf.com/)) to convert all pages to `.png` files, and to merge back all compressed pages to single pdf.
    - [ImageMagick](https://imagemagick.org), `convert` (Mac/Linux) or `magick` (Windows), to convert all `.png` pages to black and white and compress them with `CCITT Groupe 4` compression.
    - Use [bbe](https://sourceforge.net/projects/bbe-/) (if present) to add interpolation option to all CCITT streams to make the pages more pleasing to the eye (less "pixelated").

    Remark : probably `sed` can be used instead of `bbe` (I didn't check).

    # Usage

  9. kpym revised this gist Feb 1, 2021. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion pdf2bw.sh
    Original file line number Diff line number Diff line change
    @@ -74,7 +74,7 @@ echo "Thersnold set to ${thershold}%."

    # Convert the pdf pages to set of png images
    echo "extract pages: ${base}.pdf to temp_pageX.png..."
    mutool draw -o "temp_page%d.png" -r$resolution "${base}.pdf"
    mutool draw -o "temp_page%03d.png" -r$resolution "${base}.pdf"

    # Compress all pages using CCITT Groupe 4
    for f in temp_page*.png; do
  10. kpym revised this gist Jan 30, 2021. 2 changed files with 62 additions and 18 deletions.
    17 changes: 15 additions & 2 deletions README.md
    Original file line number Diff line number Diff line change
    @@ -1,4 +1,17 @@
    # Description

    This script needs the following tools :
    - `mutool` (from MuPDF) to convert all pages to `.png` files, and to merge back all compressed pages to single pdf.
    - ImageMagick, `convert` (Mac/Linux) or `magick` (Windows), to convert all `.png` pages to black and white (with thershold 77%) and compress them with `CCITT Groupe 4` compression.
    - Use `bbe` (if present) to add interpolation option to all CCITT streams to make the pages more pleasing to the eye (less "pixelated").
    - ImageMagick, `convert` (Mac/Linux) or `magick` (Windows), to convert all `.png` pages to black and white and compress them with `CCITT Groupe 4` compression.
    - Use `bbe` (if present) to add interpolation option to all CCITT streams to make the pages more pleasing to the eye (less "pixelated").

    # Usage

    ```bash
    > ./pdf2bw.sh
    pdf2bw.sh file.pdf [options]
    Result: file_xs.pdf
    Options:
    -r, --resolution] <int> : the resolution in dpi
    -t, --thershold <0-100> : the black/white thershold in percent
    ```
    63 changes: 47 additions & 16 deletions pdf2bw.sh
    Original file line number Diff line number Diff line change
    @@ -1,5 +1,41 @@
    #!/bin/sh

    # https://gist.github.com/kpym/dc74bdc78a192add4ca4b3853fb3473d

    # Convert PDF to black and white CCITT Groupe 4 one. Use MuPDF, ImageMagisk and bbe (if available).

    # convert arguments to variables
    while [ -n "$1" ]; do
    key="$1"
    shift # past argument
    case $key in
    -r|--resolution)
    resolution="$1"
    shift # past value
    ;;
    -t|--threshold)
    thershold="$1"
    shift # past value
    ;;
    *.pdf) # the pdf
    base="${key%.pdf}" # The pdf name without the extension .pdf
    ;;
    *) # unknown option
    echo "Error: Unknown parameter $key"
    base=""
    esac
    done

    # Get the pdf base
    if [ -z "$base" ]; then
    echo "Usage: $0 file.pdf [options]"
    echo "Result: file_xs.pdf"
    echo "Options:"
    echo " -r, --resolution] <int> : the resolution in dpi"
    echo " -t, --thershold <0-100> : the black/white thershold in percent"
    exit 1
    fi

    # Check for mutool
    if ! command -v mutool &> /dev/null
    then
    @@ -23,32 +59,27 @@ else
    fi
    fi

    # Get the pdf filename
    if [ -z "$1" ]; then
    echo "Usage: $0 file.pdf [resolution]"
    echo "Result: file_xs.pdf"
    exit 1
    fi

    # Set the output resolution in ppp
    if [ -z "$2" ]; then
    # Set the output resolution in dpi
    if [ -z "$resolution" ]; then
    resolution=200
    else
    resolution=$2
    fi
    echo "Used resolution to $resolution."
    echo "Resolution set to ${resolution}dpi."

    # The pdf name without the extension .pdf
    base="${1%.*}"
    # Set the black and white thershold
    if [ -z "$thershold" ]; then
    thershold=77
    fi
    echo "Thersnold set to ${thershold}%."

    # Convert the pdf pages to set of png images
    echo "extract pages: $1 to temp_pageX.png..."
    mutool draw -o "temp_page%d.png" -r$resolution "$1"
    echo "extract pages: ${base}.pdf to temp_pageX.png..."
    mutool draw -o "temp_page%d.png" -r$resolution "${base}.pdf"

    # Compress all pages using CCITT Groupe 4
    for f in temp_page*.png; do
    printf "convert: $f -> ${f%.*}.pdf..."
    $im "$f" -threshold 77% -type bilevel -compress Group4 "${f%.*}.pdf"
    $im "$f" -threshold "${thershold}%" -type bilevel -compress Group4 "${f%.*}.pdf"
    echo "done"
    done

  11. kpym created this gist Jan 30, 2021.
    4 changes: 4 additions & 0 deletions README.md
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,4 @@
    This script needs the following tools :
    - `mutool` (from MuPDF) to convert all pages to `.png` files, and to merge back all compressed pages to single pdf.
    - ImageMagick, `convert` (Mac/Linux) or `magick` (Windows), to convert all `.png` pages to black and white (with thershold 77%) and compress them with `CCITT Groupe 4` compression.
    - Use `bbe` (if present) to add interpolation option to all CCITT streams to make the pages more pleasing to the eye (less "pixelated").
    73 changes: 73 additions & 0 deletions pdf2bw.sh
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,73 @@
    #!/bin/sh

    # Check for mutool
    if ! command -v mutool &> /dev/null
    then
    echo "Error: can't find mutool (MuPDF)"
    exit 1
    fi

    # Check for ImageMagisk = magick on Windows, convert on Mac/Linux
    if command -v magick &> /dev/null
    then
    im=magick
    echo "Use magick (probably on Windows)"
    else
    if command -v convert &> /dev/null
    then
    im=convert
    echo "Use convert (hope is not Windows)"
    else
    echo "Error: can'n find ImageMagick (magick or convert)."
    exit 1
    fi
    fi

    # Get the pdf filename
    if [ -z "$1" ]; then
    echo "Usage: $0 file.pdf [resolution]"
    echo "Result: file_xs.pdf"
    exit 1
    fi

    # Set the output resolution in ppp
    if [ -z "$2" ]; then
    resolution=200
    else
    resolution=$2
    fi
    echo "Used resolution to $resolution."

    # The pdf name without the extension .pdf
    base="${1%.*}"

    # Convert the pdf pages to set of png images
    echo "extract pages: $1 to temp_pageX.png..."
    mutool draw -o "temp_page%d.png" -r$resolution "$1"

    # Compress all pages using CCITT Groupe 4
    for f in temp_page*.png; do
    printf "convert: $f -> ${f%.*}.pdf..."
    $im "$f" -threshold 77% -type bilevel -compress Group4 "${f%.*}.pdf"
    echo "done"
    done

    # Collect all pages in a single pdf
    echo "join pages: temp_pageX.pdf -> temp_page_all.pdf"
    mutool merge -o "temp_page_all.pdf" temp_page*.pdf

    # Use bbe (if present) to add interpolation option to CCITT strames
    if command -v bbe &> /dev/null
    then
    echo "Add interpolation (blur) : temp_page_all.pdf -> ${base}_xs.pdf"
    bbe -b ":/BitsPerComponent 1/" -e "A /Interpolate\32true" temp_page_all.pdf -o "${base}_xs.pdf"
    else
    echo "Can't find bbe command. The images will look very sharp."
    echo "Move: temp_page_all.pdf -> ${base}_xs.pdf"
    mv temp_page_all.pdf "${base}_xs.pdf"
    fi

    # Clear temporary files
    echo delete page files
    rm temp_page*.png
    rm temp_page*.pdf