Skip to content

Instantly share code, notes, and snippets.

@MuhammadSawalhy
Last active July 25, 2022 19:03
Show Gist options
  • Save MuhammadSawalhy/fb21fcb8275c2042e6cf22307e06bc8d to your computer and use it in GitHub Desktop.
Save MuhammadSawalhy/fb21fcb8275c2042e6cf22307e06bc8d to your computer and use it in GitHub Desktop.

Revisions

  1. MuhammadSawalhy revised this gist Jul 25, 2022. 2 changed files with 30 additions and 20 deletions.
    15 changes: 12 additions & 3 deletions my-story.md
    Original file line number Diff line number Diff line change
    @@ -1,3 +1,5 @@
    ## Plan A

    1. list images with command: `pdfimages -j -png file.pdf img`
    2. run **list-code-images.py** to find the dark theme code images
    3. invert these image to make them light theme code:
    @@ -6,10 +8,17 @@
    convert $f -channel RGB -negate inversed/$f
    done
    ```
    4. find a way to replace images in a pdf with code, but at the
    end I end up using PyMuPDF to invert dark theme code images
    and save them in the same position using **replace-images.py**
    4. find a way to replace images in a pdf with code (but I gave up here)

    ## Plan B

    I end up using PyMuPDF to invert dark theme code images and save them in the same position using **replace-images.py**.

    You will need to install these packages:

    ```bash
    pip install fitz PyMuPDF
    ```

    ------------

    35 changes: 18 additions & 17 deletions replace-images.py
    Original file line number Diff line number Diff line change
    @@ -1,20 +1,21 @@
    import fitz
    from rich import inspect
    from os import listdir
    # This creates the Document object doc
    doc: fitz.Document = fitz.open("file.pdf")
    for page in doc:
    for img in page.get_images():
    xref = img[-2]
    xref_number = img[0]
    pix = fitz.Pixmap(doc, xref)
    bg_color = pix.pixel(pix.width - 1, int(pix.height / 2))
    if bg_color == (12,12,12):
    pix.invert_irect()
    rect = page.get_image_bbox(xref)
    page.insert_image(rect, pixmap=pix, keep_proportion=False)

    # doc.save(filename=r"file.new.pdf", clean=True)
    # doc.save(filename=r"file.new.pdf", clean=True, garbage=4)
    # without deflate_images=1 the file size is 112MB, but now it is just 12MB
    doc.save(filename=r"file.new.pdf", clean=True, deflate=4, deflate_images=1, deflate_fonts=1)
    doc.close()
    for file in listdir("./files"):
    doc: fitz.Document = fitz.open(f"./files/{file}")
    for page in doc:
    for img in page.get_images(full=True):
    xref = img[0]
    pix = fitz.Pixmap(doc, xref)
    bg_color = pix.pixel(pix.width - 1, int(pix.height / 2))
    if bg_color == (12,12,12):
    pix.invert_irect()
    rect = page.get_image_bbox(img)
    page.insert_image(rect, pixmap=pix, keep_proportion=False)

    # doc.save(filename=r"file.new.pdf", clean=True)
    # doc.save(filename=r"file.new.pdf", clean=True, garbage=4)
    # without deflate_images=1 the file size is 112MB, but now it is just 12MB
    doc.save(filename=f"./processed-files/{file}", clean=True, deflate=4, deflate_images=1, deflate_fonts=1)
    doc.close()
  2. MuhammadSawalhy revised this gist Jul 25, 2022. No changes.
  3. MuhammadSawalhy revised this gist Mar 29, 2022. 2 changed files with 8 additions and 2 deletions.
    9 changes: 8 additions & 1 deletion my-story.md
    Original file line number Diff line number Diff line change
    @@ -8,4 +8,11 @@
    ```
    4. find a way to replace images in a pdf with code, but at the
    end I end up using PyMuPDF to invert dark theme code images
    and save them in the same position using **replace-images.py**
    and save them in the same position using **replace-images.py**


    ------------

    ![image](https://user-images.githubusercontent.com/42011920/160681311-be009615-1e24-4a95-b847-78353bf53927.png)

    Alhamdulillah, all images replaced. This is an illusion of replacement, because the new images are placed in top of the old images. I think there exists some possible ways to really replace using [Document.update_object](https://pymupdf.readthedocs.io/en/latest/document.html#Document.update_object) or [Document.update_stream](https://pymupdf.readthedocs.io/en/latest/document.html#Document.update_stream) provided by PyMuPDF package.
    1 change: 0 additions & 1 deletion replace-images.py
    Original file line number Diff line number Diff line change
    @@ -1,4 +1,3 @@

    import fitz
    from rich import inspect
    # This creates the Document object doc
  4. MuhammadSawalhy revised this gist Mar 29, 2022. 3 changed files with 18 additions and 5 deletions.
    2 changes: 1 addition & 1 deletion list-code-images.py
    Original file line number Diff line number Diff line change
    @@ -8,4 +8,4 @@
    x = math.floor(im.size[0]/2)
    px = im.getpixel((x,-1))
    if px == (12,12,12):
    print(img)
    print(img)
    11 changes: 11 additions & 0 deletions my-story.md
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,11 @@
    1. list images with command: `pdfimages -j -png file.pdf img`
    2. run **list-code-images.py** to find the dark theme code images
    3. invert these image to make them light theme code:
    ```bash
    for f in `cat file.code-images.txt`; do
    convert $f -channel RGB -negate inversed/$f
    done
    ```
    4. find a way to replace images in a pdf with code, but at the
    end I end up using PyMuPDF to invert dark theme code images
    and save them in the same position using **replace-images.py**
    10 changes: 6 additions & 4 deletions replace-images.py
    Original file line number Diff line number Diff line change
    @@ -1,19 +1,21 @@

    import fitz
    from rich import inspect
    # This creates the Document object doc
    doc: fitz.Document = fitz.open("file.pdf")
    for page in doc:
    for img in page.get_images():
    xref = img[0]
    xref = img[-2]
    xref_number = img[0]
    pix = fitz.Pixmap(doc, xref)
    bg_color = pix.pixel(pix.width - 1, int(pix.height / 2))
    if bg_color == (12,12,12):
    pix.invert_irect()
    rect = page.get_image_bbox(img[-2])
    rect = page.get_image_bbox(xref)
    page.insert_image(rect, pixmap=pix, keep_proportion=False)

    # doc.save(filename=r"file.new.pdf", clean=True)
    # doc.save(filename=r"file.new.pdf", clean=True, garbage=4)
    # without deflate_images=1 the file size is 112MB, but now it is just 12MB
    doc.save(filename=r"file.new.pdf", clean=True, deflate=4, deflate_images=1, deflate_fonts=1)

    doc.close()
    doc.close()
  5. MuhammadSawalhy revised this gist Mar 29, 2022. 1 changed file with 3 additions and 1 deletion.
    4 changes: 3 additions & 1 deletion replace-images.py
    Original file line number Diff line number Diff line change
    @@ -12,6 +12,8 @@
    rect = page.get_image_bbox(img[-2])
    page.insert_image(rect, pixmap=pix, keep_proportion=False)

    doc.save(filename=r"file.new.pdf")
    # doc.save(filename=r"file.new.pdf", clean=True)
    # doc.save(filename=r"file.new.pdf", clean=True, garbage=4)
    doc.save(filename=r"file.new.pdf", clean=True, deflate=4, deflate_images=1, deflate_fonts=1)

    doc.close()
  6. MuhammadSawalhy created this gist Mar 29, 2022.
    11 changes: 11 additions & 0 deletions list-code-images.py
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,11 @@
    import math
    import glob
    from PIL import Image

    for img in glob.glob("images/*.png"):
    # for img in ["imageroot-014.png"]:
    with Image.open(img) as im:
    x = math.floor(im.size[0]/2)
    px = im.getpixel((x,-1))
    if px == (12,12,12):
    print(img)
    17 changes: 17 additions & 0 deletions replace-images.py
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,17 @@
    import fitz
    from rich import inspect
    # This creates the Document object doc
    doc: fitz.Document = fitz.open("file.pdf")
    for page in doc:
    for img in page.get_images():
    xref = img[0]
    pix = fitz.Pixmap(doc, xref)
    bg_color = pix.pixel(pix.width - 1, int(pix.height / 2))
    if bg_color == (12,12,12):
    pix.invert_irect()
    rect = page.get_image_bbox(img[-2])
    page.insert_image(rect, pixmap=pix, keep_proportion=False)

    doc.save(filename=r"file.new.pdf")

    doc.close()