You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
We have been working with PDF files since 1999 and developed complex software to display [PDF](https://blog.idrsolutions.com/what-is-a-pdf/) files. We have learnt a lot about the PDF file format in that time and share our knowledge in the articles below.
XinyuIDR
revised
this gist Nov 27, 2024.
1 changed file
with
2 additions
and
0 deletions.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
We have been working with PDF files since 1999 and developed complex software to display [PDF](https://blog.idrsolutions.com/what-is-a-pdf/) files. We have learnt a lot about the PDF file format in that time and share our knowledge in the articles below.
There are also a large number of technical terms used with PDF so we have created a [Glossary of Terms](https://blog.idrsolutions.com/glossary-of-pdf-terms/) with all the keywords.
XinyuIDR
renamed
this gist Nov 27, 2024.
1 changed file
with
0 additions
and
0 deletions.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
We have been working with PDF files since 1999 and developed complex software to display [PDF](https://blog.idrsolutions.com/what-is-a-pdf/) files. We have learnt a lot about the PDF file format in that time and share our knowledge in the articles below.
There are also a large number of technical terms used with PDF so we have created a [Glossary of Terms](https://blog.idrsolutions.com/glossary-of-pdf-terms/) with all the keywords.
If you are interested in using our software to display your PDF documents (we can rasterize them, [convert them to HTML5](https://blog.idrsolutions.com/why-convert-pdf-documents-to-html/) or SVG, or provide a complete Java PDF Viewer) pdf why not [setup a call](https://www.idrsolutions.com/contact-us) with us and see if we can help?
Here is an overview of the topics covered in this article:
- [Make your own PDF file manually](https://blog.idrsolutions.com/understanding-the-pdf-file-format/#makeyourown)
## Quick Tutorials:
How to solve common PDF tasks in Java with our software
### BuildVu
[How to convert a PDF file into HTML](https://blog.idrsolutions.com/how-to-convert-pdf-to-html-in-java-tutorial/)
[How to convert a PDF file into SVG](https://blog.idrsolutions.com/how-to-convert-pdf-files-to-svg/)
### JDeli
[How to convert an image into PDF file](https://blog.idrsolutions.com/how-to-convert-an-image-to-a-pdf-in-java/)
### JPedal
[How to convert a PDF file to an image](https://blog.idrsolutions.com/how-to-convert-a-pdf-to-image-in-java/)
[How to rasterize PDF files](https://blog.idrsolutions.com/how-to-rasterize-pdf-files/)
[How to search a PDF file](https://blog.idrsolutions.com/how-to-search-a-pdf-file-in-java/)
[How to print a PDF file](https://blog.idrsolutions.com/how-to-print-pdf-files-from-java/)
[How to access PDF metadata](https://blog.idrsolutions.com/how-to-access-pdf-metadata-in-java/)
[How to extract text from PDF files](https://blog.idrsolutions.com/how-to-extract-text-from-pdf-files-in-java/)
[How to extract structured text from PDF files](https://blog.idrsolutions.com/how-to-extract-structured-text-from-pdf-files/)
[How to create or edit Annotations in a PDF file](https://blog.idrsolutions.com/how-to-create-or-edit-pdf-annotations)
[How to extract images from a PDF file](https://blog.idrsolutions.com/how-to-extract-images-from-pdf-in-java/)
[How to extract clipped Images from a PDF file](https://blog.idrsolutions.com/how-to-extract-clipped-images-from-pdf-file-in-java/)
[How to copy bookmarks from one PDF to another](https://blog.idrsolutions.com/how-to-copy-bookmarks-from-one-pdf-to-another/)
[How to find PDF page size](https://blog.idrsolutions.com/how-to-find-pdf-page-size-in-java/)
[How to view PDF files](https://blog.idrsolutions.com/how-to-view-pdf-files-in-java/)
[How to extract PDF file form data](https://blog.idrsolutions.com/how-to-extract-pdf-file-form-data-in-java/)
[How to split a PDF file in Java](https://blog.idrsolutions.com/how-to-split-pdf-files-in-java/)
[How to remove a page from a PDF file in Java](https://blog.idrsolutions.com/how-to-remove-a-page-from-a-pdf-file-in-java/)
[How to split a PDF file in Java](https://blog.idrsolutions.com/how-to-split-pdf-files-in-java/)
## Guides:
[Top 9 pdf file questions with answers for developers](https://blog.idrsolutions.com/top-9-pdf-file-questions-with-answers-for-developers/)
[What is the PDF file format ?](https://blog.idrsolutions.com/what-is-the-pdf-file-format/)
[What Java Developers need to know about PDF Files?](https://blog.idrsolutions.com/what-java-developers-need-to-know-about-pdf-files/)
## Frequently Asked Questions:
Questions developers often ask us
[Why can’t I just open and edit a PDF File?](https://blog.idrsolutions.com/why-cant-i-just-open-and-edit-a-pdf-file/)
[How do I find out the PDF version used?](https://blog.idrsolutions.com/how-do-i-find-out-the-pdf-version-used/)
[What is a PDF renderer?](https://blog.idrsolutions.com/what-is-a-pdf-renderer/)
[What is a tagged PDF?](https://blog.idrsolutions.com/what-is-tagged-pdf/)
[How big is a PDF Page in bytes?](https://blog.idrsolutions.com/how-big-is-a-pdf-page-size-in-bytes/)
[What does an OCR PDF file contain?](https://blog.idrsolutions.com/what-does-ocr-pdf-file-contain/)
[What is PDF Pagesize? CropBox, MediaBox, ArtBox, BleedBox, TrimBox?](https://blog.idrsolutions.com/what-is-pdf-pagesize/)
[How to calculate PDF Page Size in Inches or Centimetres?](https://blog.idrsolutions.com/how-to-calculate-pdf-page-size-in-inches-or-centimetres/)
[Why is my PDF Producer showing in Chinese?](https://blog.idrsolutions.com/why-is-my-pdf-producer-showing-up-in-chinese-or-all-the-adventure-of-the-wrongly-encoded-textstream/)
[How to Embed PDF files in HTML Web Pages](https://blog.idrsolutions.com/how-to-embed-pdf-files-in-html-web-pages/)
[How to Compare PDF files](https://blog.idrsolutions.com/how-to-compare-pdf-files/)
[How to handle corrupt PDF files](https://blog.idrsolutions.com/how-to-handle-corrupt-pdf-files/)
## The PDF File itself:
This section covers the actual file format and how it works
[How to view PDF objects](https://blog.idrsolutions.com/how-to-view-pdf-objects/)
[How to read a PDF file](https://blog.idrsolutions.com/how-to-read-a-pdf-file/)
[Where do your PDF objects start in a PDF file?](https://blog.idrsolutions.com/where-do-your-pdf-objects-start-in-a-pdf-file/)
[Understanding the PDF file format – Text, shapes and images](https://blog.idrsolutions.com/understanding-the-pdf-file-format-text-shapes-and-images/)
[What are PDF Object Streams?](https://blog.idrsolutions.com/what-are-pdf-object-streams/)
[Multiple Trailers in a PDF File](https://blog.idrsolutions.com/multiple-trailers-in-a-pdf-file/)
[What are PDF Xref tables?](https://blog.idrsolutions.com/what-are-pdf-xref-tables/)
[Understanding PDF Text Objects](https://blog.idrsolutions.com/understanding-pdf-text-objects/)
[How does a decodeArray work on Images?](https://blog.idrsolutions.com/how-does-decodearray-work/)
[What is a PDF Dictionary?](https://blog.idrsolutions.com/what-is-a-pdf-dictionary/)
[What is a Linearized PDF File?](https://blog.idrsolutions.com/what-is-a-linearized-pdf/)
[What are Form XObjects?](https://blog.idrsolutions.com/what-are-form-xobjects/)
[How are stacks used in PDF files?](https://blog.idrsolutions.com/how-are-stacks-used-in-pdf-files/)
[How to identify a PDF File](https://blog.idrsolutions.com/how-to-identify-a-pdf-file/)
[No Startxref found in last 1024 bytes?](https://blog.idrsolutions.com/no-startxref-found-in-last-1024-bytes-opening-file-what-does-this-error-message-mean-with-a-pdf-file/)
[How to Embed your own data in PDF files](https://blog.idrsolutions.com/how-to-embed-your-own-data-in-pdf-files/)
[Why writing a PDF parser is such a challenging task (Part 234)](https://blog.idrsolutions.com/why-writing-a-pdf-parser-is-such-a-challenging-task-part-234/)
## Images in PDF:
This section explores image related topics in the PDF File format
[How are images stored in a PDF file?](https://blog.idrsolutions.com/how-images-are-stored-in-pdf/)
[What are Blend Modes in PDF files?](https://blog.idrsolutions.com/what-are-blend-modes-in-pdf/)
[What are PDF Image Masks?](https://blog.idrsolutions.com/what-are-image-masks/)
[How to calculate PDF Image DPI?](https://blog.idrsolutions.com/how-to-calculate-pdf-image-dpi/)
[How to extract Raw JPEG Images from a PDF File?](https://blog.idrsolutions.com/how-to-extract-raw-jpeg-images-from-a-pdf-file/)
[How do Filter and DecodeParms Objects change a PDF Image?](https://blog.idrsolutions.com/filter-and-decodeparms-objects-for-a-pdf-image/)
## Color handling in PDF:
Color support inside PDF files is very powerful and complex.
[How does Color work in PDF files?](https://blog.idrsolutions.com/how-does-color-work-in-pdf-files/)
[How does image color depth work in PDF files?](https://blog.idrsolutions.com/how-does-image-color-depth-work-in-pdf-files/)
[What is an Indexed Colorspace in a PDF file?](https://blog.idrsolutions.com/what-is-an-indexed-colorspace-in-a-pdf-file/)
[Why is white a special color in PDF Files?](https://blog.idrsolutions.com/why-is-white-a-special-color-in-pdf-files/)
[What are ICCBased Colorspaces?](https://blog.idrsolutions.com/what-are-iccbased-colorspaces-in-pdf-files/)
## Text in PDF:
How Text is stored, displayed and extracted from a PDF file
[How is text stored in a PDF file?](https://blog.idrsolutions.com/how-is-text-stored-in-a-pdf-file/)
[Why is pdf text extraction problematic?](https://blog.idrsolutions.com/why-is-pdf-text-extraction-problematic/)
[What is Unicode?](https://blog.idrsolutions.com/beginners-introduction-unicode/)
[What text format and style information is in a PDF file?](https://blog.idrsolutions.com/what-text-format-and-style-information-in-a-pdf-file/)
[How to find out if a PDF file contains ‘structured content’](https://blog.idrsolutions.com/how-to-find-out-if-a-pdf-file-has-structured-content/)
[What does the ActualText dictionary tag do?](https://blog.idrsolutions.com/what-does-the-actualtext-dictionary-tag-do/)
[How do PDF Text Coordinates work?](https://blog.idrsolutions.com/how-do-pdf-text-coordinates-work/)
[How are carriage returns, spaces and other gaps defined in a PDF file?](https://blog.idrsolutions.com/how-are-carriage-returns-spaces-and-other-gaps-defined/)
[PDF Mystery – What is the correct value for a Text Field?](https://blog.idrsolutions.com/pdf-mystery-what-is-the-correct-value-for-a-text-field/)
[PDF Text extraction – Why can I not extract text from a PDF file?](https://blog.idrsolutions.com/why-can-i-not-extract-text-from-this-pdf-file/)
[How are text links defined in a PDF file?](https://blog.idrsolutions.com/how-are-text-links-defined-in-a-pdf-file/)
[How are Text spaces created in a PDF file?](https://blog.idrsolutions.com/how-are-text-spaces-created-in-a-pdf-file)
## Fonts in PDF:
PDF files can use three different font technologies for display
[Introductory PDF font tutorial](https://blog.idrsolutions.com/introductory-pdf-font-tutorial/)
[Introduction to PDF Font Technologies](https://blog.idrsolutions.com/pdf-font-technologies/)
[How are Embedded CMAP tables defined in a PDF File?](https://blog.idrsolutions.com/how-are-embedded-cmap-tables-in-pdf-file/)
[What are CID Fonts?](https://blog.idrsolutions.com/what-are-cid-fonts/)
[What are subsetted fonts in PDF files?](https://blog.idrsolutions.com/what-are-subsetted-fonts-in-pdf-files/)
[Where do PDF viewers get font data for non-embedded fonts?](https://blog.idrsolutions.com/where-do-pdf-viewers-get-font-data-for-non-embedded-fonts/)
[Problems caused by arial fonts in PDF files](https://blog.idrsolutions.com/problems-caused-by-arial-font-in-pdf-files/)
[How does TrueType Hinting work?](https://blog.idrsolutions.com/how-does-truetype-hinting-work/)
[Why are CID Fonts far more complicated than non-CID Fonts?](https://blog.idrsolutions.com/why-are-cid-fonts-far-more-complicated-than-non-cid-fonts/)
## PDF Forms, Annotations & Interactive Elements:
PDF files can contain interactive elements with Forms and Annotations
[What are PDF Forms?](https://blog.idrsolutions.com/what-are-pdf-forms/)
[What are AcroForms?](https://blog.idrsolutions.com/what-are-acroforms/)
[What are XFA Forms?](https://blog.idrsolutions.com/what-are-xfa-forms/)
[How do PDF files add interactive elements?](https://blog.idrsolutions.com/how-do-pdf-files-add-interactive-elements/)
[How do Layers work in a PDF file?](https://blog.idrsolutions.com/how-do-layers-work-in-a-pdf-file/)
[Is it possible to extract flattened form data from a PDF file?](https://blog.idrsolutions.com/is-it-possible-to-extract-flattened-form-data-from-a-pdf-file/)
[What is PDF Form Flattening?](https://blog.idrsolutions.com/what-is-pdf-form-flattening/)
[How to display PDF forms in a browser](https://blog.idrsolutions.com/how-to-display-pdf-forms-in-a-browser/)
## PDF File Encryption:
PDF files can have their content protected using encryption.
[How are PDF files protected?](https://blog.idrsolutions.com/how-are-pdf-files-protected/)
[Overview of Security Features offered by the PDF file format](https://blog.idrsolutions.com/brief-overview-of-security-features-offered-by-the-pdf-file-format/)
[How are PDF files password protected?](https://blog.idrsolutions.com/how-are-pdf-files-password-protected/)
[How to create your own test certificates and keys for signing PDF files](https://blog.idrsolutions.com/how-to-create-your-own-test-certificates-and-keys-for-signing-pdf-files/)
## PDF compression:
PDF files use CCITT, DCT, Flate, LZW and other forms of Compression to reduce the size of a PDF file.
[What is CCITT compression?](https://blog.idrsolutions.com/what-is-ccitt-compression//)
[How to Convert CCITT data to TIFF image](https://blog.idrsolutions.com/how-to-convert-ccitt-data-to-tiff/)
[What is the best option to compress a PDF?](https://blog.idrsolutions.com/what-is-the-best-compression-format-for-pdf/)
[How does CCITT compress image data?](https://blog.idrsolutions.com/how-does-ccitt-compress-image-data/)
## Make your own PDF file manually with our ‘Hello World’ coding example
One of our developers bravely set out to write the ‘Hello World’ tutorial of PDF files, creating a PDF file from scratch manually, in a text editor. Follow the series:
[Part 1: PDF Objects and Data Types](https://blog.idrsolutions.com/make-your-own-pdf-file-part-1-pdf-objects-and-data-types/)
[Part 2: Structure of a PDF file](https://blog.idrsolutions.com/make-your-own-pdf-file-part-2-structure-of-a-pdf-file/)
[Part 2.5: Create a non working PDF](https://blog.idrsolutions.com/make-your-own-pdf-part-2b-create-your-own-non-working-pdf/)