A detailed guide to PDF compression using Python
ConvertAPI provides a wide variety of converters suite. One of our popular converters is PDF Compression API. It can reduce a document's size by up to 99% while maintaining the same visual clarity. You can use Python programming language to compress PDF documents easily. We created an SDK library for Python to avoid any explicit HTTP calls - we handle it for you. In this tutorial, we will go through the steps needed to use Python for PDF compression.
PDF Document compression algorithm
Our PDF Compression API applies multiple techniques with configurable options to reduce the size of the PDF. These include PDF structure optimization, linearizing the document, and subsetting embedded fonts, so only the used characters are included in the PDF assets. It will also allow you to remove multiple objects from the final PDF, like alternative images, unused fonts, duplicate elements, annotations, bookmarks, etc. The PDF compressor can preserve the PDF/A format and handle the compression for password-protected documents using Python. It is achievable easier than it sounds using our ConvertAPI library for Python programming language.
How to use Python to compress a PDF document?
Compressing PDF documents using Python is super simple. Follow these steps to reduce the document's size programatically:
- Install ConvertAPI library
- Get your API secret key
- Set up the conversion parameters
- Copy-paste the code into your project
Install ConvertAPI library for Python
The first thing you want to do is to install our Python library into your project. Here you have two options. You can install it using pip:
pip install --upgrade convertapi
Or you can use our library's source code from GitHub by using:
python setup.py install
Sign up for a free account
Secondly, please sign up for a free account on the ConvertAPI website to retrieve your API secret key.
Set up PDF compression parameters
Once you have your library installed and found your API secret, using the Python library for PDF compression is super simple. You can set up all conversion parameters and test the compression result using our PDF Compression API interactive demo page.
Get your auto-generated code snippet
As soon as you are happy with the conversion result, please find an auto-generated code snippet at the bottom of the page. All parameters in the code snippet are generated dynamically based on your choices, so there is no more coding involved - copy-paste the code snippet into your project, and you are good to go!
Code example
An extended example of a PDF compression in Python programming language might look something like this:
convertapi.api_secret = 'your-api-secret' convertapi.convert('compress', { 'File': '/path/to/large.pdf', 'ColorImageCompression': 'zip', 'ColorImageQuality': '70', 'ColorImageDownsample': 'true', 'ColorImageThresholdDpi': '150', 'ColorImageResampleDpi': '100', 'RemoveBookmarks': 'true', 'RemoveAnnotations': 'true', 'RemoveForms': 'true', 'RemovePageLabels': 'true', 'RemoveLayers': 'true', 'RemoveArticleThreads': 'true', 'RemoveNamedDestinations': 'true', 'RemoveEmbeddedFiles': 'false', 'RemovePieceInformation': 'false', 'UnembedBaseFonts': 'true', 'SubsetEmbeddedFonts': 'true', 'CreateObjectStreams': 'false', 'Linearize': 'true' }, from_format = 'pdf').save_files('/path/to/dir')
Advanced Python PDF compression options
You can convert local files from your disc drive as well as remote files accessible by a public URL, or even pass a file stream to gain all performance benefits using the ConvertAPI library for Python.
Compress a local PDF file
To compress a local PDF file stored on your machine, specify the path to the file and the destination folder where you want to store your result like so:
convertapi.api_secret = 'your-api-secret' convertapi.convert('compress', { 'File': '/path/to/sample.pdf' }, from_format = 'pdf').save_files('/path/to/dir')
Compress a remote PDF accessible by a URL
If you want to compress a remote file hosted on a server, please ensure it is publicly accessible by a URL. The URL's response must be a PDF file with the appropriate "application/pdf" content type set in the header.
convertapi.api_secret = 'your-api-secret' convertapi.convert('compress', { 'File': 'https://cdn.convertapi.com/cara/testfiles/document-large.pdf' }, from_format = 'pdf').save_files('/path/to/dir')
Compress a file stream
For large file processing, consider using file streams. It can increase performance significantly. Here is an example of how to pass a file stream to our converters:
import convertapi import io import tempfile convertapi.api_secret = 'your-api-secret' with io.FileIO("path\\to\\file", 'r') as file_stream: result = convertapi.convert('pdf', { 'File': file_stream }) saved_files = result.save_files(tempfile.gettempdir()) print("The PDF saved to %s" % saved_files)
You can find more examples of advanced file conversion techniques in our GitHub examples repo.
Conclusion
ConvertAPI makes your Python PDF compression easy. Simply install our library and use the auto-generated code snippets from our PDF Compression API page. You can set up a fully customizable PDF compression level and improve performance with significantly reduced PDF sizes. Check out our ConvertAPI Python library on GitHub, and feel free to contribute if you have an idea of how to make it even better!