PDF to Text API Overview
Convert a textual and scanned PDF document to a plain text file, extract text from a PDF, and apply OCR on a scanned PDF document before conversion.
Instant Text Extraction
Quickly pull plain text from any PDF document with customizable settings.
Accurate Parsing
Retains text structure while removing non-textual elements.
Works with Scanned PDFs
Supports OCR for image-based and scanned PDFs.
Lightweight Output
Get clean, minimalistic TXT files ready for further processing.
Custom OCR Settings
Select OCR engine, language, and OCR mode to get the best results.
Privacy First
Your data is secured under ISO 27001, GDPR, and HIPAA compliance.
Customizable Parameters
Fine-tune your automation with these powerful conversion options
File
File Supported formats: .pdfFile to be converted. Value can be URL or file content.
Password
StringSets the password to open protected documents.
PageRange
String Default: 1-2000Set page range. Example 1-10 or 1,2,5.
OcrMode
Collection Default: autoDefines how OCR is applied during conversion. Auto performs OCR only when needed. Force applies OCR to all pages. Never disables OCR entirely.
OcrLanguage
Collection Default: autoConfigure the OCR language for text recognition. If auto-detection fails, manually specify the language.
Values: auto ar ca zh da nl en fi fr de el ko it ja no pl pt ro ru sl es sv tr ua thIncludeFormatting
Bool Default: FalsePersist formatting while extracting text. Only works when RemoveHeadersFooters and RemoveFootnotes properties are disabled.
SplitPages
Bool Default: FalseSplit each page to different result file.
RemoveHeadersFooters
Bool Default: FalseRemove headers and footers from the document.
RemoveFootnotes
Bool Default: FalseRemove footnotes from the document.
RemoveTables
Bool Default: FalseRemove tables from the document.
StoreFile
Bool Default: FalseWhen the StoreFile parameter is set to True, your converted file is written to ConvertAPI’s encrypted, temporary storage and made available via a time-limited secure download URL, valid for up to 3 hours. After this period, the file is permanently deleted.
When StoreFile is set to False, conversion happens entirely in-memory. The raw file bytes are streamed back in the API response without touching disk or external storage, ensuring maximum security and zero persistence so that only you can access the content.
Integrate within minutes
Easy PDF to Text automation using our simple REST-API
Businesses trust us
Highest rated File Conversion API on major B2B software listing platforms: Capterra, G2, and Trustpilot.
"ConvertAPI has been a game-changer for our document automation workflows. Their conversion accuracy and API reliability are unmatched in the industry for over 7 years."
"ConvertAPI is a reliable, cost-effective solution with a proven track record of stability. It has grown significantly in maturity, adopting enterprise-grade practices over the years."
"We've integrated ConvertAPI across our entire document processing platform. The performance is exceptional and the support team is always responsive. Highly recommended!"
Enterprise-Grade Security
We ensure that all document processing is handled securely in the cloud, adhering to industry-leading standards like ISO 27001, GDPR, and HIPAA. To enhance security even further, we can ensure that no files or data are stored on our servers and never leave your country.
Learn more about security