Skip to content

API reference

Package

pdfrest

Top-level package for the pdfrest client library.

AsyncPdfRestClient

Bases: _AsyncApiClient

Asynchronous client for interacting with the pdfrest API.

__init__(*, api_key=None, base_url=None, timeout=None, headers=None, http_client=None, transport=None, concurrency_limit=DEFAULT_FILE_INFO_CONCURRENCY, max_retries=DEFAULT_MAX_RETRIES)

Create an asynchronous pdfRest client.

query_pdf_info(file, *, queries=ALL_PDF_INFO_QUERIES, extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Query pdfRest for metadata describing a PDF document asynchronously.

summarize_text(file, *, target_word_count=400, summary_format='overview', pages=None, output_format='markdown', output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Summarize the textual content of a PDF, Markdown, or text document.

Always requests JSON output and returns the inline summary response defined in the pdfRest API reference.

summarize_text_to_file(file, *, target_word_count=400, summary_format='overview', pages=None, output_format='markdown', output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Summarize a document and return the result as a downloadable file.

convert_to_markdown(file, *, pages=None, page_break_comments=False, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Convert a PDF to Markdown and return a file-based response.

ocr_pdf(file, *, languages='English', pages=None, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Perform OCR on a PDF to make text searchable and extractable.

translate_pdf_text(file, *, output_language, pages=None, output_format='markdown', output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Translate the textual content of a PDF, Markdown, or text document (JSON).

translate_pdf_text_to_file(file, *, output_language, pages=None, output_format='markdown', output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Translate textual content and receive a file-based response.

extract_images(file, *, pages=None, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Extract embedded images from a PDF.

extract_pdf_text(file, *, pages=None, full_text='document', preserve_line_breaks=False, word_style=False, word_coordinates=False, extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Extract text content from a PDF and return parsed JSON results.

extract_pdf_text_to_file(file, *, pages=None, full_text='document', preserve_line_breaks=False, word_style=False, word_coordinates=False, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Extract text content from a PDF and return a file-based response.

preview_redactions(file, *, redactions, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Asynchronously generate a PDF redaction preview.

apply_redactions(file, *, rgb_color=None, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Asynchronously apply PDF redactions.

add_text_to_pdf(file, *, text_objects, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Asynchronously insert text blocks into a PDF.

add_image_to_pdf(file, *, image, x, y, page, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Asynchronously insert an image into a PDF.

up(*, extra_headers=None, extra_query=None, extra_body=None, timeout=None) async

Call the /up health endpoint asynchronously and return server metadata.

split_pdf(file, *, page_groups=None, output_prefix=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Asynchronously split a PDF into one or more PDF files.

merge_pdfs(sources, *, output_prefix=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Asynchronously merge multiple PDFs (or page subsets) into a single PDF.

zip_files(files, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Asynchronously compress one or more files into a zip archive.

unzip_file(file, *, password=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Asynchronously extract files from a zip archive.

convert_to_excel(file, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Asynchronously convert a PDF to an Excel spreadsheet.

convert_to_powerpoint(file, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Asynchronously convert a PDF to a PowerPoint presentation.

convert_xfa_to_acroforms(file, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Asynchronously convert an XFA PDF to an AcroForm-enabled PDF.

convert_to_word(file, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Asynchronously convert a PDF to a Word document.

import_form_data(file, data_file, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Asynchronously import form data from a data file into a PDF.

export_form_data(file, *, data_format, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Asynchronously export form data from a PDF into a data file.

data_format support depends on detected form type: - AcroForm PDFs: xfdf, fdf, xml - XFA PDFs: xfd, xdp, xml

flatten_pdf_forms(file, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Asynchronously flatten form fields in a PDF.

add_permissions_password(file, *, new_permissions_password, restrictions=None, current_open_password=None, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Asynchronously add a permissions password and optional restrictions to a PDF.

change_permissions_password(file, *, current_permissions_password, new_permissions_password, restrictions=None, current_open_password=None, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Asynchronously rotate the permissions password and optionally update restrictions.

remove_permissions_password(file, *, current_permissions_password, current_open_password=None, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Asynchronously remove permissions restrictions from a PDF.

add_open_password(file, *, new_open_password, current_permissions_password=None, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Asynchronously encrypt a PDF with a new open password.

change_open_password(file, *, current_open_password, new_open_password, current_permissions_password=None, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Asynchronously rotate the open password for an encrypted PDF.

remove_open_password(file, *, current_open_password, current_permissions_password=None, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Asynchronously decrypt a PDF by removing its open password.

compress_pdf(file, *, compression_level, profile=None, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Asynchronously compress a PDF.

add_attachment_to_pdf(file, *, attachment, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Asynchronously attach an uploaded file to a PDF.

blank_pdf(*, page_size='letter', page_count=1, page_orientation=None, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Asynchronously create a blank PDF with configurable size and count.

flatten_transparencies(file, *, output=None, quality='medium', extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Asynchronously flatten transparent objects in a PDF.

linearize_pdf(file, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Asynchronously linearize a PDF for optimized fast web view.

flatten_annotations(file, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Asynchronously flatten annotations into the PDF content.

flatten_layers(file, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Asynchronously flatten all layers in a PDF.

rasterize_pdf(file, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Asynchronously rasterize a PDF into a flattened bitmap-based PDF.

convert_office_to_pdf(file, *, output=None, compression='lossy', downsample=300, tagged_pdf=False, locale=None, page_size=None, page_margin=None, page_orientation=None, web_layout=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Asynchronously convert a Microsoft Office file to PDF.

convert_postscript_to_pdf(file, *, output=None, compression='lossy', downsample=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Asynchronously convert a PostScript or EPS file to PDF.

convert_email_to_pdf(file, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Asynchronously convert an RFC822 email file to PDF.

convert_image_to_pdf(file, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Asynchronously convert a supported image file to PDF.

convert_html_to_pdf(file, *, output=None, compression='lossy', downsample=300, page_size='letter', page_margin='1.0in', page_orientation='portrait', web_layout='desktop', extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Asynchronously convert an uploaded HTML file to PDF.

convert_url_to_pdf(url, *, output=None, compression='lossy', downsample=300, page_size='letter', page_margin='1.0in', page_orientation='portrait', web_layout='desktop', extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Asynchronously convert HTML content from one URL to PDF.

convert_to_pdfa(file, *, output_type, output=None, rasterize_if_errors_encountered=False, extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Asynchronously convert a PDF to a specified PDF/A version.

convert_to_pdfx(file, *, output_type, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Asynchronously convert a PDF to a specified PDF/X version.

convert_to_png(files, *, output_prefix=None, page_range=None, resolution=300, color_model='rgb', smoothing='none', extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Asynchronously convert one or more pdfRest files to PNG images.

convert_to_bmp(files, *, output_prefix=None, page_range=None, resolution=300, color_model='rgb', smoothing='none', extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Asynchronously convert one or more pdfRest files to BMP images.

convert_to_gif(files, *, output_prefix=None, page_range=None, resolution=300, color_model='rgb', smoothing='none', extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Asynchronously convert one or more pdfRest files to GIF images.

convert_to_jpeg(files, *, output_prefix=None, page_range=None, resolution=300, color_model='rgb', smoothing='none', jpeg_quality=75, extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Asynchronously convert one or more pdfRest files to JPEG images.

convert_to_tiff(files, *, output_prefix=None, page_range=None, resolution=300, color_model='rgb', smoothing='none', extra_query=None, extra_headers=None, extra_body=None, timeout=None) async

Asynchronously convert one or more pdfRest files to TIFF images.

PdfRestClient

Bases: _SyncApiClient

Synchronous client for interacting with the pdfrest API.

__init__(*, api_key=None, base_url=None, timeout=None, headers=None, http_client=None, transport=None, max_retries=DEFAULT_MAX_RETRIES)

Create a synchronous pdfRest client.

up(*, extra_headers=None, extra_query=None, extra_body=None, timeout=None)

Call the /up health endpoint and return server metadata.

query_pdf_info(file, *, queries=ALL_PDF_INFO_QUERIES, extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Query pdfRest for metadata describing a PDF document.

summarize_text(file, *, target_word_count=400, summary_format='overview', pages=None, output_format='markdown', output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Summarize the textual content of a PDF, Markdown, or text document.

Always requests JSON output and returns the inline summary response defined in the pdfRest API reference.

summarize_text_to_file(file, *, target_word_count=400, summary_format='overview', pages=None, output_format='markdown', output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Summarize a document and return the result as a downloadable file.

convert_to_markdown(file, *, pages=None, page_break_comments=False, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Convert a PDF to Markdown and return a file-based response.

ocr_pdf(file, *, languages='English', pages=None, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Perform OCR on a PDF to make text searchable and extractable.

translate_pdf_text(file, *, output_language, pages=None, output_format='markdown', output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Translate the textual content of a PDF, Markdown, or text document (JSON).

translate_pdf_text_to_file(file, *, output_language, pages=None, output_format='markdown', output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Translate textual content and receive a file-based response.

extract_images(file, *, pages=None, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Extract embedded images from a PDF.

extract_pdf_text(file, *, pages=None, full_text='document', preserve_line_breaks=False, word_style=False, word_coordinates=False, extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Extract text content from a PDF and return parsed JSON results.

extract_pdf_text_to_file(file, *, pages=None, full_text='document', preserve_line_breaks=False, word_style=False, word_coordinates=False, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Extract text content from a PDF and return a file-based response.

preview_redactions(file, *, redactions, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Generate a PDF redaction preview with annotated redaction rectangles.

apply_redactions(file, *, rgb_color=None, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Apply previously previewed redactions and return the final redacted PDF.

add_text_to_pdf(file, *, text_objects, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Insert one or more text blocks into a PDF.

add_image_to_pdf(file, *, image, x, y, page, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Insert an image into a single page of a PDF.

split_pdf(file, *, page_groups=None, output_prefix=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Split a PDF into one or more PDF files based on the provided page groups.

merge_pdfs(sources, *, output_prefix=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Merge multiple PDFs (or page subsets) into a single PDF file.

zip_files(files, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Compress one or more files into a zip archive.

unzip_file(file, *, password=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Extract files from a zip archive.

convert_to_excel(file, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Convert a PDF to an Excel spreadsheet.

convert_to_powerpoint(file, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Convert a PDF to a PowerPoint presentation.

convert_xfa_to_acroforms(file, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Convert an XFA PDF to an AcroForm-enabled PDF.

convert_to_word(file, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Convert a PDF to a Word document.

import_form_data(file, data_file, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Import form data from a data file into an existing PDF with form fields.

export_form_data(file, *, data_format, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Export form data from a PDF into an external data file.

data_format support depends on detected form type: - AcroForm PDFs: xfdf, fdf, xml - XFA PDFs: xfd, xdp, xml

flatten_pdf_forms(file, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Flatten form fields in a PDF so they are no longer editable.

add_permissions_password(file, *, new_permissions_password, restrictions=None, current_open_password=None, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Add a permissions password and optional restrictions to a PDF.

change_permissions_password(file, *, current_permissions_password, new_permissions_password, restrictions=None, current_open_password=None, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Rotate the permissions password and optionally update restrictions.

add_open_password(file, *, new_open_password, current_permissions_password=None, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Encrypt a PDF with a new open password.

change_open_password(file, *, current_open_password, new_open_password, current_permissions_password=None, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Rotate the open password for an encrypted PDF.

remove_open_password(file, *, current_open_password, current_permissions_password=None, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Decrypt a PDF by removing its open password.

remove_permissions_password(file, *, current_permissions_password, current_open_password=None, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Remove permissions restrictions from a PDF.

compress_pdf(file, *, compression_level, profile=None, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Compress a PDF using preset or custom compression profiles.

add_attachment_to_pdf(file, *, attachment, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Attach an uploaded file to a PDF.

blank_pdf(*, page_size='letter', page_count=1, page_orientation=None, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Create a blank PDF with configurable size, count, and orientation.

flatten_transparencies(file, *, output=None, quality='medium', extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Flatten transparent objects in a PDF.

linearize_pdf(file, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Linearize a PDF for optimized fast web view.

flatten_annotations(file, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Flatten annotations into the PDF content.

flatten_layers(file, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Flatten all layers in a PDF into a single layer.

rasterize_pdf(file, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Rasterize a PDF into a flattened bitmap-based PDF.

convert_office_to_pdf(file, *, output=None, compression='lossy', downsample=300, tagged_pdf=False, locale=None, page_size=None, page_margin=None, page_orientation=None, web_layout=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Convert a Microsoft Office file (Word, Excel, PowerPoint) to PDF.

convert_postscript_to_pdf(file, *, output=None, compression='lossy', downsample=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Convert a PostScript or EPS file to PDF.

convert_email_to_pdf(file, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Convert an RFC822 email file to PDF.

convert_image_to_pdf(file, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Convert a supported image file to PDF.

convert_html_to_pdf(file, *, output=None, compression='lossy', downsample=300, page_size='letter', page_margin='1.0in', page_orientation='portrait', web_layout='desktop', extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Convert an uploaded HTML file to PDF.

convert_url_to_pdf(url, *, output=None, compression='lossy', downsample=300, page_size='letter', page_margin='1.0in', page_orientation='portrait', web_layout='desktop', extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Convert HTML content from one URL to PDF.

convert_to_pdfa(file, *, output_type, output=None, rasterize_if_errors_encountered=False, extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Convert a PDF to a specified PDF/A version.

convert_to_pdfx(file, *, output_type, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Convert a PDF to a specified PDF/X version.

convert_to_png(files, *, output_prefix=None, page_range=None, resolution=300, color_model='rgb', smoothing='none', extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Convert one or more pdfRest files to PNG images.

convert_to_bmp(files, *, output_prefix=None, page_range=None, resolution=300, color_model='rgb', smoothing='none', extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Convert one or more pdfRest files to BMP images.

convert_to_gif(files, *, output_prefix=None, page_range=None, resolution=300, color_model='rgb', smoothing='none', extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Convert one or more pdfRest files to GIF images.

convert_to_jpeg(files, *, output_prefix=None, page_range=None, resolution=300, color_model='rgb', smoothing='none', jpeg_quality=75, extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Convert one or more pdfRest files to JPEG images.

convert_to_tiff(files, *, output_prefix=None, page_range=None, resolution=300, color_model='rgb', smoothing='none', extra_query=None, extra_headers=None, extra_body=None, timeout=None)

Convert one or more pdfRest files to TIFF images.

PdfRestApiError

Bases: PdfRestError

Raised when the pdfrest API returns a non-successful response.

PdfRestAuthenticationError

Bases: PdfRestApiError

Raised when authentication with the pdfRest API fails.

PdfRestConfigurationError

Bases: PdfRestError

Raised when the client is misconfigured (for example, missing API key).

PdfRestDeleteError

Bases: PdfRestError

Raised when an individual file cannot be deleted.

PdfRestError

Bases: Exception

Base exception for all pdfrest client errors.

PdfRestErrorGroup

Bases: ExceptionGroup

Group of PdfRestError exceptions produced by the PDF REST library.

PdfRestRequestError

Bases: PdfRestError

Raised when the request fails before receiving a response.

PdfRestTimeoutError

Bases: PdfRestError

Raised when a request to pdfrest exceeds the configured timeout.

PdfRestTransportError

Bases: PdfRestError

Raised when a transport-level error occurs while communicating with pdfrest.

UpResponse

Bases: BaseModel

Response payload returned by the /up health endpoint.

translate_httpx_error(exc)

Convert an httpx exception into a library-specific exception.