API reference
Package
pdfrest
Top-level package for the pdfrest client library.
AsyncPdfRestClient
Bases: _AsyncApiClient
Asynchronous client for interacting with the pdfrest API.
__init__(*, api_key=None, base_url=None, timeout=None, headers=None, http_client=None, transport=None, concurrency_limit=DEFAULT_FILE_INFO_CONCURRENCY, max_retries=DEFAULT_MAX_RETRIES)
Create an asynchronous pdfRest client.
query_pdf_info(file, *, queries=ALL_PDF_INFO_QUERIES, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Query pdfRest for metadata describing a PDF document asynchronously.
summarize_text(file, *, target_word_count=400, summary_format='overview', pages=None, output_format='markdown', output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Summarize the textual content of a PDF, Markdown, or text document.
Always requests JSON output and returns the inline summary response defined in the pdfRest API reference.
summarize_text_to_file(file, *, target_word_count=400, summary_format='overview', pages=None, output_format='markdown', output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Summarize a document and return the result as a downloadable file.
convert_to_markdown(file, *, pages=None, page_break_comments=False, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Convert a PDF to Markdown and return a file-based response.
ocr_pdf(file, *, languages='English', pages=None, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Perform OCR on a PDF to make text searchable and extractable.
translate_pdf_text(file, *, output_language, pages=None, output_format='markdown', output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Translate the textual content of a PDF, Markdown, or text document (JSON).
translate_pdf_text_to_file(file, *, output_language, pages=None, output_format='markdown', output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Translate textual content and receive a file-based response.
extract_images(file, *, pages=None, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Extract embedded images from a PDF.
extract_pdf_text(file, *, pages=None, full_text='document', preserve_line_breaks=False, word_style=False, word_coordinates=False, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Extract text content from a PDF and return parsed JSON results.
extract_pdf_text_to_file(file, *, pages=None, full_text='document', preserve_line_breaks=False, word_style=False, word_coordinates=False, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Extract text content from a PDF and return a file-based response.
preview_redactions(file, *, redactions, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Asynchronously generate a PDF redaction preview.
apply_redactions(file, *, rgb_color=None, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Asynchronously apply PDF redactions.
add_text_to_pdf(file, *, text_objects, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Asynchronously insert text blocks into a PDF.
add_image_to_pdf(file, *, image, x, y, page, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Asynchronously insert an image into a PDF.
up(*, extra_headers=None, extra_query=None, extra_body=None, timeout=None)
async
Call the /up health endpoint asynchronously and return server metadata.
split_pdf(file, *, page_groups=None, output_prefix=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Asynchronously split a PDF into one or more PDF files.
merge_pdfs(sources, *, output_prefix=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Asynchronously merge multiple PDFs (or page subsets) into a single PDF.
zip_files(files, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Asynchronously compress one or more files into a zip archive.
unzip_file(file, *, password=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Asynchronously extract files from a zip archive.
convert_to_excel(file, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Asynchronously convert a PDF to an Excel spreadsheet.
convert_to_powerpoint(file, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Asynchronously convert a PDF to a PowerPoint presentation.
convert_xfa_to_acroforms(file, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Asynchronously convert an XFA PDF to an AcroForm-enabled PDF.
convert_to_word(file, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Asynchronously convert a PDF to a Word document.
import_form_data(file, data_file, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Asynchronously import form data from a data file into a PDF.
export_form_data(file, *, data_format, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Asynchronously export form data from a PDF into a data file.
data_format support depends on detected form type:
- AcroForm PDFs: xfdf, fdf, xml
- XFA PDFs: xfd, xdp, xml
flatten_pdf_forms(file, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Asynchronously flatten form fields in a PDF.
add_permissions_password(file, *, new_permissions_password, restrictions=None, current_open_password=None, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Asynchronously add a permissions password and optional restrictions to a PDF.
change_permissions_password(file, *, current_permissions_password, new_permissions_password, restrictions=None, current_open_password=None, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Asynchronously rotate the permissions password and optionally update restrictions.
remove_permissions_password(file, *, current_permissions_password, current_open_password=None, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Asynchronously remove permissions restrictions from a PDF.
add_open_password(file, *, new_open_password, current_permissions_password=None, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Asynchronously encrypt a PDF with a new open password.
change_open_password(file, *, current_open_password, new_open_password, current_permissions_password=None, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Asynchronously rotate the open password for an encrypted PDF.
remove_open_password(file, *, current_open_password, current_permissions_password=None, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Asynchronously decrypt a PDF by removing its open password.
compress_pdf(file, *, compression_level, profile=None, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Asynchronously compress a PDF.
add_attachment_to_pdf(file, *, attachment, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Asynchronously attach an uploaded file to a PDF.
blank_pdf(*, page_size='letter', page_count=1, page_orientation=None, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Asynchronously create a blank PDF with configurable size and count.
flatten_transparencies(file, *, output=None, quality='medium', extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Asynchronously flatten transparent objects in a PDF.
linearize_pdf(file, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Asynchronously linearize a PDF for optimized fast web view.
flatten_annotations(file, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Asynchronously flatten annotations into the PDF content.
flatten_layers(file, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Asynchronously flatten all layers in a PDF.
rasterize_pdf(file, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Asynchronously rasterize a PDF into a flattened bitmap-based PDF.
convert_office_to_pdf(file, *, output=None, compression='lossy', downsample=300, tagged_pdf=False, locale=None, page_size=None, page_margin=None, page_orientation=None, web_layout=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Asynchronously convert a Microsoft Office file to PDF.
convert_postscript_to_pdf(file, *, output=None, compression='lossy', downsample=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Asynchronously convert a PostScript or EPS file to PDF.
convert_email_to_pdf(file, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Asynchronously convert an RFC822 email file to PDF.
convert_image_to_pdf(file, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Asynchronously convert a supported image file to PDF.
convert_html_to_pdf(file, *, output=None, compression='lossy', downsample=300, page_size='letter', page_margin='1.0in', page_orientation='portrait', web_layout='desktop', extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Asynchronously convert an uploaded HTML file to PDF.
convert_url_to_pdf(url, *, output=None, compression='lossy', downsample=300, page_size='letter', page_margin='1.0in', page_orientation='portrait', web_layout='desktop', extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Asynchronously convert HTML content from one URL to PDF.
convert_to_pdfa(file, *, output_type, output=None, rasterize_if_errors_encountered=False, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Asynchronously convert a PDF to a specified PDF/A version.
convert_to_pdfx(file, *, output_type, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Asynchronously convert a PDF to a specified PDF/X version.
convert_to_png(files, *, output_prefix=None, page_range=None, resolution=300, color_model='rgb', smoothing='none', extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Asynchronously convert one or more pdfRest files to PNG images.
convert_to_bmp(files, *, output_prefix=None, page_range=None, resolution=300, color_model='rgb', smoothing='none', extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Asynchronously convert one or more pdfRest files to BMP images.
convert_to_gif(files, *, output_prefix=None, page_range=None, resolution=300, color_model='rgb', smoothing='none', extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Asynchronously convert one or more pdfRest files to GIF images.
convert_to_jpeg(files, *, output_prefix=None, page_range=None, resolution=300, color_model='rgb', smoothing='none', jpeg_quality=75, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Asynchronously convert one or more pdfRest files to JPEG images.
convert_to_tiff(files, *, output_prefix=None, page_range=None, resolution=300, color_model='rgb', smoothing='none', extra_query=None, extra_headers=None, extra_body=None, timeout=None)
async
Asynchronously convert one or more pdfRest files to TIFF images.
PdfRestClient
Bases: _SyncApiClient
Synchronous client for interacting with the pdfrest API.
__init__(*, api_key=None, base_url=None, timeout=None, headers=None, http_client=None, transport=None, max_retries=DEFAULT_MAX_RETRIES)
Create a synchronous pdfRest client.
up(*, extra_headers=None, extra_query=None, extra_body=None, timeout=None)
Call the /up health endpoint and return server metadata.
query_pdf_info(file, *, queries=ALL_PDF_INFO_QUERIES, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Query pdfRest for metadata describing a PDF document.
summarize_text(file, *, target_word_count=400, summary_format='overview', pages=None, output_format='markdown', output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Summarize the textual content of a PDF, Markdown, or text document.
Always requests JSON output and returns the inline summary response defined in the pdfRest API reference.
summarize_text_to_file(file, *, target_word_count=400, summary_format='overview', pages=None, output_format='markdown', output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Summarize a document and return the result as a downloadable file.
convert_to_markdown(file, *, pages=None, page_break_comments=False, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Convert a PDF to Markdown and return a file-based response.
ocr_pdf(file, *, languages='English', pages=None, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Perform OCR on a PDF to make text searchable and extractable.
translate_pdf_text(file, *, output_language, pages=None, output_format='markdown', output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Translate the textual content of a PDF, Markdown, or text document (JSON).
translate_pdf_text_to_file(file, *, output_language, pages=None, output_format='markdown', output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Translate textual content and receive a file-based response.
extract_images(file, *, pages=None, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Extract embedded images from a PDF.
extract_pdf_text(file, *, pages=None, full_text='document', preserve_line_breaks=False, word_style=False, word_coordinates=False, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Extract text content from a PDF and return parsed JSON results.
extract_pdf_text_to_file(file, *, pages=None, full_text='document', preserve_line_breaks=False, word_style=False, word_coordinates=False, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Extract text content from a PDF and return a file-based response.
preview_redactions(file, *, redactions, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Generate a PDF redaction preview with annotated redaction rectangles.
apply_redactions(file, *, rgb_color=None, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Apply previously previewed redactions and return the final redacted PDF.
add_text_to_pdf(file, *, text_objects, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Insert one or more text blocks into a PDF.
add_image_to_pdf(file, *, image, x, y, page, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Insert an image into a single page of a PDF.
split_pdf(file, *, page_groups=None, output_prefix=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Split a PDF into one or more PDF files based on the provided page groups.
merge_pdfs(sources, *, output_prefix=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Merge multiple PDFs (or page subsets) into a single PDF file.
zip_files(files, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Compress one or more files into a zip archive.
unzip_file(file, *, password=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Extract files from a zip archive.
convert_to_excel(file, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Convert a PDF to an Excel spreadsheet.
convert_to_powerpoint(file, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Convert a PDF to a PowerPoint presentation.
convert_xfa_to_acroforms(file, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Convert an XFA PDF to an AcroForm-enabled PDF.
convert_to_word(file, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Convert a PDF to a Word document.
import_form_data(file, data_file, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Import form data from a data file into an existing PDF with form fields.
export_form_data(file, *, data_format, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Export form data from a PDF into an external data file.
data_format support depends on detected form type:
- AcroForm PDFs: xfdf, fdf, xml
- XFA PDFs: xfd, xdp, xml
flatten_pdf_forms(file, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Flatten form fields in a PDF so they are no longer editable.
add_permissions_password(file, *, new_permissions_password, restrictions=None, current_open_password=None, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Add a permissions password and optional restrictions to a PDF.
change_permissions_password(file, *, current_permissions_password, new_permissions_password, restrictions=None, current_open_password=None, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Rotate the permissions password and optionally update restrictions.
add_open_password(file, *, new_open_password, current_permissions_password=None, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Encrypt a PDF with a new open password.
change_open_password(file, *, current_open_password, new_open_password, current_permissions_password=None, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Rotate the open password for an encrypted PDF.
remove_open_password(file, *, current_open_password, current_permissions_password=None, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Decrypt a PDF by removing its open password.
remove_permissions_password(file, *, current_permissions_password, current_open_password=None, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Remove permissions restrictions from a PDF.
compress_pdf(file, *, compression_level, profile=None, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Compress a PDF using preset or custom compression profiles.
add_attachment_to_pdf(file, *, attachment, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Attach an uploaded file to a PDF.
blank_pdf(*, page_size='letter', page_count=1, page_orientation=None, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Create a blank PDF with configurable size, count, and orientation.
flatten_transparencies(file, *, output=None, quality='medium', extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Flatten transparent objects in a PDF.
linearize_pdf(file, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Linearize a PDF for optimized fast web view.
flatten_annotations(file, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Flatten annotations into the PDF content.
flatten_layers(file, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Flatten all layers in a PDF into a single layer.
rasterize_pdf(file, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Rasterize a PDF into a flattened bitmap-based PDF.
convert_office_to_pdf(file, *, output=None, compression='lossy', downsample=300, tagged_pdf=False, locale=None, page_size=None, page_margin=None, page_orientation=None, web_layout=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Convert a Microsoft Office file (Word, Excel, PowerPoint) to PDF.
convert_postscript_to_pdf(file, *, output=None, compression='lossy', downsample=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Convert a PostScript or EPS file to PDF.
convert_email_to_pdf(file, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Convert an RFC822 email file to PDF.
convert_image_to_pdf(file, *, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Convert a supported image file to PDF.
convert_html_to_pdf(file, *, output=None, compression='lossy', downsample=300, page_size='letter', page_margin='1.0in', page_orientation='portrait', web_layout='desktop', extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Convert an uploaded HTML file to PDF.
convert_url_to_pdf(url, *, output=None, compression='lossy', downsample=300, page_size='letter', page_margin='1.0in', page_orientation='portrait', web_layout='desktop', extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Convert HTML content from one URL to PDF.
convert_to_pdfa(file, *, output_type, output=None, rasterize_if_errors_encountered=False, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Convert a PDF to a specified PDF/A version.
convert_to_pdfx(file, *, output_type, output=None, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Convert a PDF to a specified PDF/X version.
convert_to_png(files, *, output_prefix=None, page_range=None, resolution=300, color_model='rgb', smoothing='none', extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Convert one or more pdfRest files to PNG images.
convert_to_bmp(files, *, output_prefix=None, page_range=None, resolution=300, color_model='rgb', smoothing='none', extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Convert one or more pdfRest files to BMP images.
convert_to_gif(files, *, output_prefix=None, page_range=None, resolution=300, color_model='rgb', smoothing='none', extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Convert one or more pdfRest files to GIF images.
convert_to_jpeg(files, *, output_prefix=None, page_range=None, resolution=300, color_model='rgb', smoothing='none', jpeg_quality=75, extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Convert one or more pdfRest files to JPEG images.
convert_to_tiff(files, *, output_prefix=None, page_range=None, resolution=300, color_model='rgb', smoothing='none', extra_query=None, extra_headers=None, extra_body=None, timeout=None)
Convert one or more pdfRest files to TIFF images.
PdfRestApiError
PdfRestAuthenticationError
PdfRestConfigurationError
Bases: PdfRestError
Raised when the client is misconfigured (for example, missing API key).
PdfRestDeleteError
PdfRestError
Bases: Exception
Base exception for all pdfrest client errors.
PdfRestErrorGroup
Bases: ExceptionGroup
Group of PdfRestError exceptions produced by the PDF REST library.
PdfRestRequestError
Bases: PdfRestError
Raised when the request fails before receiving a response.
PdfRestTimeoutError
Bases: PdfRestError
Raised when a request to pdfrest exceeds the configured timeout.
PdfRestTransportError
Bases: PdfRestError
Raised when a transport-level error occurs while communicating with pdfrest.
UpResponse
Bases: BaseModel
Response payload returned by the /up health endpoint.
translate_httpx_error(exc)
Convert an httpx exception into a library-specific exception.