Enables extracting Text from PDFs through a Document Loader plugin with the PDF Parser PHP library. It enables Drupal modules to register and use PDF parsing in their document processing workflows.

Features

  • Extracts the text from PDFs to be used through Document Loader
  • Minimal dependencies using straight PHP, without any additional web service requirements
  • Retrieve MetaData from the PDF (page count, author, etc)

Available Inputs

  • PDF: A document from a File URI

Available Outputs

  • Text: Plain text content from the PDF

Post-Installation

Visit the Document Loader configuration page to see PDF Parser available.

Additional Requirements

Install with Composer to ensure you have all the required dependencies:

composer require drupal/document_loader_pdfparser

None.

Similar projects

Supporting organizations: 
Development

Project information

Releases