Document Loader provides a consistent, plugin-driven API for ingesting documents from multiple sources and normalizing them into reusable output formats. It standardizes discovery, configuration, and execution of document loaders so other modules can focus on their own features instead of wiring bespoke ingestion logic.

Features

  • Attribute-based plugins for registering loaders and loader types
  • Runtime discovery via Drupal plugin managers with cache support
  • Configurable defaults that map loader types to concrete loader plugins
  • Common input/output interfaces to keep transport details decoupled
  • Reusable input/output types for standard formats (JSON, CSV, Markdown, etc)

Recommended Modules

Module Document Types
PDF Parser PDF
AI File To Text Word, Spreadsheet, CSV, Text, Markdown
Plugin API HTTP/HTTPS API Calls
Webpage Remote Web Pages
AI Simple PDF To Text Deprecated by AI File To Text
Document Loader: Parquet Parquet
Supporting organizations: 
Development
Development
Development

Project information

Releases