Overview

A versatile document transformation service that leverages Langchain's extensive document loader ecosystem to transform various file formats into structured text content. Supports PDF, HTML, CSV, and general text processing through multiple specialized loaders.

Service Information

Property Value
Service Name Langchain transformer
Status Enabled
Compatible Nodes Transform Content

Key Features

Inputs

Input Type Required Description
Content Content selection / Import Set Yes Documents to transform (PDF, HTML, CSV, text files)

Parameters

Transform Source Selection

Parameter Type Required Description
Transform source Choice Yes File type: "PDF", "HTML", "CSV", "Text"

Service Selection by Source Type

PDF Sources

Service Description Best For
PyMuPDF Fast, accurate PDF text extraction Most PDFs, especially with complex layouts
PDFMiner Detailed text positioning and layout analysis PDFs requiring precise text location
PyPDF Simple, lightweight PDF reader Basic text extraction from simple PDFs
PDFPlumber Table and structured data extraction PDFs with tables and forms

HTML Sources