A versatile document transformation service that leverages Langchain's extensive document loader ecosystem to transform various file formats into structured text content. Supports PDF, HTML, CSV, and general text processing through multiple specialized loaders.
| Property | Value |
|---|---|
| Service Name | Langchain transformer |
| Status | Enabled |
| Compatible Nodes | Transform Content |
| Input | Type | Required | Description |
|---|---|---|---|
| Content | Content selection / Import Set | Yes | Documents to transform (PDF, HTML, CSV, text files) |
| Parameter | Type | Required | Description |
|---|---|---|---|
| Transform source | Choice | Yes | File type: "PDF", "HTML", "CSV", "Text" |
| Service | Description | Best For |
|---|---|---|
| PyMuPDF | Fast, accurate PDF text extraction | Most PDFs, especially with complex layouts |
| PDFMiner | Detailed text positioning and layout analysis | PDFs requiring precise text location |
| PyPDF | Simple, lightweight PDF reader | Basic text extraction from simple PDFs |
| PDFPlumber | Table and structured data extraction | PDFs with tables and forms |