The Built-in segmenter is HyperFlow's comprehensive content segmentation service, offering specialized methods for Text, PDF, HTML, and Markdown content. It provides powerful, format-aware segmentation strategies optimized for different document types and use cases.
| Property | Value |
|---|---|
| Service Name | Built-in segmenter |
| Status | Enabled |
| Compatible Nodes | Segment Content |
| Parameter | Type | Default | Description |
|---|---|---|---|
| Segmentation source | choice | - | Content type selection (PDF, HTML, Text, or Markdown) |
| Service | choice | varies | The segmentation method for the selected content type |
Note: The text segmentation options in this service provide fast, basic text splitting. For more sophisticated text segmentation with semantic awareness or recursive splitting, consider using the Langchain segmenter.
Creates overlapping text segments of fixed size, useful for maintaining context across boundaries.
| Parameter | Type | Default | Description |
|---|---|---|---|
| Window size | number | 1024 | Size of each text window in characters |
| Overlap | number | 64 | Number of overlapping characters between windows |
How it works: Moves a fixed-size window through the text with specified overlap, creating segments that share context at boundaries.
Best for: General text where consistent chunk sizes are important, embeddings generation, or when context preservation is critical.