Expand description
High-performance streaming file processor. High-performance streaming file processor.
Provides constant-memory file processing for workloads from 1K to 50K+ files. All I/O uses fixed-size buffers — memory usage does not grow with file size or transaction count.
§Performance targets
- Time to first result: < 2 ms
- Throughput: >= 50,000 files/second
- Memory: constant O(1) per file via streaming
§Architecture
Files are processed through a pipeline of StreamProcessor stages.
Each stage reads from a buffered input, transforms in a fixed-size
buffer, and writes to a buffered output. No file is ever fully loaded
into memory unless it fits within the buffer size.
Structs§
- Batch
Result - Result of processing a batch of files.
Constants§
- MAX_
BATCH_ SIZE - Maximum number of files to process in a single batch. Bounds memory for directory listings per Power of Ten Rule 2.
- STREAM_
BUFFER_ SIZE - Default buffer size for streaming I/O (8 KB). Aligned to typical filesystem block size for optimal throughput.
Functions§
- benchmark_
throughput - Returns the throughput of a no-op pipeline to measure overhead.
- process_
batch - Processes a batch of files through a streaming pipeline.
- stream_
copy - Copies a single file using buffered streaming I/O.
- stream_
hash - Hashes a file using streaming I/O with constant memory.
- stream_
lines - Processes a file by reading line-by-line with constant memory.