Compression¶
Streaming zlib compression and decompression utilities supporting GZIP, ZLIB, and DEFLATE formats.
All compression operates in streaming mode using zero-copy ByteView chunks.
Note
The default gzip level used by the writer pipeline (dftracer_aggregator,
dftracer_organize, parallel writers) is 1 (fastest); previous
releases defaulted to Z_DEFAULT_COMPRESSION (6). Override per-call with
the compression_level field on ManualStreamingCompressorUtility.
Note
The build defaults to zlib-ng (compat ABI) when the DFTRACER_USE_ZLIB_NG
CMake option is ON (the default), falling back to madler/zlib if
zlib-ng cannot be added. The compressor sources are unchanged: the same
deflate/inflate symbols are linked against whichever backend was
selected at configure time.
#include <dftracer/utils/utilities/compression/zlib/streaming_compressor_utility.h>
#include <dftracer/utils/utilities/compression/zlib/streaming_decompressor_utility.h>
#include <dftracer/utils/utilities/compression/zlib/types.h>
Types¶
// Compression format
enum class CompressionFormat : int32_t {
DEFLATE_RAW = -15, // Raw deflate (no header/trailer)
ZLIB = 15, // zlib format
GZIP = 15 + 16, // gzip format (default)
AUTO = 15 + 32, // Auto-detect gzip/zlib
};
// Decompression format
enum class DecompressionFormat : int32_t {
DEFLATE_RAW = -15,
ZLIB = 15,
GZIP = 15 + 16,
AUTO = 15 + 32 // Auto-detect (default)
};
ManualStreamingCompressorUtility¶
Zero-copy streaming compression using AsyncGenerator<ByteView>.
Yields compressed chunks as ByteView references into an internal buffer.
ManualStreamingCompressorUtility compressor(Z_DEFAULT_COMPRESSION,
CompressionFormat::GZIP);
// Compress input chunks, yielding zero-copy ByteView results
auto gen = compressor.compress(input_bytes);
while (auto view = co_await gen.next()) {
co_await writer.process(*view);
}
// Finalize to flush remaining compressed data
auto fin = compressor.finalize_stream();
while (auto view = co_await fin.next()) {
co_await writer.process(*view);
}
// Query statistics
std::size_t bytes_in = compressor.total_bytes_in();
std::size_t bytes_out = compressor.total_bytes_out();
double ratio = compressor.compression_ratio();
Buffered Compression¶
Writer pipelines (parallel writer, perfetto trace writer, organize group
writers) buffer compressed payloads and flush at a configurable
flush_threshold. The threshold is computed by
compute_writer_sizing() from the detected filesystem layout
(LayoutInfo): on Lustre/GPFS the threshold is sized to the PFS stripe
so each compressed flush fits one stripe; on local FS it is max(default,
stripe_size). Buffer capacity is always flush_threshold +
buffer_headroom.
StreamingDecompressorUtility¶
Streaming decompression with zlib stream reuse and lazy initialization.
StreamingDecompressorUtility decompressor;
for (const auto& compressed_chunk : compressed_data) {
auto decompressed = decompressor.process(compressed_chunk);
for (const auto& chunk : decompressed) {
process(chunk);
}
}