Parallel¶
Namespace: dftracer::utils::utilities::fileio::parallel
-
struct LayoutInfo¶
-
class ParallelWriter¶
Parallel file writer interface. Concrete impls (striped, sharded) hide the on-disk layout; for gzip output the caller must feed standalone gzip members so chunks stay valid at any offset.
Public Functions
-
virtual ~ParallelWriter() = default¶
-
virtual coro::CoroTask<int> open(std::string path, std::size_t num_workers, bool gzip_extension, CoroScope *scope) = 0¶
Create/truncate backing storage.
scopemay be null for layouts that don’t spawn internal coroutines; padded-striped requires a non-null scope that outlives close().
-
virtual coro::CoroTask<int> write_header(ByteView data) = 0¶
Prologue, written before any worker chunk.
-
virtual coro::CoroTask<int> write_chunk(std::size_t worker_idx, ByteView data) = 0¶
Striped: placed at an atomic offset. Sharded: appended to shard N.
Epilogue, written after all workers drain.
-
virtual std::vector<std::string> output_paths() const = 0¶
One entry for striped; N entries (read order) for sharded.
-
inline virtual std::span<const MemberSpan> member_layout() const¶
Member offsets recorded by
write_chunk, sorted by ascending offset. Returned span is owned by the writer; valid until destruction. Must be called afterclose()(no concurrent writes). Empty for layouts that don’t expose member boundaries.
-
inline virtual std::optional<MemberSpan> last_member(std::size_t) const¶
Span of the most recent
write_chunk(worker_idx, ...)call on this worker. Caller must invoke immediately afterco_await write_chunk()returns; subsequent calls overwrite. For sharded layouts the offset is shard-local; remap withshard_base_offsets()after close.
-
inline virtual std::vector<std::uint64_t> shard_base_offsets() const¶
Per-worker base offset to add to a shard-local
MemberSpan.offsetto get the merged-file offset. Empty by default (no remap needed for single-stream layouts). Call afterclose().
-
struct MemberSpan¶
Per-write_chunk layout entry: byte offset + length of one independently decompressable gzip member (or raw chunk for non-gzip layouts).
-
virtual ~ParallelWriter() = default¶
-
struct WriterConfig¶
-
struct WriterSizing¶