Command-Line Tools ================== DFTracer Utils provides several command-line utilities for working with DFTracer trace files and compressed archives. .. _cli-shared-flags: Shared CLI Flags ---------------- Most tools wire in a common set of argument schemas defined in ``src/dftracer/utils/binaries/common_cli.h``. The flags below have identical semantics across every binary that exposes the relevant schema and are not repeated in each tool's section. **Pipeline** (``PipelineArgs``) - ``--executor-threads `` - Number of worker threads for parallel processing (default: number of CPU cores) - ``--io-threads `` - Number of I/O threads (default: number of CPU cores) - ``--time-profiling`` - Print stage timing breakdown to stderr **Indexing** (``IndexingArgs``) - ``--index-dir `` - Directory for ``.dftindex`` stores - ``--checkpoint-size `` - Checkpoint size for gzip indexing in bytes (default: 33554432 B / 32 MB) - ``-f, --force`` - Force index recreation **Query** (``QueryArgs``) - ``--query `` - Query DSL filter (e.g., ``'cat == "POSIX" and dur > 1000'``) **Watchdog** (``WatchdogArgs``) - ``--disable-watchdog`` - Disable watchdog for hang detection - ``--watchdog-global-timeout `` - Watchdog global timeout for pipeline execution in seconds (0 = no timeout, default: 0) - ``--watchdog-task-timeout `` - Watchdog default task timeout in seconds (0 = no timeout, default: 0) - ``--watchdog-interval `` - Watchdog check interval in seconds (default: 1) - ``--watchdog-warning-threshold `` - Watchdog long-running task warning threshold in seconds (default: 300) - ``--watchdog-idle-timeout `` - Watchdog idle timeout in seconds (0 = use default, default: 300) - ``--watchdog-deadlock-timeout `` - Watchdog deadlock timeout in seconds (0 = use default, default: 600) **Inputs** (``DirectoryArgs`` / ``FilesArgs``) - ``-d, --directory `` - Directory containing trace files - ``--files `` - Trace files (``.pfw``, ``.pfw.gz``) dftracer_reader --------------- **Description:** DFTracer utility for reading and indexing compressed files (GZIP, TAR.GZ) **Usage:** .. code-block:: bash dftracer_reader [OPTIONS] file **Arguments:** - ``file`` - Compressed file to process (GZIP, TAR.GZ) [required] **Options:** - ``-i, --index `` - Index file to use (default: auto-generated in temp directory) - ``-s, --start `` - Start position in bytes (default: -1) - ``-e, --end `` - End position in bytes (default: -1) - ``-c, --checkpoint-size `` - Checkpoint size for indexing in bytes (default: 33554432 B / 32 MB) - ``-f, --force-rebuild`` - Force rebuild of index even if it exists - ``--check`` - Check if index is valid - ``--read-buffer-size `` - Size of the read buffer in bytes (default: 1MB) - ``--mode `` - Set the reading mode: bytes, line_bytes, or lines (default: bytes) - ``--index-dir `` - Directory to store index files (default: system temp directory) **Example:** .. code-block:: bash # Read bytes 100-200 from a compressed file dftracer_reader --start 100 --end 200 trace.pfw.gz # Read in line mode dftracer_reader --mode lines --start 1 --end 100 trace.pfw.gz # Build index with custom checkpoint size dftracer_reader --checkpoint-size 20971520 trace.pfw.gz dftracer_info ------------- **Description:** Display metadata and index information for DFTracer compressed files **Usage:** .. code-block:: bash dftracer_info [OPTIONS] **Options:** - ``--files `` - Compressed files to inspect (GZIP, TAR.GZ) - ``-d, --directory `` - Directory containing files to inspect - ``--query `` - Query type: ``summary`` (aggregate all files, default) or ``detailed`` (per-file output) - ``-v, --verbose`` - Show detailed information including index details - ``-f, --force-rebuild`` - Force rebuild index files - ``-c, --checkpoint-size `` - Checkpoint size for indexing in bytes (default: 33554432 B / 32 MB) - ``--index-dir `` - Directory to store index files (default: system temp directory) - ``--executor-threads `` - Number of worker threads for parallel processing (default: number of CPU cores) **Example:** .. code-block:: bash # Show info for files in a directory dftracer_info -d ./logs # Show info for specific files with verbose output dftracer_info --files trace1.pfw.gz trace2.pfw.gz -v # Per-file detailed output dftracer_info -d ./traces --query detailed # Analyze with 4 threads dftracer_info --executor-threads 4 -d ./traces dftracer_merge -------------- **Description:** Merge DFTracer .pfw or .pfw.gz files into a single JSON array file using pipeline processing **Usage:** .. code-block:: bash dftracer_merge [OPTIONS] **Options:** - ``-d, --directory `` - Directory containing .pfw or .pfw.gz files (default: .) - ``-o, --output `` - Output file path (should have .pfw extension) (default: combined.pfw) - ``-f, --force`` - Override existing output file and force index recreation - ``-c, --compress`` - Compress output file with gzip - ``-v, --verbose`` - Enable verbose mode - ``-g, --gzip-only`` - Process only .pfw.gz files - ``--checkpoint-size `` - Checkpoint size for indexing in bytes (default: 33554432 B / 32 MB) - ``--executor-threads `` - Number of worker threads for parallel processing (default: number of CPU cores) - ``--index-dir `` - Directory to store index files (default: system temp directory) **Example:** .. code-block:: bash # Merge all .pfw/.pfw.gz files in current directory dftracer_merge -o merged.pfw # Merge files from specific directory with compression dftracer_merge -d ./logs -o output.pfw -c # Merge with parallel processing and verbose output dftracer_merge -d ./traces -o combined.pfw --executor-threads 8 -v dftracer_split -------------- **Description:** Split DFTracer traces into equal-sized chunks using pipeline processing **Usage:** .. code-block:: bash dftracer_split [OPTIONS] **Options:** - ``-n, --app-name `` - Application name for output files (default: app) - ``-d, --directory `` - Input directory containing .pfw or .pfw.gz files (default: .) - ``-o, --output `` - Output directory for split files (default: ./split) - ``-s, --chunk-size `` - Chunk size in MB (default: 4) - ``-f, --force`` - Override existing files and force index recreation - ``-c, --compress`` - Compress output files with gzip (default: true) - ``-v, --verbose`` - Enable verbose mode - ``--checkpoint-size `` - Checkpoint size for indexing in bytes (default: 33554432 B / 32 MB) - ``--executor-threads `` - Number of worker threads for parallel processing (default: number of CPU cores) - ``--index-dir `` - Directory to store index files (default: system temp directory) - ``--verify`` - Verify output chunks match input by comparing event IDs **Example:** .. code-block:: bash # Split files into 4MB chunks dftracer_split -d ./logs -o ./split_output # Split with 10MB chunks and custom app name dftracer_split -d ./traces -s 10 -n myapp -o ./chunks # Split without compression and verify output dftracer_split -d ./data -c false --verify -o ./output dftracer_event_count -------------------- **Description:** Count valid events in DFTracer .pfw or .pfw.gz files using pipeline processing **Usage:** .. code-block:: bash dftracer_event_count [OPTIONS] **Options:** - ``-d, --directory `` - Directory containing .pfw or .pfw.gz files (default: .) - ``-f, --force`` - Force index recreation - ``-c, --checkpoint-size `` - Checkpoint size for indexing in bytes (default: 33554432 B / 32 MB) - ``--executor-threads `` - Number of worker threads for parallel processing (default: number of CPU cores) - ``--index-dir `` - Directory to store index files (default: system temp directory) **Example:** .. code-block:: bash # Count events in current directory dftracer_event_count # Count events in specific directory with 8 threads dftracer_event_count -d ./traces --executor-threads 8 # Force index rebuild dftracer_event_count -d ./logs -f dftracer_pgzip -------------- **Description:** Parallel gzip compression for DFTracer .pfw files **Usage:** .. code-block:: bash dftracer_pgzip [OPTIONS] **Options:** - ``-d, --directory `` - Directory containing .pfw files (default: .) - ``-v, --verbose`` - Enable verbose output - ``--executor-threads `` - Number of worker threads for parallel processing (default: number of CPU cores) **Example:** .. code-block:: bash # Compress all .pfw files in current directory dftracer_pgzip # Compress files in specific directory with verbose output dftracer_pgzip -d ./logs -v # Compress with 16 threads dftracer_pgzip -d ./traces --executor-threads 16 dftracer_server --------------- **Description:** HTTP server for querying and streaming DFTracer trace data via REST API **Usage:** .. code-block:: bash dftracer_server [OPTIONS] --directory **Options:** - ``-b, --bind
`` - Bind address (default: 0.0.0.0) - ``-p, --port `` - Listen port (default: 8080) - ``-d, --directory `` - Directory containing trace files [required] - ``--index-dir `` - Directory for bloom/checkpoint index files (default: same as --directory) - ``--executor-threads `` - Number of worker threads (default: number of CPU cores) **Example:** .. code-block:: bash # Start server on default port 8080 dftracer_server -d ./traces # Start server on custom port with specific bind address dftracer_server -b 127.0.0.1 -p 9000 -d ./traces # Start with custom index directory and thread count dftracer_server -d ./traces --index-dir /var/cache/dftracer_indexes --executor-threads 8 dftracer_stats -------------- **Description:** Compute event statistics with bloom filter acceleration and detailed distribution analysis **Usage:** .. code-block:: bash dftracer_stats [OPTIONS] **Options:** - ``-d, --directory `` - Directory containing .pfw or .pfw.gz files (default: .) - ``--files `` - Explicit list of trace files - ``--index-dir `` - Directory to store index files (default: system temp directory) - ``--report `` - Report type: summary, categories, names, pid_tids, time_range, duration, top-names, top-categories, detailed (default: summary) - ``--top-n `` - Top N entries to show in detailed report (0=all, default: 10) - ``--top-n-pid-tid `` - Top N PID:TID pairs to show (default: 10) - ``--query `` - Query DSL filter (e.g., ``'cat == "POSIX" and dur > 1000'``) - ``--group-by `` - Group-by dimensions: name, cat, pid, tid, fhash, hhash, pid_tid (default: name for detailed) - ``--json`` - Output in JSON format - ``--no-auto-index`` - Disable automatic bloom index building - ``--checkpoint-size `` - Checkpoint size for indexing in bytes (default: 33554432 B / 32 MB) - ``--executor-threads `` - Number of worker threads (default: number of CPU cores) **Example:** .. code-block:: bash # Summary statistics dftracer_stats -d ./traces # Top operations and categories dftracer_stats -d ./traces --report categories # Detailed duration distribution per operation dftracer_stats -d ./traces --report detailed --group-by name --top-n 20 # Filter to POSIX operations only dftracer_stats -d ./traces --report duration --query 'cat == "POSIX"' dftracer_view ------------- **Description:** Extract filtered subsets of trace data using query-based filtering with chunk pruning **Usage:** .. code-block:: bash dftracer_view [OPTIONS] **Options:** - ``--files `` - Trace files to process (.pfw, .pfw.gz) - ``-d, --directory `` - Directory containing trace files - ``--preset `` - Predefined view: io, compute, dlio - ``--recipe `` - Custom view JSON file path - ``--save-recipe `` - Save the constructed view to a JSON file - ``--query `` - Query DSL filter (e.g., ``'cat == "POSIX" and dur > 1000'``) - ``--time-range `` - Timestamp filter in microseconds (e.g., 1000000,2000000) - ``--min-duration `` - Minimum event duration in microseconds - ``--max-duration `` - Maximum event duration in microseconds - ``-o, --output `` - Output file path (default: stdout) - ``--stream`` - Stream matching events to stdout as NDJSON - ``--no-metadata`` - Exclude metadata events (ph=M) from output - ``--index-dir `` - Directory where .idx index files are stored - ``--no-auto-index`` - Disable automatic bloom index building for files missing .idx - ``--checkpoint-size `` - Checkpoint size for auto-indexing in bytes (default: 33554432 B / 32 MB) - ``--executor-threads `` - Number of worker threads (default: number of CPU cores) **Example:** .. code-block:: bash # Extract I/O operations dftracer_view --preset io -d ./traces -o io_events.pfw # Custom query: POSIX read/write operations dftracer_view -d ./traces --query 'cat == "POSIX" and name in ["read", "write"]' -o posix_rw.pfw # Time-filtered view with output streaming dftracer_view -d ./traces --time-range 1000000,5000000 --stream dftracer_index -------------- **Description:** Build per-chunk bloom filter indices for efficient chunk-skipping queries **Usage:** .. code-block:: bash dftracer_index [OPTIONS] **Options:** - ``-d, --directory `` - Input directory containing .pfw or .pfw.gz files (default: .) - ``--dimensions `` - Comma-separated extra dimensions to index from args (e.g., args.level,args.mode) - ``-f, --force`` - Force index recreation even if already built - ``--checkpoint-size `` - Checkpoint size for gzip indexing in bytes (default: 33554432 B / 32 MB) - ``--executor-threads `` - Number of worker threads for parallel processing (default: number of CPU cores) - ``--index-dir `` - Directory to store index files (default: same as data files) - ``--expected-entries `` - Expected entries per chunk for bloom filter sizing (default: 1024) - ``--false-positive-rate `` - Bloom filter false positive rate (default: 0.01) - ``--read-batch-size `` - Batch read size in MB for stream processing (default: 4) - ``--manifest`` - Also build manifest tables in .idx (per-checkpoint event line routing) - ``--rebuild-summaries`` - Rebuild ``ROOT_*`` aggregated summaries after ingest. Off by default; ``ROOT_*`` CFs are only consumed by summary tools such as ``dftracer_info``. Bloom-filter chunk-skipping queries do not require them. This binary also accepts the shared :ref:`cli-shared-flags` (Pipeline, Watchdog, Indexing). **Example:** .. code-block:: bash # Build bloom indices for all traces dftracer_index -d ./traces # Build with custom dimensions and force rebuild dftracer_index -d ./traces --dimensions "args.level,args.io.size" --force # Build manifest indices for reorganization dftracer_index -d ./traces --manifest dftracer_aggregator ------------------- **Description:** Aggregate DFTracer events into time-series counters using streaming coroutine pipeline The aggregator can emit three logical row types: - regular event rows from non-counter trace events - profile-counter rows from ``ph="C"`` events whose category is not ``sys`` - system-counter rows from ``ph="C"`` events whose category is ``sys`` With ``--format arrow``, these are distinguished by the ``batch_type`` column. The Arrow output always includes the base columns ``batch_type``, ``cat``, ``name``, ``pid``, ``tid``, ``hhash``, ``fhash``, ``time_bucket``, ``count``, ``dur_total``, ``dur_min``, ``dur_max``, ``dur_mean``, ``dur_std``, ``size_total``, ``size_min``, ``size_max``, ``size_mean``, ``size_std``, ``ts``, and ``te``. Each field listed in ``--metric-fields`` adds ``_total``, ``_min``, ``_max``, ``_mean``, and ``_std``. **Usage:** .. code-block:: bash dftracer_aggregator [OPTIONS] **Options:** - ``-d, --directory `` - Input directory containing .pfw or .pfw.gz files (default: .) - ``-o, --output `` - Output file path for aggregated counters (default: aggregated_output.json) - ``-t, --time-interval `` - Time interval in milliseconds for bucketing (default: 5000) - ``-g, --group-keys `` - Comma-separated extra group keys from args (e.g., epoch,step,level) - ``-m, --metric-fields `` - Comma-separated custom metric fields from args (e.g., iter_count,num_events) - ``--query `` - Query DSL filter (e.g., ``'cat == "POSIX" and dur > 1000'``) - ``-f, --force`` - Force index recreation - ``--checkpoint-size `` - Checkpoint size for indexing in bytes (default: 33554432 B / 32 MB) - ``--executor-threads `` - Number of executor threads for parallel processing (default: number of CPU cores) - ``--index-dir `` - Directory to store index files (default: system temp directory) - ``--compress`` - Compress output using gzip - ``--compression-level <0-9>`` - Gzip compression level (default: 6) - ``--boundary-events `` - Boundary event configuration: event_name:value_field:output_name - ``--no-track-process-parents`` - Disable tracking of process parent relationships from fork/spawn - ``--chunk-size `` - Target chunk size in MB for parallel processing (default: 4) - ``--read-batch-size `` - Batch read size in MB for stream processing (default: 4) - ``--event-format `` - Perfetto event format: counter, async, regular (default: counter) - ``--compute-percentiles`` - Enable percentile/quantile computation using DDSketch - ``--percentiles `` - Comma-separated percentiles to compute (e.g., 0.25,0.5,0.75,0.90) - ``--relative-accuracy `` - Relative accuracy for DDSketch percentile estimation (default: 0.01) - ``--format `` - Output format: ``json`` (default, Perfetto trace) or ``arrow`` (``.arrows`` IPC file). Arrow format requires ``DFTRACER_UTILS_ENABLE_ARROW_IPC=ON`` at build time. **Example:** .. code-block:: bash # Basic aggregation with 1-second (1000ms) buckets dftracer_aggregator -d ./traces -o agg.json -t 1000 # Aggregation with percentiles and compression dftracer_aggregator -d ./traces -o agg.json --compute-percentiles --compress # Query-filtered aggregation with custom metrics from args dftracer_aggregator -d ./traces --query 'cat == "POSIX"' \ -m "iter_count,epoch" # Output as Arrow IPC file (readable by pyarrow, polars, DuckDB) dftracer_aggregator -d ./traces -o agg.arrows --format arrow # Stream profile/system counters as Perfetto counter events dftracer_aggregator -d ./traces --event-format counter **Reading Arrow IPC output:** .. code-block:: python # pyarrow import pyarrow.ipc as ipc reader = ipc.open_file("agg.arrows") table = reader.read_all() df = table.to_pandas() # polars import polars as pl df = pl.read_ipc("agg.arrows") # DuckDB import duckdb result = duckdb.sql("SELECT * FROM 'agg.arrows'") dftracer_gen_dlio_config ------------------------ **Description:** Generate a DLIO YAML configuration directly from a directory of raw DFTracer traces. The tool indexes the inputs, aggregates them into the internal ``AGGREGATION`` column family (DDSketch forced on), fits per-component distributions, refines ``max_bound`` against an internal barrier simulator, and emits a DLIO ``train.computation_time`` + ``reader.preprocess_time`` block. The user does not need to run ``dftracer_aggregator`` separately. Required input event names: ``cat=dataloader`` with ``name=fetch.block`` / ``fetch.iter``, and ``cat=data`` with ``name=preprocess`` / ``item``. The tool exits non-zero with an explanatory message if no DLIO events are present. **Usage:** .. code-block:: bash dftracer_gen_dlio_config [OPTIONS] -o **Options:** - ``-d, --directory `` - Input directory containing .pfw or .pfw.gz traces (default: .) - ``-o, --output `` - Output path for the DLIO YAML config [required] - ``--max-bound-percentile `` - Initial max_bound percentile, 0-100 (default: 95) - ``--simulation-iterations `` - Max simulator iterations for percentile refinement (default: 5) - ``--target-e2e-error `` - Target relative E2E error to declare convergence (default: 0.05) - ``--target-cdf-similarity `` - Target fetch_block CDF similarity (default: 0.90) - ``--patience `` - Early-stop after this many iterations without improvement (default: 10) - ``--epsilon `` - Base step size for percentile adjustment (default: 1.0) - ``--momentum `` - Momentum factor in [0, 1) (default: 0.9) - ``--min-percentile `` - Floor on max_bound percentile during optimization (default: 50) - ``--num-workers `` - DataLoader worker count for the simulator (default: 8) - ``--prefetch-factor `` - DataLoader prefetch factor (default: 2) - ``--seed `` - Base seed for simulator and sampler (default: 42) - ``--max-samples-per-entry `` - Cap on synthesized samples per aggregation entry; 0 disables (default: 100) - ``-t, --time-interval `` - Aggregation time interval in ms (default: 5000) - ``--index-dir `` - Directory for the shared index store (default: system temp dir) - ``--checkpoint-size `` - Checkpoint size for indexing in bytes (default: 33554432 B / 32 MB) - ``--executor-threads `` - Number of executor threads for parallel processing - ``-f, --force`` - Force index recreation **Distribution pool:** Each component is fit as the lowest-BIC choice among {Normal, Lognormal, Gamma, Exponential, Weibull, Gaussian Mixture (K=2), Gaussian Mixture (K=3)}. Mixture candidates are only considered when the sample count is at least 20. **Example:** .. code-block:: bash # Generate config from a directory of raw traces dftracer_gen_dlio_config -d ./traces -o dlio_config.yaml # Refine harder against the simulator with a tighter convergence target dftracer_gen_dlio_config -d ./traces -o dlio_config.yaml \ --simulation-iterations 20 --target-e2e-error 0.02 --patience 5 # Reuse a shared index directory across runs to skip re-indexing dftracer_gen_dlio_config -d ./traces -o dlio_config.yaml \ --index-dir /var/cache/dftracer/idx **Output schema:** .. code-block:: yaml train: computation_time: type: # single distribution: per-family params (mean/stdev, mu/sigma, # shape/scale, rate) # mixture: n_components + components: [{weight, params: {type, ...}}] max_bound: reader: preprocess_time: # same structure **Comparing against an external generator:** ``scripts/compare_dlio_yamls.py`` diffs two DLIO YAMLs with a tolerance check on parameters and a two-sample Kolmogorov-Smirnov check on samples drawn from each fit. Run via ``uv run scripts/compare_dlio_yamls.py --python --cpp `` (the inline PEP-723 metadata installs ``pyyaml`` and ``numpy`` automatically). Same model family + small KS = the two YAMLs would produce indistinguishable DLIO sample streams. dftracer_organize ----------------- **Description:** Reorganize traces by routing events to query-based groups with provenance tracking **Usage:** .. code-block:: bash dftracer_organize [OPTIONS] --output --groups **Options:** - ``--files `` - Input trace files (.pfw, .pfw.gz) - ``-d, --directory `` - Directory containing trace files - ``-o, --output `` - Output directory [required] - ``--groups `` - Query groups: ``'io:cat == "POSIX"'`` ``'compute:cat == "APP"'`` [required] - ``--chunk-size `` - Target chunk size in MB for output files (default: 256) - ``--checkpoint-size `` - Checkpoint size for indexing in bytes (default: 33554432 B / 32 MB) - ``--index-dir `` - Directory for sidecar files - ``-f, --force`` - Force rebuild of indices - ``--no-compress`` - Write plain .pfw instead of .pfw.gz - ``--executor-threads `` - Worker threads (default: number of CPU cores) **Example:** .. code-block:: bash # Separate I/O and compute operations dftracer_organize -d ./traces -o ./organized \ --groups 'io:cat == "POSIX"' 'compute:cat == "APP"' # Create multiple semantic views dftracer_organize -d ./traces -o ./views \ --groups 'read:name == "read"' 'write:name == "write"' 'other:' # Keep uncompressed output dftracer_organize -d ./traces -o ./plain --groups "all:" --no-compress dftracer_reconstruct -------------------- **Description:** Reconstruct original traces from reorganized files using provenance tracking in .pidx sidecars **Usage:** .. code-block:: bash dftracer_reconstruct [OPTIONS] --directory --output **Options:** - ``-d, --directory `` - Directory containing reorganized files [required] - ``-o, --output `` - Output directory [required] - ``--index-dir `` - Directory for sidecar files - ``--checkpoint-size `` - Checkpoint size for indexing in bytes (default: 33554432 B / 32 MB) - ``--no-compress`` - Write plain .pfw instead of .pfw.gz - ``--executor-threads `` - Worker threads (default: number of CPU cores) **Example:** .. code-block:: bash # Reconstruct from reorganized directory dftracer_reconstruct -d ./organized -o ./reconstructed # Reconstruct without compression dftracer_reconstruct -d ./views -o ./reconstructed --no-compress dftracer_replay --------------- **Description:** Replay I/O operations from DFTracer trace files with timing and filtering support **Usage:** .. code-block:: bash dftracer_replay [OPTIONS] **Options:** - ``inputs`` - Trace files (.pfw, .pfw.gz) or directories containing trace files [required] - ``--no-timing`` - Ignore original timing and execute as fast as possible - ``--dry-run`` - Parse and analyze traces without executing operations - ``--dftracer-mode`` - Use DFTracer sleep-based replay (sleep for operation duration instead of doing actual I/O) - ``--no-sleep`` - When used with --dftracer-mode, disable sleep calls for maximum speed - ``--verbose`` - Enable verbose output and detailed statistics - ``-r, --recursive`` - Recursively search directories for trace files - ``--use-call-tree`` - Build and use call tree structure for hierarchical replay - ``--hierarchical-replay`` - Replay operations respecting parent-child call hierarchy (requires --use-call-tree) - ``--respect-call-hierarchy`` - Replay child nodes immediately after parent (requires --use-call-tree and --hierarchical-replay) - ``--filter-pid `` - Only replay events from specific PID(s) (comma-separated) - ``--exclude-pid `` - Exclude events from specific PID(s) (comma-separated) - ``--filter-tid `` - Only replay events from specific TID(s) (comma-separated) - ``--exclude-tid `` - Exclude events from specific TID(s) (comma-separated) - ``--filter-function `` - Only replay specific function(s) (comma-separated, e.g., read,write,open) - ``--exclude-function `` - Exclude specific function(s) (comma-separated) - ``--filter-category `` - Only replay specific category/categories (comma-separated, e.g., POSIX,storage) - ``--exclude-category `` - Exclude specific category/categories (comma-separated) - ``--start-timestamp `` - Only replay events after this timestamp (microseconds) - ``--end-timestamp `` - Only replay events before this timestamp (microseconds) - ``--min-size `` - Only replay operations with size >= this value (bytes) - ``--max-size `` - Only replay operations with size <= this value (bytes) - ``--sample-rate `` - Sample rate for replay (0.0-1.0, 1.0=all events, 0.1=10%) - ``--sample-seed `` - Random seed for sampling (for reproducibility) - ``--max-events `` - Maximum number of events to replay (0=unlimited) **Example:** .. code-block:: bash # Replay with original timing dftracer_replay ./traces/rank_0.pfw.gz # Dry-run analysis of trace file dftracer_replay ./traces/rank_0.pfw.gz --dry-run --verbose # Replay only POSIX read operations dftracer_replay -d ./traces -r --filter-category POSIX --filter-function read For detailed usage, see :doc:`utilities/replay`. dftracer_tar ------------ **Description:** Index and analyze TAR.GZ archives containing DFTracer trace data **Usage:** .. code-block:: bash dftracer_tar [OPTIONS] **Options:** - ``file`` - TAR.GZ file to process [required] - ``-i, --index `` - Index file to use (auto-generated if not specified) - ``-c, --checkpoint-size `` - Checkpoint size for indexing in bytes (default: 33554432 B / 32 MB) - ``-f, --force-rebuild`` - Force rebuild index - ``--list-files`` - List all files in the TAR archive - ``--info`` - Show archive information - ``--build-only`` - Only build the index, don't perform other operations **Example:** .. code-block:: bash # Show archive information dftracer_tar trace_archive.tar.gz --info # List files in archive dftracer_tar trace_archive.tar.gz --list-files # Build index for fast access dftracer_tar trace_archive.tar.gz --build-only dftracer_gen_fake_trace ----------------------- **Description:** Generate realistic synthetic DFTracer traces for testing bloom filter indexing **Usage:** .. code-block:: bash dftracer_gen_fake_trace [OPTIONS] --output-dir **Options:** - ``-o, --output-dir `` - Output directory for trace files [required] - ``-p, --num-processes `` - Number of ranks (default: 8) - ``-H, --num-hosts `` - Number of hosts (default: 4) - ``-e, --num-epochs `` - Training epochs (default: 500) - ``-s, --steps-per-epoch `` - Steps per epoch (default: 1000) - ``--checkpoint-every `` - Checkpoint every N epochs (default: 5) - ``--validation-every `` - Validate every N epochs (default: 2) - ``--num-train-files `` - Training data shards (default: 8) - ``--num-val-files `` - Validation data shards (default: 2) - ``--step-duration-ms `` - Base step duration in milliseconds (default: 100) - ``--seed `` - Random seed for duration jitter (default: 42) - ``--verify`` - After generation, build bloom indices and run queries to verify chunk-skipping works - ``--checkpoint-size `` - Gzip checkpoint size in bytes for indexing (default: 2 MB) **Example:** .. code-block:: bash # Generate synthetic traces for 4 ranks dftracer_gen_fake_trace -o ./traces -p 4 # Generate with verification of bloom filters dftracer_gen_fake_trace -o ./traces -p 8 -H 2 --verify # Generate with custom training parameters dftracer_gen_fake_trace -o ./traces -e 100 -s 500 --checkpoint-every 10 dftracer_call_tree ------------------ **Description:** Build and analyze call trees from DFTracer trace files for hierarchical structure analysis **Usage:** .. code-block:: bash dftracer_call_tree [OPTIONS] **Options:** - ``inputs`` - Trace files (.pfw, .pfw.gz) or directories containing trace files [required] - ``-r, --recursive`` - Recursively search directories for trace files - ``--pattern `` - File pattern for trace files (default: ``*.pfw.gz``) - ``-o, --output `` - Output file path for serialized call tree (auto-generated from input if not specified) - ``--json`` - Also save call tree in JSON (Chrome Tracing) format - ``--text `` - Export call tree to text file - ``--max-depth `` - Maximum depth for tree printing (0=unlimited, default: 0) - ``--analyze`` - Perform detailed analysis (call patterns, timing, critical path) - ``-v, --verbose`` - Enable verbose output - ``--stats-only`` - Only print statistics, skip tree traversal - ``--no-save`` - Don't save output files, only print analysis **Example:** .. code-block:: bash # Build call tree from directory dftracer_call_tree ./traces --analyze # Export to JSON and text formats dftracer_call_tree ./traces --json --text tree.txt # Analyze with detailed statistics dftracer_call_tree ./traces --analyze --verbose --max-depth 5 dftracer_comparator ------------------- **Description:** Compare DFTracer trace metrics between a baseline and a variant run. Produces a hierarchical tree table showing per-category and per-operation deltas with Cohen's d significance classification. **Usage:** .. code-block:: bash dftracer_comparator [OPTIONS] **Options:** - ``--baseline `` - Baseline trace file or directory [required unless --config] - ``--variant `` - Variant trace file or directory [required unless --config] - ``--config `` - JSON config file for hierarchical comparison (replaces --baseline/--variant) - ``--query `` - Query DSL filter (default: ``'cat == "POSIX" OR cat == "STDIO"'``) - ``--group-by `` - Comma-separated group keys (default: cat,name) - ``--format `` - Output format: ``table`` (default) or ``json`` - ``-t, --time-interval `` - Time interval in milliseconds for bucketing (default: 5000) - ``--threshold `` - Hide changes below this percentage (default: 0.0) - ``--no-color`` - Disable ANSI color output - ``--executor-threads `` - Number of parallel threads (default: auto) - ``--index-dir `` - Directory for index sidecar files (default: system temp) - ``--force`` - Force index rebuild - ``--checkpoint-size `` - Checkpoint size for indexing in bytes (default: 33554432 B / 32 MB) **Example:** .. code-block:: bash # Quick comparison of two trace files dftracer_comparator --baseline run_v1.pfw.gz --variant run_v2.pfw.gz # Compare directories with 1-second buckets dftracer_comparator --baseline ./traces_v1 --variant ./traces_v2 -t 1000 # JSON output for programmatic consumption dftracer_comparator --baseline run_v1.pfw.gz --variant run_v2.pfw.gz --format json # Filter to specific operations dftracer_comparator --baseline a.pfw.gz --variant b.pfw.gz \ --query 'cat == "POSIX" AND name == "write"' # Hierarchical comparison via JSON config dftracer_comparator --config compare.json **Output columns:** - **Baseline / Variant** - Metric values for each side - **Delta** - Absolute difference (variant - baseline) - **Pct** - Percentage change - **Sig** - Cohen's d significance: ``NEGLIGIBLE``, ``SMALL``, ``MEDIUM``, ``LARGE`` **JSON config format:** .. code-block:: json { "baseline": "./traces_v1", "variant": "./traces_v2", "defaults": { "time_interval_ms": 5000, "threshold_pct": 1.0, "percentiles": [0.5, 0.95, 0.99] }, "nodes": [ { "name": "POSIX I/O", "query": "cat == \"POSIX\"", "children": [ {"name": "reads", "query": "name == \"read\""}, {"name": "writes", "query": "name == \"write\""} ] } ] } dftracer_aggregator_mpi ----------------------- **Description:** MPI driver for the distributed-SST aggregator. Each rank produces per-rank aggregation SSTs; rank 0 bulk-ingests and the ranks jointly write the final gzip JSON output. Requires the build to be configured with ``DFTRACER_UTILS_ENABLE_MPI=ON``. The pipeline is structured as a five-task DAG executed inside the standard ``Pipeline`` runtime: ``scan -> phase_a -> phase_b -> phase_c -> merge`` - **scan** - Cooperative gzip-member pre-scan, ``Allgatherv`` of the member map, and deterministic Longest-Processing-Time (LPT) assignment of work units to ranks. - **phase_a** - Each rank runs the distributed-SST indexer + aggregation visitor on its slice and writes SSTs (and ``tracker.bin``) to its rank staging directory. SSTs are optionally moved to a shared-FS staging root for the coordinator. - **phase_b** - Rank 0 ``Gatherv`` of artifact lists and a single ``IndexDatabase::bulk_ingest`` + tracker merge. - **phase_c** - Each rank writes a shard-prefixed Perfetto gzip JSON slice using ``PerfettoTraceWriterUtility``. - **merge** - Parallel ``pwrite`` on Lustre-striped output or serial concatenation otherwise. **Usage:** .. code-block:: bash mpirun -n dftracer_aggregator_mpi [OPTIONS] **Options:** - ``-d, --directory `` - Input directory containing .pfw or .pfw.gz files (default: ``.``) - ``-o, --output `` - Output gzip JSON path. ``.gz`` is appended if missing (default: ``aggregated_output.json.gz``) - ``-t, --time-interval `` - Time interval in milliseconds for bucketing (default: 5000) - ``--staging-dir `` - Per-rank SST staging root. Defaults to ``/_staging``; each rank writes to ``/rank_``. - ``--shared-staging `` - Shared-FS staging root. When set and different from ``--staging-dir``, each rank moves its SSTs and ``tracker.bin`` from the (node-local) staging dir to ``/rank_`` before the coordinator ingest. Required for multi-node runs where ``--staging-dir`` points at node-local NVMe. - ``--keep-staging`` - Keep per-rank SST staging dirs after a successful ingest This binary also accepts the shared :ref:`cli-shared-flags` (Pipeline and Indexing schemas). Per-rank ``--executor-threads`` / ``--io-threads`` are automatically scaled down by the detected processes-per-node count so co-located ranks do not oversubscribe cores. **Example:** .. code-block:: bash # 16 ranks on one node, node-local staging mpirun -n 16 dftracer_aggregator_mpi -d ./traces -o agg.json.gz # Multi-node run with shared staging on Lustre mpirun -n 64 dftracer_aggregator_mpi -d /lustre/traces \ --staging-dir /local/nvme/_staging \ --shared-staging /lustre/scratch/_staging \ -o /lustre/out/agg.json.gz dftracer_call_tree_mpi ---------------------- **Description:** MPI driver for parallel call-tree construction. Each rank owns a slice of PIDs, emits a Chrome Tracing JSON shard, and rank 0 merges the shards. Wraps the ``MPICallTreeBuilder`` engine (``discover_pids -> build -> hierarchy -> write -> merge`` coro phases). Requires ``DFTRACER_UTILS_ENABLE_MPI=ON``. **Usage:** .. code-block:: bash mpirun -n dftracer_call_tree_mpi [OPTIONS] **Options:** - ``input`` - Input directory containing trace files [required] - ``-o, --output `` - Output JSON path (default: ``call_tree.pfw``) - ``--staging-dir `` - Shared-FS staging root for per-rank shards (default: ``.shards/``) - ``--gzip`` - gzip the merged output (``.gz`` appended if needed) - ``-v, --verbose`` - Verbose progress logging - ``--keep-staging`` - Keep per-rank shard files after merge This binary also accepts the shared :ref:`cli-shared-flags` (Pipeline); per-rank thread counts are scaled down by the detected processes-per-node count. **Example:** .. code-block:: bash # 32 ranks across nodes; gzip merged output mpirun -n 32 dftracer_call_tree_mpi ./traces -o call_tree.pfw --gzip