DFTracer Comparator

Namespace: dftracer::utils::utilities::composites::dft::comparator

struct CollapsedMetrics

Per-(category, operation) collapsed metrics aggregated across processes and time windows.

Within each time window: max across processes. Across windows: mean +/- stdev of per-window max values.

Public Members

AggregationMetrics merged = {0.01}

DDSketch merged across all windows for percentile queries.

double count_mean = 0.0

Mean of per-window max event counts.

double dur_mean_of_means = 0.0

Mean of per-window mean durations.

double dur_stdev_of_means = 0.0

Stdev of per-window mean durations.

double size_mean_of_means = 0.0

Mean of per-window mean sizes.

double size_stdev_of_means = 0.0

Stdev of per-window mean sizes.

double xfer_mean = 0.0

Mean of per-window max transfer sizes.

double xfer_stdev = 0.0

Stdev of per-window max transfer sizes.

double bw_mean = 0.0

Mean of per-window max bandwidths.

double bw_stdev = 0.0

Stdev of per-window max bandwidths.

std::size_t num_windows = 0

Number of time windows contributing to these statistics.

struct ColumnWidths

Dynamically computed column widths for aligned table output.

Public Members

int left = 6

Left column (metric/node name).

int baseline = 8

Baseline value column.

int variant = 7

Variant value column.

int delta = 5

Delta column.

int change = 6

Percentage change column.

struct ComparisonConfig

Top-level configuration for the comparison pipeline.

Constructed from CLI arguments via from_cli() or loaded from a JSON file via from_json_file(). Call resolve() after construction to propagate defaults down the node tree.

Public Functions

void resolve()

Resolve inheritance: compose queries and propagate defaults down the node tree. Must be called before using the config.

Public Members

std::string baseline

Baseline trace file or directory path.

std::string variant

Variant trace file or directory path.

ComparisonDefaults defaults

Default settings inherited by all nodes.

std::vector<ComparisonNode> nodes

Hierarchical comparison tree (top-level nodes).

std::vector<std::string> output_paths

Output file paths (CLI only, detect format by extension).

std::string format = "table"

Output format: “table” or “json”.

bool no_color = false

Disable ANSI color in table output.

std::size_t executor_threads = 0

Number of parallel threads (0 = auto-detect).

std::size_t checkpoint_size = 0

Checkpoint size for index building (0 = default).

std::string baseline_index_dir

Directory for baseline .dftindex store (empty = co-located).

std::string variant_index_dir

Directory for variant .dftindex store (empty = co-located).

bool force_rebuild = false

Force rebuild of existing indexes.

Public Static Functions

static std::optional<ComparisonConfig> from_json_file(const std::string &path, std::string &error)

Parse a JSON config file. Returns nullopt and sets error on failure.

static ComparisonConfig from_cli(const std::string &baseline, const std::string &variant, const std::string &query, const std::string &group_by_str)

Build config from CLI arguments (quick mode without JSON). Creates a single root node with the given query and group-by.

struct ComparisonDefaults

Default settings inherited by all comparison nodes unless overridden.

Public Members

std::vector<std::string> metrics = {"count", "duration", "size", "transfer_size", "bandwidth"}

Metric groups to compare (count, duration, size, transfer_size, bandwidth).

std::vector<double> percentiles = {0.50, 0.95, 0.99}

Percentiles to compute from DDSketch (e.g. p50, p95, p99).

double threshold_pct = 0.0

Hide changes below this percentage in the output.

double time_interval_ms = 5000.0

Time bucket width in milliseconds for aggregation windows.

std::string sort_by = "regression"

Sort order for groups: “regression” sorts worst regressions first.

struct ComparisonNode

A node in the hierarchical comparison tree.

Each node carries a query that is AND’d with its parent’s query, and optional per-node overrides for metrics, percentiles, and threshold. Children inherit from their parent unless they override.

Public Members

std::string name

Display name for this node (e.g. “POSIX I/O”, “reads”).

std::string query

Additional query filter, AND’d with the parent’s composed query.

std::vector<std::string> group_by

Extra group-by keys beyond the default (cat, name).

std::optional<std::vector<std::string>> metrics

Per-node metric override (nullopt = inherit from parent).

std::optional<std::vector<double>> percentiles

Per-node percentile override (nullopt = inherit from parent).

std::optional<double> threshold_pct

Per-node threshold override (nullopt = inherit from parent).

std::optional<std::string> sort_by

Per-node sort override (nullopt = inherit from parent).

std::vector<ComparisonNode> children

Child nodes forming the comparison hierarchy.

std::string composed_query

Full query including all ancestor queries, populated by resolve().

std::vector<std::string> resolved_metrics

Resolved metrics after inheritance, populated by resolve().

std::vector<double> resolved_percentiles

Resolved percentiles after inheritance, populated by resolve().

double resolved_threshold_pct = 0.0

Resolved threshold after inheritance, populated by resolve().

std::string resolved_sort_by

Resolved sort order after inheritance, populated by resolve().

struct ComparisonOutput

Top-level output of the comparison pipeline.

Public Functions

common::arrow::ArrowExportResult to_arrow() const

Flatten the tree into a single Arrow record batch.

Columns: node_path, metric_group, metric_name, baseline, variant, baseline_stdev, variant_stdev, delta, pct_change, cohens_d, significance, is_regression. Rows with all-zero values are skipped.

Public Members

std::string baseline_path

Baseline file or directory path.

std::string variant_path

Variant file or directory path.

std::size_t baseline_file_count = 0

Number of baseline trace files processed.

std::size_t variant_file_count = 0

Number of variant trace files processed.

TraceMetadata baseline_meta

Metadata extracted from baseline traces.

TraceMetadata variant_meta

Metadata extracted from variant traces.

std::vector<NodeResult> nodes

Top-level comparison tree results.

double execution_time_ms = 0.0

Total pipeline execution time in milliseconds.

class ComparisonUtility : public dftracer::utils::utilities::Utility<ComparisonUtilityInput, ComparisonUtilityOutput>

Joins baseline and variant aggregation outputs, builds the hierarchical comparison tree (root -> categories -> operations), and computes deltas with Cohen’s d significance classification.

Public Functions

coro::CoroTask<ComparisonUtilityOutput> process(const ComparisonUtilityInput &input) override

Run the comparison pipeline.

struct ComparisonUtilityInput

Input to ComparisonUtility::process().

Public Members

std::vector<ComparisonVisitorPair> visitors

Visitor pairs, one per flattened node in the tree.

ComparisonNode root_node

Root node of the comparison tree (for hierarchy reconstruction).

std::size_t baseline_file_count = 0

Number of baseline trace files (for metadata).

std::size_t variant_file_count = 0

Number of variant trace files (for metadata).

struct ComparisonUtilityOutput

Output from ComparisonUtility::process().

Public Members

NodeResult result

Hierarchical comparison result tree.

bool success = false

Whether the comparison completed successfully.

struct ComparisonVisitorPair

Paired baseline/variant aggregation outputs for a single comparison node.

Public Members

EventAggregatorOutput baseline

Aggregation output for the baseline run.

EventAggregatorOutput variant

Aggregation output for the variant run.

ComparisonNode node

Resolved config node for this visitor.

struct FormatterOptions

Options controlling the visual appearance of rendered output.

Public Members

bool use_color = true

Enable ANSI color escape codes in table output.

bool use_unicode = true

Use Unicode box-drawing characters for tree branches.

struct GroupComparison

Comparison of all metrics for a single (category, operation) group.

Public Members

std::string label

Group label (e.g. “POSIX/read”, empty for summary).

bool baseline_present = true

Whether this group exists in the baseline.

bool variant_present = true

Whether this group exists in the variant.

std::vector<MetricComparison> metrics

Per-metric comparisons for this group.

double worst_pct_change = 0.0

Worst percentage change across all metrics (for regression sorting).

struct MetricComparison

Comparison of a single metric between baseline and variant.

Public Members

std::string metric_name

Metric name (e.g. “count”, “dur_mean”, “dur_p50”, “size”).

double baseline_value = 0.0

Baseline value (mean across time windows).

double variant_value = 0.0

Variant value (mean across time windows).

double baseline_stdev = 0.0

Baseline standard deviation across time windows (0 if N<=1).

double variant_stdev = 0.0

Variant standard deviation across time windows.

double delta = 0.0

Absolute difference (variant - baseline).

double pct_change = 0.0

Percentage change (e.g. 15.3 for +15.3%).

double cohens_d = 0.0

Cohen’s d effect size.

Significance significance = Significance::NEGLIGIBLE

Classified significance level.

bool is_regression = false

True if performance got worse (higher duration, lower bandwidth).

struct NodeResult

Result for a single node in the comparison tree.

Public Members

std::string name

Node name from the config.

std::string composed_query

Full composed query for this node.

std::vector<std::string> group_by

Group-by keys used.

std::vector<GroupComparison> groups

Per-group comparisons (empty if summary-only).

GroupComparison summary

Aggregate summary across all groups (always present).

std::vector<NodeResult> children

Child node results.

struct TraceMetadata

Metadata extracted from a trace run (baseline or variant).

Public Members

std::size_t file_count = 0

Number of trace files.

std::size_t process_count = 0

Number of unique process IDs.

std::size_t thread_count = 0

Number of unique thread IDs.

double total_bytes = 0.0

Total bytes transferred (I/O operations only).

double total_io_time_us = 0.0

Sum of all event durations in microseconds.

double makespan_us = 0.0

Wall-clock makespan in microseconds (max ts - min ts).

class TreeTableFormatter

Renders ComparisonOutput as an ASCII tree table or JSON.

The tree table uses dynamic column alignment with UTF-8 display width awareness. Nodes are rendered hierarchically with tree branch characters, and metrics are shown as leaves under each node.

Public Functions

explicit TreeTableFormatter(FormatterOptions options = {})

Construct a formatter with the given display options.

void render(std::FILE *out, const ComparisonOutput &output) const

Render the comparison output as a tree table to a FILE stream.

std::string render_json(const ComparisonOutput &output) const

Render the comparison output as a JSON string.