DFTracer Comparator¶
Namespace: dftracer::utils::utilities::composites::dft::comparator
-
struct CollapsedMetrics¶
Per-(category, operation) collapsed metrics aggregated across processes and time windows.
Within each time window: max across processes. Across windows: mean +/- stdev of per-window max values.
Public Members
-
AggregationMetrics merged = {0.01}¶
DDSketch merged across all windows for percentile queries.
-
double count_mean = 0.0¶
Mean of per-window max event counts.
-
double dur_mean_of_means = 0.0¶
Mean of per-window mean durations.
-
double dur_stdev_of_means = 0.0¶
Stdev of per-window mean durations.
-
double size_mean_of_means = 0.0¶
Mean of per-window mean sizes.
-
double size_stdev_of_means = 0.0¶
Stdev of per-window mean sizes.
-
double xfer_mean = 0.0¶
Mean of per-window max transfer sizes.
-
double xfer_stdev = 0.0¶
Stdev of per-window max transfer sizes.
-
double bw_mean = 0.0¶
Mean of per-window max bandwidths.
-
double bw_stdev = 0.0¶
Stdev of per-window max bandwidths.
-
std::size_t num_windows = 0¶
Number of time windows contributing to these statistics.
-
AggregationMetrics merged = {0.01}¶
-
struct ColumnWidths¶
Dynamically computed column widths for aligned table output.
-
struct ComparisonConfig¶
Top-level configuration for the comparison pipeline.
Constructed from CLI arguments via from_cli() or loaded from a JSON file via from_json_file(). Call resolve() after construction to propagate defaults down the node tree.
Public Functions
-
void resolve()¶
Resolve inheritance: compose queries and propagate defaults down the node tree. Must be called before using the config.
Public Members
-
std::string baseline¶
Baseline trace file or directory path.
-
std::string variant¶
Variant trace file or directory path.
-
ComparisonDefaults defaults¶
Default settings inherited by all nodes.
-
std::vector<ComparisonNode> nodes¶
Hierarchical comparison tree (top-level nodes).
-
std::vector<std::string> output_paths¶
Output file paths (CLI only, detect format by extension).
-
std::string format = "table"¶
Output format: “table” or “json”.
-
bool no_color = false¶
Disable ANSI color in table output.
-
std::size_t executor_threads = 0¶
Number of parallel threads (0 = auto-detect).
-
std::size_t checkpoint_size = 0¶
Checkpoint size for index building (0 = default).
-
std::string baseline_index_dir¶
Directory for baseline
.dftindexstore (empty = co-located).
-
std::string variant_index_dir¶
Directory for variant
.dftindexstore (empty = co-located).
-
bool force_rebuild = false¶
Force rebuild of existing indexes.
Public Static Functions
-
static std::optional<ComparisonConfig> from_json_file(const std::string &path, std::string &error)¶
Parse a JSON config file. Returns nullopt and sets
erroron failure.
-
static ComparisonConfig from_cli(const std::string &baseline, const std::string &variant, const std::string &query, const std::string &group_by_str)¶
Build config from CLI arguments (quick mode without JSON). Creates a single root node with the given query and group-by.
-
void resolve()¶
-
struct ComparisonDefaults¶
Default settings inherited by all comparison nodes unless overridden.
Public Members
-
std::vector<std::string> metrics = {"count", "duration", "size", "transfer_size", "bandwidth"}¶
Metric groups to compare (count, duration, size, transfer_size, bandwidth).
-
std::vector<double> percentiles = {0.50, 0.95, 0.99}¶
Percentiles to compute from DDSketch (e.g. p50, p95, p99).
-
double threshold_pct = 0.0¶
Hide changes below this percentage in the output.
-
double time_interval_ms = 5000.0¶
Time bucket width in milliseconds for aggregation windows.
-
std::string sort_by = "regression"¶
Sort order for groups: “regression” sorts worst regressions first.
-
std::vector<std::string> metrics = {"count", "duration", "size", "transfer_size", "bandwidth"}¶
-
struct ComparisonNode¶
A node in the hierarchical comparison tree.
Each node carries a query that is AND’d with its parent’s query, and optional per-node overrides for metrics, percentiles, and threshold. Children inherit from their parent unless they override.
Public Members
-
std::string name¶
Display name for this node (e.g. “POSIX I/O”, “reads”).
-
std::string query¶
Additional query filter, AND’d with the parent’s composed query.
-
std::vector<std::string> group_by¶
Extra group-by keys beyond the default (cat, name).
-
std::optional<std::vector<std::string>> metrics¶
Per-node metric override (nullopt = inherit from parent).
-
std::optional<std::vector<double>> percentiles¶
Per-node percentile override (nullopt = inherit from parent).
-
std::optional<double> threshold_pct¶
Per-node threshold override (nullopt = inherit from parent).
-
std::optional<std::string> sort_by¶
Per-node sort override (nullopt = inherit from parent).
-
std::vector<ComparisonNode> children¶
Child nodes forming the comparison hierarchy.
-
std::string composed_query¶
Full query including all ancestor queries, populated by resolve().
-
std::vector<std::string> resolved_metrics¶
Resolved metrics after inheritance, populated by resolve().
-
std::vector<double> resolved_percentiles¶
Resolved percentiles after inheritance, populated by resolve().
-
double resolved_threshold_pct = 0.0¶
Resolved threshold after inheritance, populated by resolve().
-
std::string resolved_sort_by¶
Resolved sort order after inheritance, populated by resolve().
-
std::string name¶
-
struct ComparisonOutput¶
Top-level output of the comparison pipeline.
Public Functions
-
common::arrow::ArrowExportResult to_arrow() const¶
Flatten the tree into a single Arrow record batch.
Columns: node_path, metric_group, metric_name, baseline, variant, baseline_stdev, variant_stdev, delta, pct_change, cohens_d, significance, is_regression. Rows with all-zero values are skipped.
Public Members
-
std::string baseline_path¶
Baseline file or directory path.
-
std::string variant_path¶
Variant file or directory path.
-
std::size_t baseline_file_count = 0¶
Number of baseline trace files processed.
-
std::size_t variant_file_count = 0¶
Number of variant trace files processed.
-
TraceMetadata baseline_meta¶
Metadata extracted from baseline traces.
-
TraceMetadata variant_meta¶
Metadata extracted from variant traces.
-
std::vector<NodeResult> nodes¶
Top-level comparison tree results.
-
double execution_time_ms = 0.0¶
Total pipeline execution time in milliseconds.
-
common::arrow::ArrowExportResult to_arrow() const¶
-
class ComparisonUtility : public dftracer::utils::utilities::Utility<ComparisonUtilityInput, ComparisonUtilityOutput>¶
Joins baseline and variant aggregation outputs, builds the hierarchical comparison tree (root -> categories -> operations), and computes deltas with Cohen’s d significance classification.
Public Functions
-
coro::CoroTask<ComparisonUtilityOutput> process(const ComparisonUtilityInput &input) override¶
Run the comparison pipeline.
-
coro::CoroTask<ComparisonUtilityOutput> process(const ComparisonUtilityInput &input) override¶
-
struct ComparisonUtilityInput¶
Input to ComparisonUtility::process().
Public Members
-
std::vector<ComparisonVisitorPair> visitors¶
Visitor pairs, one per flattened node in the tree.
-
ComparisonNode root_node¶
Root node of the comparison tree (for hierarchy reconstruction).
-
std::size_t baseline_file_count = 0¶
Number of baseline trace files (for metadata).
-
std::size_t variant_file_count = 0¶
Number of variant trace files (for metadata).
-
std::vector<ComparisonVisitorPair> visitors¶
-
struct ComparisonUtilityOutput¶
Output from ComparisonUtility::process().
Public Members
-
NodeResult result¶
Hierarchical comparison result tree.
-
bool success = false¶
Whether the comparison completed successfully.
-
NodeResult result¶
-
struct ComparisonVisitorPair¶
Paired baseline/variant aggregation outputs for a single comparison node.
Public Members
-
EventAggregatorOutput baseline¶
Aggregation output for the baseline run.
-
EventAggregatorOutput variant¶
Aggregation output for the variant run.
-
ComparisonNode node¶
Resolved config node for this visitor.
-
EventAggregatorOutput baseline¶
-
struct FormatterOptions¶
Options controlling the visual appearance of rendered output.
-
struct GroupComparison¶
Comparison of all metrics for a single (category, operation) group.
Public Members
-
std::string label¶
Group label (e.g. “POSIX/read”, empty for summary).
-
bool baseline_present = true¶
Whether this group exists in the baseline.
-
bool variant_present = true¶
Whether this group exists in the variant.
-
std::vector<MetricComparison> metrics¶
Per-metric comparisons for this group.
-
double worst_pct_change = 0.0¶
Worst percentage change across all metrics (for regression sorting).
-
std::string label¶
-
struct MetricComparison¶
Comparison of a single metric between baseline and variant.
Public Members
-
std::string metric_name¶
Metric name (e.g. “count”, “dur_mean”, “dur_p50”, “size”).
-
double baseline_value = 0.0¶
Baseline value (mean across time windows).
-
double variant_value = 0.0¶
Variant value (mean across time windows).
-
double baseline_stdev = 0.0¶
Baseline standard deviation across time windows (0 if N<=1).
-
double variant_stdev = 0.0¶
Variant standard deviation across time windows.
-
double delta = 0.0¶
Absolute difference (variant - baseline).
-
double pct_change = 0.0¶
Percentage change (e.g. 15.3 for +15.3%).
-
double cohens_d = 0.0¶
Cohen’s d effect size.
-
Significance significance = Significance::NEGLIGIBLE¶
Classified significance level.
-
bool is_regression = false¶
True if performance got worse (higher duration, lower bandwidth).
-
std::string metric_name¶
-
struct NodeResult¶
Result for a single node in the comparison tree.
Public Members
-
std::string name¶
Node name from the config.
-
std::string composed_query¶
Full composed query for this node.
-
std::vector<std::string> group_by¶
Group-by keys used.
-
std::vector<GroupComparison> groups¶
Per-group comparisons (empty if summary-only).
-
GroupComparison summary¶
Aggregate summary across all groups (always present).
-
std::vector<NodeResult> children¶
Child node results.
-
std::string name¶
-
struct TraceMetadata¶
Metadata extracted from a trace run (baseline or variant).
Public Members
-
std::size_t file_count = 0¶
Number of trace files.
-
std::size_t process_count = 0¶
Number of unique process IDs.
-
std::size_t thread_count = 0¶
Number of unique thread IDs.
-
double total_bytes = 0.0¶
Total bytes transferred (I/O operations only).
-
double total_io_time_us = 0.0¶
Sum of all event durations in microseconds.
-
double makespan_us = 0.0¶
Wall-clock makespan in microseconds (max ts - min ts).
-
std::size_t file_count = 0¶
-
class TreeTableFormatter¶
Renders ComparisonOutput as an ASCII tree table or JSON.
The tree table uses dynamic column alignment with UTF-8 display width awareness. Nodes are rendered hierarchically with tree branch characters, and metrics are shown as leaves under each node.
Public Functions
-
explicit TreeTableFormatter(FormatterOptions options = {})¶
Construct a formatter with the given display options.
-
void render(std::FILE *out, const ComparisonOutput &output) const¶
Render the comparison output as a tree table to a FILE stream.
-
std::string render_json(const ComparisonOutput &output) const¶
Render the comparison output as a JSON string.
-
explicit TreeTableFormatter(FormatterOptions options = {})¶