Custom Columns Derived from DFTracer¶
This section describes how to derive custom columns for a DFAnalyzer data framework from DFTracer events using a custom function and load specific fields.
Example: Custom Workflow Columns¶
We can define custom columns in DFAnalyzer by specifying a function that extracts the desired fields from the json_object and then loading those fields with their corresponding types.
Below is an example of how to define a custom function to derive columns from a Pegasus Montage workflow trace:
def wf_cols_function(json_object, current_dict, time_approximate, condition_fn, load_data):
d = {}
if "args" in json_object:
if "size" in json_object["args"]:
d["size"] = int(json_object["args"]["size"])
if "ret" in json_object["args"]:
d["ret"] = int(json_object["args"]["ret"])
return d
load_cols_wf = {'size': "int64[pyarrow]", 'ret': "int64[pyarrow]"}
Next, use this function in DFAnalyzer to load traces with the custom columns (here is an example of loading Montage traces):
from dfanalyzer import DFAnalyzer
analyzer = DFAnalyzer(
"/path/to/montage-*-preload.pfw.gz",
load_fn=wf_cols_function,
load_cols=load_cols_wf
)
Here, the custom columns size, ret, and cmd are loaded into the DFAnalyzer using the wf_cols_function.
You can modify the function and column types to match the fields relevant to your workload.