PyTorch Profiler Integration
This guide explains how to use pydftracer with PyTorch’s built-in profiler to capture profiling events and log them to DFTracer trace files.
Overview
The PyTorch Profiler integration allows you to:
Capture PyTorch profiler events (CPU, CUDA, memory usage)
Log profiling data to DFTracer trace files with category
PP(PyTorch Profiler)Combine PyTorch profiling with I/O tracing for comprehensive analysis
Setup
Enable DFTracer before using the PyTorch Profiler integration:
export DFTRACER_ENABLE=1
Basic Usage
Use the trace_handler function as the on_trace_ready callback for PyTorch’s profiler:
import torch
from torch.profiler import profile, schedule, ProfilerActivity
from dftracer.python import dftracer
from dftracer.python.torch import trace_handler
# Initialize DFTracer logger
df_logger = dftracer.initialize_log("profiler_trace.pfw", None, -1)
# Define profiler schedule
profiler_schedule = schedule(
wait=1,
warmup=1,
active=2,
repeat=1,
)
# Run profiler with trace_handler
with profile(
activities=[ProfilerActivity.CPU],
schedule=profiler_schedule,
on_trace_ready=trace_handler,
profile_memory=True,
with_stack=True,
) as p:
for step in range(4):
# Your training code here
model(input_data)
p.step()
df_logger.finalize()
Training Loop Example
Complete example with a training loop:
import torch
import torch.nn as nn
from torch.profiler import profile, schedule, ProfilerActivity, record_function
from dftracer.python import dftracer
from dftracer.python import dft_fn as Profile
from dftracer.python.torch import trace_handler
# Initialize logger
df_logger = dftracer.initialize_log("training.pfw", None, -1)
# Model setup
model = nn.Linear(10, 2)
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)
criterion = nn.CrossEntropyLoss()
# Create DFTracer profile for additional tracing
df_test = Profile("training")
@df_test.log
def training_step(inputs, labels):
optimizer.zero_grad()
with record_function("forward"):
outputs = model(inputs)
with record_function("loss"):
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
return loss.item()
# Profiler schedule
profiler_schedule = schedule(wait=1, warmup=1, active=2, repeat=1)
with profile(
activities=[ProfilerActivity.CPU],
schedule=profiler_schedule,
on_trace_ready=trace_handler,
profile_memory=True,
) as p:
for step in range(4):
inputs = torch.randn(8, 10)
labels = torch.randint(0, 2, (8,))
loss = training_step(inputs, labels)
p.step()
df_logger.finalize()
What Gets Traced
The trace_handler captures the following from PyTorch profiler events:
Event name: Operation or kernel name
Timing: Start time and duration (microseconds)
Device type: CPU or CUDA device
Memory usage: CPU and device memory
Input shapes: Size of input tensors
CPU/Device utilization: Percentage metrics
All events are logged with category PP (PyTorch Profiler) in the trace file.
Example Output
Trace entries from PyTorch Profiler look like:
{"name":"aten::linear","cat":"PP","ts":1234567890,"dur":150,"args":{"device":0,"cpu_memory":1024,"device_memory_usage":0}}
Combining with I/O Tracing
To capture both PyTorch profiler events and I/O operations:
export DFTRACER_ENABLE=1
export DFTRACER_DATA_DIR=all
from dftracer.python import dftracer, ai
from dftracer.python.torch import trace_handler
from torch.profiler import profile, ProfilerActivity
df_logger = dftracer.initialize_log("combined.pfw", None, -1)
# Use AI decorators for I/O tracing
@ai.data.item
def load_data(idx):
# Your data loading code
return data
# Use PyTorch profiler for compute tracing
with profile(
activities=[ProfilerActivity.CPU],
on_trace_ready=trace_handler,
) as p:
for step in range(steps):
data = load_data(step)
output = model(data)
p.step()
df_logger.finalize()
API Reference
- dftracer.python.torch.trace_handler(profiler_result)
Callback function for PyTorch profiler’s
on_trace_readyparameter.- Parameters:
profiler_result (torch.profiler.profiler.ProfilerResult) – PyTorch profiler result object containing events
The handler iterates through all profiler events and logs them to DFTracer with the following information:
name: Event key/namecat: Always"PP"(PyTorch Profiler)start_time: Event start time in microsecondsduration: Event duration in microsecondsint_args: device, cpu_memory, is_remote, device_memory_usage, input_sizefloat_args: total_cpu_percent, total_device_percent