Quick Start
This guide will help you get started with pydftracer quickly.
Basic Usage
Enabling DFTracer
DFTracer must be enabled via environment variable before use:
export DFTRACER_ENABLE=1
python your_script.py
Initialize DFTracer Logger
First, initialize the DFTracer logger:
from dftracer.python import dftracer
# Initialize with log file path, data directory, and process ID
df_logger = dftracer.initialize_log(
log_file="trace.pfw",
data_dir="/tmp/data",
process_id=-1 # -1 for auto process ID
)
# Your code here - I/O operations will be automatically traced
# Always finalize when done
df_logger.finalize()
Function Tracing with dft_fn
Use dft_fn to create tracers for specific functions:
from dftracer.python import dftracer, dft_fn
import numpy as np
# Initialize logger
df_logger = dftracer.initialize_log("trace.pfw", "/tmp/data", -1)
# Create a function tracer
io_tracer = dft_fn("data_io")
@io_tracer.log
def write_data(filename, data):
np.save(filename, data)
@io_tracer.log
def read_data(filename):
return np.load(filename)
# Use the traced functions
data = np.ones((100, 100))
write_data('test.npy', data)
result = read_data('test.npy')
df_logger.finalize()
Using Iterators
Track iterations with dft_fn.iter():
from dftracer.python import dftracer, dft_fn
df_logger = dftracer.initialize_log("trace.pfw", "/tmp/data", -1)
my_tracer = dft_fn("training")
@my_tracer.log
def process_batch(batch_id):
# Process each batch
for i in my_tracer.iter(range(10)):
# Process item i
pass
process_batch(0)
df_logger.finalize()
Environment Configuration
Configure pydftracer using environment variables:
# Enable DFTracer
export DFTRACER_ENABLE=1
# Set initialization mode (PRELOAD or other)
export DFTRACER_INIT=PRELOAD
# Set log level (DEBUG, INFO, WARN, ERROR)
export DFTRACER_LOG_LEVEL=INFO
You can also check these in your code:
from dftracer.python import (
DFTRACER_ENABLE,
DFTRACER_INIT_PRELOAD,
DFTRACER_LOG_LEVEL
)
if DFTRACER_ENABLE:
print("DFTracer is enabled")
AI/ML Tracing
PyTorch Dynamo Integration
For PyTorch applications, use the create_backend function to create a custom DFTracer backend for torch.compile:
import torch
from dftracer.python import dftracer
from dftracer.python.dynamo import create_backend
# Initialize logger
df_logger = dftracer.initialize_log("model_trace.pfw", "/tmp/data", -1)
# Create a custom DFTracer backend
backend = create_backend(
name="my_model",
epoch=0,
step=0,
enable=True
)
# Use with torch.compile
model = MyModel()
compiled_model = torch.compile(model, backend=backend)
# Run your model - operations will be traced
output = compiled_model(input_tensor)
# Finalize when done
df_logger.finalize()
Dynamo Class
For more control over Dynamo tracing:
from dftracer.python import Dynamo
# Create a Dynamo tracer instance
dynamo_tracer = Dynamo(
name="my_model",
epoch=1,
step=100,
enable=True
)
# Use in your training loop
# The tracer will record PyTorch operations
AI Tracing Features
Use the AI tracing utilities:
from dftracer.python import DFTracerAI
# Create an AI-specific tracer
ai_tracer = DFTracerAI(
cat="training",
name="resnet50",
epoch=5,
step=1000,
enable=True
)
Advanced Usage
Metadata Events
Log metadata events:
from dftracer.python import dftracer
dft = dftracer()
# Log metadata
dft.log_metadata_event("key", "value")
Next Steps
Explore the API Reference for detailed API documentation
Check the DFTracer main documentation for more advanced features
Look at example scripts in the repository