Skip to main content
W&B tracks the inputs and outputs of runs using directed acyclic graphs (DAGs) called lineage graphs. Lineage graphs are visual representations of the relationships between artifacts and runs in an ML experiment. They show how data and models flow through different stages of the ML lifecycle, from raw data ingestion to model training and evaluation. Tracking artifact lineage provides several key advantages:
  • Reproducibility: Enables teams to reproduce experiments, models, and results for debugging, experimentation, and validation.
  • Version control: Tracks changes to artifacts over time, allowing teams to revert to previous data or model versions when needed.
  • Auditing: Maintains a detailed record of artifacts and transformations to support compliance and governance.
  • Collaboration: Helps to improve teamwork by making experiment history transparent, reducing duplicated effort, and accelerating development.

View an artifact’s lineage graph

To view an artifact’s lineage graph:
  1. Navigate to your project’s workspace in the W&B App.
  2. Click on the Artifacts tab in the project sidebar.
  3. Select an artifact, then click the Lineage tab.
The lineage graph is a visual representation of the relationships between artifacts and runs in an ML experiment. Use the W&B App UI or the Python SDK to explore and traverse an artifact’s lineage graph.
Nodes with green icons represent runs. Nodes with blue icons represent artifacts. Arrows between nodes indicate the input and output of a run or artifact.Artifact nodes display the artifact’s name along with the version of the artifact in the form <artifact_name>:<version>. An artifact’s type is displayed above the name of the artifact.
You can view the type and the name of artifact in both the left sidebar and in the lineage graph node.
Run nodes display the run’s name.
Run and artifact nodes
Click any individual run to get more information about that runs such as the run’s: start time, time duration, author, job type, and more. Click any individual artifact to get more information about the artifact’s: aliases, creation time, type, version, description, the run that logged the artifact, file size, and more.
Previewing a run
Runs that create multiple versions of the same artifact are grouped together in a cluster. Click on a specific artifact version listed within the cluster to view specific information about that artifact version.
Cluster of artifact versions in a lineage graph
Click and drag a node to rearrange the graph to customize the layout. You can also zoom in and out of the graph to get a better view of the nodes and their relationships.
Rearranging nodes in a lineage graph
Hover your mouse over a node and click on the eye icon to hide or show a node in the graph. This is useful for decluttering the graph to focus on specific nodes and their relationships.

Enable lineage graph tracking

To enable lineage graph tracking, you need to mark artifacts as inputs or outputs of a run using the W&B Python SDK.

Track the input of a run

Mark an artifact as the input (or dependency) of a run with the wandb.Run.use_artifact() method. Specify the name of the artifact and an optional alias to reference a specific version of that artifact. The name of the artifact is in the format <artifact_name>:<version> or <artifact_name>:<alias>. Replace values enclosed in angle brackets (< >) with your values:
import wandb

# Initialize a run
with wandb.init(entity="<entity>", project="<project>") as run:
  # Get artifact, mark it as a dependency
  artifact = run.use_artifact(artifact_or_name="<name>", aliases="<alias>")

Track the output of a run

Use wandb.Run.log_artifact() to declare an artifact as an output of a run. First, create an artifact with the wandb.Artifact() constructor. Then, log the artifact as an output of the run with wandb.Run.log_artifact(). Replace values enclosed in angle brackets (< >) with your values:
import wandb

# Initialize a run
with wandb.init(entity="<entity>", project="<project>") as run:
  
  # Create an artifact
  artifact = wandb.Artifact(name = "<artifact_name>", type = "<artifact_type>")
  artifact.add_file(local_path = "<local_filepath>", name="<optional-name>")

  # Log the artifact as an output of the run
  run.log_artifact(artifact_or_path = artifact)

Artifact clusters

When a level of the graph has five or more runs or artifacts, it creates a cluster. A cluster has a search bar to find specific versions of runs or artifacts and pulls an individual node from a cluster to continue investigating the lineage of a node inside a cluster. Clicking on a node opens a preview with an overview of the node. Clicking on the arrow extracts the individual run or artifact so you can examine the lineage of the extracted node.
Searching a run cluster