Key Design Concepts

Core Data Model

DUSTrack organizes point tracking data using three hierarchical concepts: Annotations, Labels, and Layers.

Annotations

An annotation is a single data point consisting of a frame number (integer) and a 2D pixel location (x, y coordinates as floats).

Example: 5: [120.3, 240.7] means at frame 5, the point is located at pixel coordinates (120.3, 240.7).

Annotations are the atomic units of tracking data—each represents one observation of a point at one moment in time.

Labels

A label is the identifier for one anatomical point being tracked throughout the video. Each label contains a collection of annotations across different frames. Labels are numeric strings for keyboard efficiency: "0", "1", "2", …, "9": press number keys to switch between labels during annotation. When tracking more than 10 points, access extended ranges ("10"-"19", "20"-"29", etc.) accessed via Shift+, and Shift+.

While numbers are used to for efficiency of manual annotations, we recommend creating a separate .txt file containing Human-readable names for each anatomical landmark. Create a dlc_trackermap.txt file to map numeric labels to anatomical names:

point0 - muscle_boundary
point1 - fascia
point2 - bone_surface

Data structure: Each label stores a dictionary mapping frame numbers to pixel coordinates, representing the pixel location of the same anatomical landmark in three different frames:

"0": {5: [120.3, 240.7], 10: [122.1, 241.5], 15: [123.8, 242.1]}  # label "0": 3 annotations

Layers

A layer is a complete annotation set stored in one file. Each layer contains multiple labels (typically tracking different anatomical landmarks).

Complete example:

layer.data = {
    "0": {5: [120.3, 240.7], 10: [122.1, 241.5], 15: [123.8, 242.1]},  # label "0" with 3 annotations
    "1": {5: [130.5, 250.2], 10: [131.8, 251.0]},                      # label "1" with 2 annotations
}

In this structure:

Layer: The entire layer.data dictionary plus metadata (video path, etc.)
Labels: "0" and "1" (two different anatomical points)
Annotations: Individual frame-coordinate pairs like 5: [120.3, 240.7]

Understanding Layers and File Naming

Layers in DUSTrack represent individual annotation sessions or annotators. Each layer corresponds to a single file and is managed internally by the VideoAnnotation class. The layer naming convention typically uses the annotator’s initials.

When you save your annotations using the s keyboard shortcut, DUSTrack generates a JSON file following the pattern: {video_name}_annotations_{layer_name}.json.

DeepLabCut (DLC) uses the .h5 file format. DUSTrack’s annotations are formatted into a .h5 file when preparing this data to train a ResNet model using DLC. DLC’s model predictions are also in .h5 formatand can be directly loaded into DUSTrack (see Working with Layers).

Tip: Ese the VideoAnnotation class independently for analysis scripts to read annotation stored in .json or .h5 files.

Why multiple layers?

DUSTrack supports simultaneous layers to enable:

Iterative Refinement Workflow
- Layer "manual": Initial manual annotations on sparse frames
- Layer "dlc_iter1": First DLC model predictions
- Layer "dlc_iter2": Refined model after correcting mislabeled frames
- Layer "lkmovavg_0.500": Post-processed results with jitter reduction
Comparison Between Sources
- Compare manual ground truth vs. automated predictions
- Evaluate different post-processing window sizes
- Visually assess inter-rater reliability between annotators
Non-destructive Editing
- Keep original annotations while experimenting with refinements
- Roll back to previous versions if needed

Current Layer vs. Overlay Layer

The GUI displays two layers simultaneously:

Current Layer (opaque markers): The layer you’re actively editing. Press -/= to cycle through layers.
Overlay Layer (translucent trace): A reference layer for comparison. Press [/] to cycle through overlays or set to None.

Common patterns:

Manual refinement: Current = "manual", Overlay = "dlc_iter1" (copy good predictions with c key)
Quality assessment: Current = "lkmovavg_0.500", Overlay = "dlc_iter1" (compare smoothed vs. raw)

Working with Layers

Creating layers:

import dustrack

# Initialize with single layer
tracker = dustrack.open('video.mp4', "manual")

# If video_annotations_manual.json file exists in the same folder as video.mp4, 
# then annotations from this file will be loaded into the "manual" layer.
# Otherwise, an "empty" layer will be created.

# Initialize with multiple layers
tracker = dustrack.open('video.mp4', ["manual", "dlc_iter1"])

# Initialize with specific file paths
tracker = dustrack.open('video.mp4', {
    'manual': 'path/to/manual_annotations.json',
    'predictions': 'path/to/dlc_predictions.h5'
})

Copying between layers (keyboard shortcuts):

c: Copy current label’s annotation from overlay to current layer (current frame only)
Alt+C: Copy annotations at marked frames of interest from overlay
Ctrl+Alt+C: Copy annotations in selected interval from overlay

The Buffer Layer: Every DUSTrack session includes a special "buffer" layer for temporary storage and experimentation. It serves as a scratch space.

Derived / dense layers: a few layers are produced from other layers by built-in operations and are rendered as continuous lines (not dots) by default:

dlc_iteration-N_M — DLC inference for iteration N at step M. Produced by Train DLC model; added live to the session after training completes.
dlccorr — Applied-manual-corrections layer. Produced by Apply manual corrections: the active sparse manual layer’s edits are spliced into the current DLC overlay’s per-frame trace, yielding a dense annotation. Excluded from DLC training input.
lkmovavg_<window> — Lucas-Kanade RSTC jitter-reduction output. Produced by Reduce jitter on a dense source layer (typical input: dlccorr after corrections).

The line-vs-dot rendering picks itself: any layer whose name starts with dlc_, equals dlccorr, or contains lkmovavg renders as a line; everything else (manual annotations) renders as dots.

Workflow buttons and sidebar groups

The workflow panel groups its buttons by task into four sections separated by horizontal lines, with a pastel per-group palette so visual scanning matches the workflow phase:

Group	Buttons	What they do
Workflow	Create DLC project · Train DLC model · Apply manual corrections · Reduce jitter · Save annotation as…	The five-step pipeline: scaffold a DLC project from your manual annotations → train ResNet → splice manual corrections into DLC predictions → smooth jitter with LK-RSTC → export.
Display	Trace: line / Trace: dot · Freeze/Unfreeze plot axes · Refresh UI · Keyboard shortcuts · (EnhanceWidget)	Visual controls + UI utilities. The EnhanceWidget is two sliders — CLAHE clip and Gamma — plus a `[None
Niche	Decimate annotations · Discard unsaved annotations · Replace existing from overlay · Remove layer	Layer-mutating affordances. Decimate annotations prunes incomplete frames then halves the remainder in the selected interval (training-feed prep). Discard unsaved annotations rolls the active manual layer back to disk (severity-aware confirm; refuses on `dlccorr` / `dlc_` / dense layers). Replace existing from overlay* copies the overlay’s annotations into the current layer at frames the current layer is missing. Remove layer drops a layer from the session (file on disk is not deleted — to fully discard, save annotation as… then delete the original).
Swap layers	(single button)	Toggles foreground ↔ overlay. Positioned directly above the statevars widget it manipulates so the cause/effect is one glance.

The workflow order in the Workflow group matches the steady-state DUSTrack workflow:

Manually annotate sparse frames (no button — use the GUI directly).
Create DLC project — scaffolds config.yaml, videos/, labeled-data/. Migrates your manual-annotation layer files to the project folder.
Train DLC model — runs DLC extract_frames → train → evaluate → analyze_videos in a worker thread under a modal progress overlay (DLC stdout/stderr stream live into the overlay log and to the launching terminal). On completion the new dlc_iteration-N_M trace layer is added live to the session and a fresh iteration-{N+1} manual layer is created as the destination for the next round of corrections.
Apply manual corrections — splices the active manual layer’s sparse edits into the DLC overlay’s dense per-frame trace, producing a dlccorr layer.
Reduce jitter — runs LK-RSTC over a dense source layer (usually dlccorr), producing lkmovavg_<window>.
Save annotation as… — Qt file dialog seeded with {video_stem}_annotations_{layer_name}.json; useful for one-off exports outside the auto-save pattern.

Discarding unsaved annotations

If you’ve made manual edits to the active layer but want to revert without closing the window, click Discard unsaved annotations (or use the keyboard shortcut from the cheatsheet). DUSTrack confirms via a severity-aware modal (the title is red if the diff is destructive), then either reloads the layer’s JSON from disk or — if no backing file exists — resets the layer to empty.

This is for manually authored layers; the operation refuses on dlccorr / dlc_* / dense layers, which are regenerable from upstream and should be removed and re-applied rather than reloaded.

Closing the window with unsaved changes

DUSTrack guards window close (X button, Alt+F4, plt.close()) by scanning every manual annotation layer for in-memory diffs vs disk. If any layer has unsaved changes, a modal lists the per-layer breakdown (+added / -removed / ~modified counts) and offers Save all / Discard / Cancel — with Cancel as the default, so accidental Enter/Esc won’t lose data.

Video Frame Indexing

DUSTrack uses zero-based frame indexing following Python conventions:

First frame: frame 0
For a 100-frame video: frames 0 to 99

Frame numbers in the GUI:

The current frame number appears in the interface state panel
Click on trajectory plots (right-click) to jump to specific frames
Frame markers show your current position in the x and y trajectory plots

Temporal navigation: DUSTrack treats videos as sequences of discrete frames. When you:

Annotate frame 10, then frame 50, you’re creating a sparse annotation
Run optical flow interpolation, it fills frames 11-49 based on the boundary conditions
Load DLC predictions, you typically get dense annotations (one prediction per frame)

Frames of Interest

When working with videos, you often need to focus on (e.g. jump back and forth between) specific frames rather than sequentially reviewing all frames. DUSTrack’s “frames of interest” feature lets you mark and rapidly navigate between important frames.

What are Frames of Interest?

Frames of interest are user-marked frames that deserve special attention. Common use cases:

Evaluating model predictions: Mark frames where DLC predictions look questionable
Assessing annotation consistency: Mark frames to compare across multiple layers
Sparse manual annotation: Mark candidate frames for labeling before running optical flow interpolation
Quality control: Mark frames for systematic review during iterative refinement

Workflow Example

To review DLC predictions:

Set current layer to "manual", overlay to "dlc_iter1"
Step through video and press m on frames with poor predictions
Use Alt+, / Alt+. to cycle between marked frames
Press c to copy good predictions from overlay, or manually correct bad ones

Without frames of interest, reviewing predictions in a 1000-frame video means clicking through linearly or manually noting frame numbers. With frames of interest, you create a focused review queue that streamlines quality assessment and refinement.

Event Intervals for Optical Flow Interpolation

DUSTrack provides an interval selection system that works seamlessly with the overlay layer concept to enable optical flow interpolation between sparse annotations.

What are Event Intervals?

An event interval is a range of frames defined by a start frame and end frame. This interval specifies where optical flow algorithms should interpolate point positions based on boundary conditions from the overlay layer.

How It Works

The typical workflow combines three elements:

Overlay layer containing sparse annotations (e.g., manual labels at frames 10 and 50)
Event interval marking the range to interpolate (frames 10-50)
Optical flow interpolation filling in frames 11-49 using Lucas-Kanade RSTC

The `z-z-a` Pattern

The most common operation is the z, z, a sequence:

First z: Hover your mouse over the trajectory plot at the start frame and press z to mark interval start
Second z: Hover over the end frame and press z to mark interval end
- A gray shaded region appears showing the selected interval
Press a: Triggers Lucas-Kanade RSTC interpolation for the current label
- Uses overlay layer positions at start/end frames as boundary conditions
- Fills intermediate frames in the current layer

Example workflow:

import dustrack

# Setup: manual layer has labels at frames 10 and 50
tracker = dustrack.open('video.mp4', "manual")

# In GUI:
# 1. Set overlay to "manual" (to use as reference)
# 2. Create new layer for interpolated results
# 3. Hover over trajectory plot at x=10, press 'z'
# 4. Hover over trajectory plot at x=50, press 'z'
# 5. Press 'a' to interpolate current label
# 6. Result: frames 11-49 now have smoothly interpolated positions

Overlay Layer as Boundary Conditions

The event interval system is designed to work with the overlay layer:

The overlay layer provides the source annotations (start and end points)
The current layer receives the interpolated results
This allows non-destructive interpolation: original annotations remain unchanged

Common pattern: Use manual annotations as overlay, create a new layer for interpolated results:

Current layer: "manual_interpolated" (empty or partially filled)
Overlay layer: "manual" (sparse annotations)
Select interval spanning two manual annotations
Press a to fill intermediate frames in current layer

Interpolating Multiple Labels

Alt+A: Interpolate all labels in the selected interval (not just current label)
Useful when you’ve manually labeled multiple points at the interval boundaries

Design Rationale

Without event intervals, you’d need to:

Manually annotate every frame (tedious for 1000+ frame videos)
Write custom scripts to specify frame ranges
Risk overwriting original annotations during interpolation

With event intervals + overlay system:

Visually select ranges on trajectory plots
Preserve original data in overlay layer
Rapidly fill sparse annotations with optical flow

Architecture: Building on datanavigator

DUSTrack is built on top of the datanavigator package, which provides the foundational video browsing and annotation framework – specifically datanavigator.VideoBrowser, the asset-manager + button-row + state-variable widgets, and the events / _qt scaffolding. Understanding this relationship helps when extending DUSTrack or troubleshooting issues. More importantly, to develop a better understanding of design concepts in DUSTrack’s user interface, you need to dig into the datanavigator package.

Inheritance Hierarchy

datanavigator.VideoBrowser           # Foundational video-browsing GUI
    └── dustrack.DUSTrack            # Public: point-annotation primitives
                                     # + DLC workflow + UI (single class)

dustrack.VideoAnnotation             # Annotation data container; ships with
                                     # the LK-RSTC postprocess hook attached
                                     # at import time

Since DUSTrack inherits from datanavigator:

All datanavigator keyboard shortcuts work in DUSTrack
datanavigator documentation applies to basic operations

Note: the point-annotation UI primitives (formerly datanavigator.VideoPointAnnotator) relocated to dustrack in 1.2.0a1 as the internal _DUSTrackBase parent class. The 1.2.0rc1 refactor collapsed _DUSTrackBase into dustrack.DUSTrack (a single class is easier to navigate, and the parent/child split was no longer load-bearing). Headless / mpl-only constructions (the test suite, scripted render paths) pass fast_render=False to opt out of the Qt image pane.