Presnap Identity from Tracking Data
(IN PROGRESS)
(IN PROGRESS)
This is a personal research project built using player-tracking data from the 2025 NFL Big Data Bowl dataset. I am currently developing this as an ongoing framework for learning structural identity in football units directly from tracking data, with the goal of linking formation structure, spatial organization, and presnap behavior to performance and matchup outcomes.
For every play, I represent each unit’s presnap structure using a covariance matrix built from ball-relative spacing and movement features (e.g., dx,dy, speed, acceleration, and directional components). These matrices are Symmetric Positive Definite (SPD), which allows geometry-aware modeling: each matrix is mapped into a Euclidean tangent space using a matrix logarithm, and then averaged across plays to form a team–week identity vector. This produces a compact mathematical representation of how an offense or defense typically aligns and organizes itself before the snap.
The movement plots measure how much a unit’s presnap identity changes from week to week. I compute the distance between a team’s identity vector in consecutive weeks and sum these distances across the selected weeks.
High movement → the unit is changing its presnap structural identity more across games (more adaptive / more variable)
Low movement → the unit looks structurally similar week to week (more stable / more rigid)
This is not “motion at the line” on a single play—it’s game-to-game identity drift, which is useful for understanding coaching tendencies and adaptability
The quadrant plots combine identity volatility (x-axis) with performance (y-axis):
Defense plot: y=−EPA allowed (higher is better defense)
Offense plot: y=EPA per play (higher is better offense)
This produces four archetypes:
High movement + Good EPA: adaptive identity that’s working (effective weekly changes)
High movement + Bad EPA: changing identity without payoff (instability / over-adjustment)
Low movement + Good EPA: stable foundation that performs (system/execution strength)
Low movement + Bad EPA: predictable or stale identity (needs constraint-breakers)
Because the identity vectors live in the same tangent space, I can compare an offense’s presnap identity to a defense’s and quantify style mismatch. Combined with EPA priors, this creates directional matchup scores (Team A offense vs Team B defense, and the reverse). This enables identification of favorable matchups, scheme vulnerabilities, and structural advantages.