Initializing...
GitHub LinkedIn

LeanTrack 3D

Head-tracked parallax for the web

The Project

LeanTrack 3D is a WebGPU-powered 3D scene that responds to your head movement in real-time. Using MediaPipe's Face Landmarker, it tracks your eye position and translates it into camera movement and field-of-view changes, creating the illusion of looking through a window into a 3D space.

The entire project is a single HTML file with no build step required. It's open source under the MIT license.

About Me

I'm Sean Lavery, a Principal Engineer specializing in Game Dev, AI, ML, AR, VR, and Robotics based in Lake Forest, California.

I'm a self-taught Lead Engineer and 3D generalist who comes from a background in IT. Since 2016, I've been hands-on leading teams and developing software, tools, full-stack web apps, and XR applications. Using Unity3D and C# as my primary tools, along with VisionOS, ARKit, ARCore, and more, I've created and published applications on Apple Vision Pro, Android, iOS, Windows, and Meta Quest platforms.

"I am constantly building the future." — Sean Lavery

Experience Highlights

How It Works

LeanTrack 3D technology stack and implementation

Tech Stack

WebGPU 3D Rendering
MediaPipe Face Tracking
WGSL GPU Shaders
Vanilla JS No Framework

The Pipeline

AI Recipe

Want to build something similar? Give this prompt to an AI assistant:

Build a single-file HTML page with WebGPU that: 1. Uses MediaPipe Face Landmarker (@mediapipe/tasks-vision) to track face position from webcam 2. Extracts eye center (landmarks 468, 473) for X/Y position and nose tip Z (landmark 1) for depth 3. Applies exponential smoothing to reduce jitter 4. Creates a 3D room with grid-lined walls using WebGPU: - Custom WGSL vertex/fragment shaders - Depth buffer for proper 3D sorting - Instanced rendering for targets 5. Maps head movement to camera: - X/Y position controls camera pan - Z depth controls both camera position AND field-of-view - Leaning closer = wider FOV + camera moves deeper 6. Add archery-style circular targets at various depths with concentric ring pattern (gold center, red/white rings) Key learnings: - "target" is a reserved word in WGSL, use "tgt" instead - MediaPipe Z is negative when closer to camera - Use min/ideal/max for getUserMedia to avoid cropping - Confidence thresholds around 0.3 work well for tracking

Screen Size Estimation

The site estimates your physical screen size using device pixel ratio to calibrate how much camera movement corresponds to head movement. This creates a more natural 1:1 feeling without manual calibration.

Dev Log

The journey of building LeanTrack 3D

Starting Point

Initial Goal

Test MediaPipe Face Landmarker on the web. Started with a simple HTML page that draws face landmarks on a canvas overlay—the classic "does this work on my machine?" test.

Evolution to WebGPU

Switching to WebGPU

Decided to drop the canvas overlay and build a proper 3D scene with WebGPU. Created archery-style targets at different depths with a grid-lined room environment. The parallax effect when moving your head is immediately satisfying.

WGSL Reserved Keywords

Hit a shader compilation error: "target" is a reserved keyword in WGSL. Renamed variables to "tgt". Built a Playwright test to catch shader errors early—invaluable for iterating on GPU code.

Coordinate System Confusion

The Z-Direction Saga

Lost track of which way Z goes multiple times. MediaPipe Z is negative when closer, but the scene uses negative Z for depth. Added detailed comments documenting the coordinate systems:

  • MediaPipe: closer = more negative Z
  • Scene: deeper into room = more negative Z
  • Camera at Z=4 looking toward Z=-60

Added UI toggles to test camera movement and FOV changes independently—essential for debugging.

Tuning the Feel

Screen Size Estimation

The parallax felt "off"—moving my head didn't translate 1:1 to the scene. Implemented screen size estimation using devicePixelRatio and assumed PPI values. Combined with adjustable sensitivity slider, users can dial in the feel.

Dual Z Effects

Leaning in now does two things: moves the camera deeper into the room AND widens the field of view. This creates a natural "peering through a window" effect that feels much more immersive than either alone.

MediaPipe Confidence Thresholds

Default confidence of 0.5 caused frequent tracking dropouts. Lowered to 0.3 for more permissive detection. The setOptions() API allows real-time adjustment without recreating the landmarker.

Camera Feed Issues

Frame Cropping

Tracking was lost near frame edges even with face clearly visible. The issue: getUserMedia constraints were too strict, causing browser to crop. Fixed by using min/ideal/max ranges and adding resizeMode: "none".

Final Polish

Target Placement

Made targets smaller (0.25 units) with more depth variance. Close targets stay centered to avoid going off-frame; far targets can spread wider. Some targets now "pop out" in front of the viewing plane.

Key Takeaways