LeanTrack 3D

Head-tracked parallax for the web

The Project

LeanTrack 3D is a WebGPU-powered 3D scene that responds to your head movement in real-time. Using MediaPipe's Face Landmarker, it tracks your eye position and translates it into camera movement and field-of-view changes, creating the illusion of looking through a window into a 3D space.

The entire project is a single HTML file with no build step required. It's open source under the MIT license.

View Source on GitHub

About Me

I'm Sean Lavery, a Principal Engineer specializing in Game Dev, AI, ML, AR, VR, and Robotics based in Lake Forest, California.

I'm a self-taught Lead Engineer and 3D generalist who comes from a background in IT. Since 2016, I've been hands-on leading teams and developing software, tools, full-stack web apps, and XR applications. Using Unity3D and C# as my primary tools, along with VisionOS, ARKit, ARCore, and more, I've created and published applications on Apple Vision Pro, Android, iOS, Windows, and Meta Quest platforms.

"I am constantly building the future." — Sean Lavery

Experience Highlights

Principal Game Developer at Foresight Sports (2025–Present)
Lead Software Engineer - Emerging Technology at Starbucks (2021–2025)
Led R&D in AR, VR, robotics, AI, and game design
Built XR experiences for Vision Pro and Meta Quest
Sped up coding 1000%+ with AI-powered agentic tools
Led 8 hackathons, delivered 4 finalists and 2 winners

GitHub LinkedIn

How It Works

LeanTrack 3D technology stack and implementation

Tech Stack

WebGPU 3D Rendering

MediaPipe Face Tracking

WGSL GPU Shaders

Vanilla JS No Framework

The Pipeline

Face Detection: MediaPipe's Face Landmarker detects 478 facial landmarks at 30+ FPS
Eye Tracking: We extract the iris center positions (landmarks 468 & 473) and compute the midpoint
Depth Estimation: The nose tip Z-coordinate gives relative distance from camera
Smoothing: Exponential moving average prevents jitter
Camera Mapping: Head position drives 3D camera X/Y/Z and field-of-view
WebGPU Rendering: Custom WGSL shaders render the room and targets with depth testing

AI Recipe

Want to build something similar? Give this prompt to an AI assistant:

Build a single-file HTML page with WebGPU that: 1. Uses MediaPipe Face Landmarker (@mediapipe/tasks-vision) to track face position from webcam 2. Extracts eye center (landmarks 468, 473) for X/Y position and nose tip Z (landmark 1) for depth 3. Applies exponential smoothing to reduce jitter 4. Creates a 3D room with grid-lined walls using WebGPU: - Custom WGSL vertex/fragment shaders - Depth buffer for proper 3D sorting - Instanced rendering for targets 5. Maps head movement to camera: - X/Y position controls camera pan - Z depth controls both camera position AND field-of-view - Leaning closer = wider FOV + camera moves deeper 6. Add archery-style circular targets at various depths with concentric ring pattern (gold center, red/white rings) Key learnings: - "target" is a reserved word in WGSL, use "tgt" instead - MediaPipe Z is negative when closer to camera - Use min/ideal/max for getUserMedia to avoid cropping - Confidence thresholds around 0.3 work well for tracking

Screen Size Estimation

The site estimates your physical screen size using device pixel ratio to calibrate how much camera movement corresponds to head movement. This creates a more natural 1:1 feeling without manual calibration.

Dev Log

The journey of building LeanTrack 3D

Starting Point

Initial Goal

Test MediaPipe Face Landmarker on the web. Started with a simple HTML page that draws face landmarks on a canvas overlay—the classic "does this work on my machine?" test.

Evolution to WebGPU

Switching to WebGPU

Decided to drop the canvas overlay and build a proper 3D scene with WebGPU. Created archery-style targets at different depths with a grid-lined room environment. The parallax effect when moving your head is immediately satisfying.

WGSL Reserved Keywords

Hit a shader compilation error: "target" is a reserved keyword in WGSL. Renamed variables to "tgt". Built a Playwright test to catch shader errors early—invaluable for iterating on GPU code.

Coordinate System Confusion

The Z-Direction Saga

Lost track of which way Z goes multiple times. MediaPipe Z is negative when closer, but the scene uses negative Z for depth. Added detailed comments documenting the coordinate systems:

MediaPipe: closer = more negative Z
Scene: deeper into room = more negative Z
Camera at Z=4 looking toward Z=-60

Added UI toggles to test camera movement and FOV changes independently—essential for debugging.

Tuning the Feel

Screen Size Estimation

The parallax felt "off"—moving my head didn't translate 1:1 to the scene. Implemented screen size estimation using devicePixelRatio and assumed PPI values. Combined with adjustable sensitivity slider, users can dial in the feel.

Dual Z Effects

Leaning in now does two things: moves the camera deeper into the room AND widens the field of view. This creates a natural "peering through a window" effect that feels much more immersive than either alone.

MediaPipe Confidence Thresholds

Default confidence of 0.5 caused frequent tracking dropouts. Lowered to 0.3 for more permissive detection. The setOptions() API allows real-time adjustment without recreating the landmarker.

Camera Feed Issues

Frame Cropping

Tracking was lost near frame edges even with face clearly visible. The issue: getUserMedia constraints were too strict, causing browser to crop. Fixed by using min/ideal/max ranges and adding resizeMode: "none".

Final Polish

Target Placement

Made targets smaller (0.25 units) with more depth variance. Close targets stay centered to avoid going off-frame; far targets can spread wider. Some targets now "pop out" in front of the viewing plane.

Key Takeaways

Test GPU code with automation: Playwright catches shader errors before you see a black screen
Document coordinate systems: Future you will thank past you
Build toggles for debugging: Isolate effects to understand what's happening
Lower confidence thresholds: 0.3 works better than 0.5 for face tracking
Combine effects: Camera movement + FOV change creates a more immersive result

Camera Access Required

Camera Not Available

LeanTrack 3D

The Project

About Me

Experience Highlights

How It Works

Tech Stack

The Pipeline

AI Recipe

Screen Size Estimation

Dev Log

Starting Point

Initial Goal

Evolution to WebGPU

Switching to WebGPU

WGSL Reserved Keywords

Coordinate System Confusion

The Z-Direction Saga

Tuning the Feel

Screen Size Estimation

Dual Z Effects

MediaPipe Confidence Thresholds

Camera Feed Issues

Frame Cropping

Final Polish

Target Placement

Key Takeaways