The Evolution of Edge Vision Systems – Thomas Mannfred Carlsson

*Figure: One of two server racks powering the Greyhound Tracking System (2004-2005)*

Here are two declassified case studies showing how classic computer vision delivered real‑time results on modest hardware from the early 2000s. They mark the first known, publicly deployed real‑time, multi‑camera tracking of their kind on commodity CPUs:

Speedway (2003): real‑time multi‑motorcycle tracking in a stadium.

Greyhound (2004-2005): multi‑camera dog (and mechanical hare) tracking at a racetrack.

Both ran live, streamed data to dial‑up users, and had to succeed on tiny FLOPS budgets. That scarcity forged habits that still matter on modern edge devices. For additional details the above articles are the best place to start (in particular the Greyhound piece includes lots of photos). However, a brief summary and lessons learned are outlined below.

2003–2005: What Shipped Under Tight Constraints

Context
When these systems went live, OpenCV was a fledgling v0.x, AlexNet (2012) was years away, and a single Pentium 4 could push on the order of ~12 GFLOPS (≈0.012 TOPS) of peak compute. No practical GPGPU. 1 GB RAM on high-end systems. Interlaced PAL video at 25 fps.

Systems at a glance

System	Cameras	Compute	Objects	End‑to‑End Latency	Notable Tricks
Speedway (2003)	9 × PAL CCTV	3 × Pentium 4	4 motorcycles	<200 ms	SSE2 color kernels, helmet‑cam identity hints
Greyhound (2004-2005)	52 × PAL CCTV	9 × Pentium 4	6 dogs + hare	<220 ms	64×16 analog video matrix; 1‑D “track‑unwrapped” EKF

Why the “primitive” hardware helped

Constraint	Counter‑measure
25 fps interlaced PAL	Use single‑field processing to halve motion blur; regain detail via multi‑view geometry
Zero GPUs, 1 GB RAM	Hand‑rolled SIMD, LUT color classifiers, early‑exit motion masks, ROI pyramids
100 Mb LAN; many dial‑up users	Stream state vectors (<1 kB per frame) instead of video
Dust, glare, dropouts	Per‑pixel variance masks; auto‑recovery and camera failover

Engineering highlights

Geometry‑first pipelines. In both systems the oval track was “unwrapped” to a 1‑D arclength coordinate s. Each camera produced observations z = s + noise; a single EKF fused them into smooth trajectories. Occlusions became gaps along s, not hard 2‑D re‑identification problems.
Deterministic latency. Fixed time‑budgets per stage (capture → mask → blob → association → fuse), with watchdogs that degraded gracefully (smaller ROIs, shorter association windows) under load.
Robust association. Simple gating (Mahalanobis distance) + nearest‑neighbour across cameras outperformed heavier global solvers on commodity CPUs of the era.
Operational pragmatism. Camera‑by‑camera health scores; automatic de‑weighting in the filter when variance spiked (rain, floodlights, spectators).

Core idea: Use the world’s structure (track layout, motion priors, order constraints) so that simple algorithms win in real time.

Some of the lessons which are still relevant today

2003 Approach	2025 Equivalent
Hand‑optimised kernels and cache awareness	Better quantisation strategies, compiler pragmas, and memory layouts for edge TPUs / NPUs
Geometry before deep nets	Smaller models, fewer labels; homographies and EKFs reduce training burden
Bandwidth‑first design	On‑device inference + light-weight uplinks (telemetry, not video) lowers cost and improves privacy
Designed‑for‑failure	Self‑healing nodes, health telemetry, and graceful degradation are as important as the mean Average Precision of your models.

Early deployments showed that real‑time is achievable on modest hardware by leaning on geometry, priors, and simplification of the problem space.

Evergreen lesson: Scarcity clarifies vision, whether it’s squeezing handcrafted kernels onto a Pentium 4 in 2003 or quantising modern detectors onto a 3-5 W edge accelerator in 2025.

E-mail	[email protected]
LinkedIn	Thomas M. Carlsson
Geo	Helsinki, Finland

2003–2005: What Shipped Under Tight Constraints

Some of the lessons which are still relevant today

Related Posts

Astribot S1 – AI Robotics hype meets computer vision snake oil

Top 5 Lessons from 20 years of Computer Vision