Pick the right model.
With reasons.
Answer five questions about your problem. The atlas routes you to one of thirteen discriminative models with a deep, live, interactive explanation. Every junior data scientist asks "what model should I use" too late. This page inverts that.
The wizard below is opinionated. It is not a substitute for cross- validation, baseline comparisons, or honest evaluation against your actual data. It is a shape-of-the-problem heuristic — the conversation a senior would have with a junior on day one. Read the methodology section at the bottom for what it is and is not.
For readers who already know what they're looking for. Each card links to a destination page (or a stub, in the case of phase 2/3 models still on the build queue).
The wizard is a scoring function over a curated catalog. Question 1 (the task) is a hard filter — a regression won't recommend a classifier. Questions 2 through 5 contribute weighted scores. The top score is the recommendation; ranks 2 and 3 are offered as "honest alternatives." The full rule set lives in model-decision-tree.js — ~150 lines, intentionally readable. Tweak it; the wizard recalculates instantly.
Real model selection is empirical. You build several candidates, cross-validate them on your data, and pick the one that wins on a metric you trust. The atlas is the conversation before that — a structured way to narrow from thirteen candidates to two or three, so the empirical work has fewer directions to spread across. "Best model" is a fiction. The right model is the one that survives evaluation on your data.
Every destination page demonstrates its model on a 2D synthetic dataset. Two dimensions are everything you can see. They are not everything that matters — the curse of dimensionality is real and not visualizable. A model that handles 2D data perfectly may struggle in 200D. Read the per-model "when it fails" sections; they break the demo on purpose so you can feel where the boundaries are.
Inference / accuracy / training / size on every model card are directional, not exact. They reflect typical-case performance on typical-size data. Specific architectures, optimizations, and hardware can move any of these values significantly. Treat them as a map, not a benchmark.