Three ways to
read a taste.
A live, client-side recommendation engine. Click posters you like, watch the shelves re-sort. Toggle between three algorithms and see the same taste interpreted differently — that gap is where every streaming service quietly spends its product budget.
A well-tuned recommender is, quietly, one of the most expensive pieces of software in a retail-adjacent business. ~75% of what people watch on Netflix originates from its recommendations, and every 0.1% of engagement lift translates into measurable revenue.
The interesting question is never whether to build one. It's which signal to trust: the content itself, the behaviour of similar users, or some blend of both — and how honest you are with the user about which of those the system leaned on for any given suggestion. This page exposes all three, side by side, and lets you feel the difference.
Every poster you click runs real cosine math on precomputed similarity matrices in your browser. No API, no backend, no login, no tracking. Source of everything: the linked notebook.
Click any poster to mark it as something you liked. The Because you liked row at the top re-sorts instantly. Toggle the algorithm to watch the same taste interpreted differently.
Seeded with globally top-rated titles. Start clicking — the top row will become yours.
Content-based
Each movie becomes a TF-IDF vector over its genres and the most common user tags. Cosine similarity in that space tells us "this movie looks like that one on paper." Good at new releases, blind to taste — it'll happily recommend a 30-year-old film you've never heard of if the tags match.
Collaborative
We build a user-by-movie matrix of mean-centred ratings, then cosine on the columns — so movies are similar when the same users rated them the same way. Learns taste patterns humans don't articulate, but can't say anything about a movie nobody's rated yet (cold-start).
Hybrid
A simple weighted blend: α · content + (1−α) · collab, at α = 0.5. Nothing clever — and often, that's the production answer. Content gives cold-start coverage; collab gives taste depth; the blend smooths both.
The model doesn't see a list of movies — it sees a vector. Every click updates the implicit genre weights below. Watch for the weights you didn't think you'd emit.
— click some posters above —
The same likes, fed to three different models, yield three top-10 lists. Overlap is expected; the interesting signal is the gap — where one algorithm sees something the others don't. That gap is a real product question: do you ensemble the three, pick one, or show all three for transparency?
— click some posters above —
Ratings and tags: MovieLens ml-latest-small ↗ — 100k ratings, 600 users, 9,700 films. Filtered to titles with ≥50 ratings and ≥3.3 average. Poster art from TMDB ↗, fetched at build time and lazy-loaded at runtime. TMDB does not endorse this project.
notebooks/reco_model.py ↗ — downloads MovieLens, fits TF-IDF and item-item CF, precomputes top-20 neighbours under each algorithm, writes six JSON files to /assets/data/reco/. Re-run any time with python notebooks/reco_model.py.
random_state=42 everywhere it matters. Last regenerated —. All similarity scores and recommendations computed in your browser; no server involved.
Cold-start for new users — the page seeds with global top-rated, but until a visitor clicks there is no personalization. Cold-start for new movies is worse: a brand-new film has no ratings, and if its tags are sparse the content model has nothing to grab. No dislike signal, no temporal decay, no contextual features — all standard pieces of a production recommender, all out of scope here.