M03 A recommender, click by click

Guessing what they'll
watch next.

A recommendation engine, the kind Netflix and Spotify live on. Turns a viewer's first five clicks into the hours they stay. Every basis point of watch-time is a subscription renewed.

Click posters you like. Watch the shelves re-sort. Toggle between three algorithms and see the same taste interpreted differently. That gap is where every streaming service quietly spends its product budget.

Filed

Dataset

MovieLens · — ratings · — users

Source

notebooks/reco_model.py ↗

§ I Why a recommender matters

A well-tuned recommender is, quietly, one of the most expensive pieces of software in a retail-adjacent business. ~75% of what people watch on Netflix originates from its recommendations, and every 0.1% of engagement lift translates into measurable revenue.

The interesting question is never whether to build one. It's which signal to trust: the content itself, the behaviour of similar users, or some blend of both, and how honest you are with the user about which of those the system leaned on for any given suggestion. This page exposes all three, side by side, and lets you feel the difference.

Every poster you click runs real cosine math on precomputed similarity matrices in your browser. No API, no backend, no login, no tracking. Source of everything: the linked notebook.

§ II · M03.1 The Marquee · a working recommender

Click any poster to mark it as something you liked. The Because you liked row at the top re-sorts instantly. Toggle the algorithm to watch the same taste interpreted differently.

FLIXER a recommendation-engine demo · not a real service

0 liked

Seeded with globally top-rated titles. Start clicking, the top row will become yours.

§ III Three lenses on the same taste

Fig. A

Content-based

Each movie becomes a TF-IDF vector over its genres and the most common user tags. Cosine similarity in that space tells us "this movie looks like that one on paper." Good at new releases, blind to taste, it'll happily recommend a 30-year-old film you've never heard of if the tags match.

Fig. B

Collaborative

We build a user-by-movie matrix of mean-centred ratings, then cosine on the columns, so movies are similar when the same users rated them the same way. Learns taste patterns humans don't articulate, but can't say anything about a movie nobody's rated yet (cold-start).

Fig. C

Hybrid

A simple weighted blend: α · content + (1−α) · collab, at α = 0.5. Nothing clever, and often, that's the production answer. Content gives cold-start coverage; collab gives taste depth; the blend smooths both.

§ IV · M03.2 Your taste, decomposed

The model doesn't see a list of movies, it sees a vector. Every click updates the implicit genre weights below. Watch for the weights you didn't think you'd emit.

Fig. 03.2 · genre weights from your likes

Click some posters above to start.

§ V · M03.3 Where the algorithms disagree

The same likes, fed to three different models, yield three top-10 lists. Overlap is expected; the interesting signal is the gap, where one algorithm sees something the others don't. That gap is a real product question: do you ensemble the three, pick one, or show all three for transparency?

Click some posters above to start.

§ VI Receipts

§ VII Methodology & Colophon

Dataset & posters

Ratings and tags: MovieLens ml-latest-small ↗, 100k ratings, 600 users, 9,700 films. Filtered to titles with ≥50 ratings and ≥3.3 average. Poster art from TMDB ↗, fetched at build time and lazy-loaded at runtime. TMDB does not endorse this project.

Pipeline script

notebooks/reco_model.py ↗, downloads MovieLens, fits TF-IDF and item-item CF, precomputes top-20 neighbours under each algorithm, writes six JSON files to /assets/data/reco/. Re-run any time with python notebooks/reco_model.py.

Reproducibility

random_state=42 everywhere it matters. Last regenerated —. All similarity scores and recommendations computed in your browser; no server involved.

Limitations

Cold-start for new users, the page seeds with global top-rated, but until a visitor clicks there is no personalization. Cold-start for new movies is worse: a brand-new film has no ratings, and if its tags are sparse the content model has nothing to grab. No dislike signal, no temporal decay, no contextual features, all standard pieces of a production recommender, all out of scope here.

← Back to the portfolio View the script on GitHub ↗