Master of SWE & AI · Machine Learning·2026

Review Pulse — multi-domain sentiment classifier

An ML pipeline that classifies product-review sentiment and holds up across four retail domains.

AI/MLData

Problem

Product reviews carry signal that’s expensive to read at scale, and a model trained on one category often degrades on another. The task: build a sentiment classifier that generalises across multiple, different retail domains — not just one.

Approach

Worked from a labelled dataset of ~8,000 Amazon reviews across four domains (Books, DVDs, Electronics, Kitchen & Housewares).
Cleaned and vectorised the text, then trained and evaluated classical ML models with scikit-learn, following a CRISP-DM workflow.
Measured per-domain and cross-domain performance to test generalisation — not just headline accuracy.
Iterated in Jupyter notebooks, version-controlled in a dedicated open-source repo.

Stack

Pythonscikit-learnpandasJupyterCRISP-DM

Outcome

A working multi-domain sentiment classifier with documented per-domain and cross-domain evaluation.
A reproducible ML pipeline (notebooks + repo) — concrete, hands-on ML, not just coursework slides.

review-pulse repo Master’s repo