Problem
Keeping up with the literature for an MEng thesis on mental health classification and explainability meant running the same Semantic Scholar searches most mornings and skimming dozens of abstracts, most of them irrelevant.
The search needed automating end to end: fetch new papers, judge their relevance, summarise them, and land the result in my inbox before the day starts.
Approach
A Python pipeline on a daily cron. A fetcher polls the Semantic Scholar API for fresh papers per configured query, a scorer asks Gemini Flash to rate relevance against the thesis topics, and a summariser writes the digest that is emailed out.
Pipeline behaviour lives in an app_config table in Neon Postgres, edited from a dashboard page. Swapping the Gemini model or changing a query is a config edit, not a deploy.
CI enforces 95% total coverage with a 90% per-module floor, plus integration tests against a separate Neon branch.
Outcome
A scored, summarised digest arrives every morning. The reading loop takes minutes instead of an hour, and query or model changes never touch the code.