Recommender-system of the OpenSource Together platform.
An AI‑powered data pipeline that discovers, understands, and curates open‑source projects to power OST’s recommendation system and provide high-quality projects to contribute on.
What it does :
- Discover: scan GitHub at scale with Golang scrapers
- Understand: detect language and semantics (fastText + transformers)
- Assess: score quality and relevance from activity and metadata signals
- Enrich: normalize topics, tech stacks, and fields into a coherent schema
Deliver: output a clean, queryable dataset (PostgreSQL via Prisma)
- Copy
.env.exampleinto.envand fill it. - Copy
config/cfg_example.pytoconfig/cfg.pyand adjust the config to your personal parameters. - Start
# Start the engine
docker compose upDagster UI : http://localhost:3000
Work in progress.
Build in public here : @spideyX
