Skip to content

shaneyale2005/cs336

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 

Repository files navigation

This course provides a comprehensive, hands-on introduction to language modeling, guiding students through building language models from scratch. Topics include data collection, transformer architectures, model training, evaluation, and deployment. The course is implementation-heavy and requires strong Python and deep learning skills.


Logistics

  • Lectures: Tuesday/Thursday 3:00–4:20pm, NVIDIA Auditorium
  • Office Hours:
    • Tatsu Hashimoto (Gates 364): Fridays 3–4pm
    • Percy Liang (Gates 350): Fridays 11am–12pm
    • Marcel Rød (Gates 415): Mon/Wed 11am–12pm
    • Neil Band (Gates 358): Mon 4–5pm, Tues 5–6pm
    • Rohith Kuditipudi (Gates 358): Mon/Wed 10–11am
  • Contact: Use public Slack channels for questions and announcements. For personal matters, email cs336-spr2425-staff@lists.stanford.edu.

Coursework

  1. Basics: Implement and train a standard Transformer language model.
  2. Systems: Profile, optimize, and distribute model training.
  3. Scaling: Analyze and fit scaling laws for model growth.
  4. Data: Process and filter large-scale pretraining data.
  5. Alignment and Reasoning RL: Apply supervised finetuning and RL for reasoning tasks.

About

Curated learning materials for Stanford CS336. Comprehensive resources covering the course's key concepts and practice content.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors