@@ -125,7 +125,7 @@ The code in the main chapters of this book is designed to run on conventional la
125125
126126[ * Build A Reasoning Model (From Scratch)* ] ( https://mng.bz/lZ5B ) , while a standalone book, can be considered as a sequel to * Build A Large Language Model (From Scratch)* .
127127
128- It starts with a pretrained model and implements different reasoning approaches, including inference-time scaling, reinforcement learning, and distillation, to improve the model's reasoning capabilities.
128+ It starts with a pretrained model and implements different reasoning approaches, including inference-time scaling, reinforcement learning, and distillation, to improve the model's reasoning capabilities.
129129
130130Similar to * Build A Large Language Model (From Scratch)* , [ * Build A Reasoning Model (From Scratch)* ] ( https://mng.bz/lZ5B ) takes a hands-on approach implementing these methods from scratch.
131131
@@ -146,35 +146,36 @@ In addition to the code exercises, you can download a free 170-page PDF titled
146146
147147<a href =" https://www.manning.com/books/test-yourself-on-build-a-large-language-model-from-scratch " ><img src =" https://sebastianraschka.com/images/LLMs-from-scratch-images/test-yourself-cover.jpg?123 " width =" 150px " ></a >
148148
149-
150-
151149  ;
152150## Bonus Material
153151
154152Several folders contain optional materials as a bonus for interested readers:
155-
156153- ** Setup**
157154 - [ Python Setup Tips] ( setup/01_optional-python-setup-preferences )
158- - [ Installing Python Packages and Libraries Used In This Book] ( setup/02_installing-python-libraries )
155+ - [ Installing Python Packages and Libraries Used in This Book] ( setup/02_installing-python-libraries )
159156 - [ Docker Environment Setup Guide] ( setup/03_optional-docker-environment )
160- - ** Chapter 2: Working with text data**
157+
158+ - ** Chapter 2: Working With Text Data**
161159 - [ Byte Pair Encoding (BPE) Tokenizer From Scratch] ( ch02/05_bpe-from-scratch/bpe-from-scratch-simple.ipynb )
162160 - [ Comparing Various Byte Pair Encoding (BPE) Implementations] ( ch02/02_bonus_bytepair-encoder )
163161 - [ Understanding the Difference Between Embedding Layers and Linear Layers] ( ch02/03_bonus_embedding-vs-matmul )
164- - [ Dataloader Intuition with Simple Numbers] ( ch02/04_bonus_dataloader-intuition )
165- - ** Chapter 3: Coding attention mechanisms**
162+ - [ Dataloader Intuition With Simple Numbers] ( ch02/04_bonus_dataloader-intuition )
163+
164+ - ** Chapter 3: Coding Attention Mechanisms**
166165 - [ Comparing Efficient Multi-Head Attention Implementations] ( ch03/02_bonus_efficient-multihead-attention/mha-implementations.ipynb )
167166 - [ Understanding PyTorch Buffers] ( ch03/03_understanding-buffers/understanding-buffers.ipynb )
168- - ** Chapter 4: Implementing a GPT model from scratch**
169- - [ FLOPS Analysis] ( ch04/02_performance-analysis/flops-analysis.ipynb )
167+
168+ - ** Chapter 4: Implementing a GPT Model From Scratch**
169+ - [ FLOPs Analysis] ( ch04/02_performance-analysis/flops-analysis.ipynb )
170170 - [ KV Cache] ( ch04/03_kv-cache )
171- - [ Attention alternatives ] ( ch04/#attention-alternatives )
171+ - [ Attention Alternatives ] ( ch04/#attention-alternatives )
172172 - [ Grouped-Query Attention] ( ch04/04_gqa )
173173 - [ Multi-Head Latent Attention] ( ch04/05_mla )
174174 - [ Sliding Window Attention] ( ch04/06_swa )
175175 - [ Gated DeltaNet] ( ch04/08_deltanet )
176176 - [ Mixture-of-Experts (MoE)] ( ch04/07_moe )
177- - ** Chapter 5: Pretraining on unlabeled data:**
177+
178+ - ** Chapter 5: Pretraining on Unlabeled Data**
178179 - [ Alternative Weight Loading Methods] ( ch05/02_alternative_weight_loading/ )
179180 - [ Pretraining GPT on the Project Gutenberg Dataset] ( ch05/03_bonus_pretraining_on_gutenberg )
180181 - [ Adding Bells and Whistles to the Training Loop] ( ch05/04_learning_rate_schedulers )
@@ -184,32 +185,35 @@ Several folders contain optional materials as a bonus for interested readers:
184185 - [ Llama 3.2 From Scratch] ( ch05/07_gpt_to_llama/standalone-llama32.ipynb )
185186 - [ Qwen3 Dense and Mixture-of-Experts (MoE) From Scratch] ( ch05/11_qwen3/ )
186187 - [ Gemma 3 From Scratch] ( ch05/12_gemma3/ )
187- - [ Memory-efficient Model Weight Loading] ( ch05/08_memory_efficient_weight_loading/memory-efficient-state-dict.ipynb )
188- - [ Extending the Tiktoken BPE Tokenizer with New Tokens] ( ch05/09_extending-tokenizers/extend-tiktoken.ipynb )
188+ - [ Memory-Efficient Model Weight Loading] ( ch05/08_memory_efficient_weight_loading/memory-efficient-state-dict.ipynb )
189+ - [ Extending the Tiktoken BPE Tokenizer With New Tokens] ( ch05/09_extending-tokenizers/extend-tiktoken.ipynb )
189190 - [ PyTorch Performance Tips for Faster LLM Training] ( ch05/10_llm-training-speed )
190- - ** Chapter 6: Finetuning for classification**
191- - [ Additional experiments finetuning different layers and using larger models] ( ch06/02_bonus_additional-experiments )
192- - [ Finetuning different models on 50k IMDb movie review dataset] ( ch06/03_bonus_imdb-classification )
193- - [ Building a User Interface to Interact With the GPT-based Spam Classifier] ( ch06/04_user_interface )
194- - ** Chapter 7: Finetuning to follow instructions**
191+
192+ - ** Chapter 6: Finetuning for Classification**
193+ - [ Additional Experiments Finetuning Different Layers and Using Larger Models] ( ch06/02_bonus_additional-experiments )
194+ - [ Finetuning Different Models on 50k IMDb Movie Review Dataset] ( ch06/03_bonus_imdb-classification )
195+ - [ Building a User Interface to Interact With the GPT-Based Spam Classifier] ( ch06/04_user_interface )
196+
197+ - ** Chapter 7: Finetuning to Follow Instructions**
195198 - [ Dataset Utilities for Finding Near Duplicates and Creating Passive Voice Entries] ( ch07/02_dataset-utilities )
196199 - [ Evaluating Instruction Responses Using the OpenAI API and Ollama] ( ch07/03_model-evaluation )
197200 - [ Generating a Dataset for Instruction Finetuning] ( ch07/05_dataset-generation/llama3-ollama.ipynb )
198201 - [ Improving a Dataset for Instruction Finetuning] ( ch07/05_dataset-generation/reflection-gpt4.ipynb )
199- - [ Generating a Preference Dataset with Llama 3.1 70B and Ollama] ( ch07/04_preference-tuning-with-dpo/create-preference-data-ollama.ipynb )
202+ - [ Generating a Preference Dataset With Llama 3.1 70B and Ollama] ( ch07/04_preference-tuning-with-dpo/create-preference-data-ollama.ipynb )
200203 - [ Direct Preference Optimization (DPO) for LLM Alignment] ( ch07/04_preference-tuning-with-dpo/dpo-from-scratch.ipynb )
201- - [ Building a User Interface to Interact With the Instruction Finetuned GPT Model] ( ch07/06_user_interface )
204+ - [ Building a User Interface to Interact With the Instruction- Finetuned GPT Model] ( ch07/06_user_interface )
202205
203- More bonus material from the [ reasoning from scratch ] ( https://github.com/rasbt/reasoning-from-scratch ) repository:
206+ More bonus material from the [ Reasoning From Scratch ] ( https://github.com/rasbt/reasoning-from-scratch ) repository:
204207
205- - ** Qwen3 (from scratch) basics **
206- - [ Qwen3 source code walkthrough ] ( https://github.com/rasbt/reasoning-from-scratch/blob/main/chC/01_main-chapter-code/chC_main.ipynb )
208+ - ** Qwen3 (From Scratch) Basics **
209+ - [ Qwen3 Source Code Walkthrough ] ( https://github.com/rasbt/reasoning-from-scratch/blob/main/chC/01_main-chapter-code/chC_main.ipynb )
207210 - [ Optimized Qwen3] ( https://github.com/rasbt/reasoning-from-scratch/tree/main/ch02/03_optimized-LLM )
211+
208212- ** Evaluation**
209- - [ Verifier-based evaluation (MATH-500)] ( https://github.com/rasbt/reasoning-from-scratch/tree/main/ch03 )
210- - [ Multiple-choice evaluation (MMLU)] ( https://github.com/rasbt/reasoning-from-scratch/blob/main/chF/02_mmlu )
211- - [ LLM leaderboard evaluation ] ( https://github.com/rasbt/reasoning-from-scratch/blob/main/chF/03_leaderboards )
212- - [ LLM-as-a-judge evaluation ] ( https://github.com/rasbt/reasoning-from-scratch/blob/main/chF/04_llm-judge )
213+ - [ Verifier-Based Evaluation (MATH-500)] ( https://github.com/rasbt/reasoning-from-scratch/tree/main/ch03 )
214+ - [ Multiple-Choice Evaluation (MMLU)] ( https://github.com/rasbt/reasoning-from-scratch/blob/main/chF/02_mmlu )
215+ - [ LLM Leaderboard Evaluation ] ( https://github.com/rasbt/reasoning-from-scratch/blob/main/chF/03_leaderboards )
216+ - [ LLM-as-a-Judge Evaluation ] ( https://github.com/rasbt/reasoning-from-scratch/blob/main/chF/04_llm-judge )
213217
214218<br >
215219  ;
0 commit comments