Commit 2ec1bcb
Add configurable temperature parameter for RL rollout sampling (#86)
Co-authored-by: John Schulman <[email protected]>1 parent 20e26a6 commit 2ec1bcb
File tree
4 files changed
+382
-2
lines changed- tinker_cookbook
- recipes/math_rl
- rl
- tests
4 files changed
+382
-2
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
54 | 54 | | |
55 | 55 | | |
56 | 56 | | |
| 57 | + | |
57 | 58 | | |
58 | 59 | | |
59 | 60 | | |
| |||
63 | 64 | | |
64 | 65 | | |
65 | 66 | | |
66 | | - | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
67 | 72 | | |
68 | 73 | | |
69 | 74 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
34 | 34 | | |
35 | 35 | | |
36 | 36 | | |
| 37 | + | |
37 | 38 | | |
38 | 39 | | |
39 | 40 | | |
| |||
124 | 125 | | |
125 | 126 | | |
126 | 127 | | |
| 128 | + | |
127 | 129 | | |
128 | 130 | | |
129 | 131 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
229 | 229 | | |
230 | 230 | | |
231 | 231 | | |
| 232 | + | |
232 | 233 | | |
233 | 234 | | |
234 | 235 | | |
| |||
366 | 367 | | |
367 | 368 | | |
368 | 369 | | |
| 370 | + | |
369 | 371 | | |
370 | 372 | | |
371 | 373 | | |
| |||
501 | 503 | | |
502 | 504 | | |
503 | 505 | | |
| 506 | + | |
504 | 507 | | |
505 | 508 | | |
506 | 509 | | |
| |||
659 | 662 | | |
660 | 663 | | |
661 | 664 | | |
| 665 | + | |
662 | 666 | | |
663 | 667 | | |
664 | 668 | | |
665 | | - | |
| 669 | + | |
666 | 670 | | |
667 | 671 | | |
668 | 672 | | |
| |||
991 | 995 | | |
992 | 996 | | |
993 | 997 | | |
| 998 | + | |
994 | 999 | | |
995 | 1000 | | |
996 | 1001 | | |
| |||
0 commit comments