Releases · HomebrewML/TrueGrad · GitHub

08 Aug 08:47

ClashLuke

4.0.3 Latest

Latest

fix(Sign): self-graft correctly; previously, we did update.sign() * update.norm(), omitting the required division by the original norm. Now, it's F.normalize(update.sign()) * update.norm(). This changes the required learning rates for self-grafted tg.optim.Sign.

Assets 3

23 Apr 06:18

ClashLuke

4.0.2

Use WeightDecayChain in OptimizerOptimizer

Assets 3

23 Apr 06:16

ClashLuke

4.0.1

Add missing params_flat in Graft

Assets 3

23 Apr 06:13

ClashLuke

4.0.0

Add configurable weight decay via WeightDecayChain
- L1/L2 Decay
- Decay to Init/EMA
Remove decay_to_init flag. Use weight_decay_cls=tg.WeightDecayChain(tg.optim.WeightDecayToInit()) instead.
Remove default_to_adam flag. Use default_to_baseline.

Assets 3

14 Jan 19:40

ClashLuke

2.3.5

Fix the bugs

Assets 2

14 Jan 13:16

ClashLuke

2.2.0

Improve TG-Optimizer extensibility by adding TrueGrad base optimizer class
Add (TG-)LaProp

Assets 2

29 Nov 07:34

ClashLuke

2.1.0

feat(nn.functional): allow parameters in more truegrad.nn.functional ops
fix(functional): allow odd shapes in truegrad.functional.einsum's backward
feat(utils): allow the combination of truegrad.nn with truegrad.utils.patch_model
fix(TGAdamW): improve stability

Together, these features allow performant usage of off-the-shelf HuggingFace Transformers using truegrad.utils.patch_torch.

Assets 4

27 Nov 15:14

ClashLuke

2.0.0

Feature: Patch torch and torch.nn.functional in truegrad.utils.patch_torch
Feature: Add chunk, split and transpose to truegrad.functional
Fix: publicly expose truegrad.nn.functional
Fix: use patched chunk, split and transpose functions in truegrad.nn.functional.multi_head_attention_forward (closes #1)

Assets 4

27 Nov 14:26

ClashLuke

1.0.0

Add truegrad.nn.functional
Extend truegrad.nn
Add truegrad.utils.patch_torch
Add truegrad.functional.TrueGradTensor to store sum_grad_squared (-> fixed truegrad.functional.reshape)

Assets 4

26 Nov 16:02

ClashLuke

0.1.0

Add BackPack as possible backend
default_to_adam option for TGAdamW
rename square_grad to sum_grad_squared

Assets 4