Steering #15053

shakedzy · 2025-08-03T08:17:29Z

shakedzy
Aug 3, 2025

It seems many recent papers and researches are dealing with a new technique named "Steering", where layers attentions are manipulated using a provided vector. On a more technical level, this involves another multiplication operation given an artifact of the vector/matrix.
At the moment, the only available tool to use this is transformers, as it allows full interaction and manipulation with the model weights.
I believe future versions of llama.cpp should support this functionality too. What do others here think?

NeuralNotwerk · 2025-12-20T05:10:52Z

NeuralNotwerk
Dec 20, 2025

It would be interesting to see this implemented in such a way that if we enabled steering, on load, llama.cpp inserts synthetic layers between every layer that essentially pass data straight through with no further agitation. However if you pass in a steering vector, scaling value, and target layer id at inference time, the synthetic layer inserts the steering vector for the duration of the inference call, then goes back to straight passthrough.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Steering #15053

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Steering #15053

Uh oh!

shakedzy Aug 3, 2025

Replies: 1 comment

Uh oh!

NeuralNotwerk Dec 20, 2025

shakedzy
Aug 3, 2025

NeuralNotwerk
Dec 20, 2025