Process shake restarts asynchronously#4879
Conversation
cc251cd to
88e0f88
Compare
0a7302b to
4090ec3
Compare
|
I can't get tests to pass with notifications fully async. Kinda makes sense looking at what these notifications indicate. I thought I could get away with only LSP handling everything synchronously of the VFS-side, but it appears not. I'm now trying out a small variant where notifications are still synchronously, but that the session restarts are initiated async, while still blocking subsequent LSP requests. So restarts still get squashed. |
42b60e4 to
69e0d70
Compare
1dfd24e to
5978cad
Compare
| sessionRestartTQueue <- withWorkerQueueSimple (cmapWithPrio Session.LogSessionWorkerThread recorder) "RestartTQueue" | ||
| sessionRestartTQueue <- liftIO $ newRestartSlot | ||
| sessionLoaderTQueue <- withWorkerQueueSimple (cmapWithPrio Session.LogSessionWorkerThread recorder) "SessionLoaderTQueue" | ||
| ContT $ \action -> withRestartWorker ideMVar $ action () |
There was a problem hiding this comment.
better to extend the logic of workerThread module so it can be reused here.
there is a slight chance that downstream code would swallow the async exception and hang the shutdown. workerThread handled that by setting up a shutdown flag .
Thanks for taking a peak at the PR! The idea I'm going for now here is to still have notifications be synchronous, but that they don't wait for the shake restart to finish, unlike now. So multiple subsequent edits will issue multiple synchronous notifications which will all start asynchronous shake restarts. This allows the restarts to merge. Note that this is still me experimenting, I'm not sure if I'm breaking preconditions. This is all quite nuancy. |
good idea
I’ve experienced this before: if we don’t wait for Shake to fully restart, requests may observe stale state. I think we should ensure any pending restarts have completed before processing new requests but keep the notifications side of the story as you suggested, It might work. Also it might be a good idea to go through the occurences of ----- update ---- Also we should move the "updatePositionMapping" in the notification handler into restart so every restart(merged or not) would see consistent mapping state ? |
7e4c4c7 to
224e74f
Compare
|
Alright, report time. Tests now pass. It now runs the notification handlers synchronously but any shake restart is done asynchronously through a worker thread. Some notes.
I ran the benchmarks in #4934, with smaller (very small) typing delays, which is optimal for observing the effect of this PR, but not super realistic. I couldn't observe a difference with the default delay of Running the typing burst benchmarks with very small typing delays
Example, Version, Configuration, name, success, samples, startup, setup, userT, delayedT, 1stBuildT, avgPerRespT, totalT, rulesBuilt, rulesChanged, rulesVisited, rulesTotal, ruleEdges, ghcRebuilds, maxResidency, allocatedBytes
cabal, upstream, All, hover after typing burst, True, 100, 4.07, 0.00, 6.49, 14.59, 0.55, 0.03, 21.09, 18, 15, 1164, 4042, 19942, 134, 286MB, 90219MB
cabal, HEAD, All, hover after typing burst, True, 100, 3.99, 0.00, 5.95, 14.93, 0.67, 0.03, 20.89, 18, 15, 1181, 4041, 19942, 114, 277MB, 88055MB
cabal, upstream, All, getDefinition after typing burst, True, 100, 0.66, 0.00, 7.51, 14.49, 0.64, 0.03, 22.02, 17, 14, 1183, 4045, 19949, 137, 271MB, 85329MB
cabal, HEAD, All, getDefinition after typing burst, True, 100, 0.57, 0.00, 6.10, 12.72, 0.75, 0.03, 18.83, 18, 15, 1181, 4043, 19949, 119, 279MB, 79739MB
cabal, upstream, All, completions after typing burst, True, 100, 0.67, 0.00, 9.62, 12.36, 0.76, 0.04, 21.99, 19, 16, 1182, 4043, 19941, 127, 348MB, 90711MB
cabal, HEAD, All, completions after typing burst, True, 100, 0.56, 0.00, 8.96, 11.11, 0.88, 0.04, 20.09, 19, 16, 1182, 4042, 19941, 117, 317MB, 88975MB |
@crtschin consider this, notification A then request B. How do we ensure B run after A's session restart in this PR ? |
Good question, I'll walk you through an example timeline.
Critically it's the locks that are added by notifications, which are waited on by follow-up requests, that ensures the ordering. |
| -- old session and @GetLinkable@ errors. | ||
| -- See Note [Notification vs request restart ordering] | ||
| (modifyForEvaluate queueForEvaluation >> waitForLastRestart st) | ||
| (modifyForEvaluate unqueueForEvaluation) |
There was a problem hiding this comment.
why do we not need to waitForLastRestart for unqueueForEvaluation as well ?
There was a problem hiding this comment.
Good one, I was thinking waiting was only needed for the session setup, but it makes sense for the eval itself as well.
| -- do need to occur here, the keys they invalidate need to propagate to the | ||
| -- worker so it can be used during the concrete restart. | ||
| -- See Note [Housekeeping rule cache and dirty key outside of hls-graph]. | ||
| !newDirty <- fromListKeySet <$> ioActionBetweenShakeSession |
There was a problem hiding this comment.
ioActionBetweenShakeSession should be run after all the shake rules that rely on these states fully stop.
If it happpens here, there is chance that the newer state being picked up by an old step rule run and result in inconsistancy of the shake database. It is used to be an old bug stay in hls for years until it was fixed.
There was a problem hiding this comment.
Hmm, I tried this, but some tests failed. I'll try this again, as some of the failures may have been eval-related.
Moves in-between shake actions to be accumulated and ran on the worker thread again, instead of prior to restart submission.
e2a5d19 to
fad54eb
Compare
ca33b02 to
26d319d
Compare

Would close #4725.
Implements the ideas from #4725 (comment). Makes the following changes: