Skip to content

Conversation

@wswsmao
Copy link

@wswsmao wswsmao commented Nov 5, 2025

Issue #, if available:

  1. When users run nerdctl images rmi to remove images, fscache is not cleaned up
  2. When the snapshotter exits, fscache is not cleared
  3. After restarting the snapshotter, a new batch of fscache directories is created with duplicate content
  4. The snapshotter does not automatically restore fscache upon restart

Description of changes:
This patch implements explicit resource lifecycle management with immediate eviction support to fix resource leaks during unmount operations.

The core change enhances the LRU cache to support both deferred and immediate eviction modes. The done callback now accepts a boolean parameter: false for lazy cleanup and true for immediate removal. When immediate eviction is requested, cache entries are finalized and removed synchronously instead of waiting for LRU eviction.

A new Close() method is introduced alongside the existing Done() to distinguish between normal resource release and unmount scenarios. During filesystem unmount, layers call Close() to trigger immediate resource eviction, preventing the resource leaks observed in the original issue. Error paths are also updated to use immediate eviction for invalid or corrupted resources, ensuring they don't occupy cache space.

Additionally, the patch adds explicit cleanup of span manager resources in the reader's Close() method and improves snapshot restoration robustness by pre-creating required directories before mount operations. Overall, this provides deterministic cleanup timing with clear "release when convenient" vs. "release immediately" semantics, which is critical for proper resource management during unmounts and error handling. and error handling.

Testing performed:

  1. Run an image
nerdctl run --snapshotter soci --rm -it env12.com/ocs9:soci echo "hello"

Observed:

Every 1.0s: du -h --max-depth=1 /var/lib/soci-snapshotter-grpc/soci/spancache                   VM-33-248-tlinux: Wed Nov  5 09:35:08 2025

25M     /var/lib/soci-snapshotter-grpc/soci/spancache/2068117739
25M     /var/lib/soci-snapshotter-grpc/soci/spancache
  1. Remove the image
nerdctl images -q | xargs nerdctl rmi -f
Untagged: env12.com/ocs9:soci@sha256:979fe973858c87e1fb2152a9b17563f6fc59abc5c2f85293d15a7c827bbb6a02
Deleted: sha256:3a6ef9931f24a0ccd89e429366e0341d81b58d97bf7115612494f4249dd8d351

Observed:

Every 1.0s: du -h --max-depth=1 /var/lib/soci-snapshotter-grpc/soci/spancache                   VM-33-248-tlinux: Wed Nov  5 09:35:27 2025

4.0K    /var/lib/soci-snapshotter-grpc/soci/spancache
  1. Re-run the image
nerdctl run --snapshotter soci --rm -it env12.com/ocs9:soci echo "hello"
env12.com/ocs9:soci:                                                              resolved       |++++++++++++++++++++++++++++++++++++++| 
index-sha256:979fe973858c87e1fb2152a9b17563f6fc59abc5c2f85293d15a7c827bbb6a02:    done           |++++++++++++++++++++++++++++++++++++++| 
manifest-sha256:fb1079429352f239c48ea35304477b270a42932d000b0707e4d96990f68afb2f: done           |++++++++++++++++++++++++++++++++++++++| 
config-sha256:ba7df7c2d4c09ddea132cac1bcc9d77c461db75fb49d681837695bbec4df6142:   done           |++++++++++++++++++++++++++++++++++++++| 
elapsed: 0.6 s                                                                    total:  3.2 Ki (5.3 KiB/s)                                       
hello

Observed:

Every 1.0s: du -h --max-depth=1 /var/lib/soci-snapshotter-grpc/soci/spancache                   VM-33-248-tlinux: Wed Nov  5 09:35:59 2025

71M     /var/lib/soci-snapshotter-grpc/soci/spancache/3428176031
71M     /var/lib/soci-snapshotter-grpc/soci/spancache
  1. Kill the snapshotter process (Ctrl+C)

Observed:

Every 1.0s: du -h --max-depth=1 /var/lib/soci-snapshotter-grpc/soci/spancache                   VM-33-248-tlinux: Wed Nov  5 09:36:17 2025

4.0K    /var/lib/soci-snapshotter-grpc/soci/spancache
  1. Restart the snapshotter process and run the container again

Observed:

Every 1.0s: du -h --max-depth=1 /var/lib/soci-snapshotter-grpc/soci/spancache                   VM-33-248-tlinux: Wed Nov  5 09:36:31 2025

25M     /var/lib/soci-snapshotter-grpc/soci/spancache/1469904631
25M     /var/lib/soci-snapshotter-grpc/soci/spancache

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@wswsmao wswsmao requested a review from a team as a code owner November 5, 2025 09:39
@github-actions github-actions bot added the go Pull requests that update Go code label Nov 5, 2025
@wswsmao wswsmao changed the title Cleanup fix fscache not cleanup Nov 5, 2025
@wswsmao wswsmao mentioned this pull request Nov 5, 2025
@github-actions github-actions bot added the testing Unit and/or integration tests label Nov 5, 2025
Signed-off-by: Shubhranshu Mahapatra <[email protected]>
Signed-off-by: abushwang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

go Pull requests that update Go code testing Unit and/or integration tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants