Skip to content

Conversation

@Molter73
Copy link
Collaborator

@Molter73 Molter73 commented Nov 13, 2025

Description

Retrieve device id for the dentry being accessed on kernel side, then
use the mountinfo from /proc on userspace to adjust the path the user
would see on the host node.

This new logic requires us to keep track of mountinfo in userspace, so a
new EventParser type is added to do this in a cached manner and generate
events with it.

In case a device id is received that is not found in the mountinfo
cache, the cache will be rebuilt. If the device id is still not
found, an empty entry will be added for that id and we will assume we
cannot get the required information to correct the host path gathered
from kernelspace.

These changes also require some adjustments to work on k8s, so the
manifest is updated accordingly.

Checklist

  • Investigated and inspected CI test results
  • Updated documentation accordingly

Automated testing

  • Added unit tests
  • Added integration tests
  • Added regression tests

If any of these don't apply, please comment below.

Testing Performed

Added tests will validate events generated on an overlayfs file properly
shows the event on the upper layer and the access to the underlying FS.
They also validate a mounted path on a container resolves to the correct
host path.

While developing these tests, it became painfully obvious getting the
information of the process running inside the container is not
straightforward. Because containers tend to be fairly static, we should
be able to manually create the information statically in the test and
still have everything work correctly. In order to minimize the amount of
changes on existing tests, the default Process constructor now takes
fields directly and there is a from_proc class method that builds a new
Process object from /proc. Additionally, getting the pid of a process in
a container is virtually impossible, so we make the pid check optional.

@Molter73 Molter73 marked this pull request as draft November 13, 2025 11:50
@Molter73 Molter73 force-pushed the mauro/ROX-30437/refine-host-path-algo branch from c3077eb to 4b356e7 Compare November 13, 2025 12:09
@Molter73 Molter73 requested a review from Stringy November 17, 2025 11:11
Copy link
Contributor

@Stringy Stringy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Performance impact notwithstanding, this does look like a pretty neat and clear way of resolving host paths.

Comment on lines +175 to +177
if let Ok(hf) = host_file.strip_prefix(&mount.root) {
host_file = mount.mount_point.join(hf);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took me a little while to understand what we're doing here, could we have a small comment with an example?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a quick note that we'll have to make similar updates in the main repo helm charts when this PR is merged.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will happily put together the PR for the main repo after we discuss the alternate implementation I'm about to open.

Retrieve device id for the dentry being accessed on kernel side, then
use the mountinfo from /proc on userspace to adjust the path the user
would see on the host node.

This new logic requires us to keep track of mountinfo in userspace, so a
new EventParser type is added to do this in a cached manner and generate
events with it.

In case a device id is received that is not found in the mountinfo
cache, the cache will be rebuilt. If the device id is still not
found, an empty entry will be added for that id and we will assume we
cannot get the required information to correct the host path gathered
from kernelspace.

These changes also require some adjustments to work on k8s, so the
manifest is updated accordingly.
Added tests will validate events generated on an overlayfs file properly
shows the event on the upper layer and the access to the underlying FS.
They also validate a mounted path on a container resolves to the correct
host path.

While developing these tests, it became painfully obvious getting the
information of the process running inside the container is not
straightforward. Because containers tend to be fairly static, we should
be able to manually create the information statically in the test and
still have everything work correctly. In order to minimize the amount of
changes on existing tests, the default Process constructor now takes
fields directly and there is a from_proc class method that builds a new
Process object from /proc. Additionally, getting the pid of a process in
a container is virtually impossible, so we make the pid check optional.
@Molter73 Molter73 force-pushed the mauro/ROX-30437/refine-host-path-algo branch from 1c58b67 to e2489e9 Compare November 19, 2025 14:12
@Molter73
Copy link
Collaborator Author

There is an alternative implementation on #158. Compared to that implementation, there are a few benefits to this implementation:

  • As long as we don't hit the BPF side limit climbing the dentry list, we should always get the host path, regardless of where it is on the file system.
  • Userspace code does not require FFI, but overall complexity might be a bit higher.

Main downside of this approach is that we lose kernel side filtering unless we implement some path overlapping algorithm in BPF.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants