-
Notifications
You must be signed in to change notification settings - Fork 72
loader,smp: Enable caches on all cores for shareability #465
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
bruelc
wants to merge
1
commit into
seL4:main
Choose a base branch
from
bruelc:master-cpusync
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is an argument that we should make the loader use outer shareable instead of inner shareable, then I think this should be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that would solve any coherency problems when the caches are disabled on the secondary cores.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caches are enabled early in arm_secondary_cpu_entry(). Before, the only data that must be coherent is the parameter read from the stack by arm_secondary_cpu_entry_asm() (without this I read 0 as logical_cpu parameter, which fails). Instructions should be fine.
We could conservatively clean the entire cache here, but that seems like overkill. Am I missing anything, or do you see any other possible source of incoherence before caches are enabled in this sequence?
(edit: arm_secondary_cpu_entry_asm() read the stack, not the PSCI call)
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies, I didn't notice this was for the stack, which probably needs a cache clean here yes.
What I had thought of was that the print_lock is a variable that exists across cores that won't be in the same shareability domain. I probably need to recheck the manual here, it might be that a "dsb sy" (full system barrier) would be OK when the memory is mapped ISH as opposed to OSH, but I also don't know if that's what the compiler atomics expect and whether the instructions they emit work for that case.
Admittedly this isn't an issue with this PR at all...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem. It's best to discuss the details now :). I think the print_lock exclusive accesses are in the same shareability domain, because they are accessed after enable_mmu() on all cores, with the same MMU/cache attributes. Also, we enable both ISH and OSH shareability in the translation tables
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what you mean by this?
Anyway, it's a Friday evening and I'm mixing everything up 🙃. For some reason I'm thinking that ISH means "non-shareable"; I'm mixing things up. ISH should be fine for cores; all cores are (usually) in the same inner shareable domain; at least that's what the seL4 SMP code assumes. Non-shareable would be a different matter, it was just the comment in here that the shareability matches that of seL4 sees to be not true as seL4 is NSH on the non-SMP configurations.
tldr: no issue I mixed things up .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My comment was in reply to Julia's suggestion to have the memory as outer-shareable instead of inner-shareable, not your cache clean code. The code you have is to handle the case before caches are enabled, so it doesn't matter whether it's mapped inner or outer shareable on the other core.
Anyway, considering you know that the other core starts at the correct exception level, and the MMU is already configured, you could just enable it immediately at the start of
arm_secondary_cpu_entry_asminstead. That seems better than ad-hoc cache flushing.That, or start the other cores with cache disabled. But this weird mix of having cache enabled on one core and disabled on the others seems precarious.