Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions loader/src/aarch64/cpus.c
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
#include <stdint.h>

#include "smc.h"
#include "../arch.h"
#include "../cpus.h"
#include "../cutil.h"
#include "../loader.h"
Expand Down Expand Up @@ -110,6 +111,14 @@ void arm_secondary_cpu_entry(int logical_cpu, uint64_t mpidr_el1)

plat_save_hw_id(logical_cpu, mpidr_el1);

int r = arch_mmu_enable(logical_cpu);
if (r != 0) {
LDR_PRINT("ERROR", logical_cpu, "failed to enable MMU: ");
puthex32(r);
puts("\n");
goto fail;
}

start_kernel(logical_cpu);

fail:
Expand Down Expand Up @@ -175,6 +184,10 @@ int plat_start_cpu(int logical_cpu)
/* zero out what was here before */
sp[1] = 0;

/* clean up the cache line containing sp before use by secondary core */
asm volatile("dc cvac, %0" :: "r"(sp) : "memory");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is an argument that we should make the loader use outer shareable instead of inner shareable, then I think this should be fine.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that would solve any coherency problems when the caches are disabled on the secondary cores.

Copy link
Copy Markdown
Author

@bruelc bruelc Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that would solve any coherency problems when the caches are disabled on the secondary cores.

Caches are enabled early in arm_secondary_cpu_entry(). Before, the only data that must be coherent is the parameter read from the stack by arm_secondary_cpu_entry_asm() (without this I read 0 as logical_cpu parameter, which fails). Instructions should be fine.

We could conservatively clean the entire cache here, but that seems like overkill. Am I missing anything, or do you see any other possible source of incoherence before caches are enabled in this sequence?

(edit: arm_secondary_cpu_entry_asm() read the stack, not the PSCI call)

Copy link
Copy Markdown
Contributor

@midnightveil midnightveil Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies, I didn't notice this was for the stack, which probably needs a cache clean here yes.

What I had thought of was that the print_lock is a variable that exists across cores that won't be in the same shareability domain. I probably need to recheck the manual here, it might be that a "dsb sy" (full system barrier) would be OK when the memory is mapped ISH as opposed to OSH, but I also don't know if that's what the compiler atomics expect and whether the instructions they emit work for that case.

Admittedly this isn't an issue with this PR at all...

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem. It's best to discuss the details now :). I think the print_lock exclusive accesses are in the same shareability domain, because they are accessed after enable_mmu() on all cores, with the same MMU/cache attributes. Also, we enable both ISH and OSH shareability in the translation tables

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, we enable both ISH and OSH shareability in the translation tables

I'm not sure what you mean by this?


Anyway, it's a Friday evening and I'm mixing everything up 🙃. For some reason I'm thinking that ISH means "non-shareable"; I'm mixing things up. ISH should be fine for cores; all cores are (usually) in the same inner shareable domain; at least that's what the seL4 SMP code assumes. Non-shareable would be a different matter, it was just the comment in here that the shareability matches that of seL4 sees to be not true as seL4 is NSH on the non-SMP configurations.

tldr: no issue I mixed things up .

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that would solve any coherency problems when the caches are disabled on the secondary cores.

Caches are enabled early in arm_secondary_cpu_entry(). Before, the only data that must be coherent is the parameter read from the stack by arm_secondary_cpu_entry_asm() (without this I read 0 as logical_cpu parameter, which fails). Instructions should be fine.

We could conservatively clean the entire cache here, but that seems like overkill. Am I missing anything, or do you see any other possible source of incoherence before caches are enabled in this sequence?

My comment was in reply to Julia's suggestion to have the memory as outer-shareable instead of inner-shareable, not your cache clean code. The code you have is to handle the case before caches are enabled, so it doesn't matter whether it's mapped inner or outer shareable on the other core.

Anyway, considering you know that the other core starts at the correct exception level, and the MMU is already configured, you could just enable it immediately at the start of arm_secondary_cpu_entry_asm instead. That seems better than ad-hoc cache flushing.

That, or start the other cores with cache disabled. But this weird mix of having cache enabled on one core and disabled on the others seems precarious.

asm volatile("dsb sy" ::: "memory");

/* Arguments as per 5.1.4 CPU_ON of the PSCI spec.

§5.6 CPU_ON and §6.4 describes that:
Expand Down
16 changes: 7 additions & 9 deletions loader/src/loader.c
Original file line number Diff line number Diff line change
Expand Up @@ -100,15 +100,6 @@ static int print_lock = 0;

void start_kernel(int logical_cpu)
{
LDR_PRINT("INFO", logical_cpu, "enabling MMU\n");
int r = arch_mmu_enable(logical_cpu);
if (r != 0) {
LDR_PRINT("ERROR", logical_cpu, "failed to enable MMU: ");
puthex32(r);
puts("\n");
for (;;) {}
}

LDR_PRINT("INFO", logical_cpu, "jumping to kernel\n");

#ifdef CONFIG_PRINTING
Expand Down Expand Up @@ -175,6 +166,13 @@ int main(void)
puthex32(plat_get_active_cpus());
puts("\n");

r = arch_mmu_enable(0);
if (r != 0) {
LDR_PRINT("ERROR", 0, "failed to enable MMU: ");
puthex32(r);
fail();
}

for (int cpu = 1; cpu < plat_get_active_cpus(); cpu++) {
r = plat_start_cpu(cpu);
if (r != 0) {
Expand Down
Loading