Skip to content

Make dynsym order deterministic#1263

Draft
quic-areg wants to merge 1 commit into
qualcomm:mainfrom
quic-areg:det-dynsym
Draft

Make dynsym order deterministic#1263
quic-areg wants to merge 1 commit into
qualcomm:mainfrom
quic-areg:det-dynsym

Conversation

@quic-areg

Copy link
Copy Markdown
Contributor

Multithreaded relocation scanning made .dynsym (and r_info indices in .rela.*) dependent on insertion order. Sort symbols before assigning dynsym indices to make the order deterministic.

Fixes #1261.

Comment thread lib/Target/GNULDBackend.cpp Outdated
return;

// Sort before partitioning so dynsym indices are deterministic across runs.
m_Module.sortSymbols();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We call sortSymbols also in GNULDBackend::emitRegNamePools. Should we now remove it from here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need both: this sort runs before layout to produce a deterministic dynsym partition and the existing sort in emitRegNamePools runs after layout when final addresses exist. Removing the second one would change .symtab order.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Thanks for the explanation. I am still concerned about a few things. Now, the dynsym order is deterministic but still arbitrary for the user. We have the below code in the Module::sortSymbols:

    auto ValA = A->outSymbol()->value();
    auto ValB = B->outSymbol()->value();
    if (ValA != ValB)
      return ValA < ValB;
    // Break ties deterministically by name, then input ordinal, so the final
    // order does not depend on insertion order which can vary across runs due
    // to multithreaded relocation scanning.
    int NameCmp = A->getName().compare(B->getName());
    if (NameCmp != 0)
      return NameCmp < 0;
    assert(A->resolvedOrigin() && B->resolvedOrigin());
    return A->resolvedOrigin()->getInput()->getInputOrdinal() <
           B->resolvedOrigin()->getInput()->getInputOrdinal();

Symbol value comparison seems misleading here because at this stage the symbols contain the original values from the input files. Hence, the "sort by address" intent is not actually being realized here. If we have the following symbols: lib1.so[foo] (value: 0x10), lib1.so[bar] (value: 0x20), lib2.so[baz] (0x10), then the output order of dynsym table will be: [lib1.so[foo], lib2.so[baz], lib1.so[bar]]. The order is deterministic but it is still difficult to reason without seeing the input values of each individual shared library symbols. Later, if the value of lib1.so[foo] changes, then the .dynsym order will also change.

Additionally, I believe that it would be better to only sort dynamic symbols here (GNULDBackend::DynamicSymbols vector) as that is both significantly more efficient and makes the intent clear from the code itself.

Comment thread include/eld/Config/LinkerConfig.h Outdated
Comment thread test/Common/standalone/DeterministicDynsym/DeterministicDynsym.test Outdated
@@ -0,0 +1,47 @@
## Repeated links of the same inputs into a shared library must produce
## identical output, even with multithreaded scanning.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check this on windows too by running the workflow on this PR.

Comment thread test/Common/standalone/DeterministicDynsym/DeterministicDynsym.test Outdated
Comment thread lib/Core/Module.cpp Outdated
if (A->type() != ResolveInfo::Section &&
(B->type() == ResolveInfo::Section))
return false;
if (A->isLocal() && !B->isLocal())

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not specific to this patch, but since you are changing this function. We need to handle absolute FILE symbols. Can be a seperate bug.

@quic-areg quic-areg force-pushed the det-dynsym branch 2 times, most recently from 4dfad1f to 3c14c44 Compare June 11, 2026 17:20
@quic-areg

Copy link
Copy Markdown
Contributor Author

@quic-areg quic-areg force-pushed the det-dynsym branch 2 times, most recently from 094ed23 to 013da80 Compare June 15, 2026 15:14
@quic-areg quic-areg marked this pull request as draft June 15, 2026 15:20
Multithreaded relocation scanning made the order in which symbols were
appended to the module symbol vector dependent on thread timing. Both
.dynsym and .symtab inherited that order: existing sorts had address
ties everywhere (undefs at 0, linker-created PLT/GOT stubs sharing
addresses), and stable_sort fell back to the racy insertion order.

Sort each table deterministically by (input ordinal, input .symtab index).

Assisted-by: Claude

Fixes qualcomm#1261.

Signed-off-by: quic-areg <aregmi@qti.qualcomm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix non-deterministic order of entries in .dynsym

3 participants