Skip to content

Speed up HTF creation#1858

Open
jthorton wants to merge 14 commits into
mainfrom
htf_speed
Open

Speed up HTF creation#1858
jthorton wants to merge 14 commits into
mainfrom
htf_speed

Conversation

@jthorton

@jthorton jthorton commented Feb 25, 2026

Copy link
Copy Markdown
Collaborator

Try to speed up the HTF creation, targeting the torsion force and nonbonded exceptions as they scale terribly with system size. Local testing on a small protein-ligand system shows good improvements

Main

systems built starting test
Handled harmonic bonds in 0.36160308588296175 seconds
Handled harmonic angles in 1.0089957918971777 seconds
Handled periodic torsions in 7.016903054900467 seconds
Handled nonbonded interactions in 25.400564597919583 seconds
HybridTopologyFactory took 37.79149968735874 seconds

This branch

systems built starting test
Handled harmonic bonds in 0.3190057808533311 seconds
Handled harmonic angles in 0.9470380647107959 seconds
Handled periodic torsions in 0.24904045276343822 seconds
Time to handle original exceptions 0.51 seconds
Handled nonbonded interactions in 2.244331883266568 seconds
Handled nonbonded exceptions in 0.04826593864709139 seconds
HybridTopologyFactory took 7.013191535137594 seconds

We probably need the tests in #1856 to confirm that nothing is broken with these changes.

TODO

  • test on membrane system

Checklist

  • All new code is appropriately documented (user-facing code must have complete docstrings).
  • Added a news entry, or the changes are not user-facing.
  • Ran pre-commit: you can run pre-commit locally or comment on this PR with pre-commit.ci autofix.

Manual Tests: these are slow so don't need to be run every commit, only before merging and when relevant changes are made (generally at reviewer-discretion).

Developers certificate of origin

@jthorton

Copy link
Copy Markdown
Collaborator Author

pre-commit.ci autofix

@mikemhenry

Copy link
Copy Markdown
Contributor

Oh this is very promising -- how stable are these timings? Do you think it would be worth making a test that ensures (for this test case) HybridTopologyFactory takes less than 10 seconds to build (might need to adjust depending on how long it takes on CI) but it would be nice to have some checks we don't lose these speed-ups later?

@jthorton

Copy link
Copy Markdown
Collaborator Author

how stable are these timings?

Locally, they are within a 3 second window so we could make a timing test, though I am not sure how common that is?We could also provide a timing script to be run whenever someone touches the HTF, which should not be often. Any preference @atravitz ?

@codecov

codecov Bot commented Feb 26, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 92.53731% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 90.67%. Comparing base (541e1ca) to head (8017bb8).

Files with missing lines Patch % Lines
...openfe/protocols/openmm_rfe/_rfe_utils/relative.py 92.53% 5 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1858      +/-   ##
==========================================
- Coverage   94.95%   90.67%   -4.29%     
==========================================
  Files         217      217              
  Lines       20768    20746      -22     
==========================================
- Hits        19720    18811     -909     
- Misses       1048     1935     +887     
Flag Coverage Δ
fast-tests 90.67% <92.53%> (?)
slow-tests ?

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@jthorton

Copy link
Copy Markdown
Collaborator Author

With bccc53c the timings are now

systems built starting test
Handled harmonic bonds in 0.04707380570471287 seconds
Handled harmonic angles in 0.1307398546487093 seconds
Handled periodic torsions in 0.2686290815472603 seconds
Handled nonbonded interactions in 2.285492484457791 seconds
Handled nonbonded exceptions in 0.04942901339381933 seconds
HybridTopologyFactory took 6.055246475152671 seconds

@mikemhenry

Copy link
Copy Markdown
Contributor

There are a few ways to track regressions, one cool way is to make sure the algo doesn't change the way it scales: https://pythonspeed.com/articles/big-o-tests/ but we don't really have things set up to make that easy (but it would be good to know how the HTF scales with system size, so it might be worth doing)

@atravitz

atravitz commented Mar 9, 2026

Copy link
Copy Markdown
Contributor

how stable are these timings?

Locally, they are within a 3 second window so we could make a timing test, though I am not sure how common that is?We could also provide a timing script to be run whenever someone touches the HTF, which should not be often. Any preference @atravitz ?

I like the idea of a timing script for now. I want us to add performance benchmarking like this, but structure is TBD, so pop it in the devtools/ directory for now.

@github-actions

Copy link
Copy Markdown

No API break detected ✅

View workflow run

Griffe output
$ griffe check "openfe" -s src --no-inspection --no-color --verbose -a origin/main

$ griffe check "openfecli" -s src --no-inspection --no-color --verbose -a origin/main

@IAlibay IAlibay left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks great, but there's a few things - especially key collisions, that probably should be further discussed / handled.

index1_hybrid, index2_hybrid,
[r0_old, k_old, r0_new, k_new])
# track that we've added this bond
hybrid_bonds[self._pair_key(index1_hybrid, index2_hybrid)] = True

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason for this to be a dictionary? Could keeping a set or list of pair key indices be suitable?


return []
index1, index2, r0, k = bond_force.getBondParameters(bond_index)
bond_lookup[HybridTopologyFactory._pair_key(index1, index2)] = (r0, k)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we ever be in a situation where there are multiple bonds acting on the same pair of indices? It would be odd, ,but I don't think there's anything in HarmonicBondForce that prevents that.

Given the likely low cost, I would probably suggest capturing that case with a quick "if pairkey in bond_lookup" check.

angle_lookup = {}
for angle_index in range(angle_force.getNumAngles()):
index1, index2, index3, theta0, k = angle_force.getAngleParameters(angle_index)
angle_lookup[HybridTopologyFactory._triplet_key(index1, index2, index3)] = (theta0, k)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above re: bonds with the same keys, should we guard against a case where you could have the same key but with a different angle term?

hybrid_index_list[2], hybrid_force_parameters
)
# track that we have added this angle for the hybrid system
hybrid_angles[self._triplet_key(*hybrid_index_list)] = True

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above - how about using a list or a set?


auxiliary_custom_torsion_force = []
old_custom_torsions_to_standard = []
# use sets to keep membership checks quick as systems have many torsions

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't sets drop torsions with equal indices? My understanding is that this is more common / acceptable.

hybrid_force_parameters = [
torsion_parameters[4], torsion_parameters[5],
torsion_parameters[6], 0.0, 0.0, 0.0
torsion_parameters[4], torsion_parameters[5].value_in_unit(unit.radian),

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not immediately clear to me, why the value_in_unit change?


# Check to see if this term is in the olds...
term = [hybrid_index_list[0], hybrid_index_list[1],
term = (hybrid_index_list[0], hybrid_index_list[1],

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exact quality on converted floats seems like it would increase the likelyhood of a miss here - it would be better if the comparisons was done to a certain float level instead, e.g. truncating the floats to 5 d.p. or something.

# Then this terms has to go to standard and be deleted...
old_index = auxiliary_custom_torsion_force.index(term)
old_custom_torsions_to_standard.append(old_index)
old_custom_torsions_to_standard.add(term)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] (I know it's a holdover from the previous code) maybe we should rename this variable - it's confusing

if particle_index in self._atom_classes['unique_old_atoms']:
# Get the parameters in the old system
old_index = hybrid_to_old_map[particle_index]
[charge, sigma, epsilon] = old_system_nonbonded_force.getParticleParameters(old_index)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're looking for other performance improvements - you could probably cache this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants