Bring `canonization` up to current commit and optimize.

It is difficult to build our crate on Windows, because of our dependence on Nauty. We want to get rid of this dependency. We only use its canonical-representation subroutine, which obtains a hashable representation of a graph, such that two isomorphic graphs have the same representation.

@devrz45's canonization branch has a function that does exactly this, and it is specific to molecular graphs (and therefore possibly faster). However,

1. It branched off of the main codebase early, and needs a rebase and update to interface with the current `molecule::Molecule`. 
2. We need to optimize it to be at least as fast as nauty.

For (2), I have a few optimization ideas: 

(a) Get rid of all internal string operations (string formatting, etc). I think the main place where string operations are used is to index into `MolAtomNode::canonize::DAG_vertex_map`.
(b) Remove almost all uses of hashmaps/hashsets. Use bitsets/vectors in their place if feasible. Note that `NodeIndex`s and `EdgeIndex`s are contiguous, so it is possible to index a vector or bitset by `{NodeIndex/EdgeIndex}::index()`. Such a representation will likely be faster than a hashmap/hashset. 

Ideally, we'd have a prototype that looks like this:

```rust
// Very lightweight representation, clones should be very cheap. 
// The meat of this struct might be a `Vec<usize>`, or similar.
#[derive(Hash, Copy, Clone)]
struct CanonicalRepresentation { ... }

// Should run in ~100us for most subgraphs of (say) Morphine.
fn canonical_representation(molecule: &Molecule) -> CanonicalRepresentation
```



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Bring `canonization` up to current commit and optimize. #83

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Bring canonization up to current commit and optimize. #83

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Bring `canonization` up to current commit and optimize. #83