Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added tooling for trie stats #8

Merged
merged 11 commits into from
Feb 8, 2024
Merged

Added tooling for trie stats #8

merged 11 commits into from
Feb 8, 2024

Conversation

BGluth
Copy link
Collaborator

@BGluth BGluth commented Feb 6, 2024

A bit more tooling related to extracting stats from tries. Primary use case that I want from this is to compare sub-tries from the base trie they were created from.

To give an idea of what the output looks like:

Trie Stats: (local_partial_plus_delta_storage_txn_4.json)
Counts:
Empty: 39 (18.660%)
Hash: 154 (73.684%)
Branch: 13 (6.220%)
Extension: 0 (0.000%)
Leaf: 3 (1.435%)

Depth stats:
Lowest depth: 4
Average leaf depth: 4.000
Average hash depth: 2.604

Trie Stats: (local_full_post_delta_storage_txn_4.json)
Counts:
Empty: 50 (19.455%)
Hash: 164 (63.813%)
Branch: 16 (6.226%)
Extension: 0 (0.000%)
Leaf: 27 (10.506%)

Depth stats:
Lowest depth: 4
Average leaf depth: 4.000
Average hash depth: 2.433

Node comparison: Total nodes: 209 / 257 (81.323%)
Non-empty: 170 / 207 (82.126%)
Total empty: 39 / 50 (78.000%)
Total hash: 154 / 164 (93.902%)
Total branch: 13 / 16 (81.250%)
Total extension: 0 / 0 (NaN%)
Total leaf: 3 / 27 (11.111%)

Depth comparison: Lowest depth: 4 / 4 (100.000%)
Avg leaf depth: 4.000 / 4.000 (100.000%)
Avg hash depth: 2.604 / 2.433 (107.027%)

@BGluth BGluth requested a review from Nashtare February 6, 2024 23:40
- Also did some refactoring to reduce code duplication that cut across
  modules
- Previously had a lot of redundant info that was also present in
  `TrieStats`. Now compares the depth values from both tries directly.
src/debug_tools/stats.rs Outdated Show resolved Hide resolved
}
}

fn get_trie_stats_rec<T: PartialTrie>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

returning the state rather than using a mut state as an arg would be more rusty

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only downside is then we would need to do more computation when merging together multiple CurrTrackingState together, which I guess doesn't really matter too much.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will revisit this afterwards!

Copy link
Contributor

@muursh muursh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given it's just a debug tool I'm not too worried about the bits I noted. Having working tools is more useful than not.

Copy link
Contributor

@Nashtare Nashtare left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @BGluth!
I'll stop commenting now because most of my comments are identical. So global points:

  • we can probably not have pub most methods / objects lying around. I think that's probably good practice to reduce public API in general.
  • we can add some clearer documentation on new types and methods, but it's not blocking.

src/debug_tools/stats.rs Outdated Show resolved Hide resolved
src/debug_tools/stats.rs Show resolved Hide resolved
src/debug_tools/stats.rs Outdated Show resolved Hide resolved
src/debug_tools/stats.rs Outdated Show resolved Hide resolved
src/debug_tools/stats.rs Outdated Show resolved Hide resolved
src/debug_tools/stats.rs Outdated Show resolved Hide resolved
src/debug_tools/stats.rs Show resolved Hide resolved
@muursh
Copy link
Contributor

muursh commented Feb 7, 2024

I think that's probably good practice to reduce public API in general.

Yeah agreed. Meant to comment this.

@BGluth
Copy link
Collaborator Author

BGluth commented Feb 7, 2024

Hmm yeah. So I ended up making a lot more of the interface public at the last minute because I thought the callers might want to access the stat data without relying just on Display. My thought that having a lot of public surface being ok in this is just it's stat info on the trie structure that is probably very stable at this point. I'm also fine with limiting the exposed API to just Display though.

@Nashtare
Copy link
Contributor

Nashtare commented Feb 7, 2024

It's not really blocking, we can always restrict the API in the future. This isn't even on crates.io yet, is it?

@BGluth
Copy link
Collaborator Author

BGluth commented Feb 7, 2024

It's on crates.io, but it's been a while since we've pushed a new version. It's a pretty easy change for me to make, so for now I'll restrict the API a bit more,

Co-authored-by: Robin Salen <30937548+Nashtare@users.noreply.github.com>
@Nashtare
Copy link
Contributor

Nashtare commented Feb 7, 2024

Ah nevermind for crates.io, for some reason I thought this was eth-trie-tools 🤦🏼‍♂️

@BGluth BGluth merged commit 731a095 into main Feb 8, 2024
2 checks passed
@BGluth BGluth deleted the debug_tools_stats branch February 8, 2024 20:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants