Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

node/bindnode: support binding a plain Go map to an IPLD map #343

Open
mvdan opened this issue Jan 27, 2022 · 1 comment
Open

node/bindnode: support binding a plain Go map to an IPLD map #343

mvdan opened this issue Jan 27, 2022 · 1 comment
Assignees

Comments

@mvdan
Copy link
Contributor

mvdan commented Jan 27, 2022

Right now, we implement ordered maps in the form of:

struct {
    Keys   []K
    Values map[K]V
}

This is because maps in IPLD must have a stable order:

Iteration order is defined to be stable: two separate MapIterator created to iterate the same Node will yield the same key-value pairs in the same order. The order itself may be defined by the Node implementation: some Nodes may retain insertion order, and some may return iterators which always yield data in sorted order, for example. 

What we implement now with the struct above is an ordered map that keeps insertion order. However, in many common use cases like graphsync, that's unnecessary; they just want a map to encode and decode as dag-cbor. In that case, a plain Go map that sorts its keys when MapIterator gets called should be enough.

Sorting the keys any time MapIterator gets called is a certain amount of overhead, both in copying the keys and in sorting them, but:

  • The current ordered map already duplicates the keys
  • We could alleviate the cost of copying the keys and sorting them by caching the result in the Node wrapper

cc @warpfork @rvagg @hannahhoward

@warpfork
Copy link
Collaborator

warpfork commented Jan 31, 2022

Hm. One non issue: allocation amortization. It's a noteworthy issue where this pattern appears in the gogen code generator and in the basicnode implementation, but I don't think it applies here.

(In detail: if K is going to be returned as an ipld.Node, having it in one big slab of memory -- the Keys slice -- means the whole thing can escape to heap one time, and that can save on a lot of allocs that may otherwise occur. If those values needed to be boxed into an ipld.Node interface, one each, and didn't all come from one contiguous memory like that, it would mean iteration costs an alloc per step, which is not good.)

But this doesn't apply to bindnode: bindnode always has to return a new object for wrapping the key value, regardless, because the K type doesn't implement ipld.Node itself.

In fact, I guess making one []bindnode._node and wiring the contents up to contain all the keys may end up doing this amortization nicely, whereas (I think) currently there is no amortization, so this should... end up being a net win, often? (The wrapper allocs are O(n) vs sorting being O(nlogn), but I think the constant factor on allocs is quite a bit higher than the constant factors for comparison and swapping in the sorting.)

@BigLep BigLep moved this to 🗄️ Backlog in IPLD team's weekly tracker Jul 19, 2022
@rvagg rvagg moved this from 🗄️ Backlog to 🥞 Todo in IPLD team's weekly tracker Aug 2, 2022
@rvagg rvagg self-assigned this Aug 2, 2022
@rvagg rvagg moved this from 🥞 Todo to 🗄️ Backlog in IPLD team's weekly tracker Nov 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🗄️ Backlog
Development

No branches or pull requests

3 participants