node/bindnode: support binding a plain Go map to an IPLD map #343

mvdan · 2022-01-27T10:25:17Z

Right now, we implement ordered maps in the form of:

struct {
    Keys   []K
    Values map[K]V
}

This is because maps in IPLD must have a stable order:

Iteration order is defined to be stable: two separate MapIterator created to iterate the same Node will yield the same key-value pairs in the same order. The order itself may be defined by the Node implementation: some Nodes may retain insertion order, and some may return iterators which always yield data in sorted order, for example.

What we implement now with the struct above is an ordered map that keeps insertion order. However, in many common use cases like graphsync, that's unnecessary; they just want a map to encode and decode as dag-cbor. In that case, a plain Go map that sorts its keys when MapIterator gets called should be enough.

Sorting the keys any time MapIterator gets called is a certain amount of overhead, both in copying the keys and in sorting them, but:

The current ordered map already duplicates the keys
We could alleviate the cost of copying the keys and sorting them by caching the result in the Node wrapper

cc @warpfork @rvagg @hannahhoward

The text was updated successfully, but these errors were encountered:

warpfork · 2022-01-31T11:52:17Z

Hm. One non issue: allocation amortization. It's a noteworthy issue where this pattern appears in the gogen code generator and in the basicnode implementation, but I don't think it applies here.

(In detail: if K is going to be returned as an ipld.Node, having it in one big slab of memory -- the Keys slice -- means the whole thing can escape to heap one time, and that can save on a lot of allocs that may otherwise occur. If those values needed to be boxed into an ipld.Node interface, one each, and didn't all come from one contiguous memory like that, it would mean iteration costs an alloc per step, which is not good.)

But this doesn't apply to bindnode: bindnode always has to return a new object for wrapping the key value, regardless, because the K type doesn't implement ipld.Node itself.

In fact, I guess making one []bindnode._node and wiring the contents up to contain all the keys may end up doing this amortization nicely, whereas (I think) currently there is no amortization, so this should... end up being a net win, often? (The wrapper allocs are O(n) vs sorting being O(nlogn), but I think the constant factor on allocs is quite a bit higher than the constant factors for comparison and swapping in the sorting.)

mvdan mentioned this issue Jan 27, 2022

[Feature] UUIDs, protocol versioning, v2 protocol w/ dag-cbor messaging ipfs/go-graphsync#332

Merged

mvdan added this to IPLD team's weekly tracker Feb 4, 2022

rvagg mentioned this issue Jun 29, 2022

spec: initial WAC spec ipld/ipld#226

Open

BigLep moved this to 🗄️ Backlog in IPLD team's weekly tracker Jul 19, 2022

rvagg moved this from 🗄️ Backlog to 🥞 Todo in IPLD team's weekly tracker Aug 2, 2022

rvagg self-assigned this Aug 2, 2022

rvagg moved this from 🥞 Todo to 🗄️ Backlog in IPLD team's weekly tracker Nov 30, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

node/bindnode: support binding a plain Go map to an IPLD map #343

node/bindnode: support binding a plain Go map to an IPLD map #343

mvdan commented Jan 27, 2022

warpfork commented Jan 31, 2022 •

edited

Loading

node/bindnode: support binding a plain Go map to an IPLD map #343

node/bindnode: support binding a plain Go map to an IPLD map #343

Comments

mvdan commented Jan 27, 2022

warpfork commented Jan 31, 2022 • edited Loading

warpfork commented Jan 31, 2022 •

edited

Loading