-
-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
heterogeneous feature collection #49
Comments
Came here from JuliaArrays/StructArrays.jl#135 (cc: @Sov-trotter ) In that case, you can use StructArray(
geometry=[Point(3, 1), Polygon(Point{2, Int}[(3, 1), (4, 4), (2, 4), (1, 2), (3, 1)])],
city=["Abuja", "Borongan"],
rainfall=[1221.2, 4114.0],
) GeometryBasics uses custom types ( |
Yeah! The above method works. We have done a similar working implementation here. @visr pointed out how this method of breaking down(not being able to iterate over) "meta-geometries" is a deviation of the basic GeometryBasics idea and how it might create problems in the future when we try put it into Makie(plotting) or performing spatial operations on the data. So we might keep it as a the last resort in case no other generalization works. |
The only tricky thing is that widening over a custom type is a bit ill-defined, as in general it's impossible to know how the parameters should change. The first step is definitely changing Then we can see to what extent the widening of StructArrays works. One possible solution would be to allow custom widening. Alternatively, one could do a flattening of the structure (into a named tuple with geometry and meta data) on the fly while iterating. Then, once all the relevant vectors are created, one can easily transform the columns into a |
@piever suggests that automatically widening for custom types seemed tricky while Nesting / unnesting on the fly is much easier. using GeometryBasics, StructArrays
function maketable(iter)
unnested_iter = Base.Generator(iter) do geom_meta
geom = getfield(geom_meta, :main) # well, the public accessor for this
metadata = getfield(geom_meta, :meta)
(; geometry=geom, metadata...) # I think the GeometryBasics name for this field is `:position`
end
soa = fieldarrays(StructArray(unnested_iter))
return meta(soa.geometry; Base.tail(soa)...)
end
point1 = meta(Point(2, 1), city="Delhi", rainfall=121.1)
point2 = meta(Point(2, 1), city="Delhi", rainfall=120)
maketable([point1, point2])
The above example is pretty effective when it comes to heterogeneity in features/geometry even when the MetaData types tend to be inconsistent. But I am unsure whether it is useful to change the point1 = meta(Point(2, 1), city="Delhi", rainfall=121.1)
polygon2 = PolygonMeta(Point{2, Int}[(5, 1), (3, 3), (4, 8), (1, 2), (5, 1)], city="Delhi", rainfall=44)
sa = maketable([point1, polygon2])
2-element AnyMeta{Any,Array{Any,1},(:city, :rainfall),Tuple{Array{String,1},Array{Real,1}}}:
[2, 1]
Polygon{2,Int64,Point.....}
sa.any
2-element Array{Any,1}:
[2, 1]
Polygon{......}
sa.rainfall
2-element Array{Real,1}:
121.1
44 What do you think @visr, @SimonDanisch ? |
This sounds like a good move to me. It would be breaking, though I guess we could temporarily define EDIT: probably the name |
Now that things are getting a bit clear, we have come up with a different approach for handling meta and are slowly working towards it. Also experimenting with StructArrays along the way. What we aim to do currently is put geometry and metadata separately in a Feature struct, and create a iteratable StructArray of Feature structs. using StructArrays, GeometryBasics
struct Feature{Geom, NamedTuple}
geometry::Geom
properties::NamedTuple
end
p1 = Point(2, 1)
p2 = Point(3, 2)
sa = StructArray([Feature(Point(1, 0), (city = "Delhi", rainfall = 121)),
Feature(MultiPoint([p1, p2]), (city = "Goa", rainfall = 1211.1)),
Feature(Point(1.0, 2.2), (city = "Mumbai", rainfall = 1300))]) But here we leave the NamedTuple untyped with is quite hamering for speed incase of homogeneous types. struct Feature{Geom, Names, Types}
geometry::Geom
properties::NamedTuple{Names, Types}
end is there a way to have a |
I'd like to add that For construction, in most cases it'd be easiest to construct the StructArray from vectors:
How can we do this, given the StructArray defined above? |
I also feel that the Nested approachstruct Feature{Geom, NamedTuple}
geometry::Geom
properties::NamedTuple
end
sa = StructArray(
geometry=[Point(3, 1), Polygon(Point{2, Int}[(3, 1), (4, 4), (2, 4), (1, 2), (3, 1)])],
city=["Abuja", "Borongan"],
rainfall=[1221.2, 4114.0],
)
geom = sa.geometry
metadata = StructArray(Base.tail(fieldarrays(sa)))
type = Feature{eltype(geom), eltype(metadata)}
feature_vec = StructArray{type}((geom, metadata)) I call this "nested" because the second column of Non-nested approachThis is a bit trickier, because you want the layout of the |
This is probably also doable more easily with the GeoInterface Feature/FeatureCollection wrappers now. |
I understand that the way to have geometries with attributes behave like tables in this package is to create a StructArray for the collection.
Do you have any idea what we could do when we want to represent different geometry types? Most commonly a dataset is all of the same type, but there are exceptions, and it would be nice to be able to represent them as well.
This understandably fails:
If I try to throw it into for instance a TypedTables.Table, it works fine, accepting geometry as a Vector{Any}.
I know this is more of a basic StructArrays vs TypedTables question, and understand that with a Vector{Any} things will be slower. But it would be nice to have a "GeometryBasics" table solution for this as well.
EDIT: concrete example here: https://github.com/visr/GeoJSONTables.jl/blob/4104e66a638814d77ef98af1d205450549361519/test/basics.jl#L97-L101
The text was updated successfully, but these errors were encountered: