-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Moving Least Squares example #919
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really good start. Quick partial review to address some of the common issues.
template <typename Double3D> | ||
KOKKOS_INLINE_FUNCTION Kokkos::Array<double, size> | ||
operator()(Double3D const &p) const |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be templated on a Point
. We can assert that the point is 3D (can we extend the basis to other dimensions)? Also, would want to switch away from using double
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was taken directly from DTK's code of different polynomial bases. I will strip out the templates and switch double
to float
on the example. As I am using ArborX::Point
, it is 3D.
I think it is possible to generate that code for any number of dimensions and polynomial degree, but it will most likely require a lot of templated code and classes.
auto radii = Kokkos::View<double*, MemorySpace>( | ||
"radii", target_points_num); | ||
constexpr double epsilon = std::numeric_limits<double>::epsilon(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably want float
here, it's up for a discussion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It depends a lot on how the values should be computed. Should the coefficients be handled as doubles or floats?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may make sense to do the calculations based on the floating type of the data. We have Point<DIM, FloatingPoint>
, so should probably use the same type as the second template argument of the Point
. Note that we need #918 with GeometricTraits::coordinate_type<Point>::type
to figure out that type.
auto phi = Kokkos::View<double**, MemorySpace>( | ||
"phi", target_points_num, num_neighbors); | ||
Kokkos::parallel_for( | ||
"phi_computation", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also try to prefix the kernels to indicate the origin, like Example::phi_computation
.
for (int i = 0; i < target_points_num; i++) { | ||
std::cout << "====\n" | ||
<< target_values_host(i) << '\n' | ||
<< target_values_exact_host(i) << "\n====\n"; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably want to compute and print an error.
@mrlag31 You'll need to use clang-format 14 and fix the styling (e.g., using |
I've fixed some of the issues you found.
|
constexpr std::size_t source_points_num = cube_side * cube_side * cube_side; | ||
constexpr std::size_t target_points_num = 4; | ||
|
||
auto source_points = Kokkos::View<ArborX::Point *, MemorySpace>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better to not use the default execution space.
Instead, it is preferable to create ExecutionSpace space;
, and pass it to all view allocations, deep copies and kernels. This way, we can better keep track of the kernels, and also allow running MLS on multiple CUDA streams at the same time.
Also not sure why you would prefer to write it as auto blah = Type()
instead of Type blah()
, if you spell out the type anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add the execution space on those calls.
Regarding why I write auto blah = Type()
, it's some habits that I have when declaring a variable. The other way is too similar to a function call in my opinion, and I think it helps readability. But I can change the declarations to use Type blah()
instead.
auto queries = Kokkos::View<ArborX::Nearest<ArborX::Point> *, MemorySpace>( | ||
"MLS_EX::queries", target_points_num); | ||
Kokkos::parallel_for( | ||
"MLS_EX::make_queries", | ||
Kokkos::RangePolicy<ExecutionSpace>(0, target_points_num), | ||
KOKKOS_LAMBDA(int const i) { | ||
queries(i) = ArborX::nearest(target_points(i), num_neighbors); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably steer user to a better approach in the example. Instead of creating a view of queries, requiring an extra memory, we could provide a specialization of AccessTraits
for a structure that contains target_points
, and create queries on the fly.
For an example, see struct ArborX::AccessTraits<NearestToOrigin, ArborX::PredicatesTag>
in examples/callback/example_callback.cpp
.
Not a blocker, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's how it was done in DTK, so I copied. But I agree that using AccessTraits
would be a more "correct" way to do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's how it was done in DTK
We've come a long way since DTK. DTK is pretty outdated in the way it uses ArborX. It is a good starting point, but we want and should move more towards a better and more efficient use of ArborX.
KOKKOS_LAMBDA(int const i) { | ||
float radius = 10. * epsilon; | ||
|
||
for (int j = 0; j < num_neighbors; j++) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems wrong. What if a query returned less than num_neighbors
? For example, if the number of source points is less? This loop would then use uninitialized data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the number of source points is less than the number of requested neighbors, I would most likely consider that an input error from the side of the user. Is there other reasons on why would a query not return the requested number of neighbors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is the only case where it would be not num_neighbors
.
I have switched from using Gaussian method to the SVD pseudo-inverse. I have also made adjustments from DTK's version by using the fact that the moment matrix is symmetric. |
05272fb
to
efd1abe
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think DTK has a pretty decent interface for MLS
, and we should follow it more closely here, with some differences:
- Switch
DeviceType
toMemorySpace
(do we need it?)
DeviceType
is really outdated in the usage, and we prefer to useMemorySpace
andExecutionSpace
separately - Introduce
ExecutionSpace
to the constructor andapply
They don't have to be the same (e.g., using different CUDA streams should be ok) - Duplicate communicator (similar to how it's done in the
DistributedTree
)
We want to make sure our communication does not overlap with users. Duplicating communicator is the safe way to do that. - (maybe) store
Distributor
DTK stores indices, ranks, etc. Maybe we can simplify that.
Other things:
- Anything regarding data setup should probably belong to the example
.cpp
(e.g.,TargetPoints
) - I think
MLSComputation
is really a collection of free functions, which should be inArborX_DetailsMovingLeastSquares.hpp
in theDetails
namespace (similar to DTK) - Likely don't want to hardcode user data to be Views (or, at least, point data), and switch to using
AccessTraits
to access source and target points
This would allow the interface to be a lot more flexible, and user to avoid converting their data to Kokkos Views. - We prefer function to follow camel case (and other styling things, see here)
num_neighbors
should not be a parameter, it is determined from the polynomial basis size
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another round of comments I had. Still haven't looked through all the code, though.
template <typename T> | ||
using inner_value_t = std::decay_t<std::invoke_result_t< | ||
decltype(ArborX::AccessTraits<T, ArborX::PrimitivesTag>::get), T const &, | ||
int>>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a utility to do that, AccessTraitsHelper
. You can use it as AccessTraitsHelper<Access>::type
. So you don't need this.
template <typename ValueType, typename PolynomialBasis, typename RBF, | ||
typename MemorySpace> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imho, this should be
template <typename ValueType, typename PolynomialBasis, typename RBF, | |
typename MemorySpace> | |
template <typename MemorySpace, typename PolynomialBasis = <some default>, typename RBF = <some default>, typename FloatingPoint = float> |
I think we can find some reasonable default to arguments. We can debate on what arguments a user may more likely to change, which would affect the order.
Not sure what to call that FloatingPoint
. That's the type that the calculations are done in, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ValueType
(and FloatingPoint
in your example) is the type of the calculations and the stored matrices, but it might be different from the coordinate type and the evaluation type. FloatingCalculationType
would be a better name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FloatingCalculationType
would be a better name?
May be a bit better than a generic ValueType
, but it still does not quite express the intent. There must be a better name.
|
||
template <typename ValueType, typename PolynomialBasis, typename RBF, | ||
typename MemorySpace> | ||
class MLS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe rename to MovingLeastSquares
, and then do an MLS
alias?
MLS(ExecutionSpace const &space, MPI_Comm comm, Points const &source_points, | ||
Points const &target_points, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to assume that source and target points are in a similar container? Or should they be allowed to have different access?
I can't recall the order of execution space and mpi comm in the distributed tree. We should probably match that here, as we wouldn't want to have different order in different places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both the source_points
and target_points
are coming from user space and represent array of points. But I guess they can still have different types, so I will probably add more types to the templat.
"Example::MLS::local_indices", _tgt_size * _num_neighbors); | ||
Kokkos::View<int *, MemorySpace> local_ranks("Example::MLS::local_ranks", | ||
_tgt_size * _num_neighbors); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These views don't need initializations, as they will be set in the parallel loop below. So, please use WithoutInitializing
.
source_points); | ||
|
||
// Perform the query | ||
Kokkos::View<Kokkos::pair<int, int> *, MemorySpace> index_ranks( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a type for that, I think, PairIndexRank
from src/details/ArborX_PairIndexRank.hpp
.
// a sign matrix (only 1 or -1 in the diagonal, 0 elsewhere). | ||
// Thus A = U.E.S.U^T | ||
template <class ValueType, typename MemorySpace> | ||
class SymmPseudoInverseSVD |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should move this in a separate PR, and then add some tests for it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also a bit confused. Why is this a class, and why do we need to store data? It's called exactly once from MLS, and I don't think we use any of the stored _es
, _inv
, etc, anywhere in MLS. Why not make it a free function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should move this in a separate PR, and then add some tests for it?
Why not, as it is one of the more complex set of functions and could actually be used for other things.
I'm also a bit confused. Why is this a class, and why do we need to store data? It's called exactly once from MLS, and I don't think we use any of the stored _es, _inv, etc, anywhere in MLS. Why not make it a free function?
This is usually how I create complex algorithms that need temporary variables. The user cannot instantiate it so these variables are dropped as soon as the static method returns. However, I can transform it so it uses only free functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have modified this part so it resembles what one could find in ArborX::Details
. The inverse is now computed using only free functions and with less restrictions on the input.
#include <mpi.h> | ||
|
||
template <typename MemorySpace> | ||
class MPIComms |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't closely looked at this yet.
I have attempted to add support for hypergeometry (i.e. points that are not 3D) but |
I would just not use |
I think I suggested it before MPI was added. |
0d2efb8
to
2065168
Compare
2065168
to
dcdedd3
Compare
It is true that this starts to be quite a large example (although it can be used as an ArborX add-on). I think it would be interesting to discuss how can this PR be split. |
One of the ways we could proceed (requires general consensus) is like this:
This is just an idea. It really depends on whether we want to grow the set of the interpolation algorithms in the future and make them easily available to a user. |
Next year I have a project where I will need to couple two codes together (I don't remember the name of the codes), so I may use this. |
Can this be closed? |
This pull request adds a new example for ArborX, showing how it can be used to compute and approximate a function from a point cloud to another, using the Moving Least Squares method. This example is taken from DataTransferKit's MeshFree package and removes all usage of DTK. This is a draft PR as some extra work is still needed.
Some important differences from DTK's code: