-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Moving least squares interpolation #946
Conversation
f637b0f
to
7c3f0e1
Compare
7c3f0e1
to
354efe9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some feedback on the example
(Haven't looked at the rest yet)
examples/CMakeLists.txt
Outdated
find_package(Boost COMPONENTS program_options) | ||
if(Boost_FOUND) | ||
add_subdirectory(viz) | ||
add_subdirectory(raytracing) | ||
add_subdirectory(brute_force) | ||
endif() | ||
endif() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Undo
template <typename T> | ||
KOKKOS_INLINE_FUNCTION double step(T const &p) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the argument templated but not the return type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Old code when I tested different point dimensions. Simply forgot to remove the template.
template <typename T> | ||
KOKKOS_INLINE_FUNCTION double step(T const &p) | ||
{ | ||
return Kokkos::signbit(p[0]) ? 0 : 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could also return 1 - signbit(p[0])
but I think
auto const x = p[0];
return x >= 0 ? 1 : 0;
might be more readable.
Kokkos::ScopeGuard guard(argc, argv); | ||
ExecutionSpace space{}; | ||
|
||
static constexpr std::size_t num_points = 1000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you consider making it the default value and optionally taking it from the command line arguments?
This would be handy when studying convergence as the number of points increases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made that change. You can input the number of points using the command line arguments with --points
.
|
||
auto approx_values = mls.apply(space, source_values); | ||
|
||
double max_error = 0.; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We usually suggest not to initialize the value you reduce into because it may give the wrong impression to users,
that the reduced value will be contributed to an initial value or whatnot, when Kokkos will actually ignore that value and overwrite it with the result of the reduction.
double max_error = 0.; | |
double max_error; |
loc_error = Kokkos::max( | ||
loc_error, Kokkos::abs(target_values(i) - approx_values(i))); | ||
}, | ||
Kokkos::Max<double>(max_error)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you go for max absolute difference rather than L1 or L2 error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used the first norm that came to my mind. I have changed it to use an L2 norm.
}, | ||
Kokkos::Max<double>(max_error)); | ||
|
||
std::cout << "Error: " << max_error << '\n'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you have a chance to study the convergence of the error as the number of points increases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit using the new L2 norm. It seems the more points, the better the error.
0d9c491
to
fdc710a
Compare
ArborX::Interpolation::MovingLeastSquares<MemorySpace, double> mls( | ||
space, source_points, target_points); | ||
|
||
auto approx_values = mls.apply(space, source_values); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does mls.apply
mean? I would prefer something more expressive like mls.interpolate
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'apply' was the function name in DTK and I kept it, I will switch it
// This is done to avoid a clash with another predicates access trait | ||
template <typename Points> | ||
struct MLSTargetPointsPredicateWrapper | ||
{ | ||
Points target_points; | ||
int num_neighbors; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which would that be?
What do the possible types for Points
look like? I assume these are Views
or would the implementation also allow for something different?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation allows any predicates access traits as long as "get" resolves to a point. However, I could not really test if Points is valid there as ArborX::Details::check_valid_access_traits
and ArborX::GeometryTraits::check_valid_geometry_traits
does not return a boolean.
class MovingLeastSquares | ||
{ | ||
public: | ||
// If num_neighbors is 0 or negative, it will instead be a default value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the default value/behavior? Of course, the behavior should be documented in the Wiki.
I would move that comment to line 116. Also, did you consider using std::optional
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default value is the size of the polynomial basis. And using optional is a better alternative than using zero or a negative.
int const num_targets = tgt_acc::size(target_points); | ||
_source_size = source_points.extent(0); | ||
// There must be enough source points | ||
ARBORX_ASSERT(num_neighbors <= _source_size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure a ArborX::SearchException
makes sense here.
ARBORX_ASSERT(num_neighbors <= _source_size); | |
KOKKOS_ASSERT(num_neighbors <= _source_size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this code is part of ArborX, it makes more sense to use ArborX's assert than Kokkos one. It is also used elsewhere for non-search errors as well.
SourcePoints const &source_points, | ||
TargetPoints const &target_points) | ||
: MovingLeastSquares(space, source_points, target_points, | ||
CRBF::Wendland<2>{}, PolynomialDegree<2>{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to document default values. Why did you choose these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They were the ones that yielded the best results in my firsts small examples. We could discuss on which default would actually be best.
Kokkos::view_alloc(Kokkos::WithoutInitializing, | ||
"ArborX::MovingLeastSquares::source_view"), | ||
num_targets, num_neighbors); | ||
auto const values_indices = _values_indices; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
KOKKOS_CLASS_LAMBDA
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as previously, it was to get around nvcc's lambda restrictions.
test/tstInterpDetailsMLSC.cpp
Outdated
Kokkos::View<double *, MemorySpace> tgtv0("Testing::tgtv0", 3); | ||
Kokkos::View<double **, MemorySpace> coeffs0("Testing::coeffs0", 0, 0); | ||
Kokkos::parallel_for( | ||
"for", Kokkos::RangePolicy<ExecutionSpace>(space, 0, 3), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"for", Kokkos::RangePolicy<ExecutionSpace>(space, 0, 3), | |
"Testing::mls_coefficients_case_1", Kokkos::RangePolicy<ExecutionSpace>(space, 0, 3), |
etc.
Kokkos::view_alloc(Kokkos::WithoutInitializing, "Example::target_values"), | ||
num_points); | ||
filledBoxRandom(0.5, source_points); | ||
filledBoxRandom(0.5, target_points); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What kind of convergence behavior would you expect based on the number (and positions) of the source points?
IMHO, the interpretation of the error would be easier if at least the target_points
were evenly distributed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, both source and target points are uniformely distributed. However, I may try to implement a way to use any regular mesh.
I would expect, depending on the number of source points, to converge as an inverse exponential. I do not really know regarding position of the source point.
<< '\n'; | ||
} | ||
|
||
dump_file_stream.close(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not necessary. std::fstream will close in the destructor.
dump_file_stream.close(); |
int const num_targets = tgt_acc::size(target_points); | ||
_source_size = source_points.extent(0); | ||
// There must be enough source points | ||
ARBORX_ASSERT(0 < num_neighbors_val && num_neighbors_val <= _source_size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ARBORX_ASSERT(0 < num_neighbors_val && num_neighbors_val <= _source_size); | |
KOKKOS_ASSERT(0 < num_neighbors_val && num_neighbors_val <= _source_size); |
|
||
// Source values must be a valuation on the points so must be as big as the | ||
// original input | ||
ARBORX_ASSERT(_source_size == source_values.extent_int(0)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ARBORX_ASSERT(_source_size == source_values.extent_int(0)); | |
KOKKOS_ASSERT(_source_size == source_values.extent_int(0)); |
75b03ba
to
76cf2fe
Compare
76cf2fe
to
8b2e464
Compare
nvcc 11.0.3 used to segfault on the previous CI (and probably still segfaults). Except that, it should be ready to be merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does a typical profile look like?
// Randomly fills a 2 * half_edge box centered at 0 | ||
void filledBoxRandom(double half_edge, | ||
Kokkos::View<Point *, MemorySpace> points) | ||
{ | ||
int n = points.extent(0); | ||
auto points_host = Kokkos::create_mirror_view(points); | ||
|
||
std::uniform_real_distribution<double> dist(-half_edge, half_edge); | ||
std::random_device rd; | ||
std::default_random_engine gen(rd()); | ||
auto random = [&dist, &gen]() { return dist(gen); }; | ||
for (int i = 0; i < n; ++i) | ||
for (int d = 0; d < DIM; ++d) | ||
points_host(i)[d] = random(); | ||
|
||
Kokkos::deep_copy(points, points_host); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need multiple variants of creating points in an example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is meant to show the user that they can use any set of points, either random or a regular mesh. It can be discussed if it should be kept or not.
Kokkos::deep_copy(points, points_host); | ||
} | ||
|
||
// Switches the set of points to the new base |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By default, everything is generated in a box. This lets skew and rotate the box however the user wants to. Like the previous comment, it can be removed, as well as the creation of the basis matrix.
// Basis change | ||
Kokkos::View<Point[DIM], MemorySpace> source_basis("Example::source_basis"); | ||
Kokkos::View<Point[DIM], MemorySpace> target_basis("Example::target_basis"); | ||
Kokkos::parallel_for( | ||
"Example::fill_basis", Kokkos::RangePolicy<ExecutionSpace>(space, 0, 1), | ||
KOKKOS_LAMBDA(int const) { | ||
for (int i = 0; i < DIM; i++) | ||
{ | ||
source_basis(i) = Point{}; | ||
target_basis(i) = Point{}; | ||
for (int j = 0; j < DIM; j++) | ||
{ | ||
source_basis(i)[j] = double(i == j); | ||
target_basis(i)[j] = double(i == j); | ||
} | ||
} | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is just the identity why do we need to change the basis?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly to show that the user can do it. I also used it to build a skeded/triangular mesh in my report. It can be removed with the basis change function.
num_points); | ||
|
||
// Sets the meshes | ||
filledBoxEven(0.5, {side_len, side_len}, source_points); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we just have this function return the source_points
(and similar for the other Views/functions)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That can be done. But I don't know which of "returning the view" or "giving an uninitialized view" is the better design choice.
(!num_neighbors) | ||
? Details::polynomialBasisSize<dimension, PolynomialDegree::value>() | ||
: *num_neighbors; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(!num_neighbors) | |
? Details::polynomialBasisSize<dimension, PolynomialDegree::value>() | |
: *num_neighbors; | |
num_neighbors | |
? *num_neighbors | |
: Details::polynomialBasisSize<dimension, PolynomialDegree::value>(); |
to improve readability some.
auto const source_view = fillValuesIndicesAndGetSourceView( | ||
space, indices, offsets, num_targets, num_neighbors_val, source_points); | ||
|
||
// Compute the Moving Least Squares |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Compute the Moving Least Squares | |
// Compute the Moving Least Squares coefficients |
Details::movingLeastSquaresCoefficients<CRBF, PolynomialDegree>( | ||
space, source_view, target_points, _coeffs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a good reason not to return the coefficients?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like previously, I do not know if it is better to return the coefficients or to give an uninitialized array to it. Querying a tree requires to give out empty views for example.
What do you mean by "profile"? |
What does the output for |
Using Cuda and my laptop's Nvidia A1000, here are the results for both 4096 points and 1000000 points (source and target). In short, to compute the coefficients, it lasts 3ms for 4096 points and 200ms for 1000000 points. And, as I had to do the same measures in my report, the duration scales linearly with the number of points. |
So |
This PR doesn't compile with CUDA 11.0.3 https://cloud.cees.ornl.gov/jenkins-ci/job/arborx/job/PR-946/4/pipeline-console/?selected-node=44 |
Retest this please. |
retest this please |
I can compile and run the example successfully on
|
@mrlag31 Can you post the issues you are seeing here if you can find them? |
For the moment, the crash appeared after updating my code to the new callback interface. I will investigate it. |
b585725
to
f6a7b1e
Compare
88b439f
to
ff7e6c7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with cleaning up in a follow-up.
Couple builds did not run (HIP, SYCL, Cuda-Clang), but that's fine. |
This is a follow-up from #919. It properly adds the moving least square interpolation method as well as some tests.