add accumulate! function for arbitrary number of dimensions #213
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
first we can observe that the chucks of the arrays which share the same coordination in the procs array except for the axis along which we accumulate can be processed independently, because every value depend only on the previous values in the axis which we accumulate along, and by construction if two procs share the same coordination on some axis, they will share the same local indices for that axis,
so the procs which different only on the axis along which we accumulate form a maximum independent set,
so we can process them in embarrassingly parallel fashion by calling accumulateindep function on each set.
in accumulateindep we do the following
1- perform accumulate on each process s local copy
2-we gather the last result on each process
3- we accumulate them to know the value that Precedes every process, i made that accumulation always along the first axis , that will make it faster because julia is column-major order
4- we sent to every process the value that Precedes it, and do broadcast.