-
Notifications
You must be signed in to change notification settings - Fork 5
Amit's Proposal
Tensors lacks the ability of compile time extents and strides which I'm going to add with static extents and static stride which decides when to use dynamic and static based on user's choice. Another feature which tensor lacks is the ability to specify type of storage either it be sparse or band and I'm going to provide few customization points using which user can pass their own storage policy. At the end I'm going to design devices APIs for GPUs and CPUs which will help in increasing the speed of computation.
Title of my Proposal is Design Policy And Improve The Design Of Tensor
What is Design Policy ?
Policy-based design is a great way for library authors to provide more flexibility to the user. Instead of hard coding certain behaviors, policy-based design provides various policies the users can select to customize the behavior. If done properly, a library author can accommodate all use cases with a single implementation. It was first popularized in C++ by Andrei Alexandrescu with Modern C++ Design.
For more info Click here
For ex :-
template<typename LanguagePolicy>
struct Book : LanguagePolicy{}
Static Extents
It can store extents at compile time which reduces overhead and binary size. It also increase the speed of Tensor to do computation as all the extent computation is already done during compile time. I'm going to provide same APIs as current dynamic extents because I want seamless transition between them as user won't be able to distinguish which is which. The design is based on [kokkos mdspan] (https://github.com/kokkos/array_ref/blob/master/reference/include/mdspan) which is beautifully designed and executed.
Example code :-
static_extents<4> e(1,2,3,4); // static_extents ==> <1,2,3,4>
static_extents<4,1,2,3,4> f; // static_extents ==> <1,2,3,4>
where argument defines the rank of extents.
There are 3 cases which arise due to static and dynamic extents
- static rank and static extents
- static rank and dynamic extents
- dynamic rank and dynamic extents
shape_t
is a way to define all three way
Example code :-
auto e = shape_t<4,1,2,3,4>(); // static rank and static extents
auto f = shape_t<4>(1,2,3,4); // static rank and dynamic extents
auto f2= shape_t<4,1,dynamic_extent,dynamic_extent,4>(2,3); // static rank and dynamic extents
auto g = shape_t<dynamic_shape>{1,2,3,4}; // dynamic rank and dynamic extents
Static Stride
It is similar to static extents but it stores strides which could be of two types Column Major and Row Major ( for more info on layout click here )
Example code :-
auto f = static_stride<static_extents<4,1,2,3,4>, first_order>();
auto l = static_stride<static_extents<4,1,2,3,4>, last_order>();
Storage Type
It is a way to make Tensor to store data in a specific way and retrieve when needed. There are three ways to store data which are
-
Dense Tensors store values in a contiguous sequential block of memory where all values are represented because of which they heavy on memory side. If we have large amount of non-zeroes or non-zero elements are much greater than number of zero elements, it is prefered to store in dense as there is no gain in using other storage type. As they are contiguous memory which can be cached and making operations faster than other containers.
-
Sparse Tensors is a large tensor which has large amount of zeroes or zero elements are much greater than number of non-zero elements, it's faster to perform computation by iterating through the non-zero elements. They are compressed using various algorithms or data structure such as CSR, maps, etc.
-
Band Tensors is a sparse tensor whose non-zero entries are confined to a diagonal band , comprising the main diagonal and zero or more diagonals on either side. It is also stored similarly as sparse tensor.
Device Policy or Execution Policy
It is a way to tell tensor how to execute the tensor operation and where to perform it for example you want to perform the tensor operation on cpu and parallel.
Example code but not the final API, it is just for getting the gist of the device and how it's gonna work :-
template<typename ExecutionPolicy = device::cpu::parallel>
void do_something();
We both would like to thank our mentor Cem for his constant support and help in achieving our goals. We always find him helpful and he was always easy to reach for help or discussion regarding the work. We would also like to thank Google for the Google Summer of Code Programme, without which all these wouldn't be possible. Lastly, we express our gratitude to our parents for helping and providing us with all the indirect or direct needs for carrying out our work nicely in our homes.
- Home
- Project Proposal
- Milestones and Tasks
- Implementation
- Documentation
- Discussions
- Examples
- Experimental features
- Project Proposal
- Milestones and Tasks
- Implementation
- Documentation
- Discussions
- Example Code