Amit's Proposal

Abstract

Tensors lack the ability to compile-time extents and strides which I'm going to add with static extents and static stride which decides when to use dynamic and static based on user's choice. Another feature which tensor lacks is the ability to specify the type of storage either it is sparse or band and I'm going to provide a few customization points using which user can pass their own storage policy. In the end, I'm going to design devices APIs for GPUs and CPUs which will help in increasing the speed of computation.

Proposal

Title of my Proposal is Design Policy And Improve The Design Of Tensor

What is Design Policy ?

Policy-based design is a great way for library authors to provide more flexibility to the user. Instead of hard coding certain behaviors, the policy-based design provides various policies the users can select to customize the behavior. If done properly, a library author can accommodate all use cases with a single implementation. It was first popularized in C++ by Andrei Alexandrescu with Modern C++ Design.

For more info Click here

For ex :-

template<typename LanguagePolicy>
struct Book : LanguagePolicy{}

Static Extents

It can store extents at compile time which reduces overhead and binary size. It also increases the speed of Tensor to do computation as all the extent computation is already done during compile time. I'm going to provide the same APIs as current dynamic extents because I want a seamless transition between them as the user won't be able to distinguish which is which. The design is based on [kokkos mdspan] (https://github.com/kokkos/array_ref/blob/master/reference/include/mdspan) which is beautifully designed and executed.

Example code:-

static_extents<4> e(1,2,3,4); // static_extents ==> <1,2,3,4>
static_extents<4,1,2,3,4> f;  // static_extents ==> <1,2,3,4>

where argument defines the rank of extents.

There are 3 cases which arise due to static and dynamic extents

static rank and static extents
static rank and dynamic extents
dynamic rank and dynamic extents

shape_t is a way to define all three way Example code :-

auto e = static_extents<1,2,3,4>{}; // static rank and static extents
auto f1 = dynamic_extents<4>(1,2,3,4); // static rank and dynamic extents
auto f2 = static_extents<4>(1,2,3,4); // static rank and dynamic extents
auto g = dynamic_extents<>{1,2,3,4}; // dynamic rank and dynamic extents

Static Stride

It is similar to static extents but it stores strides which could be of two types Column Major and Row Major ( for more info on layout click here )

Example code:-

auto f = static_stride<static_extents<4,1,2,3,4>, first_order>();
auto l = static_stride<static_extents<4,1,2,3,4>, last_order>();

Storage Type

It is a way to make Tensor to store data in a specific way and retrieve it when needed. There are three ways to store data which are

Dense Tensors store values in a contiguous sequential block of memory where all values are represented because of which they heavy on the memory side. If we have a large number of non-zeroes or non-zero elements that are much greater than the number of zero elements, it is preferred to store in dense as there is no gain in using another storage type. As they are contiguous memory which can be cached and making operations faster than other containers.
Sparse Tensors is a large tensor which has a large amount of zeroes or zero elements are much greater than the number of non-zero elements, it's faster to perform computation by iterating through the non-zero elements. They are compressed using various algorithms or data structures such as CSR, maps, etc.
Band Tensors is a sparse tensor whose non-zero entries are confined to a diagonal band, comprising the main diagonal and zero or more diagonals on either side. It is also stored similarly as a sparse tensor.

Slices

The slice is a way to get the elements in a range by specifying the first( from where to start the slice ), last( where to end the slice ) and step ( number of elements to be skipped in order to reach to the next element ). These parameters are enough to define a range. There are two types of slices one is static slice and the other one is dynamic slice. The static slice is processed at compile time which reduces the runtime cost but increases the compile-time and the dynamic slice is processed at runtime.

auto s1 = boost::numeric::ublas::span::slice<>{0,10,2}; // dynamic slice
auto s2 = boost::numeric::ublas::span::slice<0,10,2>{}; // static slice

Note: In some cases slice<>{} can be considered static slice. For example if you want get everything in a perticular range.

Subtensor

Subtensor is a type of tensor which holds the pointer to the tensor data and does not get a whole new memory for data rather just point to the existing data. Its also called view to the tensor which just a fancy of saying that it just points to the data.

Subtensor speeds things up compared to creating new tensor which copies all the data to new memory and results in slow runtime. According to the slice type and extents type subtensor could be of two types static or dynamic.

using namespace boost::numeric::ublas;

auto t1 = tensor{static_extents<10,10>{},1.f};
auto t2 = tensor{dynamic_extents<>{10,10}, 1.f};
auto s1 = t1(span::slice<>{}, span::slice<0,-2,2>{}); // static subtensor
auto s2 = t2(span::slice<>{}, span::slice<0,-2,2>{}); // dynamic subtensor
auto s3 = t1(span::slice<>{}, span::slice<>{0,-2,2}); // dynamic subtensor
auto s4 = t2(span::slice<>{}, span::slice<>{0,-2,2}); // dynamic subtensor

Acknowledgement

We both would like to thank our mentor Cem for his constant support and help in achieving our goals. We always find him helpful and he was always easy to reach for help or discussion regarding the work. We would also like to thank Google for the Google Summer of Code Programme, without which all these wouldn't be possible. Lastly, we express our gratitude to our parents for helping and providing us with all the indirect or direct needs for carrying out our work nicely in our homes.

Boost

Quick Links

Ashar's wiki

Home
Project Proposal
Milestones and Tasks
Implementation
- Commit History
- Diff Comparison
Documentation
Discussions
Examples
- Expression
- ublas::apply
Experimental features
- std::variant based Runtime Optimizer

Amit's wiki

Project Proposal
Milestones and Tasks
Implementation
- Commit History
- Diff Comparison
Documentation
Discussions
- Adding Design Policy and Design Improvements
Example Code
- Static Extents and Static Stride
- Slice and Subtensor

Provide feedback

Saved searches

Use saved searches to filter your results more quickly