This repository contains the second version of the code base for Xilinx SDAccel FPGA implementation of Dynamic Graph CNN. The host program in this repository implements the out-of-order kernel queuing. The version with in-order queuing is available here. The paper could be accessed here.
Basics:
Misc:
As easy as it is to use SDx GUI, it is recommended to use provided cmake scripts to run synthesis and build the binaries for both the selected FPGA platform and the host.
This project relies on these software/libraries(These should be installed on the OS):
Xilinx SDAccel 2019.1(Tested), 2018.3 2018.2 2017.4(Not Tested)
Xilinx XRT
python3(Symlinked as `python3`)
CMake3 (>3.10, Do **not** use the default CMake package available on AWS-F1)
Bash (>4.0, Dash and others are not tested)
devtoolset-7 (>7.0, For C++14 support)
- Make sure that the latest Vivado patches are applied, such as
AR73068
.
To make it easier to explore the design space and try different configurations, all of the parameters that affect the output performance of the task kernels are gathered in a separate submodule repository at directory config
.
Also please note that various Vivado directives for different steps are used to facilitate design implementation (opt, place, and route).
Refer to the table below.
Name | Supported Platform | Implementation | Datasets |
---|---|---|---|
CModel1 | Cpu, Xil | CImplementationCpu, CImplementationXil | ShapeNet, ModelNet |
There are two types of tests for the project, KernelTests
and OclTests
.
KernelTests
are located at test/kerneltests
and mainly aim for testing the HLS code of the kernels.
To run the kernel tests, run:
make test
These tests use Google Test Framework to test the correctness of the kernel outputs in HW-EMU and HW modes along with testing all the OpenCL related infrastructure of the DeepPointFPGA.
The main executable of OclTests
is located at test/ocltests/
. To execute the OclTests run this at the build directory:
sh LaunchOclTests.sh
-
Tensors
- CTensor<T> : CTensorBase
- CTensorXil<T> : CTensorBase
-
Implementations
- CPlatformSelection
- CImplementationCpu : CImplementationBase
- CImplementationXil : CImplementationBase
-
Kernels
- CKernelWrapperBasicOps : CKernelWrapper
- CKernelWrapperConcat : CKernelWrapper
- CKernelWrapperConv : CKernelWrapper
- CKernelWrapperGather : CKernelWrapper
- CKernelWrapperMatmul : CKernelWrapper
- CKernelWrapperPadUnpad : CKernelWrapper
- CKernelWrapperReduce : CKernelWrapper
- CKernelWrapperReluSqrtSquare : CKernelWrapper
- CKernelWrapperTile : CKernelWrapper
- CKernelWrapperTopK : CKernelWrapper
- CKernelWrapperTranspose : CKernelWrapper
-
Models
- CClassifierMultiPlatform
- CModel1
-
Misc
- CWeightLoader
- CProfiler
- CXilInfo
- CStringFormatter
This repository contains multiple branches as described below:
Branch | AXI Width | DType | Tool | Notes |
---|---|---|---|---|
master | 512-bits | float32 | SDx2019.1 | - |
Please use the following BibTeX entry:
@article{jamali2022dgcnn,
title={DGCNN on FPGA: Acceleration of the Point Cloud Classifier Using FPGAs},
author={Jamali Golzar, Saleh and Karimian, Ghader and Shoaran, Maryam and Fattahi Sani, Mohammad},
journal={Circuits, Systems, and Signal Processing},
pages={1--32},
year={2022},
publisher={Springer}
}
These repositories are used in this project:
Repo | Description | License |
---|---|---|
dgcnn | (Paper(ACM), Paper(Arxiv)) Dynamic Graph CNN for Point Clouds (Tensorflow) | N/S |
DeepPointV1-GPGPU | Our CUDA/OCL Version of DGCNN | N/S |
DeepPointV1-FPGA | Our FPGA Version of DGCNN | N/S |
hlslib | (Paper) CMake/HLS Libraries for Intel and Xilinx | BSD 3-Clause |
gemm_hls | (Paper(ACM), Paper(Arxiv)) Scalable matrix matrix multiplication on FPGA | BSD 3-Clause |
pp4fpgas | (Book(Arxiv)) Parallel Programming for FPGAs | N/S |
cnpy | C++ Library for working with *.npy files |
MIT |
PointNet | (Paper) PointNet 1 | MIT |
PointNet++ | (Paper) PointNet 2 | MIT |
argparse | C++ Library for handling arguments | Apache-2.0-with-LLVM-Exception or GPL-3.0 |
spdlog | C++ Library for fast logging | MIT |
rapidjson | A fast JSON parser/generator for C++ with both SAX/DOM style API | MIT |
hls_tutorial_examples | (Paper) HLS examples and tutorials (Workshop) | BSD 3-Clause |
SimplePasteBin | Python Library for working with PasteBin.com | GPL-3.0 |
googletest | GoogleTest - Google Testing and Mocking Framework | BSD 3-Clause |