GitHub - CaffeineOverflowAngeL/RefMap_Inference_Optimization_Workflow

Inference Optimization Workflow

RefMap inference optimization pipeline is a workflow for deep learning models optimization with the following features:

RefMap Inference Optimization Toolkit: This framework aims at the optimization deep learning models within the RefMap framework, ensuring their sustainability and efficiency under realistic use-case constraints. It addresses both the memory and latency limitations of resource-constrained devices used for model deployment in real-world conditions. This involves refining models at both structural and computational levels to meet these objectives. The framework is designed to support most widely used ML frameworks, i.e. PyTorch, TensorFlow, and ONNX. It implements a two-fold inference optimization process, providing greater abstraction and flexibility in optimization choices for specific use cases. Specifically, the two levels of optimization that are designed to be incorporated into the framework are; (a) Model Compression, that targets the model structure in a hardware-agnostic manner, offering general-purpose solutions and eliminating the need to specify the exact hardware backend for deployment and (b) Compilation Optimizations, that focuses on incorporating hardware specifications into the optimization process to enhance model computations during execution. In that way, this comprehensive approach ensures the framework can adapt to a variety of deployment scenarios while maintaining efficiency and performance.
Examples: Play around with off-the-shelf compiler optimizations for turbulence prediction models.
Benchmark: Explore and experiment with various pruning methods and configurations for general-purpose model compression.

For more technical details, please refer to our published papers:

TBA

Update:

2024.09.13 Structural Pruning for PyTorch Models
2024.09.10 Compiler Optimization for TensorFlow Models using OpenXLA
2024.09.10 Cross-Framework ML Model Converter

Contact Us:

Please do not hesitate to open an issue if you encounter any problems with the pipeline or the related papers.

Installation

Refmap inference optimization workflow is compatible with Python 3.x. The pipeline is compatible with multiple ML frameworks, including (a) PyTorch 1.x and 2.x (with a strong recommendation for PyTorch 1.12.1 or later), (b) TensorFlow 2.x, and (c) ONNX 1.x.

git clone https://github.com/CaffeineOverflowAngeL/RefMap_Inference_Optimization_Workflow

DepGraph

DepGraph is a versatile framework for automatic structural pruning across various neural network architectures and is natively built on Pytorch. For more information please refer to their GitHub page.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
assets		assets
compiler_optimization		compiler_optimization
fcn_turbolence_prediction		fcn_turbolence_prediction
model_compression		model_compression
torch_pruning		torch_pruning
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Inference Optimization Workflow

Update:

Contact Us:

Installation

DepGraph

About

Releases

Packages

Languages

CaffeineOverflowAngeL/RefMap_Inference_Optimization_Workflow

Folders and files

Latest commit

History

Repository files navigation

Inference Optimization Workflow

Update:

Contact Us:

Installation

DepGraph

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages