Qubitium
Follow
Golang, Python, Kotlin, Swift. I prefer strongly typed languages and I do not worship PEP.
@ModelCloudAi
-
ModelCloud.ai
- Earth/Epoch 2.0
- https://modelcloud.ai
- @qubitium
Pinned Loading
-
ModelCloud/GPTQModel
ModelCloud/GPTQModel PublicProduction ready LLM model compression/quantization toolkit with accelerated inference support for both cpu/gpu via HF, vLLM, and SGLang.
-
sgl-project/sglang
sgl-project/sglang PublicSGLang is a fast serving framework for large language models and vision language models.
-
vllm-project/vllm
vllm-project/vllm PublicA high-throughput and memory-efficient inference and serving engine for LLMs
-
AutoGPTQ/AutoGPTQ
AutoGPTQ/AutoGPTQ PublicAn easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
-
flashinfer-ai/flashinfer
flashinfer-ai/flashinfer PublicFlashInfer: Kernel Library for LLM Serving
-
Dao-AILab/flash-attention
Dao-AILab/flash-attention PublicFast and memory-efficient exact attention
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.