[ English | 简体中文 ]
This is a project that provides private large language model services, aiming to quickly access general large models (GPT3.5, GPT4) and private models (Qwen1.5, ChatGLM3, LLama2, Baichuan2, etc.) services, providing a unified API interface. Relying on the langchain framework to provide multi-turn dialogue (Chat) and retrieval augmented generation (RAG) services, the project name comes from the character Aris in Blue Archive, as shown in the figure below
-
[2024-07-13] We open source the Aris-AI-Model-Server, which integrates LLM, Embedding and Reranker deployment services, and provides an OpenAI Compatible API interface to facilitate users to deploy private models.
-
[2024-06-23] We release the Aris-14B-Chat Series Model which sft and dpo by Qwen1.5-14B-Chat on our private dataset. Please obey the qwen open source agreement while using it.
-
[2024-06-15] Use Neo4j as the database for storing knowledge bases
- Transformers
- PEFT
- Pytorch
- Deepspeed
- llama.cpp
- llama-cpp-python
- Langchain
- Fastapi
- Sqlalchemy
- JWT
- Mysql
- Redis
- Neo4j
- Streamlit
- Docker
- User registration, login, permission management
- Dialogue management, history management
- Model (LLM, Embedding) management, preset (System) prompt management
- Vector database management, vector database insertion, support:
- Files: Pdf, Markdown, HTML, Jupyter, TXT, Python, C++, Java and other code files
- Links: Arxiv, Git, unauthenticated url (supports recursive crawling, automated tool crawling)
- Chat: Supports multi-round dialogue
- Retriever QA: Supports question answering with (RAG) retrieval enhanced generation
- Provide an interface to upload knowledge bases
- Provide a dialogue interface
.
├── assets
├── confs
│ ├── deployment
│ └── local
├── docker
│ ├── deployment
│ └── local
├── envs
│ ├── deployment
│ └── local
├── kubernetes
├── logs
├── pages
└── src
├── api
│ ├── auth
│ ├── model
│ └── router
│ └── v1
│ ├── model
│ └── oauth2
├── config
├── langchain_aris
├── logger
├── middleware
│ ├── jwt
│ ├── logger
│ ├── mysql
│ │ └── models
│ └── redis
└── webui
git clone https://github.com/hcd233/Aris-AI
cd Aris-AI
You can skip this step, but you need to make sure that the python environment is 3.11
conda create -n aris python=3.11.0
conda activate aris
pip install poetry
poetry install
See the template file
docker-compose -f docker/local/docker-compose.yml up -d
Note that you need to specify local/api.env as the environment variable in the IDE
python aris_api.py
Note that you need to specify local/webui.env as the environment variable in the IDE
streamlit run aris_webui.py
- SwaggerUI: http://localhost:${API_PORT}/docs
- WebUI: http://localhost:8501
See the template file
docker volume create mysql-data
docker volume create redis-data
docker volume create neo4j-data
docker-compose -f docker/deployment/docker-compose.yml up -d --no-build
- For login operations, I only did simple username and password verification, and did not provide a registration function in the WebUI. Please call the API interface yourself, and set the administrator status (is_admin=1) in the database operation to access private models
- After login, you need to carry a jwt token to operate the secret key, which is used to call the private model service
- Call the general large model service, which currently only supports the OpenAI series models (or agents with OpenAI-like interfaces). You can access it directly in the API. You need to store information such as base, key, max_tokens in the database, and you can customize the System prompt
- Call the private model service, please deploy the model as an API service with an OpenAI-like API (you can use Aris-AI-Model-Server), and configure it accordingly.
- Support access to more models (AzureOpenAI, Gemini, HuggingFaceEndpoint, Llama.cpp)
- More RAG strategies (RAG fusion, rearrangement, multi-path recall, etc.)
- Support multi-modal Chat & RAG
- Support maintaining a Key pool for the same model to achieve load balancing
- Support Agent and tool calls
- Release fine-tuned private models
Due to my busy work schedule, the project progress may be relatively slow, and I will update it occasionally. PRs and Issues are welcome