Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chore()): add further tests #8

Merged
merged 6 commits into from
Jul 18, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .env
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ PROJECT_NAME= llm-mock
SERVER_PORT=8001
LLM_URL_ENDPOINT=chatgpt/chat/completions
## MOCK_LLM_RESPONSE_TYPE can be 'lorem' or 'stored'
MOCK_LLM_RESPONSE_TYPE=lorem
MOCK_LLM_RESPONSE_TYPE=stored
MAX_LOREM_PARAS=8
# SET DEBUG TO * START DETAILED DEBUGGING LOGS AND OFF TO STOP
DEBUG=OFF
Expand Down
12 changes: 12 additions & 0 deletions .env.chatgpt
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
PROJECT_NAME= llm-mock
SERVER_PORT=8001
LLM_URL_ENDPOINT=chatgpt/chat/completions
## MOCK_LLM_RESPONSE_TYPE can be 'lorem' or 'stored'
MOCK_LLM_RESPONSE_TYPE=lorem
MAX_LOREM_PARAS=8
# SET DEBUG TO * START DETAILED DEBUGGING LOGS AND OFF TO STOP
DEBUG=OFF
LLM_NAME=chatgpt
VALIDATE_REQUESTS=ON
# SET LOG_REQUESTS TO ON TO LOG DETAILS OF ALL INCOMING REQUESTS (VALIDATE REQUESTS MUST ALSO BE ON)
LOG_REQUESTS=ON
12 changes: 12 additions & 0 deletions .env.gemini
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
PROJECT_NAME= llm-mock
SERVER_PORT=8001
LLM_URL_ENDPOINT=models/gemini-pro:generateContent
## MOCK_LLM_RESPONSE_TYPE can be 'lorem' or 'stored'
MOCK_LLM_RESPONSE_TYPE=lorem
MAX_LOREM_PARAS=8
# SET DEBUG TO * START DETAILED DEBUGGING LOGS AND OFF TO STOP
DEBUG=OFF
LLM_NAME=gemini
VALIDATE_REQUESTS=ON
# SET LOG_REQUESTS TO ON TO LOG DETAILS OF ALL INCOMING REQUESTS (VALIDATE REQUESTS MUST ALSO BE ON)
LOG_REQUESTS=ON
7 changes: 1 addition & 6 deletions .github/workflows/tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -66,9 +66,4 @@ jobs:
node-version: '20'
cache: 'npm'
- run: npm ci
- run: npm run start

- name: Cypress run
uses: cypress-io/github-action@v6
with:
wait-on: 'http://localhost:8001'
- run: npm run test:e2e
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
node_modules
*.log
/src/logs/*.json
cypress/screenshots
cypress/videos
21 changes: 21 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
MIT License

Copyright (c) 2024 piyook

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
123 changes: 86 additions & 37 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,9 @@

The purpose of this project is to provide a quick-to-set-up standalone local mock LLM API framework running on localhost for developing with Large Language Models such as Chat-GPT.

This can be used for testing API code and logic before deploying to live servers or quickly producing API endpoints for rapid prototyping for developing frontend clients for web or mobile.
This can be used for testing API code and logic before deploying to live servers or quickly producing API endpoints for rapid prototyping for developing frontend clients for web or mobile. Using a local mock is quicker and cheaper (i.e FREE) than using commerical LLM's for initial set-up work.

The project is built using MSW and can be run directly on a local machine or in docker containers. It is further adapted from our own general mock-api framework here: https://github.com/piyook/mock-api-framework-template
The project is built using MSW and can be run directly on a local machine or in docker containers. It is further adapted from the general mock-api framework here: https://github.com/piyook/mock-api-framework-template

## Set-up

Expand Down Expand Up @@ -58,7 +58,7 @@ npm run dev

Available endpoints are listed at the url root

```
```js
http://localhost:8001
```

Expand All @@ -70,14 +70,14 @@ By setting this to blank then the path will just be the api name E.g localhost:8

You can set this to any value E.g

```
LLM_URL_ENDPOINT=things
```js
LLM_URL_ENDPOINT = things;
```

will give

```
localhost:8001/things/users
```js
localhost: 8001 / things / users;
```

This can be used to match the expected path for the LLM (E.g for chatGPT it is 'chatgpt/chat/completions')
Expand All @@ -98,8 +98,8 @@ The LLM response can be set to return random Lorem Ipsum or a random stored resp

Set the following environment variable to 'stored' to use responses stored int the data/data.json file:

```
MOCK_LLM_RESPONSE_TYPE=stored
```js
MOCK_LLM_RESPONSE_TYPE = stored;
```

![LLM Mock Stored Response](image3.png)
Expand All @@ -110,14 +110,14 @@ The responses are randomly picked and change for each http request.

Set the following environment variable to 'lorem' to use responses stored in the data/data.json file:

```
MOCK_LLM_RESPONSE_TYPE=lorem
```js
MOCK_LLM_RESPONSE_TYPE = lorem;
```

A random number of sentences are generated in the response and the maximum sentence number can be set with:

```
MAX_LOREM_PARAS=8
```js
MAX_LOREM_PARAS = 8;
```

### Validate Requests to the LLM are in the correct format
Expand Down Expand Up @@ -147,8 +147,8 @@ Requests that don't pass the validation will result in an error.

The last request made to the mock can be viewed from

```
localhost:8001/logs
```js
localhost: 8001 / logs;
```

This page provides information sent to the LLM Mock and is useful during development.
Expand All @@ -161,45 +161,44 @@ You will need to redirect all your LLM traffic to the mock llm endpoint. For exa

set the endpoint to the following:

```
LLM_URL_ENDPOINT=chatgpt/chat/completions
```js
LLM_URL_ENDPOINT = chatgpt / chat / completions;
```

this provides an enpoint on :
this provides an endpoint that mimics the chatgpt endpoint on :

```
```js
http://localhost:8001/chatgpt/chat/completions

```

In your LLM client llm-response code you can change the chatGPT baseURL if a DEV_MODE flag is set to 'on'

```
```js
import { ChatOpenAI, OpenAIEmbeddings } from '@langchain/openai';

const chatModel = new ChatOpenAI({
openAIApiKey: process.env.OPENAI_API_KEY,
modelName: 'gpt-3.5-turbo',
configuration:
process.env.DEV_MODE === 'true'
? {
baseURL: process.env.DEV_BASE_URL,
}
: {},
});

openAIApiKey: process.env.OPENAI_API_KEY,
modelName: 'gpt-3.5-turbo',
configuration:
process.env.DEV_MODE === 'true'
? {
baseURL: process.env.DEV_BASE_URL,
}
: {},
});
```

For example

```
```js
DEV_MODE=true
DEV_BASE_URL=http://localhost:8001/chatgpt
```

note that you will also need to prevent requests for embeddings - you can do this in langchain using something like
note that you will also need to prevent requests to the real LLM for embeddings - you can do this in langchain using something like

```
```s
import { FakeEmbeddings } from 'langchain/embeddings/fake';


Expand All @@ -222,14 +221,64 @@ and then continue to use the code as for normal requests. Setting Dev mode to fa

Its often useful when developing chatbots and assistants to get back a variety of responses. You can set the LLM Mock env vars to use either random lorem ipsum (up to a maximum number of sentences )

```js
MOCK_LLM_RESPONSE_TYPE = lorem;
MAX_LOREM_PARAS = 8;
```
MOCK_LLM_RESPONSE_TYPE=lorem
MAX_LOREM_PARAS=8
```



https://github.com/user-attachments/assets/d36651ac-d7fa-41ad-b8d8-cd23812ae45a



or use stored responses that are randomly provided by setting

```js
MOCK_LLM_RESPONSE_TYPE = stored;
```
MOCK_LLM_RESPONSE_TYPE=stored

then just add required responses to the src/data/data.json file.



https://github.com/user-attachments/assets/86115f16-63d7-49de-af32-c6f9c2bec7cf



# Using Different LLMs

Any LLM model can be set up by adding a JSON request-template into the request-templates folder for the shape of the expected request TO the LLM endpoint. This file name must have the format <LLM_NAME>\_req.json

A JSON response-template is also added to the response-templates folder with the expected shape of the response object, replacing any expected content with 'DYNAMIC_CONTENT_HERE'. The file name must have the format <LLM_NAME>\_res.json

Example:

```js
"message": {
"role": "assistant",
"content": "DYNAMIC_CONTENT_HERE"
},
```

This is where lorem or stored responses will be injected into the response object.

Lastly the .env file will need to be updated for the model name (which has to be the same as <LLM_NAME>)

For example to use Google Gemini (another example included in this repo) set the variables below the .env

```js
LLM_URL_ENDPOINT=models/gemini-pro:generateContent
LLM_NAME=gemini
VALIDATE_REQUESTS=ON
```

Gemini content is then available for GET and POST requests on

```js
http://localhost:8001/models/gemini-pro:generateContent.
```

Any validation issues with requests from the frontend can be viewed on localhost:8001/logs

LLM framework that supports the new model (such as LangChain) can then be updated to use this endpoint (as for the chatGPT example).
Loading