Skip to content

A platform that enables users to perform private benchmarking of machine learning models. The platform facilitates the evaluation of models based on different trust levels between the model owners and the dataset owners.

License

Notifications You must be signed in to change notification settings

microsoft/private-benchmarking

Repository files navigation

Private Benchmarking of Machine Learning Models

Project Status

Warning: This is an academic proof-of-concept prototype and has not received careful code review. This implementation is NOT ready for production use.

points

  • ssl certificate security for the website (file:settings.py) (deployment tasks)
  • implement Trust level 1,2,3,4 and 5
  • Testing
  • Documentation
  • CI/CD workflows for github actions

Project Description

This project aims to create a platform that enables users to perform private benchmarking of machine learning models. The platform facilitates the evaluation of models based on different trust levels between the model owners and the dataset owners.

This repository provides the accompnaying code for paper https://arxiv.org/abs/2403.00393

TRUCE: Private Benchmarking to Prevent Contamination and Improve Comparative Evaluation of LLMs

Tanmay Rajore, Nishanth Chandran, Sunayana Sitaram, Divya Gupta, Rahul Sharma, Kashish Mittal, Manohar Swaminathan

Installation

for complete build and EzPC LLM support

  • modify the setup.sh file to according to your system configuration for Nvidia Drivers and CUDA version (default is 11.8 and GPU architecture is 90 Hopper )
        (In setup.sh)
        line 42: export CUDA_VERSION=11.8
        line 43: export GPU_ARCH=90
    
  • run the setup.sh file
    ./setup.sh
    Enter the Server IP address: <your_server_IP>
    
    • The setup.sh file will install the required dependencies and setup the environment for the platform to run. The .env file contains the django secret key which can be changed as per the user's requirement.For any key related storage, the user should only use the .env file.

only the platform

  • NOTE: Need to set the environment variable ENCRYPTION_KEY for the TTP/TEE server to run (32 bytes/256 bits) key and .env file manually for the platform to run and the IP address for the platform to run on.
pip install -r requirements.txt
cd eval_website/eval_website
python manage.py makemigrations
python manage.py migrate
python manage.py runserver 0.0.0.0:8000

Usage

To use the project after installation visit.

http://127.0.0.1:8000 (on Localhost) or http://<your_server_IP>:8000 (on Public IP)

  • Sample User Credentials

    • Model Owner
      • username: ModelOwner
      • password: helloFriend
    • Dataset Owner
      • username: DatasetOwner
      • password: helloFriend
  • certain ports are pre-assigned as follows:

    • 8000: for the main website
    • 8001: for the EzPC LLM secure communication with Trusted third party server
    • 7000: for the Trusted execution environment to communicate with the website
    • 7001: for the Trusted third party server to receive model files
    • 7002: for the Trusted third party server to receive dataset files
    • 9000: for communication of Dataset owner with the website for receiving key files for EzPC
    • 9001: for communication of Model owner with the website for receiving key files for EzPC
  • Trusted Third Party(TTP) Server

    • The TTP server is a separate server that is used to perform the secure computation of the model. The TTP server is required to be running for the secure computation to be performed. The TTP server can be started by running the following command.
      cd utils/TTP_TEE_files
      python ttp_server.py
    • Assumptions:
      • The TTP server related details are set in the platform Backend database.
      • The TTP server require to receive model and dataset files for evaluation from the respective parties on port 7001 and 7002 respectively.
      • The TTP server will perform the secure computation and return the results to the platform.
      • The TTP server also requires server.crt and server.key files to be present in the same directory as the ttp_server.py file. These files are used for secure communication between the TTP server and the platform using the CA generated by the Platform after first run and need to be generated using the following command.
      openssl req -newkey rsa:2048 -nodes -keyout "./server.key" -out server.csr -subj /CN=127.0.0.1
      
      openssl x509 -req -in server.csr -CA path/ca.crt(generated by eval_website root) -CAkey /path/ca.key(generated by eval_website root) -CAcreateserial -out ./server.crt -days xxx
    • Environment Variable ENCRYPTION_KEY is required to be set for the TTP/TEE server to run (32 bytes/256 bits) key.
      export ENCRYPTION_KEY="32 bytes key"
      #generate a 32 bytes key using the following command
      python -c 'import os, binascii; print(binascii.hexlify(os.urandom(32)).decode("utf-8"))'
      
  • Trusted Execution Environment(TEE)

    • The Trusted Execution Environment is a separate server that is used to perform the secure computation of the model (based on TTP scripts). The Trusted Execution Environment is required to be running for the secure computation to be performed. Detailed instructions for setting up the Trusted Execution Environment can be found in the TTP/TEE.
  • EzPC LLM

    • currently EzPC supports the following models

      • bert-tiny
      • bert-base
      • bert-large
      • gpt2
      • gpt-neo
      • llama7b
      • llama13b
    • for more information on how to use EzPC LLM refer to the EzPC LLM.

Artifacts Evaluation

The artifacts evaluation for the paper to generate the Table can be found in the Artifacts Evaluation.

Contributing

If you would like to contribute to this project, please follow the guidelines outlined in the contributing.md file.

License

This project is licensed under the [MIT] license. Please see the LICENSE file for more information.

About

A platform that enables users to perform private benchmarking of machine learning models. The platform facilitates the evaluation of models based on different trust levels between the model owners and the dataset owners.

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •