Skip to content

Latest commit

 

History

History
148 lines (89 loc) · 9.92 KB

README.md

File metadata and controls

148 lines (89 loc) · 9.92 KB

SEEDEVAL

A library for software engineering task evaluation

Supported Tasks

Text-to-Code

Task Definition

Task Input Output Task Definition
Code Generation A natural language description/comment on implmenting certain specification. Code Generate code for a given specification written in natural language.
Code Search A natural language description of code. The code that matches the description. Given a natural language, search for source code that matches the natural language.

Metrics

Task Metric Reference If Integrated?
Code Generation EM (Exact Match) CodeXGLUE - Text2Code Generation ✔️
BLEU CodeXGLUE - Text2Code Generation ✔️
Code Search MRR CodeXGLUE -- Code Search (AdvTest) ✔️

Code-to-Code

Task Definition

Task Input Output Task Definition
Code Translation A function of code in either C# or Java. The function translated from Java to C# or vice-versa. Translate the code from one programming language to another programming language.
Code Repair A Java function with bugs. The refined function with no bugs. Automatically refine code by fixing bugs.
Code Completion A chunk of Java or Python context code. The predicted next token. Predict subsequent tokens given the context of code.

Metrics

Task Metric Reference If Integrated?
Code Translation EM (Exact Match) CodeXGLUE -- Code Translator ✔️
BLEU CodeXGLUE -- Code Translator ✔️
Code Repair EM (Exact Match) CodeXGLUE -- Code Refinement ✔️
BLEU CodeXGLUE -- Code Refinement ✔️
Code Completion EM (Exact Match) CodeXGLUE -- Code Completion (token level) ✔️

Code-to-Text

Task Definition

Task Input Output Task Definition
Code Summarization Code A natural language description of the code. Generate natural language comments for code.

Metrics

Task Metric Reference If Integrated?
Code Summarization EM (Exact Match) CodeXGLUE - Code-Text ✔️

Code Clasification

Task Definition

Task Input Output Task Definition
Clone Detection Two examples of code. A binary classification of similar or not. Measure the semantic similarity between codes.
Bug/Defect Prediction - Binary Code A binary classification of defective or not. Classify whether code contains defects that may be used to attack software systems.
Bug/Vulnerability Type Prediction - Multi-class Code The type of a variable, parameter, or function. Predict the correct type for a particular variable, parameter, or function.

Metrics

Task Metric Reference If Integrated?
Clone Detection MAP@R score CodeXGLUE - Clone Detection ✔️
Bug/Defect Prediction - Binary EM (Exact Match) CodeXGLUE - Defect Detection ✔️
Bug/Vulnerability Type Prediction - Multi-class EM (Exact Match) CodeXGLUE -- Type Prediction -- TypeScript ✔️

Others

Task Definition

Task Input Output Task Definition

Metrics

Task Metric Reference If Integrated?
Fault/Bug Localization Paper with Replication Package

How to Contribute

Thank you for your interest in contributing! This document outlines the process for contributing to our project. Your contributions can make a real difference, and we appreciate every effort you make to help improve this project.

Getting Started

  1. Identify your target software engineering task (Unfamiliar with SE tasks? Find them here!)

You can either choose to integrate an existing evaluation technique or add a new evaluation technique.

Note, there could be evaluation tasks that are currently being worked on. Check the pull requests tab to see if a task is already in the works

  1. Integrate the evaluation method

Ensure that you have a detailed readme that describes how to use the evaluation method.

An example of an evaluation method and appropriate readme can be found here.

  1. Add a test script for you evaluation

In order to ensure the validity of the evaluation method, we require that you provide a test script as well.

There is a separate test folder that you must add your tests to. We also ask that you provide a 'how-to-test' section in your readme, detailing how to test the evaluation method.

An example test script can be found here.

Coordinator

Mitchell Huggins, please contact mrhuggin@ncsu.edu if any questions about SEVAL.

Contributors

mrhuggins03 chaseltb ArsalaanK7 BrennenFa EZ7051 ywang146 kritipat gsharma3

Dependancy

  • python 3.6 or 3.7
  • numpy