Skip to content

ParamThakkar123/Logithon-AI-Hack

Repository files navigation

Logithon-AI-Hack

Self Learning AI PDF to Data Converter

This project tackles PDF data extraction using a Large Language Model for layout-agnostic and context-aware results.

Table of contents

1.About the Project

2.Tech Stack

3.Theory and Approach

4.Results and Demo

5.Future Work

6.Contributors

7.Acknowledgements and Resource

About the project

This project makes a smart system for getting answers to text questions. People type their questions and the system uses the LLAMA-13B model to give answers.

Tech Stack

WEB Technologies

Next.js
Typescript
FastAPI

Machine Learning Technologies

Python
Huggingface LLMs
Pytorch
Reinforcement Learning

Data Analysis:

Numpy
Pandas
Matplotlib

Databases:

ChromaDB (Vector database)
Firebase

Theory and Approach

This project began with a powerful language model known as Llama 2 13b. To make it even more effective, it was fine-tuned on a specific dataset. Additionally, RLHF was implemented using GPT2 as a reward model, further enhancing its capabilities for data conversion tasks.

Results and demo

ss1 ss

Future work

Implementation of vision transformers.

Speech to text conversion

Contributors

PARAM THAKKAR

ABHI MEHTA

ANUSHKA YADAV

AKSHITA BHASIN

Acknowledgements

logithon 2024