This project tackles PDF data extraction using a Large Language Model for layout-agnostic and context-aware results.
1.About the Project
2.Tech Stack
3.Theory and Approach
4.Results and Demo
5.Future Work
6.Contributors
7.Acknowledgements and Resource
This project makes a smart system for getting answers to text questions. People type their questions and the system uses the LLAMA-13B model to give answers.
Next.js
Typescript
FastAPI
Python
Huggingface LLMs
Pytorch
Reinforcement Learning
Numpy
Pandas
Matplotlib
ChromaDB (Vector database)
Firebase
This project began with a powerful language model known as Llama 2 13b. To make it even more effective, it was fine-tuned on a specific dataset. Additionally, RLHF was implemented using GPT2 as a reward model, further enhancing its capabilities for data conversion tasks.
logithon 2024