queryDex is an interactive data analysis tool that allows users to explore and analyze datasets using natural language queries. It leverages the power of Streamlit for the user interface and Google's Generative AI to interpret user queries and perform appropriate statistical analyses.
-
Upload CSV or Excel datasets
-
Perform various statistical analyses:
- Chi-square test
- T-test
- Pearson correlation
- Summary statistics
- And more
-
Visualize data with:
- Histograms
- Scatter plots
- Box plots
- Line charts
-
Natural language query processing
-
Interactive web interface
-
Clone this repository:
git clone https://github.com/yourusername/queryDex.git cd queryDex
-
Install the required dependencies:
pip install -r requirements.txt
-
Set up your environment variables:
Create a
.env
file in the root directory and add your Google Generative AI API key:GEMINI_API_KEY=your_api_key_here
-
Run the Streamlit app:
streamlit run app.py
-
Open your web browser and navigate to the provided local URL (usually
http://localhost:8501
). -
Upload your CSV or Excel dataset using the file uploader.
-
Enter your query in natural language. For example:
- "Show me a histogram of age"
- "Is there a correlation between height and weight?"
- "Perform a t-test on salary between male and female employees"
- "Plot a box plot of income by region"
- "Display a line chart of sales over time"
-
View the results of your query, including statistical test results or visualizations.
- The app uses Streamlit for the user interface, allowing for easy data upload and query input.
- User queries are processed using Google's Generative AI to determine the type of analysis requested.
- The app extracts relevant column names from the query and matches them to the dataset.
- Depending on the analysis type, the app performs the appropriate statistical test or generates the requested visualization.
- Results are displayed directly in the Streamlit interface.
QUERYDEX.WORKING.mp4
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License - see the LICENSE file for details.
- Streamlit for the web app framework
- Google Generative AI for natural language processing
- pandas for data manipulation
- scipy for statistical computations
- matplotlib and seaborn for data visualization