The ENEM Dashboard is an interactive data visualization tool built with Streamlit that analyzes Brazilian National High School Exam (ENEM) data from 2016 to 2020. ENEM is Brazil's primary standardized test for university admissions, making this dashboard valuable for educators, researchers, and policymakers interested in educational trends and outcomes.
- Geographic Distribution: Interactive maps showing average scores across Brazilian states
- Score Analysis:
- Average scores by exam type (Mathematics, Natural Sciences, Languages, Human Sciences)
- Score distribution visualizations for each exam type
- Year-over-year score comparisons (2016-2020)
- Population Statistics:
- Race and age distribution by state and year
- Socioeconomic analysis using violin plots
- Wealth strata correlation with performance
- Difficulty Metrics:
- Question difficulty assessment for each exam type
- Year-by-year difficulty comparisons
- Subject-specific question analysis
- Machine Learning Model:
- Score prediction using Multilayer Perceptrons (MLPs)
- Based on number of correct answers per subject
- Individual subject score estimates
- Python 3.7+
- pip package manager
- Git (optional)
- Clone the repository:
git clone https://github.com/yourusername/enem-dashboard.git
cd enem-dashboard
- Create and activate a virtual environment (recommended):
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
- Install dependencies:
pip install -r requirements.txt
- Download required geospatial data:
- Visit the IBGE website
- Download
bcim_2016_21_11_2018.gpkg
- Place it in the
outils/
directory
- Start the Streamlit server:
streamlit run dashboard.py
- Access the dashboard:
- Open your web browser
- Navigate to
http://localhost:8501
- Select exam type from dropdown menu
- Choose year using slider
- View geographic distribution and score histograms
- Switch between race and age group visualizations
- Filter by year and state
- Explore correlation with performance
- Select subject area
- Compare difficulty levels across years
- Analyze question characteristics
- Enter number of correct answers for each subject
- Get estimated scores based on ML model
- View prediction confidence intervals
- Data sourcing from INEP
- Preprocessing and sampling of large datasets
- Geographic data integration with IBGE shapefiles
- Machine learning model training and validation
- Frontend: Streamlit
- Data Processing: Pandas, NumPy
- Visualization: Plotly, Seaborn
- Machine Learning: Scikit-learn (MLPRegressor)
- Geospatial Analysis: GeoPandas
- Efficient data loading with chunking
- Caching of processed data
- Optimized ML model persistence
Geographic distribution of ENEM scores
Demographic distribution across states
Detailed demographic breakdown
ML-based score prediction interface
- Fork the repository
- Create a feature branch (
git checkout -b feature/AmazingFeature
) - Commit changes (
git commit -m 'Add AmazingFeature'
) - Push to branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
- Follow PEP 8 style guidelines
- Add tests for new features
- Update documentation as needed
- Maintain backwards compatibility
This project is licensed under the MIT License - see the LICENSE file for details.
- INEP for providing ENEM microdata
- IBGE for geospatial data
For questions and feedback, please open an issue on the GitHub repository.