An analysis of district school trends using PANDAS in Jupyter Notebook.
The analysis was performed for the school district on their budget and priorites for a list of 15 schools. The analysis included a comparison of the data provided by the schools to compare various metrics of the schools in the list.
The purpose of this project is to compare the various data between schools based on their student size, school type (District vs Charter) and grades. However, the school district discovered a miscalculation for the grades in the ninth grade students at Thomas High School and hence requesated in a change from the intial analysis. The change requried us to remove the data for the 9th grade and replace it with NaN while keeping the other data intact and to provide the analysis based on the changed dataset.
The data was provided by the school and was compiled into two files with the following data:
- School ID
- School_name
- Type
- Size
- Budget
- Student ID
- Student_name
- Gender
- Grade
- School_name
- Reading_score
- Math_score
The intial analysis to create a ditrict summary after merging the two data sets into one dataset to calculate the total number of schools and students, the total budge, the average scores for math and reading, passing percentage for both math and reading scores and the overall passing percentage between schools.
The same analysis was then performed per school and it was seen that the changes to dataset in Thomas High School did not affect the school district summary by more than .1%.
The analysis was sorted to get the five top and bottom performing schools based on their overall passing percentage from the per school summary. The change in Thomas High School affected the top five schools dropping it from the top five list to 8th in the list.
The analysis was then performed to give the average math and reading scores per school and the scores for Thomas High School was replaced with NaN while keeping the data for the other grades intact.
Average math scores per school
Average reading scores per school
The analysis was performed to show the overall performance of schools based on their spending summary.
The analysis was perforemd to show the overall performance for schools based on their size or number of students.
The analysis was perforemd to show the overall performance for schools based on their school type (District vs Charter).
The five significant changes in the analysis that occurred due to the reomval of data from Thomas High School dataset is in the following,
- Number of total students,
- Number of students counted at Thomas High School,
- Average math and reading scores,
- Overall percentages for math and reading at Thomas High School and
- Top performing schools
The removal of data implies a decrease in the count of total students overall, and total students at Thomas High School, specifically for this analysis. And since the population amount was decreased, this leads to a change in average scores and score percentages. The most significant change was the removal of Thomas High Schhol which was initally ranked second performong school with the Overall Passing percentage being 91% to the eight ranking school with the overall passing percentage being 65%.However, these score and percentage changes were minimal hence we can assume that the removal of math and reading scores of 9th graders at Thomas High School not as significant as we would imagine it to be.