This project is a product of a mentorship program done by WooTech Orgnanization
Mentees : Ankita Jain, Ayushi Agarwal, Kamalpreet Kaur, Zhilin Huang
Mentor : Mohamed Ayoob Nazeem
The NASA Sloan Atlas contains Images and parametric features of local galaxies. It has a curated catalogue of nearly 150 000 thousand images. The aim of this project would be to get statistical insights on certain aspects of the data as specified by the mentor. The mentees will also be able to learn fundamental aspects of machine learning and Data Science as a capstone of this project. This project will undertake on classifying galaxies and parametric values from them to learn additional insights and patterns on variations and interpolations between features.
Women Classifying stars is not nothing this world hasn't seen before, it was pioneered by Edward Pickering. Edward Pickering was a scientist at Harvard University. To much of the ridicule and derision of his colleagues, he left a team of women in charge of cataloguing the stars. These amazing women not only listed and logged the star catalogues but went on to publish papers and invent a system of star classification that is still the basis of today's star classification frameworks. These women were able to achieve such feats in astronomy due to their perseverance, grit and brilliance. The “Harvard Computers” were among the more decent nicknames they got. The most notable among these women are Annie Jump Cannon and Williamina Fleming.
Explore the Data and do some EDA (Exploratory Data Analysis) on it. Particularly we have since we have a lot of variables in play and Astronomy is a new science for most programmers
Querying relevant data + Feature engineering + Data pre-processing - Pull Request
Implement Machine Learning models - based on the above 3 different approaches. (Machine Learning , Statistical , Deep Neural Networks) - Pull Request
Evaluating metrics such as confusion matrix, recall, precision, accuracy, F1 Score. Additionally will be writing unit test cases.
Articles on the above three approaches to the problem. Complete Test Cases and start example Jupyter Notebook. - Pull Request
Finalize the left over works, including the test cases, articles and complete them. - Pull Request
Cushioning Week - for any unexpected delays or roadblocks. If not will be working on the refining the documentation and the articles.
I will briefly touched upon auto-encoders and the Causal relationships between variables, future work can handle, on an entirely independent research on the causal vs correlation relationship between the astronomical data.