Skip to content

This repository contains the three-part capstone project made for the DTU Data Science course 02450: Introduction to Machine Learning and Data Mining

Notifications You must be signed in to change notification settings

seby-sbirna/Coronary-Heart-Disease-Prognosis-and-Diagnosis-using-Machine-Learning-techniques

Repository files navigation

Coronary heart disease prognosis and diagnosis using Machine Learning techniques: Feature Extraction, Supervised and Unsupervised Learning

By Sebastian Sbirna


Determining presence of any kind of diseases is a skill which has always been needed by society and, up until recently, could only be performed meticulously by doctors with extensive training and experience.

Our problem of interest is to be able to take advantage of the high computational power available nowadays by using various Machine Learning techniques upon patients’ data, in order to detect accurately and rapidly whether such patients are suffering from diseases.

For this project, we have decided to focus on detecting the presence of coronary heart disease using a dataset provided by UCL and Kaggle. The dataset’s creators were Andras Janosi, M.D., William Steinbrunn, M.D., Matthias Pfisterer, M.D. and Robert Detrano, M.D.

Firstly, we have analysed our dataset using various data visualization and feature extraction methods, among which the most beneficial for our project was PCA:

Afterwards, we have performed and evaluated the performance and characteristics of various types of Supervised Learning models upon the Heart Disease data, using Neural Networks, Decision Trees, Logistic Regressions and baselines for model comparison:

Lastly, we have investigated patient readings grouping and anomaly detection using Unsupervised Learning methods of density estimation and clustering, together with finding frequently-occurring disease-confident patterns from patients' data using association mining: