DECISION TREE CLASSIFICATION BASED ON DIABETES DATASET

INTRODUCTION

diabetes.csv is originally from the National Institute of Diabetes and Digestive and Kidney
Diseases. The objective of the dataset is to diagnostically predict whether a patient has diabetes,
based on certain diagnostic measurements included in the dataset. Several constraints were placed
on the selection of these instances from a larger database. In particular, all patients here are females
at least 21 years old of Pima Indian heritage.2
From diabetes.csv you can find several variables, some of them are independent
(several medical predictor variables) and only one target dependent variable (Outcome).

INSTRUCTIONS

1. Use Copilot Chat to create a new notebook in your project. Use command /newnotebook and name it as "Diabetes Tree Classifier".
2. Use Copilot and Copilot Chat to develop the exercise and support your learning.

EXERCISE

The objective of this project is to build a decision tree classifier based on Scikit-learn and
Python. The classifier should be able to predict whether a patient has diabetes or not based on
certain diagnostic measurements included in the dataset.

1. Importing Required libraries to build a decsion tree classifier

2. Loading the dataset

3. Exploratory Data Analysis

3.1. Display the first 5 rows of the dataframe.
3.2. Display the number of rows and columns in the dataframe.
3.3. Display the data types of each column.
3.4. Display the number of missing values in each column.
3.5. Display the number of unique values in each column.

4. Feature Selection

4.1. Split the data into features and target variable.
4.2. Split the data into training and testing sets.

5. Building the Decision Tree Classifier

5.1. Instantiate the DecisionTreeClassifier class.
5.2. Fit the model to the training data.
5.3. Predict the labels of the test data.
5.4. Evaluate the model.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
LICENSE		LICENSE
README.md		README.md
diabetes.csv		diabetes.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DECISION TREE CLASSIFICATION BASED ON DIABETES DATASET

INTRODUCTION

INSTRUCTIONS

EXERCISE

1. Importing Required libraries to build a decsion tree classifier

2. Loading the dataset

3. Exploratory Data Analysis

4. Feature Selection

5. Building the Decision Tree Classifier

6. Visualizing the Decision Tree

7. Conclusion

About

License

GitHub-Insight-ANZ-Lab/copilot-challenge-data-scientist-python

Folders and files

Latest commit

History

Repository files navigation

DECISION TREE CLASSIFICATION BASED ON DIABETES DATASET

INTRODUCTION

INSTRUCTIONS

EXERCISE

1. Importing Required libraries to build a decsion tree classifier

2. Loading the dataset

3. Exploratory Data Analysis

4. Feature Selection

5. Building the Decision Tree Classifier

6. Visualizing the Decision Tree

7. Conclusion

About

Topics

Resources

License

Stars

Watchers

Forks