diabetes.csv is originally from the National Institute of Diabetes and Digestive and Kidney
Diseases. The objective of the dataset is to diagnostically predict whether a patient has diabetes,
based on certain diagnostic measurements included in the dataset. Several constraints were placed
on the selection of these instances from a larger database. In particular, all patients here are females
at least 21 years old of Pima Indian heritage.2
From diabetes.csv you can find several variables, some of them are independent
(several medical predictor variables) and only one target dependent variable (Outcome).
1. Use Copilot Chat to create a new notebook in your project. Use command /newnotebook and name it as "Diabetes Tree Classifier".
2. Use Copilot and Copilot Chat to develop the exercise and support your learning.
The objective of this project is to build a decision tree classifier based on Scikit-learn and
Python. The classifier should be able to predict whether a patient has diabetes or not based on
certain diagnostic measurements included in the dataset.
3.1. Display the first 5 rows of the dataframe.
3.2. Display the number of rows and columns in the dataframe.
3.3. Display the data types of each column.
3.4. Display the number of missing values in each column.
3.5. Display the number of unique values in each column.
4.1. Split the data into features and target variable.
4.2. Split the data into training and testing sets.
5.1. Instantiate the DecisionTreeClassifier class.
5.2. Fit the model to the training data.
5.3. Predict the labels of the test data.
5.4. Evaluate the model.