Doxa: Normalized

Friday, September 22, 2023

Normalized

The last three tutorials in Codemy's Pandas for Machine Learning series involves

working with the diabetes dataset. Seems all very interesting at first glance, but then

not so fast. The data has been normalized!!

Model-based and sequential feature selection — scikit-learn 1.3.1 documentation

* * *

The **sklearn diabetes dataset** is a popular dataset used in machine learning. It consists of **442 samples** with **10 features** ¹². The dataset is often used to develop and test machine learning algorithms. Each sample has **10 different attributes**, such as age, sex, body mass index (BMI), average blood pressure, and six blood serum measurements ¹. The target variable is a quantitative measure of disease progression one year after baseline ¹.

* * *