Friday, September 22, 2023

Normalized

 The last three tutorials in Codemy's Pandas for Machine Learning series involves

working with the diabetes dataset. Seems all very interesting at first glance, but then

not so fast. The data has been normalized!!

Model-based and sequential feature selection — scikit-learn 1.3.1 documentation

                                                         *     *     *

The **sklearn diabetes dataset** is a popular dataset used in machine learning. It consists of **442 samples** with **10 features** ¹². The dataset is often used to develop and test machine learning algorithms. Each sample has **10 different attributes**, such as age, sex, body mass index (BMI), average blood pressure, and six blood serum measurements ¹. The target variable is a quantitative measure of disease progression one year after baseline ¹.

                                                         *     *     *

                                                                   

Still, I'm sticking with it...

                                                    *     *     *

Found the raw data; having a hard time finding whether sex 1 is male or female,,,





Microsoft's breakdown of data types:



First five:






                                                     
  

No comments: