The Blue Nowhere: [scikit-learn] datasets

2022年11月25日星期五

[scikit-learn] datasets

最近開始在上 Coursera台大林軒田的 Machine Learning的課程，第二周的內容重點是關於 PLA
(Perceptron Learning Algorithm)，於是就想實作關於 PLA的程式碼。

首先是要有一組線性可分的資料，但要手動產生線性可分資料過於麻煩，而 scikit-learn提供
datasets 這個為機器學習使用者所用的資料集合。

以 iris plant dataset為例，其提供了鳶尾花的植物特徵資料：

https://scikit-learn.org/stable/datasets/toy_dataset.html

花萼長度 (sepal length in cm)
花萼寬度 (sepal width in cm)
花瓣長度 (petal length in cm)
花瓣寬度 (petal width in cm)
Class：Setosa、Versicolour及Virginica

三個品種的鳶尾花各50組，總共150組資料。透過 load_iris來獲得：

https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_iris.html

參考scikit-learn提供的範例The Iris Dataset來看其資料分布的情形。可以用來練習線性可分
的情況，也能練習在線性不可分下中使用 Pocket Algorithm。

用 matplotlib畫出其資料分布圖：

The Blue Nowhere

2022年11月25日星期五

[scikit-learn] datasets

沒有留言:

張貼留言

2022年11月25日 星期五

[scikit-learn] datasets

沒有留言:

張貼留言

2022年11月25日星期五