機器學習_學習筆記系列(30)：集成投票回歸(Ensemble Voting Regressor)和保序回歸(Isotonic Regression)

5 min readApr 6, 2021

上一篇我們介紹了Ensemble Voting Classifier並用Logistic Regression去擬合預測值，最後得到每筆資料分到各類別的機率。

Ensemble Voting Regressor

稍微看了一下Kaggle和Scikit Learn定義的VotingRegressor，其方式和我們之前的Ensemble Averaging Regressor很像，就是當我們的sub-model訓練完後直接把每個sub-model的預測值平均起來。

Python Sample Code:

Github:

tomohiroliu22/Machine-Learning-Algorithm

Contribute to tomohiroliu22/Machine-Learning-Algorithm development by creating an account on GitHub.

github.com

沒錯就只有這樣而已，所以為了不讓這篇內容太少，我們回到先前的分類問題，我們上一章產生機率的方式是用Platt Scaling，而在2001年的時候兩位偉大的科學家Zadrozny and Elkan發了一篇paper

Transforming classifier scores into accurate multiclass probability estimates

其是使用isotonic regression去擬合我們classifier預測後的數據。

Isotonic Regression

在講這個回歸方法前，我們先來回顧一下linear regression。其最佳化問題為

也就是最後我們想要找出一個方程式h，讓其代數據點x後，其和實際值y差越少越好。而對於isotonic regression，其最佳化問題為

其方程式限制條件就是

另外這裡要特別注意，原始資料x_i已經由小到大排序好。而這個就是我們所熟習的二次規劃問題，其中

其中

然後我們就可以得到我們的beta值。

我們可以看到在Isotonic regression中，其預測值只會隨著x增加而增加或持平。但是顯然這種方法，運用在解決回歸問題不是很好用，有很嚴重的overfitting問題。

不過他運用在剛剛提到的”擬合數據點分到各類別的機率”有不錯的效果。而在這裡我們要帶入isotonic regression的是f(x)，而標記是t=[0,1]，最後我們可以看到其得出

藍點為分類正確(p>0.5)、紅點為分類錯誤(p<=0.5))

而isotonic regression相對於Platt Scaling的好處是，其可以直接套入QP求解就好，不需要像Platt Scaling為了要用梯度下降，要計算微分。而且概念上也比較直觀。

Python Sample Code:

Github:

tomohiroliu22/Machine-Learning-Algorithm

Contribute to tomohiroliu22/Machine-Learning-Algorithm development by creating an account on GitHub.

github.com

Reference:

[1] Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., … & Duchesnay, E. (2011). Scikit-learn: Machine learning in Python. the Journal of machine Learning research, 12, 2825–2830.

[2] Zadrozny, B., & Elkan, C. (2002, July). Transforming classifier scores into accurate multiclass probability estimates. In Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 694–699).

機器學習_學習筆記系列(30)：集成投票回歸(Ensemble Voting Regressor)和保序回歸(Isotonic Regression)

Ensemble Voting Regressor

Python Sample Code:

tomohiroliu22/Machine-Learning-Algorithm

Contribute to tomohiroliu22/Machine-Learning-Algorithm development by creating an account on GitHub.

Isotonic Regression

Python Sample Code:

tomohiroliu22/Machine-Learning-Algorithm

Contribute to tomohiroliu22/Machine-Learning-Algorithm development by creating an account on GitHub.

Reference:

Written by 劉智皓 (Chih-Hao Liu)

No responses yet