Write A Function To Implement Regularization-Based Training
Write A Function To Implement Regularization Based Training Procedure
Write a function to implement regularization based training procedure for 1D regression. The function should have the following form: function a = mytrain(train_x, train_y, n, lamda), where train_x is a column vector containing all the input points in the training, train_y is a column vector containing all the output points in the training, n is the order of the polynomial model, lamda is a non-negative regularization parameter, and a is a column vector containing coefficients of the polynomial model. Write a function to implement testing procedure for 1D regression. The function should have the following form: function test_y = mytest(test_x, a), where test_x is a column vector containing all the input points in the test, a is a column vector containing coefficients of the polynomial model, and test_y is a column vector containing all the output points in the test. Create a script to do the following experiment Create 10 input points for training. They should be evenly distributed between 0 and 1. Create 10 corresponding output points for training as follows: cos(2pix)+0.3randn(10,1) Create 100 input points for testing. They should be evenly distributed between 0 and 1. Generate models for lamda = 0, and n = 1, 2, 3, 4,5,6,7,8,9, respectively. Generate models for n = 3, and lamda = 1e-8, 1e-5, 1e-2, 1, respectively. Generate models for n = 9, and lamda = 1e-8, 1e-5, 1e-2, 1, respectively. For each model, superimpose the testing results onto cos(2pi*x) Write a report to summarize and discuss your results.
Paper For Above instruction
Implementation of Regularization-Based Training for 1D Regression
This paper details the development and implementation of a regularization-based training procedure for one-dimensional (1D) polynomial regression. The goal is to accurately model the relationship between input features and target outputs using polynomial models of varying degrees, while incorporating regularization to prevent overfitting and improve generalization. The approach involves creating functions for training and testing the models and conducting systematic experiments to evaluate different configurations of polynomial degree and regularization strength.
Introduction
Polynomial regression is a widely used method in supervised learning for modeling nonlinear relationships between input variables and output responses. Traditional polynomial fitting can lead to overfitting, especially with high-degree polynomials, resulting in poor predictive performance on unseen data. Regularization techniques, such as Ridge regression, introduce a penalty term based on coefficient magnitude, which constrains the model parameters and enhances robustness.
Methodology
Training Function: mytrain
The core of the training procedure involves fitting a polynomial of degree n to the training data, with a Tikhonov regularization term weighted by lambda. In mathematical terms, the coefficients a are obtained by minimizing the regularized least squares objective:
\[ \min_{a} \|\mathbf{X}a - \mathbf{y}\|^{2} + \lambda \|a\|^{2} \]
where \(\mathbf{X}\) is the design matrix constructed from the training inputs, with each row representing a data point and each column representing a polynomial feature. The solution is obtained explicitly via the normal equations:
\[ a = (\mathbf{X}^{T}\mathbf{X} + \lambda \mathbf{I})^{-1} \mathbf{X}^{T} \mathbf{y} \]
Testing Function: mytest
The testing function computes the predicted outputs for new input data using the polynomial coefficients:
\[ \hat{\mathbf{y}} = \mathbf{Z} a \]
where \(\mathbf{Z}\) is the design matrix constructed from test inputs similar to the training design matrix.
Implementation
Training Function
function a = mytrain(train_x, train_y, n, lamda)
% Construct Vandermonde matrix for polynomial degree n
X = zeros(length(train_x), n+1);
for i = 0:n
X(:, i+1) = train_x.^i;
end
% Regularized least squares solution
a = (X' X + lamda eye(n+1)) \ (X' * train_y);
end
Testing Function
function test_y = mytest(test_x, a)
n = length(a) - 1;
% Construct design matrix for test data
Z = zeros(length(test_x), n+1);
for i = 0:n
Z(:, i+1) = test_x.^i;
end
% Predict outputs
test_y = Z * a;
end
Experiments
Data Generation
Using 10 evenly spaced training points between 0 and 1, the training inputs (train_x) are generated as:
train_x = linspace(0, 1, 10)';
The corresponding training outputs (train_y) are generated by the function:
train_y = cos(2 pi train_x) + 0.3 * randn(10,1);
For testing, 100 points are generated similarly:
test_x = linspace(0, 1, 100)';
Model Fitting and Evaluation
Models are fitted for various degrees (n) and regularization parameters (\(\lambda\)). For each configuration, the model's predictions on test data are superimposed onto the true function \(\cos(2\pi x)\) to visually evaluate fit quality.
Specifically, models are trained under the configurations:
- n = 1 to 9 with \(\lambda=0\)
- n = 3 with \(\lambda = 1e-8, 1e-5, 1e-2, 1\)
- n = 9 with \(\lambda = 1e-8, 1e-5, 1e-2, 1\)
Each model's predictions are plotted alongside the true \(\cos(2\pi x)\) for visual comparison and analysis.
Results and Discussion
The experiments demonstrate that models with low-degree polynomials tend to underfit, capturing only the broad trend of the data, while higher-degree polynomials can overfit, especially with small regularization. Regularization effectively controls overfitting, as seen in the smoothing of predictions for higher degrees when \(\lambda\) is increased. For very small \(\lambda\), models closely follow the data, risking overfitting, whereas large \(\lambda\) values create overly smooth estimates, potentially underfitting. The choice of polynomial degree and regularization parameter significantly affects the model's ability to generalize, emphasizing the importance of model selection and tuning.
Conclusion
This study confirms that regularized polynomial regression effectively balances bias and variance, leading to improved predictive performance on unseen data. The explicit implementation of the training and testing functions provides a flexible framework for exploring different configurations. Future work could extend this approach to non-polynomial basis functions or incorporate cross-validation techniques for automated hyperparameter tuning.
References
- Hastie, T., Tibshirani, R., & Friedman, J. (2009). The Elements of Statistical Learning. Springer.
- James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An Introduction to Statistical Learning. Springer.
- Bishop, C. M. (2006). Pattern Recognition and Machine Learning. Springer.
- Shalev-Shwartz, S., & Ben-David, S. (2014). Understanding Machine Learning. Cambridge University Press.
- Rasmussen, C. E., & Williams, C. K. I. (2006). Gaussian Processes for Machine Learning. MIT Press.
- Murphy, K. P. (2012). Machine Learning: A Probabilistic Perspective. MIT Press.
- Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1-22.
- Koenker, R. (2005). Quantile Regression. Cambridge University Press.
- Wahba, G. (1990). Spline Models for Observational Data. SIAM.
- Schölkopf, B., & Smola, A. J. (2002). Learning with Kernels. MIT Press.