Summary:
Training functions:
Levenberg-Marquardt:
the combination of Gradient Descent and Gauss-Newton, when mu is small, it is
closer to Gauss-Newton, when mu is big, it is closer to Gradient Descent. It is
sensitive to local minimum.
Bayesian
Regularization: for over fitting problem.
Quasi-Newton:
better than Gradient Descent.
Backpropagation:
supervised learning method
Conjugate
Gradient: good for large numbers of predictors and observations, both linear
and non-linear.
Gradient
Descent: sensitive to local minimum.
Gauss-Newton:
good for total minimum, good convergence.
The following are their results.
LM:
Bayesian Regularization:
Quasi-Newton:
Backpropagation:
Conjugate Gradient:
Gradient Descent:
From the above 6 algorithms, we can see that Backpropagation is the best one, but it is still not good enough. Since the results are not good, I did not plot 6 subplots (500 observations individually).
Tomorrow, I will try to find methods to improve them.
how will you find methods to improve them? you need to simultaneously read literature similar to your work.
ReplyDeleteOk, I have searched for some, but I have not found yet.
Delete