-
1
- #1
rotw
Mechanical
- May 25, 2013
- 1,143
Hi there,
I just wanted to share some of my impressions after some extensive immersion into artificial neural networks in the context of regression analysis. True that neural nets are a powerful tool but like any tool there are limitations. I feel I still have a lot to learn in this field. Nevertheless I would like to outline some of the reservations and opinions I have come to on this subject. I would be happy if this can trigger further discussion/debate.
- Neural network as a "black box". It took me some time to try to understand what the reference to a black box truly means. Actually I have my interpretation about it. A neural network is a black box because none can truly predict what the hell is going in terms of network behavior when the topology parameters are adjusted; the whole thing is simply unpredictable. A neural net with 2 units in the hidden layers may do a good job when one with 3-unit / hidden layer would overlearn your problem. Now make a new random seed, the reverse will happen: the 3 unit per hidden layer neural net is the one that performs better. This is terrible.
-The learning method. It is incredible the number of updating methods that actually exist in order to update the neural network weights during training procedure (I refer especially to feedforward / back propagation nets). But what is disappointing is that the repeatability is very poor: solve a problem with a quick propagation and again with a batch back propagation and you get different outcomes ranging from small discrepancies to quite tangible differences considering same problem set up. Move to Levenberg-Marquardt and 2nd order methods such as scaled conjugate gradients - each time a new outcome again. Change the tuning of the learning parameters (momentum, learning rate etc,) here again different results. Regularization techniques, such as weight decay and alike do not produce same results too versus the selected updating method, etc.
-Overfitting. Neural networks need to be configured to the strict necessary number of layers/units to not overfit a problem. This is key. But go find the right balance without sacrifying (sometimes seriously) the accuracy, GOOD LUCK. Even hours of work on preconditionning data / doing principal component analysis for example could be of limited help while it is time consuming and a resources drainer. Overfitting is ugly. You could have all calibration indicators okay (RMSE, max/min differences, etc.) and still have a distastrous prediction on some points which are WITHIN the training range which will prompt immediately an "OMG" type of reaction.
- Estimating the intervals of confidence. Quite a complicated procedure to estimate theses intervals. And at the end, they only hold on the mathematical parametrization of the network but tells nothing on the physics itself. These can be misleading too (e.g. false positives). In addition spliting your training set over validation and test sets can be costly for you if you do not have enough data at hand; sometimes it is a luxury I would say.
- Randomness: Take the same network architecture and apply it to train and then predict the same problem / data set. Each time you get different outcome because of the weights random initialization procedure. If you set up a network/model once and do not touch it again - maybe that is fine. But if you want to have a network integrated as part of a prediction tool, you would never have two times the same weight distribution. As a consequence each time a difference set of predicted parameters. Go explain this an operator or to management who are ignorant about the peculiarities of the neural nets - here too good luck.
I find neural networks useful to deal with very, very complicated data structures as long as it is used as a tool to aid the engineers/scientist in finding patterns and understand how systems work - that is great. I would refrain from relying on a neural network in an engineering system design which in case of failure could have consequences on safety and have impact on public (injury/harm) and/or goods on the ground that as far as it goes it is truly a black box.
Sorry if I may have offended the expert in the field, it is really not my intent. I am just sharing modest impressions (and yes - some are personal disapointments) as someone who continues to learn the method and tries to make the most out of it.
I just wanted to share some of my impressions after some extensive immersion into artificial neural networks in the context of regression analysis. True that neural nets are a powerful tool but like any tool there are limitations. I feel I still have a lot to learn in this field. Nevertheless I would like to outline some of the reservations and opinions I have come to on this subject. I would be happy if this can trigger further discussion/debate.
- Neural network as a "black box". It took me some time to try to understand what the reference to a black box truly means. Actually I have my interpretation about it. A neural network is a black box because none can truly predict what the hell is going in terms of network behavior when the topology parameters are adjusted; the whole thing is simply unpredictable. A neural net with 2 units in the hidden layers may do a good job when one with 3-unit / hidden layer would overlearn your problem. Now make a new random seed, the reverse will happen: the 3 unit per hidden layer neural net is the one that performs better. This is terrible.
-The learning method. It is incredible the number of updating methods that actually exist in order to update the neural network weights during training procedure (I refer especially to feedforward / back propagation nets). But what is disappointing is that the repeatability is very poor: solve a problem with a quick propagation and again with a batch back propagation and you get different outcomes ranging from small discrepancies to quite tangible differences considering same problem set up. Move to Levenberg-Marquardt and 2nd order methods such as scaled conjugate gradients - each time a new outcome again. Change the tuning of the learning parameters (momentum, learning rate etc,) here again different results. Regularization techniques, such as weight decay and alike do not produce same results too versus the selected updating method, etc.
-Overfitting. Neural networks need to be configured to the strict necessary number of layers/units to not overfit a problem. This is key. But go find the right balance without sacrifying (sometimes seriously) the accuracy, GOOD LUCK. Even hours of work on preconditionning data / doing principal component analysis for example could be of limited help while it is time consuming and a resources drainer. Overfitting is ugly. You could have all calibration indicators okay (RMSE, max/min differences, etc.) and still have a distastrous prediction on some points which are WITHIN the training range which will prompt immediately an "OMG" type of reaction.
- Estimating the intervals of confidence. Quite a complicated procedure to estimate theses intervals. And at the end, they only hold on the mathematical parametrization of the network but tells nothing on the physics itself. These can be misleading too (e.g. false positives). In addition spliting your training set over validation and test sets can be costly for you if you do not have enough data at hand; sometimes it is a luxury I would say.
- Randomness: Take the same network architecture and apply it to train and then predict the same problem / data set. Each time you get different outcome because of the weights random initialization procedure. If you set up a network/model once and do not touch it again - maybe that is fine. But if you want to have a network integrated as part of a prediction tool, you would never have two times the same weight distribution. As a consequence each time a difference set of predicted parameters. Go explain this an operator or to management who are ignorant about the peculiarities of the neural nets - here too good luck.
I find neural networks useful to deal with very, very complicated data structures as long as it is used as a tool to aid the engineers/scientist in finding patterns and understand how systems work - that is great. I would refrain from relying on a neural network in an engineering system design which in case of failure could have consequences on safety and have impact on public (injury/harm) and/or goods on the ground that as far as it goes it is truly a black box.
Sorry if I may have offended the expert in the field, it is really not my intent. I am just sharing modest impressions (and yes - some are personal disapointments) as someone who continues to learn the method and tries to make the most out of it.