Training a Sigmoidal Node Is Hard

Don R. Hush
1999 Neural Computation  
This paper proves that the task of computing near{optimal weights for sigmoidal nodes under the L 1 regression norm is NP{Hard. For the special case where the sigmoid is piecewise{linear we p r o ve a s l i g h tly stronger result, namely that computing the optimal weights is NP{Hard. These results parallel that for the one{node pattern recognition problem, namely that determining the optimal weights for a threshold logic node is also intractable. Our results have important consequences for
more » ... onsequences for constructive algorithms that build a regression model one node at a time. It suggests that although such methods are (in principle) capable of producing e cient size representations (e.g. see Barron (1993 ) Jones (1992 ), nding such representations may be computationally intractable. These results holds only in the deterministic sense, that is they does not exclude the possibility that such representations may befound e ciently with high probability. In fact it motivates the use of heuristic and/or randomized algorithms for this problem.
doi:10.1162/089976699300016449 fatcat:hdzoobbitjb6bginlypwmfjtnu