BACK TO INDEX

Publications about 'recurrent neural networks'
Articles in journal or book chapters
  1. J. Hanson, M. Raginsky, and E.D. Sontag. Learning recurrent neural net models of nonlinear systems. Proc. of Machine Learning Research, 144:1-11, 2021. [PDF] Keyword(s): machine learning, empirical risk minimization, recurrent neural networks, dynamical systems, continuous time, system identification, statistical learning theory, generalization bounds.
    Abstract:
    This paper considers the following learning problem: given sample pairs of input and output signals generated by an unknown nonlinear system (which is not assumed to be causal or time-invariant), one wishes to find a continuous-time recurrent neural net, with activation function tanh, that approximately reproduces the underlying i/o behavior with high confidence. Leveraging earlier work concerned with matching derivatives up to a finite order of the input and output signals the problem is reformulated in familiar system-theoretic language and quantitative guarantees on the sup-norm risk of the learned model are derived, in terms of the number of neurons, the sample size, the number of derivatives being matched, and the regularity properties of the inputs, the outputs, and the unknown i/o map.


  2. E.D. Sontag and Y. Qiao. Further results on controllability of recurrent neural networks. Systems Control Lett., 36(2):121-129, 1999. [PDF] Keyword(s): machine learning, controllability, recurrent neural networks, neural networks.
    Abstract:
    This paper studies controllability properties of recurrent neural networks. The new contributions are: (1) an extension of the result in "Complete controllability of continuous-time recurrent neural networks" to a slightly different model, where inputs appear in an affine form, (2) a formulation and proof of a necessary and sufficient condition, in terms of local-local controllability, and (3) a complete analysis of the 2-dimensional case for which the hypotheses made in previous work do not apply.


  3. P. Koiran and E.D. Sontag. Vapnik-Chervonenkis dimension of recurrent neural networks. Discrete Appl. Math., 86(1):63-79, 1998. [PDF] [doi:http://dx.doi.org/10.1016/S0166-218X(98)00014-6] Keyword(s): machine learning, neural networks, recurrent neural networks.
    Abstract:
    This paper provides lower and upper bounds for the VC dimension of recurrent networks. Several types of activation functions are discussed, including threshold, polynomial, piecewise-polynomial and sigmoidal functions. The bounds depend on two independent parameters: the number w of weights in the network, and the length k of the input sequence. Ignoring multiplicative constants, the main results say roughly the following: 1. For architectures whose activation is any fixed nonlinear polynomial, the VC dimension is proportional to wk. 2. For architectures whose activation is any fixed piecewise polynomial, the VC dimension is between wk and w**2k. 3. For architectures with threshold activations, the VC dimension is between wlog(k/w) and the smallest of wklog(wk) and w**2+wlog(wk). 4. For the standard sigmoid tanh(x), the VC dimension is between wk and w**4 k**2.


  4. E.D. Sontag. A learning result for continuous-time recurrent neural networks. Systems Control Lett., 34(3):151-158, 1998. [PDF] [doi:http://dx.doi.org/10.1016/S0167-6911(98)00006-1] Keyword(s): machine learning, neural networks, VC dimension, recurrent neural networks.
    Abstract:
    The following learning problem is considered, for continuous-time recurrent neural networks having sigmoidal activation functions. Given a ``black box'' representing an unknown system, measurements of output derivatives are collected, for a set of randomly generated inputs, and a network is used to approximate the observed behavior. It is shown that the number of inputs needed for reliable generalization (the sample complexity of the learning problem) is upper bounded by an expression that grows polynomially with the dimension of the network and logarithmically with the number of output derivatives being matched.


  5. P. Koiran and E.D. Sontag. Vapnik-Chervonenkis dimension of recurrent neural networks. In Computational learning theory (Jerusalem, 1997), volume 1208 of Lecture Notes in Comput. Sci., pages 223-237. Springer-Verlag, London, UK, 1997. Keyword(s): machine learning, neural networks, VC dimension, recurrent neural networks.


  6. E.D. Sontag. Recurrent neural networks: Some systems-theoretic aspects. In M. Karny, K. Warwick, and V. Kurkova, editors, Dealing with Complexity: a Neural Network Approach, pages 1-12. Springer-Verlag, London, 1997. [PDF] Keyword(s): machine learning, neural networks, recurrent neural networks, learning, VC dimension.
    Abstract:
    This paper provides an exposition of some recent results regarding system-theoretic aspects of continuous-time recurrent (dynamic) neural networks with sigmoidal activation functions. The class of systems is introduced and discussed, and a result is cited regarding their universal approximation properties. Known characterizations of controllability, observability, and parameter identifiability are reviewed, as well as a result on minimality. Facts regarding the computational power of recurrent nets are also mentioned.


  7. R. Koplon and E.D. Sontag. Using Fourier-neural recurrent networks to fit sequential input/output data. Neurocomputing, 15:225-248, 1997. [PDF] Keyword(s): machine learning, neural networks, recurrent neural networks.
    Abstract:
    This paper suggests the use of Fourier-type activation functions in fully recurrent neural networks. The main theoretical advantage is that, in principle, the problem of recovering internal coefficients from input/output data is solvable in closed form.


  8. E.D. Sontag and H.J. Sussmann. Complete controllability of continuous-time recurrent neural networks. Systems Control Lett., 30(4):177-183, 1997. [PDF] [doi:http://dx.doi.org/10.1016/S0167-6911(97)00002-9] Keyword(s): machine learning, neural networks, recurrent neural networks.
    Abstract:
    This paper presents a characterization of controllability for the class of control systems commonly called (continuous-time) recurrent neural networks. The characterization involves a simple condition on the input matrix, and is proved when the activation function is the hyperbolic tangent.


  9. B. DasGupta and E.D. Sontag. Sample complexity for learning recurrent perceptron mappings. IEEE Trans. Inform. Theory, 42(5):1479-1487, 1996. [PDF] Keyword(s): machine learning, neural networks, VC dimension, recurrent neural networks.
    Abstract:
    Recurrent perceptron classifiers generalize the usual perceptron model. They correspond to linear transformations of input vectors obtained by means of "autoregressive moving-average schemes", or infinite impulse response filters, and allow taking into account those correlations and dependences among input coordinates which arise from linear digital filtering. This paper provides tight bounds on sample complexity associated to the fitting of such models to experimental data. The results are expressed in the context of the theory of probably approximately correct (PAC) learning.


  10. H. T. Siegelmann and E.D. Sontag. On the computational power of neural nets. J. Comput. System Sci., 50(1):132-150, 1995. [PDF] [doi:http://dx.doi.org/10.1006/jcss.1995.1013] Keyword(s): machine learning, neural networks, recurrent neural networks, machine learning, analog computing, theory of computing, neural networks, computational complexity, super-Turing computation.
    Abstract:
    This paper deals with finite size networks which consist of interconnections of synchronously evolving processors. Each processor updates its state by applying a "sigmoidal" function to a rational-coefficient linear combination of the previous states of all units. We prove that one may simulate all Turing Machines by such nets. In particular, one can simulate any multi-stack Turing Machine in real time, and there is a net made up of 886 processors which computes a universal partial-recursive function. Products (high order nets) are not required, contrary to what had been stated in the literature. Non-deterministic Turing Machines can be simulated by non-deterministic rational nets, also in real time. The simulation result has many consequences regarding the decidability, or more generally the complexity, of questions about recursive nets.


  11. F. Albertini and E.D. Sontag. State observability in recurrent neural networks. Systems Control Lett., 22(4):235-244, 1994. [PDF] [doi:http://dx.doi.org/10.1016/0167-6911(94)90054-X] Keyword(s): machine learning, neural networks, recurrent neural networks, observability, identifiability.
    Abstract:
    This paper concerns recurrent networks x'=s(Ax+Bu), y=Cx, where s is a sigmoid, in both discrete time and continuous time. Our main result is that observability can be characterized, if one assumes certain conditions on the nonlinearity and on the system, in a manner very analogous to that of the linear case. Recall that for the latter, observability is equivalent to the requirement that there not be any nontrivial A-invariant subspace included in the kernel of C. We show that the result generalizes in a natural manner, except that one now needs to restrict attention to certain special "coordinate" subspaces.


  12. H. T. Siegelmann and E.D. Sontag. Analog computation via neural networks. Theoret. Comput. Sci., 131(2):331-360, 1994. [PDF] [doi:http://dx.doi.org/10.1016/0304-3975(94)90178-3] Keyword(s): analog computing, neural networks, computational complexity, super-Turing computation, recurrent neural networks, neural networks, computational complexity.
    Abstract:
    We consider recurrent networks with real-valued weights. If allowed exponential time for computation, they turn out to have unbounded power. However, under polynomial-time constraints there are limits on their capabilities, though being more powerful than Turing Machines. Moreover, there is a precise correspondence between nets and standard non-uniform circuits with equivalent resources, and as a consequence one has lower bound constraints on what they can compute. We note that these networks are not likely to solve polynomially NP-hard problems, as the equality "P=NP" in our model implies the almost complete collapse of the standard polynomial hierarchy. We show that a large class of different networks and dynamical system models have no more computational power than this neural (first-order) model with real weights. The results suggest the following Church-like Thesis of Time-bounded Analog Computing: "Any reasonable analog computer will have no more power (up to polynomial time) than first-order recurrent networks."


  13. F. Albertini, E.D. Sontag, and V. Maillot. Uniqueness of weights for neural networks. In R. Mammone, editor, Artificial Neural Networks for Speech and Vision, pages 115-125. Chapman and Hall, London, 1993. [PDF] Keyword(s): machine learning, neural networks, recurrent neural networks.
    Abstract:
    In this short expository survey, we sketch various known facts about uniqueness of weights in neural networks, including results about recurrent nets, and we provide a new and elementary complex-variable proof of a uniqueness result that applies in the single hidden layer case.


  14. E.D. Sontag. Neural networks for control. In H. L. Trentelman and J. C. Willems, editors, Essays on control: perspectives in the theory and its applications (Groningen, 1993), volume 14 of Progr. Systems Control Theory, pages 339-380. Birkhäuser Boston, Boston, MA, 1993. Note: A longer version (tech report with more details) is here: http://sontaglab.org/FTPDIR/neural-nets-siemens.pdf. [PDF] Keyword(s): neural networks, recurrent neural networks, machine learning, neural networks.
    Abstract:
    This paper has an expository introduction to two related topics: (a) Some mathematical results regarding "neural networks", and (b) so-called "neurocontrol" and "learning control" (each part can be read independently of the other). It was prepared for a short course given at the 1993 European Control Conference.


  15. F. Albertini and E.D. Sontag. For neural networks, function determines form. Neural Networks, 6(7):975-990, 1993. [PDF] Keyword(s): machine learning, neural networks, identifiability, recurrent neural networks, realization theory, observability, neural networks.
    Abstract:
    This paper shows that the weights of continuous-time feedback neural networks x'=s(Ax+Bu), y=Cx (where s is a sigmoid) are uniquely identifiable from input/output measurements. Under very weak genericity assumptions, the following is true: Assume given two nets, whose neurons all have the same nonlinear activation function s; if the two nets have equal behaviors as "black boxes" then necessarily they must have the same number of neurons and -except at most for sign reversals at each node- the same weights. Moreover, even if the activations are not a priori known to coincide, they are shown to be also essentially determined from the external measurements.


  16. H. T. Siegelmann and E.D. Sontag. Turing computability with neural nets. Appl. Math. Lett., 4(6):77-80, 1991. [PDF] Keyword(s): machine learning, neural networks, computational complexity, recurrent neural networks.
    Abstract:
    This paper shows the existence of a finite neural network, made up of sigmoidal neurons, which simulates a universal Turing machine. It is composed of less than 100,000 synchronously evolving processors, interconnected linearly. High-order connections are not required. (Note: this paper was placed here by special request. The results in this paper have been by now improved considerably: see the JCSS pape which among other aspects provides a polynomial time simulation. This paper, based on a unary encoding, results in an exponential slowdown).


Conference articles
  1. E.D. Sontag and Y. Qiao. Remarks on controllability of recurrent neural networks. In Proc. IEEE Conf. Decision and Control, Tampa, Dec. 1998, IEEE Publications, 1998, pages 501-506, 1998. Keyword(s): machine learning, neural networks, recurrent neural networks.


  2. E.D. Sontag. Some learning and systems-theoretic questions regarding recurrent neural networks. In Proc. Conf. on Information Sciences and Systems (CISS 97), Johns Hopkins, Baltimore, MD, March 1997, pages 630-635, 1997. Keyword(s): machine learning, neural networks, VC dimension, recurrent neural networks.


  3. B. Dasgupta and E.D. Sontag. Sample complexity for learning recurrent perceptron mappings. In D.S. Touretzky, M.C. Moser, and M.E. Hasselmo, editors, Advances in Neural Information Processing Systems 8, pages 204-210, 1996. MIT Press, Cambridge, MA. Keyword(s): machine learning, neural networks, VC dimension, recurrent neural networks.


  4. R. Koplon and E.D. Sontag. Techniques for parameter reconstruction in Fourier-Neural recurrent networks. In Proc. IEEE Conf. Decision and Control, Orlando, Dec. 1994, IEEE Publications, 1994, pages 213-218, 1994. Keyword(s): machine learning, neural networks, recurrent neural networks.


  5. F. Albertini and E.D. Sontag. Identifiability of discrete-time neural networks. In Proc. European Control Conf., Groningen, June 1993, pages 460-465, 1993. Keyword(s): machine learning, neural networks, recurrent neural networks.


  6. F. Albertini and E.D. Sontag. State observability in recurrent neural networks. In Proc. IEEE Conf. Decision and Control, San Antonio, Dec. 1993, IEEE Publications, 1993, pages 3706-3707, 1993. Keyword(s): machine learning, neural networks, observability, recurrent neural networks.


  7. F. Albertini and E.D. Sontag. Uniqueness of weights for recurrent nets. In Systems and Networks: Mathematical Theory and Applications, Proc. MTNS '93, Vol. 2, Akad. Verlag, Regensburg, pages 599-602, 1993. Note: Full version, never submitted for publication, is here: http://sontaglab.org/FTPDIR/93mtns-nn-extended.pdf. [PDF] Keyword(s): machine learning, neural networks, identifiability, recurrent neural networks.
    Abstract:
    This paper concerns recurrent networks x'=s(Ax+Bu), y=Cx, where s is a sigmoid, in both discrete time and continuous time. The paper establishes parameter identifiability under stronger assumptions on the activation than in "For neural networks, function determines form", but on the other hand deals with arbitrary (nonzero) initial states.


  8. J. L. Balcázar, R. Gavaldà, H. T. Siegelmann, and E.D. Sontag. Some structural complexity aspects of neural computation. In Proceedings of the Eighth Annual Structure in Complexity Theory Conference (San Diego, CA, 1993), Los Alamitos, CA, pages 253-265, 1993. IEEE Comput. Soc. Press. [PDF] Keyword(s): machine learning, analog computing, neural networks, computational complexity, super-Turing computation, theory of computing and complexity.
    Abstract:
    Recent work by H.T. Siegelmann and E.D. Sontag (1992) has demonstrated that polynomial time on linear saturated recurrent neural networks equals polynomial time on standard computational models: Turing machines if the weights of the net are rationals, and nonuniform circuits if the weights are real. Here, further connections between the languages recognized by such neural nets and other complexity classes are developed. Connections to space-bounded classes, simulation of parallel computational models such as Vector Machines, and a discussion of the characterizations of various nonuniform classes in terms of Kolmogorov complexity are presented.


  9. H.T. Siegelmann and E.D. Sontag. Analog computation via neural networks. In Proc. 2nd Israel Symposium on Theory of Computing and Systems (ISTCS93), IEEE Computer Society Press, 1993, 1993. Keyword(s): analog computing, neural networks, computational complexity, super-Turing computation, recurrent neural networks.


  10. F. Albertini and E.D. Sontag. For neural networks, function determines form. In Proc. IEEE Conf. Decision and Control, Tucson, Dec. 1992, IEEE Publications, 1992, pages 26-31, 1992. Keyword(s): machine learning, neural networks, recurrent neural networks.


  11. H.T. Siegelmann and E.D. Sontag. On the computational power of neural nets. In COLT '92: Proceedings of the fifth annual workshop on Computational learning theory, New York, NY, USA, pages 440-449, 1992. ACM Press. [doi:http://doi.acm.org/10.1145/130385.130432] Keyword(s): analog computing, neural networks, computational complexity, super-Turing computation, recurrent neural networks.


  12. H.T. Siegelmann and E.D. Sontag. Some results on computing with neural nets. In Proc. IEEE Conf. Decision and Control, Tucson, Dec. 1992, IEEE Publications, 1992, pages 1476-1481, 1992. Keyword(s): analog computing, neural networks, computational complexity, super-Turing computation, recurrent neural networks.


  13. H.T. Siegelmann, E.D. Sontag, and C.L. Giles. The Complexity of Language Recognition by Neural Networks. In Proceedings of the IFIP 12th World Computer Congress on Algorithms, Software, Architecture - Information Processing '92, Volume 1, pages 329-335, 1992. North-Holland. Keyword(s): machine learning, neural networks, computational complexity, machine learning, recurrent neural networks, theory of computing and complexity.


  14. E.D. Sontag. Neural nets as systems models and controllers. In Proc. Seventh Yale Workshop on Adaptive and Learning Systems, Yale University, 1992, pages 73-79, 1992. [PDF] Keyword(s): machine learning, neural networks, recurrent neural networks, neural networks.
    Abstract:
    A conference paper. Placed here because it was requested, but contains little that is not also contained in the survey on neural nets mentioned above.


  15. E.D. Sontag. Systems combining linearity and saturations, and relations to neural nets. In Nonlinear Control Systems Design 1992, IFAC Symposia Series, 1993, M. Fliess Ed., Pergamon Press, Oxford, 1993, pages 15-21, 1992. Note: (Also in Proc. Nonlinear Control Systems Design Symp., Bordeaux, June 1992, M. Fliess, Ed., IFAC Publications, pp. 242-247). Keyword(s): machine learning, neural networks, recurrent neural networks.



BACK TO INDEX




Disclaimer:

This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders.




Last modified: Fri Nov 15 15:28:36 2024
Author: sontag.


This document was translated from BibTEX by bibtex2html