"A Brief Introduction of Neural Networks" is bilingual now

After presenting the beta and gamma versions, in february 2008 the delta version of the manuscript was published. Thanks a lot to all the readers sending me remarks. Special thanks go to Beate Kuhl: She managed to translate the whole manuscript! Thus, it is now available both in German and English!

Cover of the manuscriptAs usual, The manuscript is available on the corresponding sub page concerning Neural Networks. Following extensions of the manuscript will be:

  • Further enhancements of the overall arrangement (IMHO, there exist too many small chapters).
  • The biology chapter will be added (the chapter number 2 is reserved for this. It's practically finished.
  • A chapter “Evolution of Neural Networks” will be added, introducing both evolutionary optimization strategies and their appliances of neural nets. There exist lots of evolutionary operators allowing not only for the adjustment of synaptic weights on the net, but also for growing network topologies in order to fit your needs.

A Brief Introduction to Neural Networks

Manuscript Download - Zeta2 Version

Filenames are subject to change. Thus, if you place links, please do so with this subpage as target.

Original version eBookReader optimized
English PDF, 6.2MB, 244 pages PDF, 6.1MB, 286 pages
German PDF, 6.2MB, 256 pages PDF, 6.2MB, 296 pages

Original Version? EBookReader Version?

The original version is the two-column layouted one you've been used to. The eBookReader optimized version on the other hand has one-column layout. In addition, headers, footers and marginal notes were removed.

For print, the eBookReader version obviously is less attractive. It lacks nice layout and reading features and occupies a lot more pages. However, using electronic readers, the simpler lay-out significantly reduces the scrolling effort.

During every release process from now on, the eBookReader version going to be automatically generated from the original content. However, contrary to the original version, it is not provided an additional manual layout and typography tuning cycle by the release workflow. So concerning the aestetics of the eBookReader optimized version, do not expect any support :-)

Further Information for Readers

Provide Feedback!

This manuscript relies very much on your feedback to improve it. As you can see from the lots of helpers mentioned in my frontmatter, I really appreciate and make use of feedback I receive from readers. If you have any complaints, bug-fixes, suggestions, or acclamations :-) send emails to me or place a comment in the newly-added discussion section below at the bottom of this page. Be sure you get a response.

How to Cite this Manuscript

There's no official publisher, so you need to be careful with your citation. For now, use this:

David Kriesel, 2007, A Brief Introduction to Neural Networks, available at

This reference is, of course, for the english version. Please look at the German translation of this page to find the German reference.

Please always include the URL – it's the only unique identifier to the text (for now)! Note the lack of edition name, which changes with every new edition, and Google Scholar and Citeseer both have trouble with fast-changing editions. If you prefer BibTeX:

@Book{ Kriesel2007NeuralNetworks, 
       author = { David Kriesel }, 
       title =  { A Brief Introduction to Neural Networks },
       year =   { 2007 }, 
       url =   { available at } 
Again, this reference is for the English version.

Terms of Use

From the epsilon edition, the text is licensed under the Creative Commons Attribution-No Derivative Works 3.0 Unported License, except for some little portions of the work licensed under more liberal licenses as mentioned in the frontmatter or throughout the text. Note that this license does not extend to the source files used to produce the document. Those are still mine.


To round off the manuscript, there is still some work to do. In general, I want to add the following aspects:

  1. Implementation and SNIPE: While I was editing the manuscript, I was also implementing SNIPE a high performance framework for using neural networks with JAVA. This has to be brought in-line with the manuscript: I'd like to place remarks (e.g. “This feature is implemented in method XXX in SNIPE”) all over the manuscript. Moreover, an extensive discussion chapter on the efficient implementation of neural networks will be added. Thus, SNIPE can serve as reference implementation for the manuscript, and vice versa.
  2. Evolving neural networks: I want to add a nice chapter on evolving neural networks (which is, for example, one of the focuses of SNIPE, too). Evolving means, just growing populations of neural networks in an evolutionary-inspired way, including topology and synaptic weights, which also works with recurrent neural networks.
  3. Hints for practice: In chapters 4 and 5, I'm still missing lots of practice hints (e.g. how to preprocess learning data, and other hints particularly concerning MLPs).
  4. Smaller issues: A short section about resilient propagation and some more algorithms would be great in chapter 5. The chapter about recurrent neural networks could be extended. Some references are still missing. A small chapter about echo state networks would be nice.

I think, this is it … :-) as you can see, there's still a bit of work to do until I call the manuscript “finished”. All in all, It will be less work than I already did. However, it will take several further releases until everything is included.

Recent News

As of the manuscript's Epsilon version, update information is published in news articles whose headlines you find right below. Please click on any news title to get the information.

What are Neural Networks, and what are the Manuscript Contents?

Neural networks are a bio-inspired mechanism of data processing, that enables computers to learn technically similar to a brain and even generalize once solutions to enough problem instances are tought.

The manuscript “A Brief Introduction to Neural Networks” is divided into several parts, that are again split to chapters. The contents of each chapter are summed up in the following.

Part I: From Biology to Formalization -- Motivation, Philosophy, History and Realization of Neural Models

Introduction, Motivation and History

How to teach a computer? You can either write a rigid program – or you can enable the computer to learn on its own. Living beings don't have any programmer writing a program for developing their skills, which only has to be executed. They learn by themselves – without the initial experience of external knowledge – and thus can solve problems better than any computer today. KaWhat qualities are needed to achieve such a behavior for devices like computers? Can such cognition be adapted from biology? History, development, decline and resurgence of a wide approach to solve problems.

Biological Neural Networks

How do biological systems solve problems? How is a system of neurons working? How can we understand its functionality? What are different quantities of neurons able to do? Where in the nervous system are information processed? A short biological overview of the complexity of simple elements of neural information processing followed by some thoughts about their simplification in order to technically adapt them.

Components of Artificial Neural Networks

Formal definitions and colloquial explanations of the components that realize the technical adaptations of biological neural networks. Initial descriptions of how to combine these components to a neural network.

How to Train a Neural Network?

Approaches and thoughts of how to teach machines. Should neural networks be corrected? Should they only be encouraged? Or should they even learn without any help? Thoughts about what we want to change during the learning procedure and how we will change it, about the measurement of errors and when we have learned enough.

Part II: Supervised learning Network Paradigms

The Perceptron

A classic among the neural networks. If we talk about a neural network, then in the majority of cases we speak about a percepton or a variation of it. Perceptrons are multi-layer networks without recurrence and with fixed input and output layers. Description of a perceptron, its limits and extensions that should avoid the limitations. Derivation of learning procedures and discussion about their problems.

Radial Basis Functions

RBF networks approximate functions by stretching and compressing Gaussians and then summing them spatially shifted. Description of their functions and their learning process. Comparison with multi-layer perceptrons.

Recurrent Multi-layer Perceptrons

Some thoughts about networks with internal states. Learning approaches using such networks, overview of their dynamics.

Hopfield Networks

In a magnetic field, each particle applies a force to any other particle so that all particles adjust their movements in the energetically most favorable way. This natural mechanism is copied to adjust noisy inputs in order to match their real models.

Learning Vector Quantisation

Learning vector quantization is a learning procedure with the aim to reproduce the vector training sets divided in predefined classes as good as possible by using a few representative vectors. If this has been managed, vectors which were unkown until then could easily be assigned to one of these classes.

Part III: Unsupervised learning Network Paradigms

Self Organizing Feature Maps

A paradigm of unsupervised learning neural networks, which maps an input space by its fixed topology and thus independently looks for simililarities. Function, learning procedure, variations and neural gas.

Adaptive Resonance Theory

An ART network in its original form shall classify binary input vectors, i.e. to assign them to a 1-out-of-n output. Simultaneously, the so far unclassified patterns shall be recognized and assigned to a new class.

Part IV: Excursi, Appendices and Registers

Cluster Analysis and Regional and Online Learnable Fields

In Grimm's dictionary the extinct German word “Kluster” is described by “was dicht und dick zusammensitzet (a thick and dense group of sth.)”. In static cluster analysis, the formation of groups within point clouds is explored. Introduction of some procedures, comparison of their advantages and disadvantages. Discussion of an adaptive clustering method based on neural networks. A regional and online learnable field models from a point cloud, possibly with a lot of points, a comparatively small set of neurons being representative for the point cloud.

Neural Networks Used for Prediction

Discussion of an application of neural networks: A look ahead into the future of time series.

Reinforcement Learning

What if there were no training examples but it would nevertheless be possible to evaluate how good we have learned to solve a problem? et us regard a learning paradigm that is situated between supervised and unsupervised learning.