An Introduction to Neural Networks
Let us first recap the most important features of the neural networks found in the brain. Firstly the brain contains many billions of very special kinds of cell - these are the nerve cells or neurons. These cells are organized into a very complicated intercommunicating network. Typically each neuron is physically connected to tens of thousands of others. Using these connections neurons can pass electrical signals between each other. These connections are not merely on or off - the connections have varying strength which allows the influence of a given neuron on one of its neighbors to be either very strong, very weak (perhaps even no influence) or anything in between. Furthermore, many aspects of brain function, particularly the learning process, are closely associated with the adjustment of these connection strengths. Brain activity is then represented by particular patterns of firing activity amongst this network of neurons. It is this simultaneous cooperative behavior of very many simple processing units which is at the root of the enormous sophistication and computational power of the brain.
The Hopfield Network
Artificial Neural Networks are computers whose architecture is modeled after the brain. They typically consist of many hundreds of simple processing units which are wired together in a complex communication network. Each unit or node is a simplified model of a real neuron which fires (sends off a new signal) if it receives a sufficiently strong input signal from the other nodes to which it is connected. The strength of these connections may be varied in order for the network to perform different tasks corresponding to different patterns of node firing activity. This structure is very different from traditional computers.
The traditional computers that we deal with every day have changed very little since their beginnings in the 1940's. While there have been very significant advances in the speed and size of the silicon-based transistors, which form their basic elements - the hardware, the overall design or architecture has not changed significantly. They still consist of a central processing unit or CPU which executes a rigid set of rules (the program or software) sequentially, reading and writing data from a separate unit - the memory. All the "intelligence" of the machine resides in this set of rules - which are supplied by the human programmer. The usefulness of the computer lies in its vast speed at executing those rules - it is a superb machine but not a mind.
Neural networks are very different - they are composed of many rather feeble processing units, which are connected into a network. Their computational power depends on working together on any task - this is sometimes termed parallel processing. There is no central CPU following a logical sequence of rules - indeed there is no set of rules or program. Computation is related to a dynamic process of node firings. This structure then is much closer to the physical workings of the brain and leads to a new type of computer that is rather good at a range of complex tasks.
The Hopfield Neural Network is a simple artificial network which is able to store certain memories or patterns in a manner rather similar to the brain - the full pattern can be recovered if the network is presented with only partial information. Furthermore there is a degree of stability in the system - if just a few of the connections between nodes (neurons) are severed, the recalled memory is not too badly corrupted - the network can respond with a "best guess". Of course, a similar phenomenon is observed with the brain - during an average lifetime many neurons will die but we do not suffer a catastrophic loss of individual memories - our brains are quite robust in this respect (by the time we die we may have lost 20 percent of our original neurons).
The nodes in the network are vast simplifications of real neurons - they can only exist in one of two possible "states" - firing or not firing. Every node is connected to every other node with some strength. At any instant of time a node will change its state (i.e start or stop firing) depending on the inputs it receives from the other nodes.
If we start the system off with a any general pattern of firing and non-firing nodes then this pattern will in general change with time. To see this think of starting the network with just one firing node. This will send a signal to all the other nodes via its connections so that a short time later some of these other nodes will fire. These new firing nodes will then excite others after a further short time interval and a whole cascade of different firing patterns will occur. One might imagine that the firing pattern of the network would change in a complicated perhaps random way with time. The crucial property of the Hopfield network which renders it useful for simulating memory recall is the following: we are guaranteed that the pattern will settle down after a long enough time to some fixed pattern. Certain nodes will be always "on" and others "off". Furthermore, it is possible to arrange that these stable firing patterns of the network correspond to the desired memories we wish to store!
The reason for this is somewhat technical but we can proceed by analogy. Imagine a ball rolling on some bumpy surface. We imagine the position of the ball at any instant to represent the activity of the nodes in the network. Memories will be represented by special patterns of node activity corresponding to wells in the surface. Thus, if the ball is let go, it will execute some complicated motion but we are certain that eventually it will end up in one of the wells of the surface. We can think of the height of the surface as representing the energy of the ball. We know that the ball will seek to minimize its energy by seeking out the lowest spots on the surface -- the wells.
Furthermore, the well it ends up in will usually be the one it started off closest to. In the language of memory recall, if we start the network off with a pattern of firing which approximates one of the "stable firing patterns" (memories) it will "under its own steam" end up in the nearby well in the energy surface thereby recalling the original perfect memory.
The smart thing about the Hopfield network is that there exists a rather simple way of setting up the connections between nodes in such a way that any desired set of patterns can be made "stable firing patterns". Thus any set of memories can be burned into the network at the beginning. Then if we kick the network off with any old set of node activity we are guaranteed that a "memory" will be recalled. Not too surprisingly, the memory that is recalled is the one which is "closest" to the starting pattern. In other words, we can give the network a corrupted image or memory and the network will "all by itself" try to reconstruct the perfect image. Of course, if the input image is sufficiently poor, it may recall the incorrect memory - the network can become "confused" - just like the human brain. We know that when we try to remember someone's telephone number we will sometimes produce the wrong one! Notice also that the network is reasonably robust - if we change a few connection strengths just a little the recalled images are "roughly right". We don't lose any of the images completely.
The Perceptron - a network for decision making
An artificial neural network which attempts to emulate this pattern recognition process is called the Perceptron. In this model, the nodes representing artificial neurons are arranged into layers. The signal representing an input pattern is fed into the first layer. The nodes in this layer are connected to another layer (sometimes called the "hidden layer"). The firing of nodes on the input layer is conveyed via these connections to this hidden layer. Finally, the activity on the nodes in this layer feeds onto the final output layer, where the pattern of firing of the output nodes defines the response of the network to the given input pattern. Signals are only conveyed forward from one layer to a later layer - the activity of the output nodes does not influence the activities on the hidden layer.
In contrast to the Hopfield network, this network produces its response to any given input pattern almost immediately - the firing pattern of the output is automatically stable. There is no relaxation process to a stable firing pattern, as occurs with the Hopfield model.
To try to simplify things, we can think of a simple model in which the network is made up of two screens - the nodes on the first (input) layer of the network are represented as light bulbs which are arranged in a regular pattern on the first screen. Similarly, the nodes of the third (output) layer can be represented as a regular array of light bulbs on the second screen. There is no screen for the hidden layer - that is why it is termed "hidden"! Instead we can think of a black box which connects the first screen to the second. Of course, the magic of how the black box will function depends on the network connections between hidden nodes which are inside. When a node is firing, we show this by lighting its bulb. See the picture for illustration.
We can now think of the network functioning in the following way: a given pattern of lit bulbs is set up on the first screen. This then feeds into the black box (the hidden layer) and results in a new pattern of lit bulbs on the second screen. This might seem a rather pointless exercise in flashing lights except for the following crucial observation. It is possible to "tweak" with the contents of the black box (adjust the strengths of all these internode connections) so that the system can produce any desired pattern on the second screen for a very wide range of input patterns. For example, if the input pattern is a triangle, the output pattern can be trained to be a triangle. If an input pattern containing a triangle and a circle is presented, the output can be still arranged to be a triangle. Similarly, we may add a variety of other shapes to the network input pattern and teach the net to only respond to triangles. If there is no triangle in the input, the network can be made to respond, for example, with a zero.
In principle, by using a large network with many nodes in the hidden layer, it is possible to arrange that the network still spots triangles in the input pattern, independently of what other junk there is around. Another way of looking at this is that: the network can classify all pictures into one of two sets - those containing triangles and those which do not. The perceptron is said to be capable of both recognizing and classifying patterns.
Furthermore, we are not restricted to spotting triangles, we could simultaneously arrange for the network to spot squares, diamonds or whatever we wanted. We could be more ambitious and ask that the network respond with a circle whenever we present it with a picture which contains both triangles, squares but not diamonds. There is another important task that the perceptron can perform usefully: the network may be used to draw associations between objects. For example, whenever the network is presented with a picture of a dog, its output may be a cat. Hopefully, you are beginning to see the power of this machine at doing rather complex pattern recognition, classification and association tasks. It is no coincidence, of course, that these are the types of task that the brain is exceptionally good at.