A neural computer network is a programmatic structure that simulates the functions within the organic brain. A well-trained neural network (NN) will accept a few parameters, and from past experience, it will produce a logically arrived output. Yes, we need to 'train' the network first, which is nothing but teaching it how to think. There is absolutely no magic involved and almost every step we are about to perform is rooted in mathematics-calculus and statistics, mainly. Let us start with the basics of NN and see how we can make use of it.
How they work
A NN has a hierarchical multi-layered structure. While the whole thing is called a 'network', inside that is a set of layers. A layer can be of three types-input, hidden or output. The input layer accepts values, one neuron to a parameter, and passes it on without performing any operations on that data. The hidden-layer (also called the operational neuron layer) is where most of the action happens. Here, data is transformed in various steps. There is no limit to the number of hidden layers a NN can have, although typical simple examples would use just one. Then comes the output layer-this layer does not operate on the data like the hidden layer does, but uses aggregation logic to combine the multiple results it receives into something more meaningful. Actually, a NN is easier to understand than a logic-gate network, and works quite similarly.
|
In a NN, you have two biological entities, in the form of neurons and dendrites. A NN layer as we saw earlier, has a number of neurons at the same hierarchical level. All the neurons in a level are connected to all the neurons in the level before and the level after. These interconnections are known as dendrites. Just like in an electronic network, the signal reaching a particular device is controlled by factors like connector resistance, potential and so on; data passing through an NN is also subject to fairly similar parameters. The equivalent of connector resistance is called 'Dendrite Weight'. The higher the weight, the stronger is the signal reaching the other end. Then you have the 'Bias', which is similar to the electromotive-potential in electronics.
The learning mechanism
Neural networks would be useless, because they would become any other algorithmic logic, if they couldn't use their results and past data to learn and achieve better results every time. In order to do this, it uses two mechanisms. One is the stored training data- here, past (statistical) data is collected for all affecting parameters, along with the optimal (correct) result in each case. This is stored in a file or a database. The AI of the NN program can load this information and train itself. And then there is the feed-forward and loop-feed mechanisms that can be used to adjust the values of bias and weight automatically pushing the engine to a better result. This is done on both live runs as well as training data. Both mechanisms are governed by yet another parameter called the 'Learning Rate'.
What's going to happen
Let's now take a look at a sample, and our own NN and trace the events as they happen to understand it a little better. As always we need a purpose, and we're writing a program that will attempt to predict when a particular server will crash given the five parameters. We said there is one neuron per input, so that makes five neurons in the input layer. Next, each of these values interacts with every other input value in the hidden layer. In our example, we take an arbitrary single hidden layer, with just three neurons (you can add more if you like). Being arbitrary again, we take a three-neuron output layer. This means we get the structure shown in the chart.
|
Let us proceed by tracing the inputs and outputs at NH1 (for an example). We already know that NI1...NI5 will pass on what's input directly. At NH1, therefore, each input is transformed to get one aggregate input. This is done by a simple equation:
NH1 (input) = (Bias of NH1) + NI1 * Whi1 + NI2 * Whi2 + ... + NI5 * Whi5
where 'WhiN' is the weight of the dendrite between NH1 and each of the input neurons. The output of a neuron is given by the mathematical equation:
Output = 1 / (1 + e-input)
Every NN computation will have an associated error and this is calculated using a formula. Once we get the error value, we need to fix the bias so that the NN learns. That is achieved by another formula. Both these are given below.
Error = Output * (1 — Output) * (Desired output — Output)
New Bias = Old Bias + (Old Bias * Error * Learning Rate)
Finally, our particular program will sum up the values at NO1...NO3 and divide it by 3 (all arbitrary) to get a single value that can be interpreted and stored as our result. The initial values of Bias and Weight at each neuron are always taken as random values, and we use the .NET 'Random' class to get our random value. Before we step into code proper, we should note that the entire program uses only 'Double' data type for the NN values, because all the neuron and dendrite values are always between 0 and 1.
Emulating the neurons
Let's make a thinking life form now. The zip file on the PCQEssentials CD contains a class file called 'NeuralNetwork.vb', open this file now to follow the text below. The file contains three classes: nnNeuron, nnNeuronLayer and NeuralNetwork. The nnNeuron class emulates a single neuron. The nnNeuronLayer is a set of neurons and adds a few values and functions to perform layer-level tasks. Finally the NeuralNetwork class manages all the layers. All outside programmatic actions happen at the NeuralNetwork level and we do not directly touch the layers or the neurons at any time, because of all the interactions between them-imagine what would happen if a random neuron in your body were to be replaced!
|
The process of life is set in motion by the NeuralNetwork.CreateNetwork function. This actually just initializes a neuron layer and tells the layer how many neurons we need in that layer (nnNeuronLayer.PopulateNeurons function). This function then creates that many neurons in that layer using the CreateNeuron function. The actual neuron-creation happens within the New function of the nnNeuron class.
We have built the three classes to a high degree of abstraction so that they can be used for any neuron and any layer to create as diverse a network as we need. This comes to life in two places. One, in the CreateNetwork function's IF statement, if the value of layer is '0', it makes the 'input' layer and if it is the last layer (number of 'hidden' layers plus 1), it makes the 'output' layer. All other layers are our 'hidden' layers.
The second is what happens as a result to the MakeInputLayerNeuron in line 141. The code though looks like a simple initializer is actually important:
ReDim m_Inputs(0)
ReDim m_DendriteWeights(0)
m_Inputs(0) = InputValue
m_DendriteWeights(0) = 1
m_TargetOutput = m_Inputs(0)
The ReDim statements make sure there is only one input to the neuron and there is only one associated dendrite to it. The InputValue to each input neuron is known already — its either what is input through our form, or a part of the test data. This is passed in as a function parameter and initialized directly. The dendrite weight is set to 1, because our CalcDendriteInput function in line 244 has this value multiplying with the active input value. Setting it to 1 therefore ensures that the input value is passed out unchanged. Similarly to make this neuron 'stupid' and not learn anything new, we set the m_TargetOutput value to the input value so that any change triggers an error condition. The three quick-fire functions between lines 243 and 276 do most of the neurological work. CalcDendriteInput finds out the incoming value to the neuron along one dendrite.
CalcNeuronInput puts all of that together using the hidden neuron input formula we saw earlier. The output value is calculated similarly in CalcNeuronOutput, which in turn calls the Adjust function to take care of the Bias and Weights. The IF statement in CalcNeuronOutput makes sure that if we're dealing with an input neuron, we just return the input value.
The form
The code here is pretty straightforward. Most of it just reads and writes our trainer-data file. The only calls to our NN are in lines 297-298 and again in 316. The first set feeds in the trainer data and gets the network to compute the results — note that results are not taken out at this stage, because we're just priming the weights and bias. Line 316 causes a network compute and gets the aggregate value. Truly, what we do with that value and how you interpret it is completely up to the application we're making. In this case, we would get a value less than 1. So we simply multiply it by 100 and tell the user that their server will fail at the end of that many days. Need not be true, this is just a demo. Have fun coding other neurological (and neuropathic) entities and build your own demon-bots and crystal-gazing
Deepblues!
Sujay V Sarma