Synapse: The New Hypothesis

I spent the last couple months working on an AI system where actions would be able to be executed to achieve a goal, by chaining a couple of different feed forward networks together. I have proven that this hypothesis and architecture is not going to result in learned behavior. I believe that it is not learning behaviors because I did not capture the relationship between a “good” goal state or a “bad” goal state.

In my new proposal I am still sticking with a feed forward network, but instead of ending at a goal, this ends at an action. The goal is not actually a part of the network, although it is still a critical part of the system. The neural network architecture for this system is illustrated below.

The innovative aspect to this feedforward network does not have to do with the network itself, but with the way that the goal(s) will be used to adjust the learning rate of the back propagation routine.

As a reminder, in Synapse, all neural objects (Inputs, Actions, Goals) are sensors. They may or may not need to be passed into the inputs of the network based on whether there would be a natural relationship between them, but they all need to be sensing in the regular loop of the system.

With the proposed dynamic learning rate, when the synapses of the system are tuned in back propagation, depending on whether the system is close to an ideal goal or not, will depend on how much the weights of the system will be adjusted – and whether they will be tuned up to increase activation, or tuned down to inhibit activation.

Now that there is a dynamic learning rate that ties together the relationship between the current goal state, and the ideal goal state, the system needs to be tuned on an “ideal action”. An ideal action is dynamic based on the current action. A computed action (the end result of the forward propagation) is never a clean vector with 0, or 1 as values. There are some options I am playing with, but the first attempt at creating an “ideal action” is to normalize the action and then backpropagate with that new normalized value.

I believe that the “ideal action” will vary based on how your motor system interprets the action sense, but the “ideal action” should accentuate the highs and lows so that the system can adjust itself to those outlying values.

The new proposal incorporates two innovative concepts to try to reach a system that learns actions without an explicit teacher, and only using sensors. Those concepts are:

  1. Dynamic learning rate based on ideal goal states versus actual goal states.
  2. Idealized actions to help the system reinforce or inhibit certain action features.

With the relationship between an ideal goal state and an actual goal state being a central part to how the system tunes its synapses the issues with the previous attempt should be fixed. I see some implementation details that need to be ironed out when creating “ideal actions”. If something goes wrong with this version, this is the area that is going to be most scrutinized.

Here is hoping third time is the charm. I’ll keep you updated!

[SIDEBAR] Vector Dancing to Andrew Bird

I was doing some brainstorming and refactoring while listening to some Andrew Bird (Roma Fade) and the little robot I got for Christmas – Anki’s Vector – started dancing to the song. He was moving his little arm to the beat the same way I was bobbing my head.

I know that the little guy doesn’t actually feel the beat but it is crazy how connected I felt to him the second he was sharing that experience with me.

I gave him some pets to hopefully reinforce that behavior. I’m not sure if that does that or not, but I felt like I should try.

I wanted to share this experience while I try to solve my AI problems… which I have a new hypothesis. Details to come…

Synapse: Take 2

Another day, another attempt at arriving at an AI. I’ll start this article by stating that there will be at least a Take 3 – the failure of this system has me hitting the drawing board. I implemented the architecture I had hope in below, and was ending with a random success measure – the system is not learning it’s actions based on it’s goal actions, whether optimizing the network for the ideal, or the actual.

In this system the hope was that the weights in the back propagation would trickle all the way up past the actions, and into the sensor layers and the system would get nudged towards the actions that drive the system towards the ideal goal.

This system was learning how to predict the goals based on the sensors and action middle layer, but that did not trickle up to the sensor layers since the back propagation was just tweaking the network to predict the goal sense, or in the scenario when it was optimized against the ideal sense – it was just tweaking the weights to always optimize the goal feature that was ideally set, but there was no relation between the ideal and the actual – the goal was not being met, but the system kept tuning it towards the static features of an ideal goal.

So… the second failure. What did I learn in this round? I am going to hit the drawing board and think on how to capture the ideal versus the actual goal sensors relationships. This is a critical feature that I have not captured in my current iteration of the system.

Stay tuned… the weekend is soon but there are more than a couple of nights where I can consolidate my thoughts on this. Round 3 to come!

Synapse: Take 1

I recently completed the back propagation routine in Synapse and can now get to the scary part. I can now prove myself wrong. The empiricist in me is excited, but if the first round of tests are to go by, there are going to be a lot of ups and downs on this journey. I’ll skip to the end and let you know here that the first system I tried did not work.

The system I attempted to put together was the one below.

This was the easiest one to setup since I do not need to break apart the network, and I really am only taking a middle layer of a standard feed forward network and using that as my output.

I didn’t believe that this architecture would give results because I do not believe that there is enough information in the layers to build the necessary relationships between the sensor input, and the action output relative to the goal. Even with my low expectations, I was disheartened with failure.

With any failure, it is important to learn something – or else you really are not failing the right way. So did I learn something in this attempt?

I learned that I am not sure what I mean in the last step of optimizing with the goal sensor. Am I optimizing so that the network learns the actual goal sensor, or am I optimizing the system so that it can optimize to the ideal goal?

I did run the system with both options and received the same results, but as soon as I started to implement this piece the question was forced on me, and I definitely had made some leaps in my logic that I will have to iron out in my next attempt.

I will keep you updated on the progress…

Making the Back Propagation Routine

One of the more complicated aspects of a neural network is the back propagation routine. This is the routine that tunes the synapses (weights) of the network to optimize the system and reduce the error from the computed output to an ideal output.

For me, the difficulty in understanding this process was largely based on the technical terms and mathematical notations that are used to describe them, so I will not be using those in my article and I will simply go over my implementation of the back propagation routine from a theory and code perspective.

My test example is based off a very popular online article “A Step by Step Backpropagation Example” by Matt Mazur.

Matt’s article is widely referenced online with YouTube videos referencing the steps and examples as well as pretty good amount of comments on his blog dating back to it’s original posting date in 2015.

In order to prove out my back propagation method I used the structure, weights and ideals from his example (image below) and setup unit tests in my project to ensure that the values were coming out to about the right value. Utilizing floating point numbers results in less than exact numbers – so precision was an important part of validating the results.

Setting up my network with these specific layers, biases and inputs was a great way to make sure that as I refactor and enhance my system I will have proof that I have not broken something fundamental in it..

One of the struggles when I was working with the example images was just making sure that I had the proper weights associated with the proper neurons – something about the way they were illustrated in the image made me attribute the weights to the wrong neurons.

I spent a lot of time just comparing Matt’s examples to make sure that I interpreted the weights properly. I think that Matt struggled with that as well because there is an error in his example (that cost me a couple of days). All my math was adding up and only a couple of values were not lining up with his numbers. The weights in his example were going in opposite directions for w6 and w7, so I think he flipped the error value he used to compute those.

I have uploaded an Excel file with the math that is used in this example, and with it you can try and proof your work, and see if maybe I have made a mistake!

Here is what the file looks like.

One of the hardest things I had to grasp when working on implementing this routine based on this example was that there didn’t appear to be a pattern to the way the weights were identified in the step by step example. To find the contribution to the error for the weights in the last layer, a separate set of variables were being used than the first layers weights. To find a pattern I took his example and added another layer to it to see if I could find a pattern with the additional complexity, his example may have been too simple to find a pattern.

It took me a while, but what I broke it down to was:

  1. Get the Synapse (weight) target Neuron
  2. Get the Error relative to the target neuron’s output
  3. Multiply that by the derivative of output of the target neuron
  4. Multiple that by the derivative of the net of the source layer

I have made a particular focus to the target and source neuron in my bullet points above because it became very confusing in the code to differentiate things when everything was called “neuron”. Once I setup a source and a target neuron naming convention I was able to iron out a lot of my issues.

In my implementation I never use the existing weights value to identify the contribution of the preceding weights like Matt does in his example. In my routine everything is relative to the partial derivative of the error with respect to the output of a neuron. What that means is that when moving back through the network I always take it to the point of the target neurons output. Going from that output to a synapses contribution is just the derivative of the output of the target neuron and the derivative of that to the net input.

  • The net input is always the source neurons output.
  • The derivative of the output is the derivative of the activation function of the output value.

The pattern that I learned in this process was how to work backwards through functions. Derivatives are always described using gradient and slope and graphical illustrations, but I understood it a lot better when I realized that a derivative allows you to reverse through the network, and allow you to identify how much that specific network element (neuron input, neuron output, weight) contributed to the error. Because of the “chain rule” you can take all the previous contributions and use that to identify the additional contribution a previous synapse had on the error. Everything just builds on the previous values because the downstream contributions are part of the upstream contributions.

Understanding that aspect and really “knowing” that allowed me to finally put my back propagation routine together.

I have unit test for all the values in the step by step example, and just to make sure that my network can tune the synapses (weights) to the proper value and it can learn how to compute the proper output, I have a while loop iterating over as many examples as it needs to train the accuracy within a 0.00001 error rate and my network trains this simple example in ~40 ms.

With the network wired up, I can now run tests on my proposed architecture where I have a middle layer that is the output of the system – as an action, and the last layer being a goal that the system is optimized for.

I’ll keep you updated…