Pivot on a Pivot… Creating an automated Supervisor for labeled Data

Last time you heard from me I had to pivot due to the some technical limitations of the Unity3D engine and my choice for neural processor (Encog).

Today I pivot again. The difference between my original plan and this new one is significant, as my original plan was an unsupervised network, and this new approach is a supervised network.

I thought I could make it work.. and I could.. technically – even if it is a bit of a shoe horn, but I thought of something that fits the model a lot better: an automated supervisor – or Robovisor. I will build a literal supervisor that will provide feedback to the network based on whether the Robovisor likes the action of the system- or in this implementation if the number that the system returns is the actual number. If it does, then it will be given positive feedback, and if it is wrong, it will receive negative feedback. The goal of the system is to continue to have positive feedback.

I thought of this feedback sensor previously as I was brainstorming ways to coach an embodied system with Synapse. I imagined that if I liked the way that the truck took the turn in the game, I would have an app that I could “thumbs up” the truck providing it positive feedback. I am going to expand this concept and instead of me providing feedback for the truck, I will build a supervisor that will know the number written in the MNIST image and provide positive or negative feedback based on the action of the system.

I’m not sure if this will work or not, but this is a lot more interesting to test than the standard labeled dataset supervised learning examples that exist everywhere. I think that this is a novel approach and am looking forward to how it works out!

New layout for MNIST Implementation

Sensors

  • Number Sensor (0-9) [This is for the Supervisor]
  • Image Sensor (28×28 pixel images)
  • Feedback Sensor (-1, 0. 1)

Actions

  • Interpreted Number (0-9)

Goals

  • Positive Feedback (Goal of 1 = Correct Answer)

Do you think that this new approach will work? Have you seen something like this implemented before?

Advertisements

Creating the MNIST Sensor, Actions & Goals

Since my pivot from building an AI for my racing game, and starting with the “hello world” AI training data (MNIST) I have started to get into the first step of what anyone who would use my platform would do – Create the Sensors, Actions & Goals.

Sensors

In my commute to work I have been thinking about the many different ways to try to encode visual information, and will be looking forward to trying different types out, but I think that the most straightforward sensor is one that will interpret the grayscale 28×28 pixel image as a 784 feature vector (28×28=784). Each one of the pixels will be represented by another feature of the vector.

The first 28 features will be the first row of pixels in the image, and the 2nd 28 features will be the 2nd row of features. The number in each one of those features will be a value from 0-1, with 0 being a white pixel and 1 being a black pixel. A gray pixel would be 0.5.

MNISTExampleAn MNIST image broken down into each pixel. In this example, the first
5 rows are all 0 value, as the 4 is not visible until the 5th row.

Actions

The action (output of the network) will be a representation of a number (0,1,2…-9). This means that there are 10 different potential results. The simplest and most straight forward way of encoding this information is in a 10 feature vector. The first feature will represent 0, and the last feature will represent 9.

MNISTActionAn example of an action which has 4 as the most
likely number, and 9 as the 2nd likely.

This means that the structure of this network will be a  784 feature vector, that goes through 2 hidden layers (default for Synapse – reasoned from Jeff Heaton findings) and results in a 10 feature vector.

The “motor system” will interpret this 10 feature vector and will print the number to the screen.

(I think I might need to add a diagram of my vision of the standard architecture at some point. Would that be helpful? If you want it let me know.)

[DISCLAIMER]Using “actions” to describe the output of Synapse in this supervised instance is not ideal, but since the goal of Synapse is to be embodied (in a virtual race car initially) I will continue with this naming scheme, just be aware that the output of Synapse is an Action, and in this MNIST implementation, the Action is to interpret the hand written image.

Goals

The goal of this network is to determine the correct number that is written in the image. With every goal in synapse, it is required to have a sensor. My rational is that ,just like any goal in life, you can never know when you achieved a goal if you never measure it.

I will need to take the label for the data that lets the system know what number is written in the image and encode it into a sensor. This goal sensor will then be used to optimize the network against when the system trains, or as I term it in Synapse “sleeps”.

The goal sensor will be the exact same encoding pattern as the action is. That is how the system  will determine whether the goal was achieved for the image.

MNISTGOalMNISTAction

The first image is the goal sense, the 2nd is the action. In this example
the system properly interpreted a 4 as the most likely number.

I think that this setup will be a good initial test of the system. Would you have done it differently?

Adding Supervised Learning to Synapse

I started my AI platform (Synapse) with the understanding that I wanted to make an AI that paralleled some human constructs, as the field of AI has too many of it’s own terms that makes learning AI more complicated than it needs to be.

This meant that Synapse would be an unsupervised system. The difference between a supervised, and unsupervised system is labeled data, versus unstructured data. An example of labeled data for supervised learning would be CAPTCHA tests you have to pass to make sure you are not a robot when logging into your favorite service. By selecting the images that all have bikes in them, you are helping label image data. Without the label, the system can not learn what a bike is. Labeled data provides you the correct answer, while unsupervised learning doesn’t have any right or wrong – it just learns the features of the data being processed. [What’s the difference between supervised and unsupervised?]

I made a definition for AI only slightly modified from a techmergence.com definition, and made a statement that all AI’s could be defined with it – so does that definition apply to unsupervised AI alone, or supervised as well?

“Artificial intelligence is an entity, able to receive and process inputs from the environment in order to flexibly act upon the environment in a way that will achieve goals over time.”

With this definition it is important to identify the new environment (MNIST data) and determine what each of the pieces are.

  • Environment:
    • MNIST Dataset
  • Inputs:
    • 28×28 pixel images
  • Actions:
    • Print a Number (1, 2, 3…9)
  • Goals:
    • Accurately Interpret the Numbers

The trick for a supervised system in Synapse, is that in the unsupervised implementation, the goal is usually an ideal sensor state. For example; in my Super Truckin’ AI, the vehicle’s speed has a sensor, and there isa goal represented in the system of maximum speed. The system (theoretically at this point) would identify the relation between hitting the gas, and getting closer to the goal of top speed, and learn to act by hitting the gas.

In a supervised system, if I were to provide the actual number as the input goal in addition to the pixels of the image of the number, then it would learn to ignore image pixels, and would just repeat the goal as the number, which is essentially useless. That setup is like taking a Jeopardy test where you need to answer with a question when you are given the question already.

Since the goal is explicit in a supervised system, the system needs to optimize the output (action) with a dynamic goal (each image is a different number), and not a static goal (go fast), since the goal is different and explicit for each set of numbers.

This implementation of MNIST is not bringing anything new to the world of AI, but I plan on using the MNIST dataset as a test of my neural network. I’ll start with version 1 of the network using some basic parameters and as the system evolves, I can use this data as a benchmark of progress.

I’ll let you know after I implement the “pivot” in the system and add supervised learning  if I have to revise the AI definition or not, but I think I have re-framed the problem in a way that solves it for a supervised implementation even if it breaks some of my architectural constructs.

Do you disagree? I hope so, because then one of us is going to learn something…

Pivot… Hello World Synapse, with MNIST

I started working on Synapse (AI platform) with the goal of creating an AI that would drive the vehicles in my racing game Super Truckin’. Unfortunately, after attempting to get Super Truckin’ up and running I have determined I will have to “pivot”.

When I updated Super Truckin’ to the new Unity 3d build, I was forced to update the Edy’s Vehicle Physics engine and broke everything. There was no “automatic update” of the changes and there was significant architectural changes in the physics.

Another challenge faced was that the libraries I was going to use for the feed forward and back propagation methods were in Encog (C#) and that library is completely incompatible with Unity, as Unity only supports .NET 2.0 and Encog is .NET 3.6.[Encog]

This doesn’t change my goal as I was still planning on making my own feed forward , and back propagation methods, but it does put it off for a while until I can build the neural network pieces of my platform.

With these bumps in the road, and my patience to see some results short, I have now chosen to make my first complete project with Synapse one that is based on the MNIST data set. A “hello world” set of data for AI developers. For those not familiar, MNIST is a set of hand drawn numbers (28×28 pixels). [MNIST Data]

This does change my platforms first implementation from an unsupervised network, to a supervised network since MNIST is labeled, meaning the system isn’t going to “learn” intuitively what the number is, it will have to be told after each attempt whether the number the system believes it is, is the correct number. This is a drastic change, but one that does test whether my original definition of an AI is still valid, or if it was only valid for unsupervised AI’s.[Original AI Definition]

Is it going to be easy to switch a system from unsupervised to supervised learning? We will find out…

 

Sensor Types of Synapse (my AI platform)

I am building an AI platform with 3 main components: Sensors (inputs), Goals and actions (output).

The initial version of this system is being built to try and make an automated vehicle AI for my racing game I made about 5 years ago. The original AI in that game was built with waypoints and had a lot of setup and tweaking required in order for the vehicles to make their way across the course, and did not have any “learning”, so it is an AI in only the videogame definition, not the machine learning definition.

I am planning on adding 4 different sensors to the vehicle.

  1. Location Sensor
    1. This reads the location on the course, like a GPS sensor
  2. Proximity Sensor
    1. This reads if something is within range of the vehicle
  3. Rank Sensor
    1. This reads the current race position (last place, 2nd place…)
  4. Speed Sensor
    1. This reads the speedometer

These are more than enough input to the system in order for it to optimize it’s actions and learn how to race along the course.

As I started coding these, I realized that I was going to encode the sensor readings to vector inputs in two distinct ways.

  1. Metered
  2. Discrete

Metered Sensor
The speed sensor is a good example of a metered sensor. In a metered sensor, the larger the reading, the more features of the input are activated. The slower, the less number of features of the sensor input are activated. If the maximum speed for a vehicle is 100mph, and I have 20 feature dimensions on my sensor, Then at 50mph, the sensor would have the first 10 features activated on the sensor reading.

MeteredSensor

Discrete Sensor
The location sensor is a good example of a discrete sensor. There should be no relationship between the location of the vehicle on the course and the number of features activated, so it would make little sense to activate more features if the vehicle is at the top of the course versus the bottom of the course. In this sensor type, each feature represents a specific dimension across the X, Y, Z dimension.

Discreet

Theoretically all sensors could be Metered, and the system could learn the discrete elements of the sensor reading. I am going to shortcut this but maybe this is worth testing after the system is up and running and I have the second phase of the system implemented, which will create more complex internal representations and “learn”.

I can’t think of any other sensor encoding types that one could build that would interpret raw sensor readings as vectors. So until the need is identified, these will be the base number sensors.

UPDATE: I did think of another way of coding a number. It could just be a single feature dimension. Pretty straightforward actually… I might have over complicated things.  🙂

Do you disagree? I hope you do because then one of us will learn something…

 

Definition and Elements of Synapse (my AI platform)

 

There is a really good definition of AI that has been put together by Techmergence.com.

“Artificial intelligence is an entity (or collective set of cooperative entities), able to receive inputs from the environment, interpret and learn from such inputs, and exhibit related and flexible behaviors and actions that help the entity achieve a particular goal or objective over a period of time.”

https://www.techemergence.com/what-is-artificial-intelligence-an-informed-definition/

The reason I think it is so good, is that it has the three elements that I have identified in what I am building my AI platform and all other platforms, whether they are aware of it or not. Inputs (sensors), actions (output) and goals (objectives)- and what a system made up of these things does, is process them. This definition is the best one I have found, but if I were to criticize it, I think it is a bit too verbose. I would change it in this way.

“Artificial intelligence is an entity, able to receive and process inputs from the environment in order to flexibly act upon the environment in a way that will achieve goals over time.”

I believe that inputs, actions and goals are THINGS (objects) in an AI, and processing is the thing AI does. I think that this definition is a little bit more direct than the techmergence.com definition and if I were talking to someone writing an object oriented solution it would be a bit more specific for them.

It is important to differentiate in an AI system that the action in an AI is an object or THING, not a verb or actual action. An AI system only processes, and the outputs of the AI system are the actions that are represented in an object. This action if embodied would then be interpreted by a motor system.

An oversimplification of an AI workflow would be something like the one below.

SimplifiedWorkflow

The Actions piece of the this workflow would then be ingested by a motor system to execute in the environment, but the instruction for the action is the output of the AI system.

Do you disagree? I hope so, because then one of us is going to learn something…

Changing the format…

This WordPress was put together initially to promote and demonstrate the progress of a project I was working on with my brother and friend. That project is scrapped, but the name and the site are too good to just let sit.

I am going to start to post my work on AI – so instead of “Developments on Development” it’s going to be “Thoughts on Thoughts”.

There will be some well thought out articles, and then there will be random musings and thought experiments. I’ve been working on my AI project largely by myself at home and there is a big risk of getting tunnel vision, so I am going to take advantage of Cunningham’s law and start putting my wrong answers up here and hope someone corrects me.

Enjoy! Hopefully we both learn something in this process.

-Alex