For my machine learning class I created a neural network and a visualization to show how it learns.
What is a neural network? Basically, it's a computer model inspired by the way your brain works. The simplest version only has inputs and outputs. The synapses between the inputs and outputs start with small random weights (think multipliers), and get adjusted in small increments over many epochs in order to cause the network's output to approach the outputs desired.
Where it gets really interesting is when you start adding more layers, besides inputs and outputs, called hidden layers. This allows the neural net to, in a sense, hold abstractions, and generalize from the training data.
For this project, I took data from the last 70 years, and built a network with four layers, six inputs, four neurons in each of the two hidden layers, and one output. (Specifically, I built a feedforward back-propagating supervised network, which is what I'm discussing here. There are many other types of neural nets.)
As inputs, I used the number of Democrats and Republicans in the Senate and House, the active strength of the US Military, and the unemployment rate for the previous year. As output, I used the unemployment rate for that year, as that's what I wanted to predict.
The video above shows the neural net learning, by adjusting the synapses between the neurons, which are animated in gray, and then running through the training data, and then the testing data (i.e. data that it wasn't trained on, in this case, the most recent 15 years).
In the end, my neural network was much more accurate than I expected, the error on the testing data was 11%, meaning that, in theory, if you give the network inputs for a given year (House and Senate makeup, military, last year's unemp) it will predict the unemployment rate within 11% . So if the true rate was, say, 8.3%, the network would give you something between 9.21% and 7.387%. Not too bad. (Frankel and other economist friends, feel free to chime in about the various ways this is wrong and makes no sense.)
To do this, I used the excellent Encog library for java. I like python more than java, so messed with pybrain a lot, but Encog has every feature conceivable. And works with processing, which made visualizing easier.
Before I got to what you see in the visualization, I tried a lot of different, similar neural nets. First, I tried it without the temporal data, that is, without the last year's unemployment. This gave me a 17% error rate on the last 15 year data. Next, I tried it with a random selection of 15 years, and the error rate went down slightly. Finally, I added the extra input of last year's unemployment, and, with a ton of training (much more than in the video), achieved an 11% error rate on the test data. I also messed with a great amount of variation as far as number of neurons in the hidden layers, from 2 to 500, but 4 turned out to be best.
There were two areas in which I had trouble, one was data normalization using Encog, so I ended up normalizing the data myself using a python script I was using to aggregate the data anyway (neural nets like data to be between 0 and 1).
Second, was displaying the actual values moving through the network while it was being tested. I rewrote most of my code in attempt to add this (and learned how to use the iterable object correctly), as I think it would be really useful for understanding the way the network works, but I couldn't get the values for each layer to display correctly. With some time, I'd really like to add that.
Source code available here, though this doesn't really resemble what's in the video. Below is the console output for the most successful run of the network. Because the weights start randomly, each training is different from every other training.
Epoch #5447 Error:1.4999955997114394E-4 Neural Network Results for Training Data: 0.66,0.28,0.534,0.324,0.18,0.73, actual=0.4983663395292939,ideal=0.495 0.66,0.28,0.534,0.324,0.3859,0.495, actual=0.23436250492049088,ideal=0.235 0.57,0.38,0.444,0.418,0.9045,0.235, actual=0.08452120095631022,ideal=0.095 0.57,0.38,0.444,0.418,1.1451,0.095, actual=0.0815033122889647,ideal=0.06 0.57,0.38,0.486,0.38,1.2056,0.06, actual=0.08130640799774146,ideal=0.095 0.57,0.38,0.486,0.38,0.3025,0.095, actual=0.24327423747050617,ideal=0.195 0.45,0.51,0.376,0.492,0.1582,0.195, actual=0.20086019727739185,ideal=0.195 0.45,0.51,0.376,0.492,0.1445,0.195, actual=0.2008001851266592,ideal=0.19 0.54,0.42,0.526,0.342,0.1613,0.19, actual=0.27778673711288804,ideal=0.295 0.54,0.42,0.526,0.342,0.1459,0.295, actual=0.28814359093564434,ideal=0.265 0.48,0.47,0.468,0.398,0.325,0.265, actual=0.16153750109068066,ideal=0.165 0.48,0.47,0.468,0.398,0.3635,0.165, actual=0.13558489288725128,ideal=0.15 0.46,0.48,0.426,0.442,0.3555,0.15, actual=0.17602786013114474,ideal=0.145 0.46,0.48,0.426,0.442,0.3303,0.145, actual=0.2199931108910821,ideal=0.275 0.48,0.47,0.464,0.406,0.2935,0.275, actual=0.23021456016162994,ideal=0.22 0.48,0.47,0.464,0.406,0.2807,0.22, actual=0.253903192793502,ideal=0.205 0.49,0.47,0.468,0.402,0.2795,0.205, actual=0.2555298780817681,ideal=0.215 0.49,0.47,0.468,0.402,0.2599,0.215, actual=0.26123536179631845,ideal=0.34 0.64,0.34,0.566,0.306,0.2504,0.34, actual=0.27390713255535215,ideal=0.275 0.64,0.34,0.566,0.306,0.2476,0.275, actual=0.29709252965041594,ideal=0.275 0.64,0.36,0.524,0.35,0.2483,0.275, actual=0.2855653767340758,ideal=0.335 0.64,0.36,0.524,0.35,0.2808,0.335, actual=0.24978578065111492,ideal=0.275 0.67,0.33,0.516,0.352,0.27,0.275, actual=0.27536312822913894,ideal=0.285 0.67,0.33,0.516,0.352,0.2687,0.285, actual=0.2765217655180545,ideal=0.26 0.68,0.32,0.59,0.28,0.2656,0.26, actual=0.2554208345203682,ideal=0.225 0.68,0.32,0.59,0.28,0.3093,0.225, actual=0.1939832730643538,ideal=0.19 0.64,0.36,0.496,0.374,0.3375,0.19, actual=0.19772126359928496,ideal=0.19 0.64,0.36,0.496,0.374,0.3547,0.19, actual=0.16332311108949626,ideal=0.18 0.58,0.42,0.486,0.384,0.346,0.18, actual=0.16442127838161877,ideal=0.175 0.58,0.42,0.486,0.384,0.3066,0.175, actual=0.24245546463834086,ideal=0.245 0.54,0.44,0.51,0.36,0.2714,0.245, actual=0.2558851329275245,ideal=0.295 0.54,0.44,0.51,0.36,0.2324,0.295, actual=0.2954853700500669,ideal=0.28 0.56,0.42,0.484,0.384,0.2253,0.28, actual=0.2681779921075669,ideal=0.245 0.56,0.42,0.484,0.384,0.2163,0.245, actual=0.26151565006675204,ideal=0.28 0.61,0.37,0.582,0.288,0.2129,0.28, actual=0.3470994484698628,ideal=0.425 0.61,0.37,0.582,0.288,0.2081,0.425, actual=0.37127167300622715,ideal=0.385 0.61,0.38,0.584,0.286,0.2075,0.385, actual=0.3441559331912948,ideal=0.355 0.61,0.38,0.584,0.286,0.2062,0.355, actual=0.34953890305512214,ideal=0.305 0.58,0.41,0.554,0.316,0.2031,0.305, actual=0.33600392793467077,ideal=0.29 0.58,0.41,0.554,0.316,0.2063,0.29, actual=0.3332303554084988,ideal=0.355 0.46,0.53,0.484,0.384,0.2101,0.355, actual=0.38391068153643754,ideal=0.38 0.46,0.53,0.484,0.384,0.213,0.38, actual=0.4706693478188674,ideal=0.485 0.46,0.54,0.538,0.332,0.2163,0.485, actual=0.4305970376393928,ideal=0.48 0.46,0.54,0.538,0.332,0.2184,0.48, actual=0.42942131121472155,ideal=0.375 0.47,0.53,0.506,0.364,0.2207,0.375, actual=0.3791308632978942,ideal=0.36 0.47,0.53,0.506,0.364,0.2233,0.36, actual=0.3408292536305523,ideal=0.35 0.55,0.45,0.516,0.354,0.2244,0.35, actual=0.3180255945111218,ideal=0.31 0.55,0.45,0.516,0.354,0.2209,0.31, actual=0.3054461292681537,ideal=0.275 0.55,0.45,0.52,0.35,0.2203,0.275, actual=0.30308948074905867,ideal=0.265 0.55,0.45,0.52,0.35,0.2144,0.265, actual=0.3008652986844994,ideal=0.28 0.56,0.44,0.534,0.334,0.2077,0.28, actual=0.3167321964928064,ideal=0.34 0.56,0.44,0.534,0.334,0.188,0.34, actual=0.33105436361549073,ideal=0.375 0.57,0.43,0.516,0.352,0.1775,0.375, actual=0.35928649461540163,ideal=0.345 0.57,0.43,0.516,0.352,0.1678,0.345, actual=0.3048710453119955,ideal=0.305 0.48,0.52,0.408,0.46,0.1583,0.305, actual=0.2624632144627692,ideal=0.28 ------------------------------- ------------------------------- Neural Network Results for NEW DATA: 0.48,0.52,0.408,0.46,0.1538,0.28, actual=0.23930077726065127, ideal=0.27, percentage error=0.11370082496055091 0.45,0.55,0.414,0.452,0.1504,0.27, actual=0.24239373755821966, ideal=0.245, percentage error=0.010637805884817685 0.45,0.55,0.414,0.452,0.147,0.245, actual=0.22977833679284043, ideal=0.225, percentage error=0.021237052412624118 0.45,0.55,0.422,0.446,0.1451,0.225, actual=0.22768245012752356, ideal=0.21, percentage error=0.08420214346439793 0.45,0.55,0.422,0.446,0.1449,0.21, actual=0.2253461254045889, ideal=0.2, percentage error=0.12673062702294444 0.5,0.5,0.424,0.442,0.1451,0.2, actual=0.21840667496538055, ideal=0.235, percentage error=0.07060989376433804 0.5,0.5,0.424,0.442,0.1478,0.235, actual=0.22297073742023873, ideal=0.29, percentage error=0.23113538820607327 0.48,0.51,0.41,0.458,0.15,0.29, actual=0.24921051256244264, ideal=0.3, percentage error=0.16929829145852449 0.48,0.51,0.41,0.458,0.1494,0.3, actual=0.25960230146033725, ideal=0.275, percentage error=0.055991631053319176 0.44,0.55,0.404,0.462,0.1455,0.275, actual=0.24845814093718874, ideal=0.255, percentage error=0.025654349265926538 0.44,0.55,0.404,0.462,0.1456,0.255, actual=0.23292273464134336, ideal=0.23, percentage error=0.012707541918884128 0.49,0.49,0.472,0.398,0.1451,0.23, actual=0.24803874970006634, ideal=0.23, percentage error=0.07842934652202753 0.49,0.49,0.472,0.398,0.1474,0.23, actual=0.24857095999001486, ideal=0.29, percentage error=0.14285875865512113 0.58,0.4,0.514,0.356,0.1493,0.29, actual=0.26828403415745933, ideal=0.465, percentage error=0.4230450878334208 0.58,0.4,0.514,0.356,0.1506,0.465, actual=0.5537600529394734, ideal=0.48, percentage error=0.15366677695723627 Average Error for Test data 0.11466037 Hey there. End of program.