More Tech Stuff:
March 29, 2010
Understanding the Bitworm NuPIC HTM Example Program , Part 1
Now for my least favorite part of intellectual projects, figuring out someone else's computer code.
When I installed the NuPIC package, a program called Bitworm was run to show that NuPIC installed correctly. Bitworm's main program, RunOnce.py is written in Python script and might be characterized as a simplest meaningul example program, which makes it considerably more complicated than your typical Hello World one liner.
The explanation of, and instructions for running and playing with Bitworm can be found in Getting Started With NuPIC (see pages 14-23). If you open RunOnce.py (mine conveniently opened in IDLE, "Python's Integrated Development Environment") there is a good outline of the process too.
The point is to test an HTM (Hierarchical Temporal Memory) with a simple data set. If you got here without knowing about HTMs, see www.numenta.com or my glosss starting with Evaluating HTMs, Part 1.
Bitworm, or RunOnce, starts by creating a minimal HTM. It does this by importing nodes and components using functions that are part of the NuPIC package. It also sets some parameters which have already been built elsewhere. Then the HTM is trained using another already-created data set of bitworms, which are essentially short binary strings easily visualized if 1's as interpreted as black and 0's as white (or whatever colors you like). Later I'll want to look inside the nodes, and at how nodes are interconnected, in order to understand why this works, but for now I'll keep to the top-level-view.
To test if the NuPIC HTM network learned to distinguish 2 types of bitworms, the training data set is again presented to see what outputs the HTM gives. This is also known as pattern recognition, but in temporal memory talk we prefer the term inference. The bitworms are examples of causes (objects in most other systems), and the HTM infers, from the data, which causes are being presented to it.
That seems like too easy of a trick, infering causes based on the training set, so RunOnce also sees how the trained network does trying to infer cuases from a somewhat different set of data.
As output RunOnce gives us the percentages of correct inferences for the training set and second data set, plus some information about the network itself.
Presuming that you are using Windows, to run Bitworm with RunOnce.py, open a command prompt (press Start, in the search box type Command. This should show Command Prompt at the top of the program list. Click it once. Since you will need Command Prompt often, you might also return to Start, right-click on Command Prompt, and Pin to Start Menu. Then it is always in your Start Menu. Or create a shortcut).
and hit Enter. That will get you in the right directory.
Then run RunOnce by typing the following and hitting Enter:
If you get errors, you need to run the Command Prompt as an Administrator. Close the window, then right click on Command Prompt and choose Run As Administrator. Click through security warnings.
The output says there were two sets off 420 data vectors written. Inference with the training set as input data was 100% accurate. Inference with the 2nd data set was 97.85...% accurate.
As it says, you can also open report.txt. Here's what mine says:
General network statistics:
Node Level1 has 40 coincidences and 7 groups.
Comparing: test_results.txt with test_categories.txt
Getting groups and coincidences from the node Level1 in network ' trained_bitworm.xml
====> Group = 0
====> Group = 1
====> Group = 2
====> Group = 3
====> Group = 4
====> Group = 5
====> Group = 6
Full set of Level 2 coincidences: