6.1 Shifter Problem
6.2 Single Shifter
6.3 Double Shifter
All you do to solve a shifter-problem is to click a few buttons; however, first we will explain the problem.
The shifter problem generates a pattern, shifts the pattern in one of many possible ways and asks a neural network to identify the shifting.
This is obviously synthetic data instead of real data, such as a person's signature. The shifter problem is chosen because:
(1) the number of neurons is arbitrary, allowing you to work with many different sizes of neural networks;
(2) the amount of data is arbitrary, allowing you to study the relation between the amount of training data and the results;
(3) this is a very hard problem due to the large sample
space.
Example 4-bit Shifter problem.
100
1100 1001 Left Shift,
Wrap-around
001
1100 0110 Right Shift
010
1100 1100 No Shift
111
1111 0000 Not Valid
When you shift a pattern once, it is a single shifter; when you shift a pattern twice, it is a double shifter, ... Shifter data is used to train the neural network. Once training is finished, the network has to solve two types of problems: identify a shift or complete a shift.
Two files are used: training, testing. You can choose
the file names by clicking: "Data/Link". For the moment, we suggest you
to use the default file names:
Training file: example1a.txtTo generate training data from ABM, click:
Testing file: example1b.txt
Example/Shifter/Data.
This command will also open the two files and set
up the neural network.
The default values (for the shifter problem) are:
15-bit Shifter (33 neurons),
250 training-patterns,
50 testing-patterns.
To change the default settings, click:
Example/Shifter/Parameter.Example. A 15-Bit Shifter's training file is:
Shifter, Training Data File
*
33
010 000001000000010
000001000000010
100 101000000110010
010000001100101
010 000001110011101
000001110011101
... ...
010
means no shift;
000001000000010 is a 15-bit pattern;
and
000001000000010 is generated
by 'no shift'.
Use the command "Data/Symmetry" to read or set the symmetry parameters. In this problem, it indicates that the shifter has a 1-dimensional x-translation symmetry. This is very important for training the network.
Now that the network is initialized and the data is generated, it is ready to be trained and tested. Click Run/Classification. This command takes no time.
If you click Run/Distribution, the Boltzmann Machine will present you with a distribution, which is a set of possibilities and the probability for each possibility. The one with the largest number is the preferred answer. The following is an example of the output for a shifter problem:
xxx
001101000101111
100110100010111
Distribution begins:
100 2
001 50
End of the Distribution
001
001101000101111
100110100010111 50 1
xxx
010000001111001
100000011110010
Distribution begins:
010 7
100 440
001 33
End of the Distribution
100
010000001111001
100000011110010 440 1
In each instance, the testing pattern is printed first. This is followed by all possible solutions and a relative probability for each possibility. In the first instance, 001 ( Right Shift ) is the answer; and in the second instance, 100 ( Left Shift ) is the answer. In both cases, the answers are correct.
The probabilities printed next to each possibility are relative probabilities. In the first example, the probabilities for 100, 010, and 001 are 2/52, 0, and 50/52, respectively. The probabilities are not normalized because they carry other information: the larger the number is, the more reliable the answer is.
In this example, 45 or so testing-patterns are recognized. If you are satisfied, copy the unrecognized patterns to the training file and run again (with xxx being replaced properly). Then you should get 100% correct classification. Alternatively, generate more training data by clicking:
Example/Shifter/Parameter,
And reset the number of training patterns.
Click commands "Run/Classification" or "Run/Distribution", you can expect three possibilities for each testing pattern:
Incorrect classification: this is usually the case when: the Boltzmann Machine assigns a predominate probability to more than one configuration; or the relative probability is "small". It is hard to define "small" here because it depends on the problem, but for a given problem, an experienced user will know what is "small".
No classification: this is usually the case when
the Boltzmann Machine can find little correlation in the test pattern,
based on the current training. In such a case, "No classification " is
printed in the output data file.
If you shift a pattern twice, you will get a double shifter. Here are two examples:
010001
100100001000000
100100001000000
010010000100000
100010
101000111110110
010001111101101
101000111110110.
The possible shifts are:
LL 100100
NL 010100
RL 001100
LN 100010
NN 010010
RN 001010
LR 100001
NR 010001
RR 001001
Where RL means Right-Left shifts. The testing patterns
look like this:
xxxxxx
100000000001011
000000000010111
000000000010111
The output data file for "Run/Distribution" looks like this:
xxxxxx
100000000001011
000000000010111
000000000010111
Distribution begins:
100100 260
010100 128
010010 32
010001 32
001100 80
001001 32
End of the Distribution
100100
100000000001011
000000000010111
000000000010111 260 1
The output data file contains the original vector, plus
several possible shifts. Finally, the answer is printed. Each possibility
is associated with a number:
for class 100100, the number is 260;
for class 010100, the number is 128;
for class 010010, the number is 32;
. . .
The predominate number here is 260, making the preferred
answer 100100 (the left-left shift), which is the correct answer. These
numbers are relative probabilities. Note, the probabilities printed next
to each possibility are relative probabilities. The probabilities are not
normalized because they carry other information: the larger the number
is, the more reliable the answer is.
If you use "Run/Classification" instead of "Run/Distribution", then the results will look like this:
xxxxxx
100000000001011
000000000010111
000000000010111
100100
100000000001011
000000000010111
000000000010111 260 1
In the output data file, the original testing-vector is printed first, followed by the best match output-vector.