Lab 6: Fun with Autoassociators
The patterns in "auto1.pat" have identical inputs and outputs. The 8 patterns are given below:
1 1 0 1 0 0 1 1 0 0 1 1 1 1 1 0 1 1 1 0 0 1 1 0 1 0 0 0 0 1 0 0 0 0 1 0 1 0 0 0 1 0 0 0 0 0 1 0 0 0 0 1 0 1 1 0
First consider these patterns informally. Are there natural subsets based on similarity? Remember that for a hidden unit looking down on this input array, unit adjacency is irrelevant, but unit identity is not. You should be able to identify several possible bases for grouping at least some of these patterns together. Which pairs are most similar?
Using your cluster analysis skills, examine the patterns for similarity relations.
Now train a 7-n-7 network to map inputs to outputs. You will have to experiment to find a suitable value of n.
Examine the hidden unit patterns in the trained networks. Do they reveal the same similarity relations you found in either of the two previous steps?
What is the smallest value of n you can use and still get the network to learn?
Content addressable memory
When you address an envelope, the contents of the envelope are strictly independent of the address you write on the outside. Computer memory also works like this: the address in memory which is used to access some content is independent of the content itself. In a CAM, or associative memory, things are different. The address and content are linked. This means that the way you access something is not independent of what that thing is. An auto-associative network can be interpreted as a form of CAM. For the following exercises, load your simulation that has learned to autoassociate the above set of patterns using 3 hidden units.
You can edit pattern files using a simple text editor (not Word). have a look at existing ones to get to grips with the format required.
Set up a test pattern consisting of a degraded form of one of the learned patterns. For example, your test pattern could be:
<1 1 0.5 0.5 0 0 1>
Here, the two 0.5 entries are ambiguous between values of 0 and 1. However, looking at the set of patterns, you can see that this is nevertheless enough information to uniquely specify the first pattern. Don't worry about the test target; that is irrelevant here. Now, what output does the network produce for this pattern? Repeat this with many variants on the same theme: degrade a pattern, and observe the output. You should find that a partial cue is sufficient to recover a whole pattern. This is because the cue (address) and the content (stored pattern) are deeply linked.
How much can you degrade a pattern and still have it recalled? Is its similarity to other patterns important? After a while, you should be able to use the output of the cluster plot tool to predict, more or less, how much you can degrade a pattern, and in what way, and still have it reproduced in pristine form at the output.