Hopfield Networks

As an in-class activity, we will replicate an experiment described in Section 9.2 (pages 226-231) of our text. We train an artificial neural network to recognize a set of 10x10-pixel images of numerals.


Training:
We are given a series of N patterns, q0, q1, ..., qN-1, with each pattern qk ∈ {-1, +1}n. For this experiment, n=100, as we simply view each image as a 100-dimensional vector.

An n-node network is modeled as a complete graph with real-valued weights set according to the following formula, for all 0 ≤ i < n, 0 ≤ j < n, 0 ≤ k < N:

wij = (Σk qki qkj) / n
where k ranges over all patterns qk in the training set.


Classification:
To perform a classification, we are seeking a steady-state in which each node of the network outputs a value xi ∈ {-1, +1}, such that

xi = -1 if Σji wij xj < 0
xi = +1 if Σji wij xj ≥ 0
We seek this equilibrium with an iterative algorithm. If not currently at equilibrium, we randomly select some xi that does not satisfy the above formula, and we invert it. We repeat this until reaching equilibrium, or until a predetermined maximum number of iterations is exhausted.

After reaching equilibrium, we check to see if we have reached one of the training patterns, or an unknown configuration.


Experimental Setup:
We can train a network to use portions of the sample images. To be consistent with the book, we start by using pattern "1", then "2", and so on, only using "0" as the tenth pattern, if desired. We allow for the user to choose how many of those patterns to include in the training set (as we will see that it will be difficult to effectively differentiate between all 10 patterns when using only 100 pixels).

Once trained, we perform one or more tests, where each tests consists of taking one of the original samples (a randomly chosen one, by default), and then intentionally introducing noise by flipping each bit of that pattern independently with some probability p (e.g., p=0.10). We then run the classification process and when it concludes, determine whether it reached the original image for that numeral, some other numeral, or some equilibrium position that was distinct from all samples in the training set.

Our software will allow for an arbitrary number of such tests, and it reports on the overall success rate, as well as a matrix showing how often each query numeral was (mis)classified to each possible result. For example, if training on 4 samples and using 30% noise, we get the following results for 1000 trials:

Overall success rate of 0.7080

      1   2   3   4 other
 1: 205   .   .   .  50
 2:   . 163   4   .  74
 3:   .   7 145   . 115
 4:   1   .   . 195  41
We see that we had the most trouble with the numeral "3", occassionally classifying it as a "2", and many times reaching some other steady state.

Unfortunately, we will see that this success falls apart when we add more patterns (and in particular this specific collection of patterns). In fact, when we add in the number "5" it turns out that the training results in a network for which even the unperturbed numbers "2", "3", and "5" are no longer steady states.


Software:
The necessary software can be found at turing:/Public/goldwasser/362/hopfield/ or downloaded as the following zip file.

You are responsible for implementing the following two functions:

as described above (and documented within the software).

Usage: hopfield.py [options]

Options:
  -h, --help       show this help message and exit
  -a               show all test patterns and exit
  -s SEED          seed for all randomization [default: clock]

  Experiment Options:
    -n PATTERNS    number of patterns to use in training [default: 4]
    -p PROB        Probability of perturbing each bit in the test pattern [default: 0.1]
    -r REPS        Number of independent tests to perform [default: 1]
    -f NUMERAL     force numeral to choose as basis for test query [default: random]
    -m ITERATIONS  maximum number of iterations to perform per query [default: 10000]

  Display Options:
    -t STEPS       trace status every t steps (no trace if 0) [default: 0]
    -d DELAY       per step delay for trace; manual if 0 [default: 0.001]
    -v             visualize trace [default: False]
    -w WIDTH       width of window for visualization [default: 200]
    -q             no console output (other than statistics) [default: False]

Michael Goldwasser
Last modified: Thursday, 21 November 2013