The Proposal: Subspacing Bayesian Ensembles

with No Comments

A place for me to keep track of my ‘idea blitz’ during the last year of going through papers.

  1. Cost/Fitness Landscape. Bayesian ensemble solution to flexibility in face of change. Biological population/machine population parallel. Problem: high dimensional volume difficult to sample.
  2. Dynamical Systems & Basin Entropy. Sensing & behavior as perturbations overcoming energy barriers between adjacent basins. Arrangement and size of basins in state-space lead to a measure of entropy for a particular configuration of coefficients. Response of environment is included in Q-learning. ‘Flexibility’ as a specific range of allowable entropies.
  3. Coefficient Symmetry in Basin Entropy. 1st entropy management strategy: control entropy through managing coefficient symmetries.
  4. Clustering/Metastability. 2nd management strategy: control symmetries & entropy through clustered architecture. Kauffman’s ‘rule balancing.’
  5. Tiling ->clustering/symmetry. Non-periodic tiling insights into management of “organ systems.” Function-clusters as shapes in a high-dimensional nonperiodic tiling. Marginal units of these ‘organs’ as rotate into apoptotic basins, restricting the shape with just ‘local knowledge.’
  6. Conformal bootstrap solved for metastability in phase transition between being over-specified and under-specified. Should help to define allowable rules for clustering/tiling to maintain proper metastable conditions. 6A: correlation function purpose is similar between Ising model magnetic particles and neural net coefficients; instead of a correlation depending on particle distance like magnets, its a correlation depending on patterns from training data. Pattern recognition as a kind of “field” that emerges from configurations. Symmetrical “conservation” learning rules to break real world symmetry to conform to patterns.
  7. Restrict Bayesian search for changes to coefficients to metastability condition only, using Conformal Bootstrap cluster/tiles, mitigating the problem from #1. A form of subspacing. May generate nonlinear plasticity effects that oppose Hebbian learning – Hebbian “drops temperature,” breaks symmetry to learn pattern; neural plasticity.
  8. Algorithm proposal: number of nearest neighbors in Ising is 2*d -> 1d is a line — nearest neighbors are left and right = 2. 2d is a plane — nearest neighbors are top bottom left right = 4, etc. For same number of nodes, higher dimensional model is more cross connected = more chaos. What if training is a tension constantly pulling regions of our net down into lower and lower dimensions, filtering out bad data, breaking symmetries to learn patterns = more order? Can cite a paper on 0 value eigenvalues correlating to training completeness. Then when we get things wrong, we fold up space around the major contributors to the wrongness and restore new symmetries to break (more chaos).

To summarize: if we want to keep track of a population of machines, we can do it using a Bayesian ensemble that stores probability values for coefficients instead of a single coefficient value. That raises the specter of high dimensionality; if we allow 30 million coefficients to all vary at once, it will take forever to make enough machines to sample the space. All human variability is captured by a tiny percentage of the total DNA which is allowed to vary – we are all 99.9% similar. We can ask a similar question of our machine population – what tiny percentage of coefficients do we allow to vary? I suggest that we only allow our coefficients to change in ways that mostly preserve the relative distribution of attractor basins to ensure metastability. One method of attacking this problem may be through the conformal bootstrap borrowed from physics, which is used to search for “fascinating universal physics of scale invariant critical points.” Our metastable attractor dynamics we are looking for may be on the edge of just this sort of phase transition. A second, related method is by clustering – recent research suggests that clustering may actually aid in the transition between attractor basins.