Reading Group 1 – Introduction to music and predictive processing

Thanks to both Nikki and Lauren for the warm welcome and for helping set up these sessions!

The first EMPRes session of 2017 provided an introduction to predictive processing in music perception and cognition. We began with Rohrmeier & Koelsch’s (2012) detailed review of existing work in predictive information processing in music cognition, including converging theoretical, behavioral, computational, and brain imaging approaches. Then we looked at a commentary by Michael and Wolf (2014) regarding the impact on music research of a specific framework of predictive processing, namely Hierarchical Predictive Processing (HPP) as put forward by Schaefer. These papers were a bit more dense than the ‘introduction’ meeting was intended to be, so I’ll lay out a summary of them here, attempting to explain some of the computational bits as well. Please feel free to comment if you have any questions, or especially if you have any answers or better explanations! A review of our discussion points will be in this subsequent blog post.

Rohrmeier & Koelsch laid out the predictable qualities of music, how our brains may be utilizing those qualities (e.g. through perceptual Gestalts, structural knowledge), behavioral evidence of prediction, followed by various computational models and neural evidence for predictive processes.

Predictable information within the music

  1. Predictability and combinatoriality requires a discrete, finite set of elements
  2. Prediction in music occurs on both lower-level processes (predicting the next note) and higher-level processes (predicting a development section in a sonata)
  3. Four sources of prediction, which may work together or be in ‘mutual conflict’:
    • Acquired style-specific syntactic/schematic knowledge
    • Sensory & low level predictions (Gestalts; ‘data-driven’ perception)
    • Veridical Knowledge (from prior knowledge of/exposure to the piece)
    • Non-sensory structures acquired through online learning (knowledge gained from current listening, e.g. motifs, statistical structures, probability profiles)
  4. Prediction in music is messy, constant parallel predictions are made in respect to not only single melodic strings, but complex harmonies, overall key structure, polyphonic and polyrhythmic sound streams, and at phrase-,movement-, or whole-composition levels. It becomes even messier when adding in texture/timbre changes, or considering more polyrhythmic, polymetrical, or complex polyphonic music of non-Western musics

Behavioral Findings

  1. Prediction effects are found in behavioral responses (identification) of unexpected musical events in the case of unexpected tones, intervals, and chords
  2. Musical priming studies (adapted from semantic priming studies in language) give evidence of implicit local knowledge, and perhaps even higher-level tonal key and temporal aspects

Computational Models- not as scary as they sound!

Why do we like them? “Predictive computational models provide a link between theoretical, behavioural, and neural accounts of music prediction”

  1. Hand crafted models such as Piston’s table of usual root progressions, show the general harmonic (root) progression expectancies based on tendencies in Western music. Your theory courses teach you to explicitly recognize these tendencies, which are implicitly learned and recognized by persons enculturated around Western music.

  1. Probabilistic models

N-gram models chop long segments up into shorter bits, and analyze those bits for statistical probabilities to predict the likelihood of the next unit.

  • Example of a 3-garm model of a sequence of pitches {A C E G C E A C E G}; the sequence will be chopped into shorter bits of three pitches, and the number of times each bit occurs [ACE: 2 (occurs two times); CEG:2 ; EGC:1; GCE:1; CEA:1; EAC:1]
  • We can use this model to predict that the notes ‘CE’ will occur after A 2/3 of the time, or after G 1/3 of the time (for this example, it’s easy to just count every instance of [_CE], 3 instances, and see 2 of those are A+CE, and 1 is G+CE)

Multiple Viewpoint Idea

  • Using information from multiple different features (viewpoints) to aid the prediction of a target feature
  • In music, using “duration or metrical position to improve the prediction of pitch class”, for example, an 8th note anacrusis at the start of a piece will likely be 5 leading to 1 (Sol, Do) — this particular prediction though, necessitates prediction based on previous exposure to a larger corpus of music, since it would be improbable to infer statistical correlations of a current piece based on only two notes

Short-term models and Long-term models

  • Short term- knowledge from the current listening; “specific repetitive and frequent patterns that are particular to the current piece and picked up during implicit online-learning”
  • Long term-knowledge from an entire corpus; “long-term acquired (melodic) patterns”

IDyOM- Information Dynamics of Music

Hidden Markov Models

  • A Markov transition matrix is the same as a 2-gram model. So our earlier set of pitches {A C E G C E A C E G}, would be split into [AC:2; CE:3; EG:2; EA:1;] and the probabilities are modeled between single events.
  • A Hidden Markov Model (HMM) generates probabilities not from single events, but instead generates probability distributions from hidden deep structure states. The probability of each subsequent state depends only on the previous state (not future states), reflecting the temporality of musical processing
  •  An introduction to HMM

Dynamic Bayesian Networks (DBM)

  • A DBN is an extension of an HMM in the same way that the multiple viewpoint model was an extension of n-gram models. DBNs analyze “dependencies and redundancies between different musical structures, such as chord, duration, or mode” to generate predictions

Connectionist Networks- Neural networks are designed to represent how actual biological neurons work, combining probabilistic models like those listed above with practical models of neural connections, firing, and growth dynamics

  • MUSCAT- an early musical neural net pre-programmed with Western features (12 chromatic pitches, 24 diatonic Major and minor chords, 24 keys)- does very well at predicting features of tone perception and prediction
  • Self-Organizing Map- unsupervised learning of features of tonal music (this is different from MUSCAT in that it was not pre-programmed with any training data)- matched some experimental data for predicting chord relations, key relations, and tone relations
  • Simple Recurrent Networks- unsuperviseds learning of transition probabilities and structures—not as efficient as n-gram models
  • Dynamic cognitively motivated oscillator models- unsupervised learning to adapt to metrical structure—however slow adaptation to tempo changes

Neuroscientific Evidence

Increased brain response to incongruent (unexpected) stimuli within a sequence- in music this may be hearing a normal chord progression followed by an unusually placed chord

  • ERAN- early right anterior negativity
  • MMN- mismatched negativity

It’s not clear through the neuroscientific evidence whether these responses are the result of local vs hierarchical violations

Brain areas involved

  • Ventral Pre-motor Cortex, BA44 (Broadman’s area 44, the right hemisphere analouge to Broca’s area for language in the left hemisphere- perhaps both do hierarchical processing)

Michael & Wolf laid out perhaps a more accessible overview of areas where a particular predictive processing framework, hierarchical predictive processing (HPP), might lend a novel contribution to the study of music cognition and human social cognition more generally.

HPP is a predictive framework which describes the brain as having a combination of lower level and higher level models arranged, of course, in a hierarchy. Each higher-level model generates and sends predictions down stream to the model immediately below it, while each lower-level model sends sensory input upstream to the model immediately above it. The goal is to minimize prediction error between the higher-level predictions and the lower-level sensory representations. Every time a higher-level prediction comes in contact with a lower-level sensory input that *does not match* the prediction, a prediction error is sent. The higher-level model then takes that prediction error, and changes its prediction, repeating until the incoming signal and the downward prediction are sufficiently matched. Higher-level models are thought to represent changes occurring over longer time scales, such as more abstract, structural, schematic, or specific style aspects of music. Lower-level models represent change in sensory input over shorter time scales, as in immediate local events of the next note or rhythm.

Musical Preferences

  • HPP seems of little use in the understanding of musical preferences. It can’t be assumed that the preferred balance between ‘optimally predictable’ and ‘a bit of uncertainty’ between individuals is the same. The author’s dub this search for the ‘sweet spot of predictability’ the “Golidlocks fallacy”, since even the right amount of predictability in a novice trumpet players crude sounds is likely still unpleasant.

Embodying Music

  • HPP might help in furthering our understanding of embodied music cognition by providing a clear link between perception and action, where perception simply is reflected by “a graded hierarchy of models functioning according to the same basic principles” separated only by time scales, and action is “in a sense equivalent to updating of higher-level cognitive models… through active inference”

Joint Action

  • In joint music making, agents engage in recursive higher-order modeling: “agents are not only modeling the music, but they are modeling the other agent’s actions and intentions, as well as the other agent’s model of her actions and intentions”. If joint music making is construed by the brain as a coordination problem, then HPP may the perfect model to step in and try to minimize prediction (coordination) error in these complex, recursive social interactions
This entry was posted in Spring 2017. Bookmark the permalink.

Leave a Reply