Discussion Review- Reading Group 3 – Rhythm in music and predictive processing

Thanks for a great final reading group on predictive processing and music everyone! This week we had our usual gang of musicians and philosophers, plus a visitor from linguistics. For this session we read an experiment paper from Vuust et al. (2009) measuring neural responses to varying levels of rhythmic incongruity in expert (jazz) musicians and non-musicians, as well as Vuust & Witek’s (2014) review of predictive coding explanations of the perception of syncopation, polyrhythm, and groove. Below is a review of our discussion, and a summary of the 2009 text can be found here.

Discussing Vuust et al. (2009), we weren’t entirely comfortable with what still felt like fuzzy distinctions between rhythm, meter, beat, pulse, syncopation etc. In particular, syncopation was conceived as metric displacement which doesn’t interfere with the metric pulse, but which “breaks the metric expectancy by replacing a weak beat with a strong beat” (p82). This sounded contrary to our experience (in say, jazz, but presumably any syncopated music tradition will do) when a syncopation sometimes serves to validate our perception of the meter (or pulse, or metric pulse(?)), and feels as if it then enhances or confirms our metric expectancy. The interpretation of the MMN and P3am responses seemed quite in line with similar results in studies regarding response to syntactic incongruities in language, and fit the PC hypothesis that incongruent (unpredicted) stimuli require more neural processing in order to be explained away than do predicted stimuli. If expert musicians have more precise predictive models, then it also makes intuitive sense that they’d be more responsive to incongruencies than non-musicians.

Vuust & Witek (2014) provide a conceptual review of the predictive coding (PC) model as it applies to perception of rhythm, including syncopation, polyrhythm, and groove. Here we get a definition of rhythm as “a pattern of discrete durations…depend[ing] on the underlying perceptual mechanisms of grouping”, and meter “as the temporal framework according to which rhythm is perceived”(p1). The equal importance of top down (metrical framework) and bottom up (rhythmic input) is a feature both of PC and Dynamic Attending Theory (DAT), however DAT is meant to be much more flexible and dynamic than PC. In PC, meter is a generative, anticipatory model in the brain, however, in DAT meter is an emergent property of “the reciprocal relationship between external periodicities [rhythm] and internal attending processes” (p2). Unfortunately! We didn’t talk much about DAT, but perhaps this dynamic/entraintment/enactive-y account may help soothed some of our tension with the PC take on metrical/rhythmic vocabulary. (For some more insight into entrainment and dynamical theories in music, see Clayton et al. (2004), Clayton (2012) , Phillips-Silver et al. (2010) )

Syncopation We had some good discussion of the perception of syncopation in expert and non-musicians  in response to the 2009 text, so our next interests were piqued by the PC accounts of polyrhythm and groove.

Polyrhythm The authors analogize polyrhythms with bistable percepts of binocular rivalry. The standard example of a binocular rivalry experiment involves a participant being presented with a face and a house, independently, to each eye. One might expect that the percept (what a person sees) ends up being some morphed face-house object, averaging the sensory input being given to each eye. However, what happens in these experiments is that the participants perception switches back and forth from seeing a face to seeing a house, never having a merged percept of a sort of face-house object (See Tong et al. 1998). Presumably, this is the result of a hyper-prior relating to the spatial and temporal relationships between houses and faces, you don’t expect them to occupy precisely the same space at the same time.

Image result for face house binocular rivalry


Polyrhythms, the authors argue, are similar to visual bistable percepts in that we can hear either a three-beat meter or a four-beat meter (in the case of a 3 against 4, or 4 against 3 polyrhythm), but you can’t hear both at the same time. However, it felt to some of us as though if you listen to a polyrhythm, you actually hear the entire stimulus at once. Similar to looking at Rubin’s vase, where you do see both of the vases and the faces at once, but what is more salient to you depends on what you are actively attending. The authors note that polyrhythm as a bistable percept differs importantly from binocular rivalry in that musical training can have an effect on what meter you hear or attend to. Our intuition was that, perhaps to the untrained ear, the examples in A and B below might ‘sound’ the same, and might just sound like a particular (complex) rhythm pattern. rather than a superimposition of two different patterns.

In the two mentioned polyrhythm studies (Vuust et al. 2006, 2011), when participants were asked to tap along to the counter-meter, rather than the original meter, increased activation was found in Brodman’s area (BA) 40, which is known to deal with processing of bistable percepts, as well as BA 47 which deals with semantic processing in language. Musicians showed less activation than non-musicians, which is consistent with the idea that their predictive model is more precise and thus they need less processing power to interpret the predicted stimuli. In listening to the excerpt which provided the the polymetric stimuli for these studies (the soprano saxophone solo in “Lazarus Heart”, by Sting – listen at 1m40s), this seemed to us like it could be interpreted as an instance of hemiola rather than polymeter. Which led us to the question, is there a difference (neurally speaking) between the perception of a hemiola vs perception of a sustained polyrhythm/polymeter?

Groove, the authors say, is accounted for in PC as a sort of ‘Goldilocks’, just-right phenomonenon, where “the balance between the rhythm and the meter is such that the tension is sufficient to produce prediction error, but not so complex as to cause the metric model to break down”(p9). This is a good example of action-oriented perception, where the propensity to move the body in time with ‘groove’ music serves as active-inference of the causes of the sensory input (the rhythm) in comparison with the anticipatory model (meter). Witek et al. (2014) supported this notion through a web-based study where participants rated their pleasure and how much they felt like moving after hearing a collection of differently-syncopated drum breaks. The participants responses generated an ‘inverted U-shaped relationship’ (roughly a Gaussian curve) which seemed to suggest a sweet spot of syncopation and participant ratings, somewhere in between too-low-complexity degrees of syncopation (where there is little incongruence between predicted and actual input), and too-high-complexity degrees of syncopation (where there is so much incongruency that ‘the predicted model breaks down’).

So these readings conclude our introduction to a few of the ways predictive processing may shed light into our perception and cognition of music. I get the feeling that we’ve only just barely scratched the surface, and certainly look forward to seeing where continuing this train of investigation will lead!




This entry was posted in Spring 2017. Bookmark the permalink.

Leave a Reply