Beat perception in polyrhythms: Time is structured in binary units

Cecilie Møller; Jan Stupacher; Alexandre Celma-Miralles; Peter Vuust

doi:10.1371/journal.pone.0252174

Abstract

In everyday life, we group and subdivide time to understand the sensory environment surrounding us. Organizing time in units, such as diurnal rhythms, phrases, and beat patterns, is fundamental to behavior, speech, and music. When listening to music, our perceptual system extracts and nests rhythmic regularities to create a hierarchical metrical structure that enables us to predict the timing of the next events. Foot tapping and head bobbing to musical rhythms are observable evidence of this process. In the special case of polyrhythms, at least two metrical structures compete to become the reference for these temporal regularities, rendering several possible beats with which we can synchronize our movements. While there is general agreement that tempo, pitch, and loudness influence beat perception in polyrhythms, we focused on the yet neglected influence of beat subdivisions, i.e., the least common denominator of a polyrhythm ratio. In three online experiments, 300 participants listened to a range of polyrhythms and tapped their index fingers in time with the perceived beat. The polyrhythms consisted of two simultaneously presented isochronous pulse trains with different ratios (2:3, 2:5, 3:4, 3:5, 4:5, 5:6) and different tempi. For ratios 2:3 and 3:4, we additionally manipulated the pitch of the pulse trains. Results showed a highly robust influence of subdivision grouping on beat perception. This was manifested as a propensity towards beats that are subdivided into two or four equally spaced units, as opposed to beats with three or more complex groupings of subdivisions. Additionally, lower pitched pulse trains were more often perceived as the beat. Our findings suggest that subdivisions, not beats, are the basic unit of beat perception, and that the principle underlying the binary grouping of subdivisions reflects a propensity towards simplicity. This preference for simple grouping is widely applicable to human perception and cognition of time.

Citation: Møller C, Stupacher J, Celma-Miralles A, Vuust P (2021) Beat perception in polyrhythms: Time is structured in binary units. PLoS ONE 16(8): e0252174. https://doi.org/10.1371/journal.pone.0252174

Editor: Jessica Adrienne Grahn, University of Western Ontario, CANADA

Received: May 8, 2021; Accepted: August 1, 2021; Published: August 20, 2021

Copyright: © 2021 Møller et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All data is available at https://researchbox.org/278.

Funding: Center for Music in the Brain is funded by the Danish National Research Foundation (DNRF117). Jan Stupacher is supported by an Erwin Schrödinger fellowship from the Austrian Science Fund (FWF J4288). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

In speech, music, and natural environments, we automatically group, subdivide, and structure sound sequences evolving in time. The function of such hierarchical structures is to scaffold and anticipate upcoming auditory events and to facilitate detection of unexpected events through a process that has been termed “predictive timing” [1, 2]. This perceptual grouping of temporal events is a cognitive mechanism essential for reducing complexity and making sense of the vibrant sensory environment surrounding us.

In search of a universal principle that can explain the ubiquity of rhythms in nature, physiology, attention, speech, poetry, and music, Bolton [3] performed one of the earliest investigations of human rhythm processing. He showed that when listening to unaccented, equally spaced events, such as the isochronous ticks of a clock, listeners tend to subjectively accentuate every fourth or second tick and just rarely accentuate every third tick. Subsequent studies used various paradigms to explore this “tick-tock effect” [4, 5] and its neurophysiological correlates [6, 7]. A general preference for binary or quaternary over ternary grouping is also evident in other tasks involving rhythm perception and production, such as in music [8–12].

The spontaneous clapping, tapping, swaying, and nodding in time with music is a universal human behavior. It provides evidence of our ability to extract and perceive a regular pulse and its underlying hierarchically organized metrical structure. This capacity for beat perception is a fundamental human cognitive skill [13, 14] and present from infancy [15]. Even when listening to complex musical rhythmic structures, which do not accent the beat itself, most people can extract a regular pulse and synchronize their movements to it, indicating that beat perception is a constructive and endogenous process [16].

The regular pulse we emphasize when synchronizing with music represents only one level in a more complex metrical structure. Fig 1 explains the concept of a metrical structure and illustrates its perceptual and behavioral consequences. Fig 1A illustrates how the subdivisions mark the points of the metrical grid, which is established on the basis of the smallest interval between perceptible events of the stimulus. The beat level is the level with which we usually synchronize our body movements. The cycle level marks the onset of the whole repeating pattern. Although the beat level is often the most perceptually salient level, we exhibit high flexibility with regards to synchronizing with any level in the metrical structure [17]. What is considered moving “in time with music” can relate to any level of the metrical structure. Which level of the metrical structure we synchronize our movements with may depend on a number of factors, including stimulus rate, dynamically changing rhythmic accents, and spontaneous motor tempo [18, 19].

Download:

Fig 1. Examples of metrical structures in rhythms.

A) Left and right panels show the same 2:3 polyrhythm with two different underlying metrical structures, corresponding to the two-beat pulse train with ternary grouped subdivisions (left) and the three-beat pulse train with binary grouped subdivisions (right). B) Three different examples of interpretations of the 2:3 polyrhythm that lead to three different behavioral outcomes when synchronizing body movements—here finger tapping—to the stimulus. The subjective experience of the rhythm’s ‘feeling’ depends on the perceived beat, which in turn depends on the grouping of subdivisions (see Fig I in S1 File, for more information). C) Stressing the bold syllables of the speech examples induces ternary grouped two-beat (left) or binary grouped three-beat (right) interpretations of the 2:3 polyrhythms.

https://doi.org/10.1371/journal.pone.0252174.g001

In the special case of polyrhythms, two or more metrical structures co-exist and as such, polyrhythms are often used to create tension and increase expressiveness in musical performances. A polyrhythm is created by presenting at least two pulse trains containing coprime numbers of beats within the same periodic cycle, e.g., in ratios of 2:3, 3:4, or 3:5. A listener may perceive one or the other pulse train as representing the underlying beat and extract the corresponding metrical structure. The example depicted in Fig 1 is a 2:3 polyrhythm. Note, that the two manifestations of metrical structure (Fig 1A) are identical at the cycle level as well as at the subdivisions level, which is defined as the least common denominator of the polyrhythm’s ratio (i.e., 6 in the case of the 2:3 polyrhythm) What distinguishes the two metrical structures is the beat level, which is defined by the grouping of elements at the subdivision level. On the left of Fig 1A, the ternary subdivision grouping results in two perceived beats per cycle. On the right, the binary subdivision grouping results in three perceived beats per cycle. The resulting metrical structures are organized differently and give rise to distinct and mutually exclusive perceptual experiences depending on how the elements at the subdivision level are grouped (Fig 1B). Two of these perceptual experiences are illustrated with speech examples in Fig 1C.

Previous polyrhythm studies have primarily focused on the constituent pulse trains [20–27], neglecting the polyrhythms’ metrical structures and the subdivisions underlying each of the pulse trains. Most of the research has aimed at assessing whether polyrhythms are perceived as integrated or segregated streams [28], while some studies have made efforts to describe the factors that influence beat perception in polyrhythms. Tempo is consistently reported to strongly influence whether the faster or the slower pulse train represents the beat, which is also affected by the density, pitch, accentuation of elements and the relative timing between them [29–33]. Importantly, these studies focused on characteristics of the individual pulse trains, taking no notice of the characteristics of the two competing metrical structures that emerge when the pulse trains are superimposed on each other. In order to elucidate how we organize temporal auditory patterns, it is necessary to explicitly consider the hierarchical relationships between metrical levels [34]. Beat perception studies that do consider metrical structures tend to focus on the beat and meter levels, e.g., by assessing sensitivity to various manipulations of events at strong and weak beat positions [9, 10, 35]. Yet, beat perception entails perception of subdivisions [34]. Unfolding the empirically established temporal relation between beats and subdivisions, London [34] pointed to the fact that the shortest inter-onset-interval (IOI) that can be perceived as representing beats is approximately 200 ms. In comparison, the shortest IOIs necessary for subjective rhythmization, such as the “tick-tock” effect [3] and for interval discrimination [36] is approximately 100 ms, i.e., corresponding to subdividing the beat by a factor of two. As such, we only perceive a regular beat if the cognitive constraints on temporal perception allow us to perceive the subdivisions of that beat, at least potentially. In a tapping study investigating the benefits and costs of explicitly subdividing the beat, Repp [11] found similar thresholds for motor synchronization rates. Because participants were required to tap only to every second, third, or fourth element of pulse trains presented at different rates, Repp’s study also ruled out the possibility that motor constraints influenced participants’ ability to make judgements of the quantity of the faster subdivisions. Assuming that successful grouping of subdivisions is necessary for beat perception to occur [34], we have to move our research focus from the beat level to the subdivision level of the metrical structure—especially when assessing beat perception and sensorimotor synchronization in polyrhythms.

To provide a comprehensive account of beat perception that takes into account the most basic level of the metrical hierarchy, the purpose of this online finger tapping study was to assess how subdivision grouping biases listeners to adopt one rather than another possible metrical structure inherent in a given polyrhythm consisting of two pulse trains. Owing to their ambiguous nature, polyrhythms are ideal stimuli for assessing rhythmic interpretations in tapping studies. Such studies rest on the assumption that subjects synchronize their taps with the perceived beat [19, 20]. The paradigm in the present study allowed for categorization of tapping responses into all possible metrical levels, including half and double tempo in relation to the constituent pulse trains (see Fig I in S1 File). Overall, we hypothesized that participants would prefer to tap to a beat with binary rather than ternary or irregular subdivision grouping, and ternary rather than irregular subdivision grouping. This hypothesis was based on the general propensity for binary grouping of isochronous auditory stimuli and their subdivisions [e.g., 3, 7, 37].

Participants were recruited worldwide via social media. In three separate online experiments we manipulated tempo (N = 100), ratio (N = 120), and pitch (N = 80) of pulse trains in various polyrhythms. Tempo was manipulated to assess transition points of tapping preference within and between metrical structures. Ratio was manipulated to investigate different types of subdivision grouping (binary, ternary, irregular) in the slow and the fast pulse trains at different tempi. Pitch manipulations allowed assessing the effect of low-frequency notes on beat perception relative to the effect of subdivision grouping. We report the results of preregistered main analyses (https://aspredicted.org/yi5si.pdf).

Method

Participants

The study included data of 300 participants (159 female, age range 18–75 years, M = 31 years, IQR = 10.5 years). Additional incomplete or duplicate responses were excluded. The majority of respondents grew up in Denmark (32.7%) followed by Spain (12.3%), UK (7.3%), Germany (5.3%), and the US (4.7%). The remaining 37.7% of participants grew up in forty-four different countries. Musicianship was assessed with one item from Ollen’s Musical Sophistication Index [38]. Eleven percent considered themselves nonmusicians, 29%, music-loving nonmusicians, 24% amateur musicians, 18% serious amateur musicians, 11% semi-professional musicians, and 8% professional musicians. Participants were randomly assigned to complete either the Tempo (N = 100), Ratio (N = 120) or Pitch (N = 80) Experiment (Fig 2). Sample sizes were determined a priori based on pilot studies and preregistered. Participants were informed that their data would be used for scientific purposes. They were not offered any kind of payment for their participation. The study was conducted in accordance with the guidelines from the Declaration of Helsinki and the Danish Code of Conduct for Research Integrity and Aarhus University’s Policy for research integrity, freedom of research and responsible conduct of research. In Denmark, research that does not collect nor store personally identifiable or sensitive information are exempt from IRB approval, which we confirmed in correspondence with the local IRB. All collected data included no personally identifiable information.

Download:

Fig 2. Experimental overview and main research questions for the three individual experiments.

https://doi.org/10.1371/journal.pone.0252174.g002

Procedure

Participants were recruited through social media and directed to a webpage containing a Shiny app, which was developed using the JavaScript library jsPsych [39] embedded into psychTestR [40]. Headphones and touch screens were recommended, though the experiment could also be run on computers using internal speakers. After initial assessment and testing of their devices, participants were randomly assigned to the three different experiments, Tempo, Ratio, or Pitch. For the Pitch dataset, we only included participants wearing headphones to ensure a proper representation of low-frequency tones. After a spontaneous motor tempo assessment, which familiarized participants with tapping on their device (touchscreen, touch pad, or mouse), the experimental tapping task was explained. Participants’ task was to listen to the polyrhythm and to start tapping with the index finger of their dominant hand when they could clearly “feel” the beat. They continued tapping until the sound stopped. Stimuli were presented once and in random order. After each trial, a 100-point slider allowed participants to rate how difficult it was to find the beat. Finally, participants filled out a short questionnaire assessing their musical and linguistic background. Questionnaire data, spontaneous motor tempo, and difficulty ratings were not analyzed in the present work, which focuses specifically on reporting the tapping data as specified in the preregistration (https://aspredicted.org/yi5si.pdf).

Finger tapping analyses

We obtained a high proportion of tapping responses. Only 1.3% of the tapping trials were missing in Tempo, 2.0% in Ratio, and 1.0% in Pitch. To remove involuntary double taps and device artefacts, we calculated all the time intervals between consecutive taps (Inter Tapping Intervals, ITIs) and removed the second tap of each ITI shorter than 150 ms (see Fig II(A) in S1 File). We additionally removed the first two taps of each trial. If less than five taps remained, we removed the trial from the analysis. The resulting percentages of excluded trials in this step were 3.7% in Tempo, 4.6% in Ratio, and 3.0% in Pitch.

The timing of the taps was converted into angular measures, i.e., phase in radians. This means that the taps are measured as a circular angle in relation to the timing of a periodic reference event at angle 0. The periodic references were defined as the different metrical levels in each polyrhythm: cycle, slow pulse train, slow pulse train—double tempo, slow pulse train—half tempo, fast pulse train, fast pulse train—double tempo, fast pulse train—half tempo, and the common subdivision level (see Fig II(B) in S1 File). This method allowed us to obtain the consistency of the taps at each metrical level, even if the device or the participant missed some taps. Because every operating system, browser and tapping device has different delays, we did not analyze stimulus-tapping phase but used ITIs and circular statistics.

In the circular statistics analyses [41], we computed the mean resultant vector of the tapping responses to each stimulus for every metrical level. The tapping consistency is reflected by the length of the mean vector ranging from 0 to 1. To determine the metrical level tapped by the participant in each trial, we took the longest vector length among all the metrical levels and checked whether the taps were uniformly distributed using the Rao’s Spacing test and the Rayleigh test. Only when both tests were significant (p ≤ .05), the tapping responses were assigned to a metrical category. This procedure filtered out non-regular tapping responses, including irregular grouping of subdivisions and synchronization with the rhythm itself. The percentages of trials not assigned to any metrical category (i.e. non-significant at least in one of the two circular tests) were 12.3% in Tempo, 24.7% in Ratio, and 5.1% in Pitch. The larger proportion of uncategorized responses in the Ratio Experiment reflects the increased complexity of the stimuli presented here. Finally, we confirmed the selection of the categorized metrical level by only accepting those metrical tapping responses (i) whose mean of the ITIs fell in the range of ±15% of the inter-onset interval (IOI), i.e., the tempo in milliseconds, of the categorized meter and (ii) whose standard deviation of the ITIs was smaller than 66% of the IOI. These means and standard deviations were obtained after removing ITIs that fell beyond two standard deviations from the mean ITI of each trial. The percentages of trials rejected in this step were 16.6% in Tempo, 14.7% in Ratio, and 14.7% in Pitch. The combination of linear and circular analyses resulted in the final inclusion of the following percentages of trials with consistent tapping at one of the metrical levels: 66.2% in Tempo, 54.1% in Ratio, and 76.2% in Pitch. See Fig III in S1 File, for visualization of trials excluded in each of the data cleaning steps.

Tempo experiment

The perception of a regular beat in isochronous sequences of sounds is possible if the tempo is within a range of approximately 30–300 BPM / 2000–200 ms [18, 34, 42]. Particularly salient pulses are perceived at tempi between 80–160 BPM / 750–375 ms, corresponding to our preferred spontaneous motor tempo [4, 43, 44]. Previous studies suggested that, when synchronizing with polyrhythms, individuals tap in time with the pulse train closest to the human preferred tempo, i.e., the faster pulse train in slower tempi and vice versa [32, 33]. In contrast, we expected that individuals synchronize with the pulse train that can be subdivided into binary groups. We assessed 2:3 and 3:4 polyrhythms in a wide range of tempi. The fast pulse train in the 2:3 polyrhythm (i.e., 3) admits binary subdivision whereas the slow pulse train in the 3:4 polyrhythm (i.e., 3) admits binary subdivision.

Hypotheses.

We expected that any difference in distribution of tapping preference between the two polyrhythms can be explained by differences in subdivision grouping and not by the relative timings of the slow and the fast pulse trains. At moderate tempi, we expected taps to occur in time with the pulse train in which subdivisions could be grouped in pairs (binary). At faster tempi, we expected the pulse train itself to be perceived as subdivisions—again with preferences for binary grouping of subdivisions. At extremely slow and extremely fast tempi, we expected taps to shift towards the subdivisions and the cycle, respectively.

Stimuli.

All stimuli in this study were created with Ableton Live 8 (Ableton, Berlin, Germany; audio files in https://researchbox.org/278). The Tempo stimuli consisted of 2:3 and 3:4 polyrhythms ranging from very slow (approx. 40 BPM) to very fast tempi (approx. 450 BPM; see Table 1). The two pulse trains in each of the polyrhythms were presented with the same cowbell sound and the same amplitude. The duration of the 15 stimuli was between 18 and 28 s, depending on the ratio and tempo. Every stimulus was presented once in random order and consisted of at least six repetitions of a whole polyrhythm cycle. Additional stimuli for control analyses are shown in Figs IV and V in S1 File.

Download:

Table 1. The 15 stimuli of the tempo experiment.

https://doi.org/10.1371/journal.pone.0252174.t001

Statistical analyses.

Tapping responses were categorized as one of the following metrical categories: cycle, slow pulse train, double and half tempo of the slow pulse train, fast pulse train, double and half tempo of the fast pulse train, and the common subdivisions. Unclear tapping performances that could not be categorized were not analyzed. Within each metrical level, Cochran’s Q tests were used to investigate the effect of tempo. McNemar’s tests were used to analyze differences between neighboring tempo pairs. All analyses were Bonferroni-corrected for multiple comparisons.

Ratio experiment

Although the literature acknowledges that rhythmic interpretation depends on the configuration and in turn on the structure of the polyrhythm [29, 30, 45], no efforts have yet been made to directly assess which particular aspects of the polyrhythm configurations drive tapping preference. In the Ratio Experiment, we assessed preference for metrical structure by focusing on the polyrhythm subdivision level, rather than preference for the constituent pulse trains, as in previous studies [29–31, 33]. The paradigm included polyrhythms ranging from simple (e.g., 2:3) to complex (e.g., 5:6; see Table I in S1 File, for a definition of complexity). Different configurations of polyrhythms give rise to different possible subdivision groupings (binary, ternary, irregular). For example, in a 2:3 polyrhythm, a binary subdivision grouping subserves the three-beat and a ternary subdivision grouping subserves the two-beat (Fig 1A). Accordingly, with respect to subdivision grouping, the 2:3 polyrhythm can be denoted ternary:binary, while for instance the 2:5 polyrhythm can be denoted irregular:binary. Because six subdivisions may be grouped in either two or three, the 5:6 polyrhythm should be denoted binary/ternary:irregular.

Hypothesis.

We expected that the metrical structure containing simpler subdivision grouping would be preferred over those containing more complex subdivision grouping. This means that we expected the following preferences: Binary grouping (2 or 4) is preferred over ternary grouping (3), and ternary grouping (3) is preferred over irregular grouping (5).

Stimuli.

The Ratio stimuli consisted of 2:3, 2:5, 3:4, 3:5, 4:5, and 5:6 polyrhythms (audio files in https://researchbox.org/278). The two pulse trains in each polyrhythm were presented with the same cowbell sound and the same amplitude. The tempo of the polyrhythms were based on the duration of their subdivisions, i.e., their least common denominator (Table 2). Two 3:4 polyrhythms with the subdivision tempi 167 and 125 ms, corresponding to pulse train tempi of 90:120 and 120:160 BPM (667:500 and 500:375 ms), were used as anchors. To make the tempo of the pulse trains comparable across the different ratios, 2:3 and 2:5 polyrhythms were additionally slowed down to half tempo, whereas 3:5, 4:5, and 5:6 polyrhythms were additionally speeded up to double tempo (Table 2). Following the temporal constraints on beat perception described by London [34] and Repp [11], it is reasonable to assume that beat perception is only possible when the tempo allows for grouping of subdivisions. Consequently, in a metrical structure containing groupings of a large number of subdivisions, the tempo of the pulse train must be slowed down to allow beat perception to occur and to make balanced comparisons between different ratios possible. In other words, comparisons should be made between subdivision tempi, not pulse train tempi. The duration of the 22 stimuli was between 15 and 24 s, depending on the ratio and tempo. Every stimulus consisted of at least five repetitions of a whole polyrhythm cycle.

Download:

Table 2. The 22 stimuli of the ratio experiment.

https://doi.org/10.1371/journal.pone.0252174.t002

Statistical analyses.

Tapping responses were coded as 1 when falling into one of the fast pulse train categories (fast pulse train, fast pulse train—double tempo, or fast pulse train—half tempo) and coded as 0 when falling into one of the slow pulse train categories (slow pulse train, slow pulse train—double tempo, or slow pulse train—half tempo). These values were averaged across all tempi in each of the polyrhythm ratios (Fig 5A). For statistical analysis, we computed two means: 1) the mean of all polyrhythm ratios in which the slow:fast pulse train relation admits ternary:binary subdivision (2:3), irregular:binary subdivision (2:5, 4:5) or irregular:ternary subdivision (3:5), i.e., ratios with simpler subdivision grouping in the faster pulse train, and 2) the mean of all polyrhythm ratios in which the slow:fast pulse train relation admits binary:ternary subdivision (3:4) or binary/ternary:irregular subdivision (5:6), i.e., ratios with simpler subdivision grouping in the slower pulse train. These two means were compared using a paired sample Wilcoxon test.

Pitch experiment

The pitch of elements in a musical rhythm is an important factor for beat perception. Low-pitched rhythmic elements increase the sensitivity to timing variation on behavioral and neural levels [46], and EEG activity at meter-related frequencies increases with low-pitch sounds [47]. In general, high energy in bass frequencies are important for inducing movements, such as tapping in time with the beat and dancing [48–51]. When investigating the effect of pitch on beat perception in polyrhythms, Handel and Oshinsky [31] found that participants tended to perceive the lower pitched pulse train as the beat and that this preference counteracted preferences related to the timing of the pulse trains. Here, we investigated this effect of lower pitch in more detail by not only varying pitch, but also loudness between the slow and fast pulse trains in 2:3 and 3:4 polyrhythms. This manipulation was informed by pilot studies showing that without a loudness manipulation the vast majority of participants tap in time with the pulse train admitting binary subdivision grouping. The resulting design allowed us to assess whether preferences for binary subdivision grouping have stronger effects on beat perception than loudness or bass frequencies.

Hypotheses.

We expected participants’ tapping responses to reflect a preference for lower pitched pulse trains. This means that the preference for binary subdivision grouping should be strengthened when coinciding with the low-pitched pulse train and weakened when coinciding with the high-pitched pulse train.

Stimuli.

The Pitch stimuli consisted of 2:3 and 3:4 polyrhythms created with marimba sounds at the tempi of 90:135 BPM (667:444 ms) and 90:120 BPM (667:500 ms), respectively (audio files in https://researchbox.org/278). Table 3 details the pitch and loudness manipulations. In each polyrhythm, one of the pulse trains was pitched low with a peak frequency of 262 Hz (C4) and the other pulse train was pitched higher with a peak frequency of 1047 Hz (C6). This manipulation was counterbalanced. The loudness of the two pulse trains was either the same, moderately louder for the pulse train with ternary subdivisions, or markedly louder for the pulse train with ternary subdivisions. Loudness was measured in Loudness K-weighted Full Scale (LKFS) with the Orban Loudness Meter (version 2.9.6; www.orban.com/meter/). The duration of the 12 stimuli was 17 and 18 s for ratios 2:3 and 3:4, respectively. To exclusively investigate the effect of amplitude, we presented six additional control stimuli with the same loudness manipulation as the experimental stimuli but using the same pitch in both pulse trains (C5 with a peak frequency of 524 Hz; Fig VI in S1 File).

Download:

Table 3. The 12 pitch stimuli.

https://doi.org/10.1371/journal.pone.0252174.t003

Statistical analyses.

The dependent variable was defined as the tapping consistency related to the slow pulse train minus the tapping consistency related to the fast pulse train in each trial. As the tapping consistency is defined as the vector length in circular statistics, this procedure resulted in a value ranging between -1 and 1. Values close to -1 indicate that participants consistently tapped in time with the slow pulse train, whereas values close to 1 indicate that participants consistently tapped in time with the fast pulse train. Data for the individual factor combinations (2 pitch × 3 loudness factor levels) were not normally distributed (Shapiro-Wilk p-values < .001). In the preregistration for this study, we planned to use linear mixed effects models for analyzing the pitch data. However, the residuals of these models were not normally distributed, as indicated by visual inspections of Q-Q plots and Shapiro-Wilk tests (all p-values < .001). Consequently, we used two Wilcoxon tests for paired samples to investigate the effect of pitch (low pitch in slow vs. fast pulse train) and two Kruskal-Wallis tests to investigate the effect of loudness in 2:3 and 3:4 polyrhythms, separately. We report the results of the nonparametric tests, which were supported by a double check analysis using the linear mixed effects models.

Results and discussion

In the following section, we report and discuss the results of the three experiments Tempo, Ratio, and Pitch separately, to unravel their individual effects on beat perception in polyrhythms. In the subsequent General Discussion, we focus on the converging evidence for a preference for binary grouping of subdivisions in musical rhythms and broaden the perspective to binary grouping of temporal units in the perception of time in everyday life.