Statistical learning and the uncertainty of melody and bass line in music

Authors: Tatsuya Daikoku ^aff001
Authors place of work: Department of Neuropsychology, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany ^aff001; Centre for Neuroscience in Education, Department of psychology, University of Cambridge, Cambridge, United Kingdom ^aff002
Published in the journal: PLoS ONE 14(12)
Category: Research Article
doi: https://doi.org/10.1371/journal.pone.0226734

Summary

Statistical learning is the ability to learn based on transitional probability (TP) in sequential information, which has been considered to contribute to creativity in music. The interdisciplinary theory of statistical learning examines statistical learning as a mechanism of human learning. This study investigated how TP distribution and conditional entropy in TP of the melody and bass line in music interact with each other, using the highest and lowest pitches in Beethoven’s piano sonatas and Johann Sebastian Bach’s Well-Tempered Clavier. Results for the two composers were similar. First, the results detected specific statistical characteristics that are unique to each melody and bass line as well as general statistical characteristics that are shared between the melody and bass line. Additionally, a correlation of the conditional entropies sampled from the TP distribution could be detected between the melody and bass line. This suggests that the variability of entropies interacts between the melody and bass line. In summary, this study suggested that TP distributions and the entropies of the melody and bass line interact with but are partly independent of each other.

Keywords:

Learning – Markov models – Neurophysiology – Bioacoustics – Entropy – Music cognition – Statistical distributions – Conditional entropy

1. Introduction

1.1. Statistical learning in humans and computers

Statistical learning (SL) has been considered a domain-general and implicit learning system that encodes probabilistic distribution of sequential phenomena such as music and language [1–3]. For example, the brain’s SL machinery automatically computes transitional probability (TP) distributions of sequences, calculates uncertainty/entropy of the distribution, and predicts a future state based on an internalized statistical model in order to minimize sensory reaction and uncertainty and optimize the efficiency of the prediction. SL is an interdisciplinary field that embraces both the brain’s SL system and artificial intelligence in the framework of predictions. When a brain or a computer encodes the TP distribution of a sequence, it expects a probable future stimulus with a high TP and inhibits the processing loads that will arise in response to predictable states [4][5]. SL has been considered to contribute to creativity in music [6,7], decision-making [8–10], and motor activities [11,12][13] as well as perception [14,15][16,17]. The TP is a conditional probability of an event B given that the latest event A has occurred, written as P(B|A). The TP distributions sampled from sequential information can be expressed by nth-order Markov models or n-gram models [18]. The Markov model has frequently been applied to develop artificial intelligence that gives computers learning abilities similar to those of the human brain, thus generating systems for data mining, automatic music composition [19], and automatic text classification in natural language processing [20].

Psychologists agree that computational and corpus studies on music can highlight some of the statistical properties available to musical learners by SL and implicit learning [21–24]. Particularly, the Competitive Chunker [25], PARSER [26], Information Dynamics of Music (IDyOM) [27], and n-gram models [28] underlie the hypothesis that music is acquired by concatenating chunks. Computational studies calculate statistical distributions in music and devise corresponding models, then evaluate the validities of these models through neurological and behavioural experiments [27,29,30]. Particularly, SL in Markov models, which correspond to n-gram models based on conditional probability [31], overlaps with SL in many other fields of study, such as neuroscience, behavioural science, and computational science. Entropy, which is calculated from the probability distribution and has been interpreted as the average degree of surprise associated with an outcome [32,33], has also been used to verify the validity of computational models including SL in music [34–37]. Thus, information-theoretical approaches including information content and entropy (i.e., transitional probability and uncertainty, respectively) based on n-order Markov models are candidates for understanding musical SL on an interdisciplinary scale.

1.2. Uncertainty, probability, and order

To precisely predict individual events in a sequence, the brain encodes the degree of uncertainty of the statistical distributions in the sequence as well as the TP value itself [34,38]. This uncertainty can be evaluated using “entropy” as Shannon has done [31]. Particularly, conditional entropy can be calculated from TP distribution, interpreted as the average degree of surprise or uncertainty of an outcome. From a psychological perspective in music, a musical sequence with higher conditional entropy is considered to have information that makes its distributional structure more difficult to grasp. Therefore, in terms of information efficiency, an SL model sampled from a sequence with higher conditional entropy will be less optimized. Several studies have shown that the degree of conditional entropy modulates the precision of predictability in a sequence [30,39–41]. In addition, the uncertainty in musical sequences may account for the characteristics of musical SL ability in persons with developmental learning disorders such as amusia [42–44]. The literature on this topic indicates that persons with developmental learning disorders are impaired only with regard to higher- rather than lower-order SL [45]. Computational modelling has also suggested that individual differences in statistical knowledge gradually emerge from the lower- to higher-order SL models [46][47], and that statistical knowledge may shift from a lower- to higher-order (deeper) hierarchy through experience. Thus, distinct stages of SL strategies may be explained based on the information-theoretical concept of “order”. The order of SL is not independent of but rather interdependent on the degree of uncertainty[48]. In the framework of information theory, higher-order statistical models represent lower conditional entropy (i.e., uncertainty) (see Fig 3B in [18]). In other words, when the brain can construct a higher-, but not a lower-, order statistical model from music, it can internalize the music as having less uncertainty. Thus, the order of the SL model in music could modulate the uncertainty.

1.3. Creativity and uncertainty

Recent literature has suggested that specific developmental processes modulate SL ability in the brain. For example, both Western-classical and jazz musicians are better statistical learners in general than nonmusicians [49–53]. Furthermore, through long-term musical training, musicians optimize their brains’ probabilistic modelling ability for SL and decrease the degree of uncertainty [52]. In the end, the optimized SL models in musicians’ brains allow them to precisely and efficiently predict tones during SL of auditory sequences. This precision and efficiency of prediction may also enhance neural-processing efficiency. For example, neurophysiological studies have demonstrated the existence of individual differences in SL ability in the framework of prediction [54]. This may indicate that auditory training modulates neural processing that may reflect prediction based on SL. Although the brain tries to realize valuable behaviours at the lowest uncertainty, it also seeks a slightly suboptimal solution if such a solution can be afforded at a significantly low uncertainty [55]. This fluctuation of uncertainty could contribute to maximizing the rewards of curiosity, encouraging human creativity and creating new information regularities [56]. Recent computational studies on music have suggested that, from the early stage to the late stage of a composer’s lifetime, the transitional probabilities of familiar phrases in that composer’s music gradually decrease [46], whereas the conditional entropy (i.e., uncertainty) gradually increase. These findings were more prominent in higher- than in lower-order SL models. These studies suggest that higher- rather than lower-order statistical knowledge [46][38] may be more susceptible to long-term experience that modulates uncertainty in the brain’s probabilistic model [52]. Furthermore, computational studies on improvisation in music have suggested that lower-order SL models represent general characteristics shared among musicians, whereas higher-order SL models detect specific characteristics unique to each musician [57][58]. Thus, a growing body of literature indicates that SL affects musical structure and its statistical distributions. It is unknown, however, how the TP distributions of the melody and bass line interact with each other, and how tonal mode and key govern the statistical distributions and the interactions between the melody and bass line.

Western tonal classical music has a number of specific features such as isochronic metrical grids, tonal pitch spaces, hierarchical tension, and attraction contours based on the structure of the melody and chord progression [59,60]. The musical melody and bass line can interact with each other within the constraints of these features. In music, the highest and lowest pitches play an important role in establishing the frames of the melody and bass line, respectively. To form musical structures such as phrase and harmony, they are partly dependent and partly independent of each other. According to neurophysiological and behavioural studies, SL of dyad sequences with distinct regularities in each high and low voice can be performed in parallel and independently [61,62]. In other words, distinct statistical knowledge of high- and low-pitch sequences can be acquired simultaneously. Another neurophysiological study suggested that SL is also possible for harmony sequences in which the highest and lowest pitches are randomly distributed without regularity [29]. Together, neural studies support the hypothesis that SL of the melody and SL of the bass line interact with and are partly independent of each other in the framework of the Gestalt principle in music [60]. To understand musical SL in humans and to refine the computational models, it is important to examine how the melody and bass line interact with each other based on statistical and music-specific features.

1.4. The aim of the present studies

The purpose of the present studies is to investigate how TP distributions of the melody and bass line interact with each other, and how tonal mode and keys govern the statistical distributions and the interaction between the melody and bass line. The information content of TPs in the sequences containing the highest and lowest pitches in all of the movements in Beethoven’s piano sonatas (No.1, Op.2-1 to No.32, Op.111) (Study 1) and Johann Sebastian Bach’s Well-Tempered Clavier (Study 2) were calculated based on six different order Markov stochastic models (i.e., zeroth- to fifth-order Markov chains). First, to investigate the statistical characteristics of the melody and bass line in each piece of music, the TP distribution was analysed using principal component analysis, based on the hypothesis that there are fundamental statistical characteristics shared between the melody and bass line, and specific statistical characteristics that are unique to each. Additionally, the detectability of these characteristics may depend on the tonal mode and the keys [63] and/or on the order of TP distributions (first to sixth orders). If so, the interaction of statistical characteristics between the melody and bass line may depend on the tonality (tonal mode and keys) and/or order of the TP distribution[64]. Second, to investigate the relationships between entropy in the melody and entropy in the bass line in each tonality and each order of TP distribution, the conditional entropy of the TP distribution was compared by correlation analysis between the melody and bass line, and between music in a major key and music in a minor key. It was hypothesized that the variability of entropy in each piece of music depends on the tonality and order of TP distribution. In the present studies it was expected that the statistical distribution of music would correspond with models of predictive function in the brain, and we first investigated how information-theoretical notions including information content and entropy are related to SL theory regarding human predictions.

2. Methods

All of the movements in Ludwig van Beethoven’s piano sonatas (No.1 in F minor, Op.2-1 to No.32 in C minor, Op.111, composed 1795–1822) and Johann Sebastian Bach’s Well-Tempered Clavier, BWV 846–893, which is a collection of two series (No.1 and No.2) of preludes and fugues in all 24 major and minor keys, were used in the present studies. Using a scorewriter software program (Finale version 25, MI Seven Japan, Inc.), electronic scoring data of the sequences of highest pitch were extracted from the XML files. The highest and lowest pitches were defined as the highest and lowest pitches that can be played at a given point in time; in identifying these pitches, equivalent pitches were counted as one, and grace notes were excluded. Using all the pitch sequences in each piece of music, the TPs distributions were calculated based on zeroth- to fifth-order Markov models. In Beethoven’s piano sonatas, the weighted averages of TPs of all the movements were calculated. In Bach’s Well-Tempered Clavier, the weighted averages of TPs of the prelude and fugue in No.1 and No.2 in each key were calculated. As described in detail previously [57], the nth-order Markov models are based on the conditional probability of an element e_n+1, given the preceding n elements:

Then, for each type of pitch-interval transition, all of the intervals were numbered so that an increase or decrease in a semitone was 1 or -1, respectively, based on the first pitch. Representative examples are shown in Fig 1. This revealed interval patterns but not pitch patterns. This procedure was employed to eliminate the effects of key changes on transitional patterns. The interpretation of a key change depends on the musician and is difficult to define in an objective manner. Thus, the results of the present studies may represent a variation of statistics associated with relative pitch rather than absolute pitch. Then, the information content (I[e_{n +1}|e_n]) in each TP was calculated based on information theory [31] as:

The SL mechanism can be explained using well-defined principles of information theory [31]. Information, also referred to as information content, is measured in binary integers or bits. The key insight is that information, i.e., the sum of the bits required to transmit a message, has entropy, i.e., “uncertainty” of statistical distribution. Thus, using the distributions of TPs (information content) in each melody and bass line of each piece of music, the distributional characteristics of each piece of music were analysed by principal component analysis (PCA). The present study hypothesized that a component shared within the melodies or bass lines and within major or minor keys represents a specific characteristic of TP distribution depending on voice part (i.e., melody and bass) and tonal mode (i.e., major and minor). Based on our previous papers [57], the criteria of the eigenvalue were set over 1. The first two components that contribute to each piece of music (i.e., the first and second highest cumulative contribution ratios), were adopted in Study 1. In Study 2, on the other hand, the first three components were adopted in order to verify the components of major and minor keys as well as those of the melody and bass lines. Furthermore, the conditional entropy (H(AB)) in the nth-order was calculated from the information content as follows:

where P(bj|ai) is a conditional probability of the sequence “ai bj”. P(ai) is the probability of event ai occurring, and P(bj|ai) is the probability of bj occurring given that ai occurs previously (i.e., transitional probability). The conditional entropy is the sum of the bits and is regarded as the “uncertainty” of the transitional-probability distribution. The conditional entropy of each TP distribution was compared by correlation analysis. Statistical significance levels were set at p = 0.05 for all analyses.

**Fig. 1. Representative phrases of transition patterns in the melody and bass line from zeroth- to fifth-order Markov models (Beethoven’s piano sonata).**

3. Study 1: Ludwig van Beethoven

3.1. Results

3.1.1. Retrieval of characteristics in the melody and the bass line in major and minor keys

The transitional-probability matrices and the entropies in each piece of music are shown in Supporting Information 1 and 2, respectively. All of the results are shown in Table 1, Table 2, and Fig 2. In the zeroth-order model, the two components accounted for 51.18% of the total variance. All of the pieces of music except for No.20 scored higher than .37 on component 1. This score represents the general component that is shared between the melody and the bass line. Component 2, in contrast, was unable to detect any shared characteristics between the melody and bass line. In the first-, second-, and third-order models, the two components accounted for 42.64%, 25.91%, and 18.56% of the total variance, respectively. All of the pieces of music scored higher than .44, .25, and .17 on component 1 in the first-, second-, and third-order models, respectively. These results represent the general component that is shared between the melody and the bass line. In component 2, on the other hand, the eigenvectors in the melody were generally higher than those in the bass lines. This represents the distinct components of the melody and bass lines. In the fourth- and fifth-order models, the two components accounted for 14.23% and 13.12% of the total variance, respectively. All of the pieces of music scored higher than .14 and .03 on component 1 in the fourth- and fifth-order models, respectively. These results represent the general component that is shared between the melody and the bass line. In component 2, the eigenvectors were generally lower in the melody than in the bass lines. This represents the distinct components of the melody and the bass line.

**Tab. 1. The eigenvalue and percentages of variance and cumulative variance in Study 1 (Beethoven’s piano sonata).**

**Tab. 2. The eigenvectors for the principal components in Study 1 (Beethoven’s piano sonata).**

3.1.2. Correlation analysis

All of the results in the correlation analysis are shown in Fig 3. In first- to fifth-order TP distributions, the conditional entropies of the melody were significantly related to those of the bass line (1st: r = .60, p < 0.001; 2nd: r = .82, p < 0.001; 3rd: r = .80, p < 0.001; 4th: r = .55, p = 0.001; 5th: r = .50, p = 0.004).

**Fig. 3. The correlation analysis of conditional entropy between the melody and bass line based on zeroth- to fifth-order Markov models in Study 1 (Beethoven’s piano sonata).**

3.2. Discussion

This study examined how zeroth- to fifth-order TP distributions (Markov models) and the conditional entropies in the melody and bass line correlate and interact with each other in all movements of the piano sonatas by Ludwig van Beethoven (No.1 in F minor, Op.2-1 to No.32 in C minor, Op.111, composed 1795–1822). First, we investigated how the statistical characteristics of the melody and bass line can be extracted in each order Markov model using principal component analysis. It was hypothesized that there were general statistical characteristics shared between the melody and bass line as well as specific statistical characteristics that were unique to each melody and bass line based on each order model. Thus, TP distribution in the zeroth-order Markov model detected a general component that is shared between the melody and bass line, whereas those in the first- to fifth-order Markov models detected specific components that are unique to each melody and bass line (Fig 2). These results suggest that specific statistical characteristics in each melody and bass line can be disclosed in higher-order but not in zeroth-order statistical models. From the psychological and neurophysiological viewpoints of SL in the brain, higher-order but not lower-order statistical knowledge of the melody and bass line are partially independent of each other.

Second, we investigated the relationships of conditional entropies between the melody and bass line in each order Markov model using correlation analysis. It was hypothesized that the correlation of the variability in the entropy between the melody and bass line depends on the order of TP distribution. The results suggest that the correlation of conditional entropies between the melody and bass line could be detected in the first- to fifth- but not zeroth-order Markov models. They may suggest a correlation in the variability of entropies between the melody and bass line in higher-order TP distributions. This may suggest that the correlation between the melody and bass line depends on the length of the sequence. Compared to the zeroth-order model, the higher models could essentially construct a musical phrase. Thus it is possible that the analysis of an entire musical phrase may strengthen the perceived connection between the melody and bass line. In psychological and computational studies related to SL, predictive coding, and information theory, entropy has been interpreted as the average degree of surprise associated with an outcome [33]. Entropy has also been used to verify the validity of statistical models in music [34–37]. The present study detected that the entropy of the melody is correlated with that of the bass line in higher-order statistical models. This may suggest that higher-order but not lower-order statistical knowledge of the melody and the bass line are partially dependent on each other. This hypothesis seems plausible given what we know about musical properties. In general, musical constraints such as harmony and musical key control phrasing of each melody and bass line. For example, if a five-tone melody is made up of C sharp, F sharp, and D (Fig 1, fourth-order), it controls a harmony or key (e.g., the A major, F-sharp minor, D major, or B minor keys), and the concurrent bass line also follows the same key or harmony. In contrast, a two-tone sequence with a semi- or whole-tone interval, which can be coded in a first-order model, is insufficient to establish a harmony, musical key, and phrase, unlike longer sequences. It is worth noting, however, that a pianist often picks up his or her hands as a phrase ends and restarts a new phrase, resulting in unpredictable jumps in pitch interval. Thus, we cannot exclude the possibility that the findings of the present study could simply be associated with texture and phrasing in music rather than melody and bass patterning itself. Further study will be needed to verify the relationships between musicological texture and statistical pattern with regard to entropy in several orders of TP distributions. In summary, this study may suggest that the SL of the melody and bass line correlate with and are partly independent of each other in terms of TP distribution. These findings may also be in agreement with the hypothesis in neural studies that the SL of the melody and bass line interact with and are partly independent of each other [29,61,65]. In the present studies, it was expected that this would occur based on some very specific findings in the neuroscience literature, but a previous neural study also suggested that SL could be modulated by music-specific features such as tonal mode and key [29]. Therefore, our next study will investigate how the tonalities of keys govern statistical distributions and the interaction between the melody and bass line.

4. Study 2: Johann Sebastian Bach

4.1. Results

4.1.1. Retrieval of characteristics in the melody and bass lines in major and minor keys

All of the results are shown in Table 3, Table 4, and Fig 4. In the zeroth- to fifth-order models, the three components accounted for 58.71%, 50.03%, 37.41%, 31.31%, 24.14%, and 15.94% of the total variance, respectively. All of the pieces of music scored higher than 0 on component 1, which represents the general component that is shared among all of the pieces of music. In component 2, in the first-, second-, and third-order models, the eigenvectors of the bass line were generally higher than those of the melody, representing the distinct components of the melody and the bass line. In component 3, in the second-order model, the eigenvectors of major keys were generally higher than those of minor keys, representing the various components of major and minor keys.

**Tab. 3. The eigenvalue and percentages of variance and cumulative variance in Study 2 (Bach’s Well-Tempered Clavier).**

**Tab. 4. The eigenvectors for the principal components in Study 2 (Bach’s Well-Tempered Clavier).**

4.1.2. Correlation analysis

All of the results in the correlation analysis are shown in Fig 5. In the zeroth-, second-, and third-order TP distributions, the conditional entropies of the melody were strongly (0.7≦|r|<1.0) related to those of the bass line (zeroth: major: r = .77, p = 0.003; minor: r = .85, p < 0.001, second: major: r = .93, p < 0.001; minor: r = .78, p = 0.003, third: major: r = .75, p = 0.005; minor: r = .91, p < 0.001; Fig 5A). In first-order TP distributions, the conditional entropies of the melody in major keys were strongly related while those in minor keys were moderately (0.4≦|r|<0.7) related to those of the bass line (major: r = .82, p = 0.001; minor: r = .62, p = 0.063). In fourth-order TP distributions, the conditional entropies of the melody in major keys were moderately related while those in minor keys were strongly related to those of the bass line (major: r = .59, p = 0.045; minor: r = .93, p < 0.001). In fifth-order TP distributions, the conditional entropies of the melody were strongly related to those of the bass line in minor keys (r = .81, p = 0.001), whereas no significant correlation was detected in major keys. No significant correlation was detected between major and minor keys (Fig 5B).

4.2. Discussion

In Study 2, using Johann Sebastian Bach’s Well-Tempered Clavier, BWV 846–893, which has preludes and fugues in all 24 major and minor keys, we investigated the interaction between the zeroth- to fifth-order TP distributions (Markov models) and the conditional entropies in the melody and bass line. First, the manner in which the statistical characteristics of the melody and bass line in each of the major and minor keys could be extracted in each order Markov model was investigated using principal component analysis. It was hypothesized that there were general statistical characteristics shared between the melody and the bass line and between the major and minor keys, as well as specific statistical characteristics that were unique to each melody and bass line and to each major or minor key. Additionally, it was hypothesized that the detectability of these characteristics depends on the tonalities of the keys and the order of TPs [63]. Thus, TP distribution in each order Markov model detected general components that are shared between the melody and bass line and between major and minor keys (Fig 4). The first- to third-order Markov models detected specific components that are unique to each melody and bass line. The second-order Markov models detected specific components that are unique to each major and minor key**1**. These results suggest that statistical characteristics specific to each melody and bass line can be disclosed in first- to third-order models. Second, we investigated the relationships of conditional entropies between the melody and bass line and between major and minor keys in each order Markov model using correlation analysis. It was hypothesized that the correlation of variability in the entropies between the melody and bass line depends on the order of TP distribution and tonal mode. The results suggested that the correlation of conditional entropies between the melody and bass line could be detected in the first- to fifth- but not zeroth-order Markov models. These results suggest that the variability of entropies is correlated with the melody and bass line in each order TP distribution. Considering the psychological and computational viewpoints on entropy [34], the present findings that the entropies of the melody are correlated with those of the bass line suggest that statistical knowledge of the melody and bass line, but not of major and minor keys (Fig 5B), are partially dependent on each other. In summary, this study suggested that SL of the melody and SL of the bass line correlate with and are partly independent of each other. Thus, humans’ statistical knowledge of melodies and bass lines may be derived from their pairing with some noise in compositional systems.

5. General discussion

5.1. Statistical characteristics of melodies and bass lines

The present studies investigated how TP distributions and the conditional entropy of the melody and bass line interact with each other, using the highest and lowest pitches in Beethoven’s piano sonatas (Study 1) and Johann Sebastian Bach’s Well-Tempered Clavier (Study 2). Our findings were similar for the two composers. First, TP distribution in each model showed a general component (component 1) that is shared between the melody and bass line. Second, TP distribution in the first- and second- but not zeroth-order models detected specific components (component 2) that were unique to each melody and bass line. These results suggest that statistical characteristics specific to each melody and bass line can be disclosed in higher-order but not in zeroth-order statistical models. From the psychological and neurophysiological viewpoints of SL in the brain, higher-order but not lower-order statistical knowledge of the melody and bass line are partially independent of each other. Additionally, Study 2 also detected specific components (component 3) that are unique to each major and minor key as well as to the melody and bass line (Fig 4). Thus, the results suggest that a second-order Markov model (i.e., trigram model) may have the advantage of being able to extract statistical characteristics based on the tonalities of keys and voice parts. From a psychological viewpoint, a composer’s specific statistical knowledge of the melody and bass lines in music may be expressed in higher-order rather than zeroth-order TP distributions. It is of note, however, that the present studies investigated statistical characteristics in music belonging to only two corpora without taking any psychological or neurological measurements and did not directly demonstrate statistical knowledge of music in the composers. A previous study reported computational validation against a ground truth of human cognition by examining whether the output of computational modelling aligned with human assessments or behaviour [21]. Thus, it may be doubtful to claim that neurodynamics can be represented by TP distribution and entropy. Furthermore, the present studies might not prove the existence of a general musical phenomenon because of the small corpora, and there might be other possible explanations for our results. For instance, it might have been an intentional plan on the part of the composers to compose music based on the statistics of melodies and bass lines. Furthermore, it has been suggested that humans’ ability to generate random sequences of numbers [66] is associated with creativity [67]. The possibility that the findings in the present studies do not necessarily reflect the composers’ statistical learning cannot be excluded. Thus, it remains possible that the findings of these studies showed compositional tendencies that are present in the examined corpus but may not be inherent to cognitive function in the human brain. Future studies are required to investigate the phenomenon of music learning through experimentation and direct comparison of computational and neurophysiological results.

5.2. Relationships of entropy between the melody and the bass line

In the fields of computational and informatics studies, entropy has been used to verify the validity of computational models including SL in music (e.g., [34]). A computational model with lower entropy indicates greater predictability. Additionally, in the fields of neuroscience and psychology, entropy has been interpreted as the average degree of surprise associated with outcomes based on predictions in the brain [32]. Thus, both computational researchers and psychologists agree that entropy in the framework of statistical learning can highlight some of the statistical information that is available to music learners. Based on these studies, the present studies expected the variation of entropy in music to partially reflect typical patterns in musical expression associated with statistical knowledge. The results suggested that the correlation of conditional entropies between the melody and the bass line could be detected in some Markov models for both composers. This suggests that the variability in entropy is correlated between the melody and the bass line in TP distributions. In psychological and computational studies related to SL, predictive coding, and information theory, entropy has been interpreted as the average degree of surprise associated with an outcome [33]. Based on neurophysiological theories, when the brain encodes TP distributions in musical sequences, a next tone can be expected. Based on this processing, a neurophysiological response to predictable external stimuli can be inhibited to ensure efficiency and low entropy of neural processing[68][69] [70]. Thus, the correlation between the melody and the bass line suggests that statistical knowledge of the melody and that of the bass line interact with each other. However, the results of Study 2 also suggest that the correlations of TP distributions and the entropies between the melody and the bass line partly depend on tonalities (i.e., major and minor keys). In the second-order model, the specific characteristics of TP distributions could be detected in major and minor keys of each melody and bass line. Additionally, the correlation of entropy between the melody and the bass line in the fifth-order model could be detected in minor keys but not in major keys. This may be because there is more variation in minor keys than in major ones, as the sixth and seventh scale degrees are more variable in minor keys than in major keys [71]. Another possibility is that, as previous studies have reported, SL of the melody and SL of the bass line interact with and are partly independent of each other [61,65], and SL can be modulated by music-specific features such as tonal mode and key [29]. The present studies may be in agreement with these previous neurophysiological findings. Thus, neurophysiological and computational findings may partially share SL. On the other hand, the computational approaches in the present study did not consider pitch intervals between the melody and the bass line, although this is important information in the establishment of harmony and in the prediction of when the melodies and bass lines will act similarly and when they will act differently. In this study, the two lines were analysed as independent information and compared in order to explore whether the entropy levels of these lines are correlated with each other. Our studies suggest that statistical knowledge, which has been demonstrated by several neurophysiological studies, is mentally expressed in music composition. Future studies are required to investigate the neural basis underlying the mental expression of acquired statistical knowledge by directly comparing computational and neurophysiological results in an experiment. The present studies may propose novel methodologies that can be used to evaluate the statistical knowledge of a composer via interdisciplinary approaches that include informatics, musicology, and psychology.

Supporting information

S1 Table [xlsx]
Transitional-probability matrices in all pieces of music.

S2 Table [xlsx]
Entropies in all pieces of music.

Zdroje

1. Saffran JR, Aslin RN, Newport EL. Statistical learning by 8-month-old infants. Science (80-). 1996. doi: 10.1126/science.274.5294.1926 8943209

2. Cleeremans A, Destrebecqz A, Boyer M. Implicit learning: News from the front. Trends Cogn Sci. 1998;2: 406–416. doi: 10.1016/s1364-6613(98)01232-7 21227256

3. Perruchet P, Pacton S. Implicit learning and statistical learning: one phenomenon, two approaches. Trends Cogn Sci. 2006;10: 233–238. doi: 10.1016/j.tics.2006.03.006 16616590

4. Yumoto M, Daikoku T. Basic function. Clinical Applications of Magnetoencephalography. 2016. doi: 10.1007/978-4-431-55729-6_5

5. Daikoku T. Time-course variation of statistics embedded in music: Corpus study on implicit learning and knowledge. PLoS One. 2018;13. doi: 10.1371/journal.pone.0196493 29742112

6. Wiggins GA. Consolidation as Re-Representation: Revising the Meaning of Memory. Front Psychol. 2019; 1–22. doi: 10.3389/fpsyg.2019.00001

7. Wiggins GA. Creativity, information, and consciousness: The information dynamics of thinking. Phys Life Rev. 2018;1: 1–39. doi: 10.1016/j.plrev.2018.05.001 29803403

8. Berry D., Dienes Z. Implicit learning: Theoretical and empirical issues. Hove, England: Lawrence Erlbaum; 1993.

9. Reber AS. Implicit learning and tacit knowledge. An essay on the cognitive unconscious. New York: Oxford University Press; 1993.

10. Perkovic S, Orquin JL. Implicit Statistical Learning in Real-World Environments Leads to Ecologically Rational Decision Making. Psychol Sci. 2017;29: 34–44. doi: 10.1177/0956797617733831 29068761

11. Daikoku T, Takahashi Y, Tarumoto N, Yasuda H. Auditory statistical learning during concurrent physical exercise and the tolerance for pitch, tempo, and rhythm changes. Motor Control. 2018;22. doi: 10.1123/mc.2017-0006 28872415

12. Daikoku T, Yatomi Y, Yumoto M. Statistical learning of an auditory sequence and reorganization of acquired knowledge: A time course of word segmentation and ordering. Neuropsychologia. 2017;95. doi: 10.1016/j.neuropsychologia.2016.12.006 27939187

13. Daikoku T, Takahashi Y, Tarumoto N, Yasuda H. Motor Reproduction of Time Interval Depends on Internal Temporal Cues in the Brain: Sensorimotor Imagery in Rhythm. Front Psychol. 2018;9: 1–11. doi: 10.3389/fpsyg.2018.00001

14. Daikoku T, Yatomi Y, Yumoto M. Implicit and explicit statistical learning of tone sequences across spectral shifts. Neuropsychologia. 2014;63. doi: 10.1016/j.neuropsychologia.2014.08.028 25192632

15. Daikoku T, Yatomi Y, Yumoto M. Statistical learning of music- and language-like sequences and tolerance for spectral shifts. Neurobiol Learn Mem. 2015;118. doi: 10.1016/j.nlm.2014.11.001 25451311

16. Tsogli V, Jentschke S, Daikoku T, Koelsch S. When the statistical MMN meets the physical MMN. Sci Rep. 2019;9: 5563. doi: 10.1038/s41598-019-42066-4 30944387

17. Koelsch S, Busch T, Jentschke S, Rohrmeier M. Under the hood of statistical learning: A statistical MMN reflects the magnitude of transitional probabilities in auditory sequences. Sci Rep. 2016;6: 1–11. doi: 10.1038/s41598-016-0001-8

18. Daikoku T. Neurophysiological markers of statistical learning in music and language: Hierarchy, entropy, and uncertainty. Brain Sci. 2018;8. doi: 10.3390/brainsci8060114 29921829

19. Raphael C, Stoddard J. Functional Harmonic Analysis Using Probabilistic Models. Comput Music J. 2004;28: 45–52. doi: 10.1162/0148926041790676

20. Brent MR. Speech segmentation and word discovery: a computational perspective. Trends Cogn Sci. 1999;3: 294–301. doi: 10.1016/s1364-6613(99)01350-9 10431183

21. Temperley D. 8—Computational Models of Music Cognition. In: Deutsch DBT-TP of M ( Third E, editor. Academic Press; 2013. pp. 327–368. https://doi.org/10.1016/B978-0-12-381460-9.00008-0

22. Rohrmeier M, Rebuschat P. Implicit Learning and Acquisition of Music. Top Cogn Sci. 2012;4: 525–553. doi: 10.1111/j.1756-8765.2012.01223.x 23060126

23. Dubnov S. Information Dynamics and Aspects of Musical Perception. The Structure of Style, ISBN 978-3-642-12336-8. Springer-Verlag Berlin Heidelberg, 2010, p. 127. 2010. doi: 10.1007/978-3-642-12337-5_7

24. Wang W. Machine Audition: Principles, Algorithms and Systems: Principles, Algorithms and Systems. Information Science Reference; 2010.

25. Servan-Schreiber E, Anderson JR. Learning Artificial Grammars With Competitive Chunking. J Exp Psychol Learn Mem Cogn. 1990;16: 592–608. doi: 10.1037/0278-7393.16.4.592

26. Perruchet P, Vinter A. PARSER: A Model for Word Segmentation. J Mem Lang. 1998;39: 246–263. https://doi.org/10.1006/jmla.1998.2576

27. Pearce MT, Wiggins GA. Auditory Expectation: The Information Dynamics of Music Perception and Cognition. Top Cogn Sci. 2012;4: 625–652. doi: 10.1111/j.1756-8765.2012.01214.x 22847872

28. Pearce M, Wiggins G. Improved Methods for Statistical Modelling of Monophonic Music. J New Music Res. 2004;33: 367–385. doi: 10.1080/0929821052000343840

29. Daikoku T, Yatomi Y, Yumoto M. Pitch-class distribution modulates the statistical learning of atonal chord sequences. Brain Cogn. 2016;108. doi: 10.1016/j.bandc.2016.06.008 27429093

30. Agres K, Abdallah S, Pearce M. Information-Theoretic Properties of Auditory Sequences Dynamically Influence Expectation and Memory. Cogn Sci. 2018;42: 43–76. doi: 10.1111/cogs.12477 28121017

31. Shannon CE. A Mathematical Theory of Communication. Bell Syst Tech J. 1948;27: 623–656.

32. Friston K. The free-energy principle: A unified brain theory? Nat Rev Neurosci. 2010;11: 127–138. doi: 10.1038/nrn2787 20068583

33. Applebaum D. Probability and Information: An Integrated Approach. 2nd ed. Cambridge: Cambridge University Press; 2008. doi: 10.1017/CBO9780511755262

34. Pearce M. Expectation in melody. 2006; 377–405.

35. Manzara LC, Witten IH, James M. On the Entropy of Music: An Experiment with Bach Chorale Melodies. Leonardo Music J. 1992;2: 81–88. doi: 10.2307/1513213

36. Witten IH, Manzara LC, Conklin D. Comparing Human and Computational Models of Music Prediction. Comput Music J. 1994;18: 70–80. doi: 10.2307/3680523

37. Cox G. On the Relationship Between Entropy and Meaning in Music: An Exploration with Recurrent Neural Networks. 2010.

38. Daikoku T. Entropy, Uncertainty, and the Depth of Implicit Knowledge on Musical Creativity: Computational Study of Improvisation in Melody and Rhythm. 2018;12: 1–11. doi: 10.3389/fncom.2018.00097 30618691

39. Hasson U. The neurobiology of uncertainty: implications for statistical learning. Phil Trans R Soc B. 2017;372: 20160048. doi: 10.1098/rstb.2016.0048 27872367

40. Nastase S, Iacovella V, Hasson U. Uncertainty in visual and auditory series is coded by modality-general and modality-specific neural systems. Hum Brain Mapp. 2014;35: 1111–1128. doi: 10.1002/hbm.22238 23408389

41. Harrison LM, Duggins A, Friston KJ. Encoding uncertainty in the hippocampus. Neural Networks. 2006;19: 535–546. doi: 10.1016/j.neunet.2005.11.002 16527453

42. Omigie D, Stewart L. Preserved statistical learning of tonal and linguistic material in congenital amusia. Front Psychol. 2011;2: 1–11. doi: 10.3389/fpsyg.2011.00001

43. Omigie D, Pearce MT, Stewart L. Tracking of pitch probabilities in congenital amusia. Neuropsychologia. 2012;50: 1483–1493. doi: 10.1016/j.neuropsychologia.2012.02.034 22414591

44. Omigie D, Pearce MT, Williamson VJ, Stewart L. Electrophysiological correlates of melodic processing in congenital amusia. Neuropsychologia. 2013;51: 1749–1762. doi: 10.1016/j.neuropsychologia.2013.05.010 23707539

45. Daikoku T. Implicit learning in the developing brain: An exploration of ERP indices for developmental disorders. Clin Neurophysiol. 2019. https://doi.org/10.1016/j.clinph.2019.09.001

46. Daikoku T. Depth and the Uncertainty of Statistical Knowledge on Musical Creativity Fluctuate Over a Composer’s Lifetime. Frontiers in Computational Neuroscience. 2019. p. 27. Available: https://www.frontiersin.org/article/10.3389/fncom.2019.00027 31114493

47. Daikoku T. Method and apparatus for analyzing characteristics of music information. United States of America; US20190189100, 2019. Available: https://patentscope.wipo.int/search/en/detail.jsf?docId=US244367418&tab=NATIONALBIBLIO&fbclid=IwAR3cy6qM_YpE_sQebTYc0ixnGTfuprzEiLxxb4Qbe1bKHlhlh5UZSgZDEWM

48. Daikoku T, Okano T, Yumoto M. Relative difficulty of auditory statistical learning based on tone transition diversity modulates chunk length in the learning strategy. In Proceedings of the Biomagnetic. Proc Biomagn. 2017;22–24: p.75. doi: 10.1016/j.nlm.2014.11.001

49. Elmer S, Lutz J. Relationships between music training, speech processing, and word learning: a network perspective. 2018; 1–9. doi: 10.1111/nyas.13581 29542125

50. François C, Chobert J, Besson M, Schön D. Music Training for the Development of Speech Segmentation. Cereb Cortex. 2012; 1–6. doi: 10.1093/cercor/bhs180 22784606

51. Francois C, Schön D. Musical expertise boosts implicit learning of both musical and linguistic structures. Cereb Cortex. 2011;21: 2357–2365. doi: 10.1093/cercor/bhr022 21383236

52. Hansen NC, Pearce MT. Predictive uncertainty in auditory sequence processing. Front Psychol. 2014;5: 1–17. doi: 10.3389/fpsyg.2014.00001

53. Przysinda E, Zeng T, Maves K, Arkin C, Loui P. Jazz musicians reveal role of expectancy in human creativity. Brain Cogn. 2017;119: 45–53. doi: 10.1016/j.bandc.2017.09.008 29028508

54. Paraskevopoulos E, Kuchenbuch A, Herholz SC, Pantev C. Statistical learning effects in musicians and non-musicians: An MEG study. Neuropsychologia. 2012;50: 341–349. doi: 10.1016/j.neuropsychologia.2011.12.007 22197571

55. Tishby N, Polani D. Information Theory of Decisions and Actions. Perception-Action Cycle. 2011; 601–636. doi: 10.1007/978-1-4419-1452-1_19

56. Schmidhuber J. Developmental robotics, optimal artificial curiosity, creativity, music, and the fine arts. Conn Sci. 2006;18: 173–187. doi: 10.1080/09540090600768658

57. Daikoku T. Musical Creativity and Depth of Implicit Knowledge: Spectral and Temporal Individualities in Improvisation. Front Comput Neurosci. 2018;12: 1–27. doi: 10.3389/fncom.2018.00001

58. Daikoku T. Computational models and neural bases of statistical learning in music and language: Comment on “Creativity, information, and consciousness: The information dynamics of thinking” by Wiggins. Phys Life Rev. 2019. https://doi.org/10.1016/j.plrev.2019.09.001

59. Hauser MD, Chomsky N, Fitch WT. The Faculty of Language: What Is It, Who Has It, and How Did It Evolve? Science (80-). 2002;298: 1569 LP–1579. doi: 10.1126/science.298.5598.1569 12446899

60. Jackendoff R, Lerdahl F. The capacity for music: What is it, and what’s special about it? Cognition. 2006;100: 33–72. doi: 10.1016/j.cognition.2005.11.005 16384553

61. Daikoku T, Yumoto M. Single, but not dual, attention facilitates statistical learning of two concurrent auditory sequences. Sci Rep. 2017;7. doi: 10.1038/s41598-017-10476-x 28860466

62. Daikoku T, Takahashi Y, Futagami H, Tarumoto N, Yasuda H. Physical fitness modulates incidental but not intentional statistical learning of simultaneous auditory sequences during concurrent physical exercise. Neurol Res. 2017;39. doi: 10.1080/01616412.2016.1273571 28034012

63. Rohrmeier M, Cross I. Statistical Properties of Tonal Harmony in Bach’s Chorales. Proc 10th Intl Conf Music Percept Cogn. 2008;6: 123–1319. Available: http://icmpc10.psych.let.hokudai.ac.jp/%5Cnhttp://www.mus.cam.ac.uk/files/2009/09/bachharmony.pdf

64. Daikoku T. Tonality Tunes the Statistical Characteristics in Music: Computational Approaches on Statistical Learning. Frontiers in Computational Neuroscience. 2019. p. 70. Available: doi: 10.3389/fncom.2019.00070 31632260

65. Daikoku T, Yumoto M. Concurrent statistical learning of ignored and attended sound sequences: An MEG study. Fronstiers, Hum Neurosci. 2019;under revi.

66. Wagenaar W. Generation of random sequences by human subjects: A critical survey of literature. Psychol Bull. 1972;77: 65–72.

67. Bains W. Random number generation and creativity. Med Hypotheses. 2008;70: 186–190. doi: 10.1016/j.mehy.2007.08.004 17920778

68. Yumoto M, Daikoku T. Neurophysiological Studies on Auditory Statistical Learning [in Japanese] 聴覚刺激列の統計学習の神経生理学的研究. Japanese J Cogn Neurosci. 2018;20: 38–43.

69. Daikoku T, Ogura H, Watanabe M. The variation of hemodynamics relative to listening to consonance or dissonance during chord progression. Neurol Res. 2012;34. doi: 10.1179/1743132812Y.0000000047 22642826

70. Friston K. A theory of cortical responses. Philos Trans R Soc B Biol Sci. 2005;360: 815–836. doi: 10.1098/rstb.2005.1622 15937014

71. Albrecht J, Shanahan D. The Use of Large Corpora to Train a New Type of Key-Finding Algorithm. Music Percept An Interdiscip J. 2013;31: 59 LP–67. doi: 10.1525/mp.2013.31.1.59