Song Genre Classification

Music Genre Classification with the , 000, 000 Song Dataset

15-826 Final Report

Dawen Liang, †Haijie Gu, ‡ and Brendan O'Connor‡

†University of Music, ‡ Equipment Learning Office Carnegie Mellon University

12 , 3, 2011

1

Introduction

The field of Music Information Collection (MIR) attracts from musicology, signal digesting, and artificial intelligence. A good line of work addresses challenges including: music understanding (extract the musically-meaningful information from audio waveforms), automatic music annotation (measuring song and artist similarity), and other challenges. However , hardly any work provides scaled to commercially measured data units. The methods and data are both sophisticated. An extraordinary array of information is definitely hidden inside music waveforms, ranging from perceptual to auditory—which inevitably makes largescale applications challenging. There are a number of commercially successful on the web music providers, such as Pandora, Last. fm, and Spotify, but many of them are merely based on traditional text message IR. The course project focuses on large-scale data exploration of music information while using recently unveiled Million Tune Dataset (Bertin-Mahieux et al., 2011), 1 which contains 1

http://labrosa.ee.columbia.edu/millionsong/

1

300GB of audio tracks features and metadata. This dataset was launched to push the boundaries of Music IRGI research to commercial weighing scales. Also, the associated musiXmatch dataset2 supplies textual lyrics information for several of the MSD songs. Incorporating these two datasets, we offer a cross-modal retrieval construction to combine the music and calcado data to get the task of genre classification: Given And song-genre pairs: (S1, GN ),..., (SN, GN ), where Dans le cas ou ∈ F for some characteristic space N, and Gi ∈ G for some genre set G, output the classifier together with the highest classification accuracy for the hold-out test out set. The raw characteristic space F contains multiple domains of sub features which can be of variable duration. The genre label established G can be discrete.

1 . 1 Inspiration

Genre classification is a regular problem in Music IR analysis. Most of the music genre classification techniques use pattern identification algorithms to categorise feature vectors, extracted by short-time documenting segments into genres. Commonly used classifiers will be Support Vector Machines (SVMs), Nearest-Neighbor (NN) classifiers, Gaussian Mixture Versions, Linear Discriminant Analysis (LDA), etc . A lot of common music datasets had been used in tests to make the reported classification accuracies comparable, for instance , the GTZAN dataset (Tzanetakis and Prepare food, 2002) which can be the most traditionally used dataset intended for music genre classification. Yet , the datasets involved in these studies are incredibly small evaluating to the Mil Song Dataset. In fact , a lot of the Music MARCHAR research still focuses on small datasets, such as the GTZAN dataset (Tzanetakis and Cook, 2002) with just 1000 audio tracks, each 30 seconds long; or CAL-500 (Turnbull et ing., 2008), a collection of 1700 humangenerated musical reflexion describing 500 popular european musical songs. Both of these datasets are widespread in most state of the art research in Music VENTOSEAR, but are a long way away from practical application. Furthermore, the majority of the research upon genre classification focuses only on music features, disregarding lyrics (mostly due to the difficulty of collecting large-scale lyric data). two

http://labrosa.ee.columbia.edu/millionsong/musixmatch

a couple of

Nevertheless, in addition to the musical features (styles, forms), the genre is also carefully related to lyrics—songs in different makes may require different topics or feelings, which could be recoverable via word frequencies in words. This inspires us to sign up the audio and lyrics information via two databases for this job.

1 . a couple of Contribution

Towards the best of our knowledge, there were no posted works that perform largescale genre classification using cross-modal methods. • We suggested a cross-modal retrival framework of model...

References: Leonard E. Baum, Ted Petrie, George Soules, and Grettle Weiss. A maximization approach occurring inside the statistical research of probabilistic functions of markov stores. The Life of Numerical Statistics, 41(1): pp. 164–171, 1970. ISSN 00034851. WEB ADDRESS http://www.jstor.org/stable/2239727. Robert M. Bell, Yehuda Koren, and Chris Volinsky. The BellKor solution to the

twenty eight

Netflix Reward, 2007.

http://www2.research.att.com/Лњvolinsky/netflix/

ProgressPrize2007BellKorSolution. pdf format. Robert M. Bell, Yehuda. Koren, and Chris Volinsky. The Bellkor 2008 way to the Netflix Prize, 08. Bellkor2008. pdf file. Thierry Bertin-Mahieux, Daniel P. W. Ellis, Brian Whitman, and Paul Lamere. The million track dataset. In Proceedings in the 12th Foreign Conference in Music Info Retrieval (ISMIR 2011), 2011. Byron Boots and Geoffrey J. Gordon. An online spectral learning protocol for partially observable nonlinear dynamical systems. In AAAI, 2011. M. M Bradley and G. J Lang. Affective best practice rules for british words (ANEW): instruction manual and affective evaluations. University of Florida: The Center for Research in Psychophysiology, 1999. G. S. Dodds and C. M Danforth. Measuring the happiness of Large-Scale crafted expression: Tunes, blogs, and presidents. Journal of Joy Studies, webpage 116, 2009. J. Friedman, T. Hastie, and L. Tibshirani. Component logistic regression: a record view of boosting (With discussion and a rejoinder by the authors). The annals of statistics, 28(2): 337407, 2000. ISSN 0090-5364. Daniel Hsu, Scam M. Kakade, and Tong Zhang. A spectral formula for learning hidden markov models. CoRR, abs/0811. 4413, 2008. Herbert Jaeger. Visible operator versions for under the radar stochastic time series. Neural Computation, 12(6): 1371–1398, 2000a. Herbert Jaeger. Observable operator models pertaining to discrete stochastic time series. Neural Computation, 2000b. Captain christopher D. Manning, Prabhakar Raghavan, and Hinrich Schtze. Introduction to Information Retrieval. Cambridge School Press, very first edition, July 2008. ISBN 0521865719. up to 29 http://www2.research.att.com/˜volinsky/netflix/

M. McVicar, T. Freeman, and T. M. Bie. Exploration the relationship between musical and audio tracks features plus the emergence of mood. In Proceedings from the 12th Worldwide Conference upon Music Info Retrieval, 2011. M. Muller. Information retrieval for music and action. In Springer, 2007. Bo Pang and Lillian Shelter. Opinion Mining and Sentiment Analysis. Now Publishers Incorporation, July 2008. ISBN 1601981503. N. Rasiwasia, J. C. Pereira, At the. Coviello, G. Doyle, G. Lanckriet, Ur. Levy, and N. Vasconcelos. A new way of cross-modal media retrieval. In Proceedings in the international Conference on Multi-media, 2010. T. Ren, D. Dunson, S i9000. Lindroth, and L. Carin. Dynamic non-parametric bayesian types for examination of music. Journal from the American Record Association, 105(490): 458472, 2010. Greg Ridgeway. Generalized increased models: Strategies for the gbm package, 2007. http: //cran. r-project. org/web/packages/gbm/vignettes/gbm. pdf. Matt Rosencrantz, Geoff Gordon, and Sebastian Thrun. Learning low dimensional predictive representations. In Proceedings in the twenty-first international conference in Machine learning, ICML '04, pages 88–, New York, NYC, USA, 2004. ACM. ISBN 1-58113838-5. doi: http://doi. acm. org/10. 1145/1015330. 1015441. LINK http://doi. acm. org/10. 1145/1015330. 1015441. Sajid Siddiqi, Byron Boots, and Geoffrey T. Gordon. Reduced-rank hidden Markov models. In Proceedings with the Thirteenth Worldwide Conference upon Artificial Intelligence and Figures (AISTATS-2010), 2010. Satinder Singh and Michael jordan R. Wayne. Predictive condition representations: A new theory intended for modeling dynamical systems. In In Doubt in Artificial Intelligence: Process of the Twentieth Conference (UAI), pages 512–519. AUAI Press, 2004. Yla R. Tausczik and Wayne W. Pennebaker. The emotional meaning of words: LIWC and digital text analysis methods. Journal of Dialect and Social Psychology, 2009. URL http://jls. sagepub. com/cgi/rapidpdf/0261927X09351676v1. 30

Douglas Turnbull, Lomaz Barrington, David Torres, and Gert Lanckriet. Semantic annotation and retrieval of music and sound clips. IEEE Orders on Audio, Speech and Language Processing, 16(2): 467–476, February 08. G. Tzanetakis and L. Cook. Musical genre classification of sound signals. IEEE Transactions in Speech and Audio Control, 10(5), Come july 1st 2002.

31

 human cloning Essay

human cloning Essay

Standards of Natural beauty and Looks How do the ideals of beauty and aesthetics impact daily life? How do you define what is beauty? and Who pieces the standards…...

Read..

Essay about Case Study: Self-reliance Air

MBAA 642 INDEPNDENCE AIR ASSIGNMENT being unfaithful - AIRCARRIER CASE STUDY Alliages Nyandoro 3/14/2013 Dr . Bruce Ellis A brief history Independence Air started…...

Read..
 International Business Study Information Essay

International Business Study Information Essay

Test 1 Guideline A. Current issues with the IMF 1 ) Currency entree: coming up installment payments on your Issues: a) Credibility/inappropriate policies 1)…...

Read..
 Sexual Harrassement vs . Pr Essay

Sexual Harrassement vs . Pr Essay

Public Relations or Sexual Harassment Eric Reidenbach CGD 318 Professor LaKisha Bryant Aug 12, 2010 Public Relations versus Sexual Harassment Sexual Harassment can…...

Read..

A Verse to India: an Study of the Work in a Historical Framework Essay

A Passage to India by Edward Morgan Forster is really one of the superb books of it's time. Written within an era when the world was more intimate, yet…...

Read..

Museum Discipline Trip Composition

Museum Field Trip Last night I had developed an amazing experience. I toured museums in multiple Claims and on two Continents. This kind of also were the least…...

Read..