SMU Data Science Review


As the digital music landscape continues to expand, the need for effective methods to understand and contextualize the diverse genres of lyrical content becomes increasingly critical. This research focuses on the application of transformer models in the domain of music analysis, specifically in the task of lyric genre classification. By leveraging the advanced capabilities of transformer architectures, this project aims to capture intricate linguistic nuances within song lyrics, thereby enhancing the accuracy and efficiency of genre classification. The relevance of this project lies in its potential to contribute to the development of automated systems for music recommendation and genre-based playlist creation. Moreover, understanding the linguistic features that define distinct musical genres through transformer-based models offers valuable insights into the underlying patterns and characteristics of lyrical content. The final pre-trained transformer model chosen for the final model is called DistilBERT, which is a “distilled” version of the popular pre-trained transformer model called BERT (Biodirectional Encoder Representations from Transformers). The implications, challenges, ethical concerns, and future research are discussed and are sought to be addressed.