Klasterisasi Lagu pada Dataset Spotify Berdasarkan Fitur Audio Menggunakan Algoritma K-Means
Keywords:
K-Means, Spotify, Clustering, Music Information Retrieval, MoodAbstract
Digital music is currently growing rapidly; however, song retrieval systems that rely solely on genres are often inadequate to meet the needs of users searching for music based on specific moods. This study aimed to cluster songs using the Spotify Songs and Artists Dataset from Kaggle based on audio features that reflect mood dimensions—specifically Valence, Energy, and Danceability—using the K-Means algorithm. This approach was selected to uncover hidden patterns and establish more personalized music categories. The research methodology followed the standard CRISP-DM framework, encompassing data preprocessing with Z-Score normalization, determination of the optimal number of clusters using the Elbow Method, and model evaluation. The experimental results demonstrated that the K-Means algorithm successfully grouped the song data into three main clusters ( ) with distinct characteristics: Happy/Cheerful, Sad/Melancholy, and Energetic/Intense. Cluster quality evaluation using the Silhouette Coefficient yielded a score of 0.30. While this score indicates some overlap typical in the music emotion spectrum, the centroid analysis proved that the algorithm effectively separated mood characteristics to support a more relevant music recommendation system.
Keywords: K-Means, Spotify, Clustering, Music Information Retrieval, Mood.
Abstrak
Musik digital saat ini berkembang pesat, namun sistem pencarian lagu yang hanya mengandalkan genre seringkali tidak memadai untuk memenuhi kebutuhan pengguna yang mencari musik berdasarkan suasana hati (mood). Penelitian ini bertujuan untuk mengelompokkan lagu menggunakan Spotify Songs and Artists Dataset dari Kaggle berdasarkan fitur audio yang merefleksikan dimensi mood, yaitu Valence, Energy, dan Danceability, dengan algoritma K-Means. Pendekatan ini dipilih untuk mengungkap pola tersembunyi dan menciptakan kategori musik yang lebih personal. Metodologi penelitian mengikuti kerangka standar CRISP-DM, dimulai dari pra-pemrosesan data menggunakan normalisasi Z-Score, penentuan jumlah klaster optimal dengan Elbow Method, hingga evaluasi model. Hasil eksperimen menunjukkan bahwa algoritma K-Means berhasil membagi data lagu menjadi tiga klaster utama ( ) dengan karakteristik yang distingtif, yaitu klaster Bahagia (Happy/Cheerful), Sedih (Sad/Melancholy), dan Intens (Energetic/Intense). Evaluasi kualitas klaster menggunakan Silhouette Coefficient menghasilkan nilai 0,30. Meskipun nilai ini mengindikasikan adanya irisan (overlap) antar data yang wajar dalam spektrum emosi musik, analisis pusat klaster (centroid) membuktikan bahwa algoritma mampu memisahkan karakteristik mood secara efektif untuk mendukung sistem rekomendasi musik yang lebih relevan.
Kata Kunci: K-Means, Spotify, Klasterisasi, Music Information Retrieval, Mood.
Downloads
References
[1] D. Vidas, L. Nitschinsk, M. S. Osborne, and N. S. Rickard, “Validating Spotify’s ‘Valence’,’Energy’, and ’Danceability’Audio Features for Music Psychology Research,” Music Percept, 2025, doi: 10.31234/osf.io/8gfzw_v2., in press
[2] D. Nuriska, B. Irawan, A. Bahtiar, and A. Rinaldi Dikananda, “Klasterisasi Data Lagu Terpopuler Spotify 2023 Berdasarkan Suasana Hati Menggunakan Algoritma K-Means,” Jurnal Mahasiswa Teknik Informatika, vol. 7, no. 6, 2023, doi: 10.36040/jati.v7i6.8232.
[3] N. Rohman and A. Wibowo, “Clustering Of Popular Spotify Songs In 2023 Using K-Means Method And Silhouette Coefficient,” Jurnal Pilar Nusa Mandiri, vol. 20, no. 1, pp. 18–24, Apr. 2024, doi: 10.33480/pilar.v20i1.4937.
[4] D. Duman, P. Neto, A. Mavrolampados, P. Toiviainen, and G. Luck, “Music we move to: Spotify audio features and reasons for listening,” PLoS One, vol. 17, no. 9 September, pp. 1–8, Sep. 2022, doi: 10.1371/journal.pone.0275228.
[5] R. I. Safitri and S. Ningsih, “Klasterisasi Lagu Pada Platform Spotify Berdasarkan Fitur Audio Menggunakan Algoritma K-Means Dan K-Means++,” Jurnal Sistem Informasi Bisnis (JUNSIBI), vol. 6, no. 2, pp. 247–257, 2025, doi: 10.55122/junsibi.v6i2.1684.
[6] M. I. Firmansyah, R. S. Rohman, E. Marsusanti, U. Bina, S. Informatika, and D. D. Disetujui, “Penerapan Algoritma Klastering K-Means Untuk Fitur Atribut Pada Layanan Streaming Musik Spotify,” Journal Computer Science, vol. 2, no. 2, pp. 58–66, 2023, doi: 10.31294/ijcs.v2i2.2465.
[7] N. Rehman, “Spotify Songs and Artists Dataset.” Accessed: Dec. 09, 2025. [Dataset]. Available: https://www.kaggle.com/datasets/glowstudygram/spotify-songs-and-artists-dataset
[8] M. Daffa Rachman, A. Voutama Sistem Informasi, U. Singaperbangsa Karawang Jl HSRonggo Waluyo, and T. Timur, “Implementasi Algoritma K-Means Dalam Sistem Rekomendasi Musik Menggunakan Python,” Jurnal Mahasiswa Teknik Informatika, vol. 8, no. 3, pp. 3857–3862, 2024, doi: 10.36040/jati.v8i3.9635.
[9] H. Humaira and R. Rasyidah, “Determining The Appropiate Cluster Number Using Elbow Method for K-Means Algorithm,” in Proceedings of the 2nd Workshop on Multidisciplinary and Applications (WMA 2018), European Alliance for Innovation n.o., Mar. 2020. doi: 10.4108/eai.24-1-2018.2292388.
[10] S. Marlia et al., “Analisis Fitur Musik dan Tren Popularitas Lagu di Spotify menggunakan K-Means dan CRISP-DM Analysis of Music Features and Song Popularity Trends on Spotify Using K-Means and CRISP-DM,” SISTEMASI: Jurnal Sistem Informasi, vol. 13, no. 2, pp. 595–607, 2024, doi: 10.32520/stmsi.v13i2.3757.
[11] A. Gupta, H. Sharma, and A. Akhtar, “A Comparative Analysis Of K-Means And Hierarchical Clustering,” EPRA International Journal of Multidisciplinary Research (IJMR)-Peer Reviewed Journal, vol. 7, no. 8, 2021, doi: 10.36713/epra2013.
Downloads
-
PDF FULL TEXT
Abstract Dilihat : 213 Kali , Download: 104 Kali
Published
Issue
Section
License
Copyright (c) 2026 Branchris, Kevin Alexander Yech, Andri Wijaya (Author)

This work is licensed under a Creative Commons Attribution 4.0 International License.

