Splunk is a fantastic data analytics tool. I’ve been involved with it for the past few years at work. Splunk allows you to download and run it freely with a limited ingest, which is perfect for home use and/or training. I have it setup at home, primarily to learn, and sometimes to try new things.
Warning, this is a long post.
In the past I wrote a Javascript application called iStats, which I posted about a couple of times. iStats was designed to provide play and library analytics for my iTunes (now Apple Music) library. I was interested in seeing most played tracks, artists, and albums, and other statistics. The previous blog posts for iStats are here:
Original: https://unfinishedbitness.info/2018/05/04/itunes-missing-feature-library-statistics/
Update: https://unfinishedbitness.info/2020/12/06/itunes-missing-feature-library-statistics-revised/
While iStats does exactly what I need, I was curious how Splunk could provide any additional insights as it has a far greater capability to analyze big data. My Apple Music Library is an 18MB XML file when exported, with roughly 9,000 songs. This post will demonstrate the searches and dashboard I built in Splunk to provide analytics on my library. There are 22 searches with corresponding visual reports built into a single dashboard.
iStats Results
iStats produces output like this. In the most recent update (linked above) I added the ability to export each track to CSV so I could pull the data into Elastic, another tool similar to Splunk. I did create a dashboard in Elastic, but didn’t make the effort to blog about it. The only comment I will make in comparing the two is that it was much easier to create the Splunk dashboard.
Splunk Dashboard
First, let’s see the Splunk dashboard. It’s broken into 5 parts. Open the images to see a larger version. Each section has a title that describes what it represents:
Getting Data In To Splunk
To get the data into Splunk, I used the CSV export from my iStats program. Note there is only one major section of the exported Apple Music library XML file that should be transposed into the CSV file – there is other stuff in the XML you don’t want to be part of the analysis. Without iStats, you may not be able to fully replicate what I’ve done here with accurate results. If you don’t want to use iStats, fine, however you will need a similar CSV. The required fields in the CSV are (the ones in quotes are alpha and quoted):
"Artist","Album","Track","Year","Genre",Plays,Skips,Size,Minutes,Seconds,Milliseconds,"Bitrate","Kind"
An example of the data:
"Candlebox","Candlebox","Far Behind","1993","Hard Rock",14,0,36579379,4,59,666,"971","Apple Lossless audio file"
The first step is to export the Apple Music library. To export the library from Apple Music, from the File menu, use the Library menu item, and the Export Library… sub menu item.
The second step is to transform the exported XML file to CSV, omitting the items that are not a targeted part of the analysis. This included movies/videos, voice memos, playlists, etc. The target is only the songs. iStats extracts only what is needed from the XML. If you don’t use iStats, your mileage may vary. Open iStats, ensure export to CSV is checked, select the exported XML file, and wait a few seconds. Once the results are displayed, the CSV file should be saved as well.
Now that the CSV is created, it can be pulled into Splunk. You’ll need to be logged into Splunk as a user with permissions to upload data. If “Add Data” doesn’t show up, your user doesn’t have the proper permissions. So, select Add Data:
Next, select Upload:
Next, select the CSV file. This can be done using the picker or drag and drop:
Next, Splunk doesn’t understand how to define the timestamp, so it gets set to current. This will mark all records with the current date and time:
Next, set the destination index. If you want to use the searches I provide later, create and name the index “applemusic” (no spaces):
Next, review the selections, and click Submit:
Lastly, Splunk will show a summary and provide some options for exploring the newly imported data. You can select “Start Searching”:
Going to the Search app will allow you to see the raw data:
Building the Dashboard
When building the dashboard, you do not need the elevated permissions required to import the data. The single dashboard being built is called “Apple Music Statistics”. It will have the description “Apple Music Library and Playtime Statistics”:
To build the dashboard, 22 searches will be created. Note that I mis-spelled one here – it says “Aple” instead of “Apple”. I didn’t notice until after capturing everything for this blog post. Correct it as you go. The searches are:
Building the Searches
I will build the searches in the order they appear on the dashboard as depicted at the top of this post (see Splunk Dashboard above). Each of these will be created from the Search app. Each search is listed with the Dashboard Section Name and the Search Name in parentheses:
Example: Dashboard Section Name (Search Name)
Library Kinds (Apple Music Kind)
index="applemusic" Kind!=Undefined Kind!="Purchased AAC audio file" | stats count by Kind | sort -count
Library Genres (Apple Music Genre)
index="applemusic" Kind!=Undefined Kind!="Purchased AAC audio file" | stats count by Genre | sort -count,Genre
Library Artist Count (Apple Music Library Artist Count)
index="applemusic" Artist != Compilation | dedup Artist | stats count
Library Album Count (Apple Music Library Album Count)
index="applemusic" | eval cd=Artist+" - "+Album | dedup cd | stats count
Library Track Count (Apple Music Library Track Count)
index="applemusic" | stats count
Library Size in Bytes (Apple Music Library Size)
index="applemusic" | stats sum(Size) as s_size
Library Duration (Apple Music Library Duration)
index="applemusic" | stats sum(Minutes) as s_minutes sum(Seconds) as s_seconds sum(Milliseconds) as s_milliseconds | eval tot_seconds=(s_minutes*60)+s_seconds+floor(s_milliseconds/1000) | eval str_seconds=tostring(tot_seconds,"duration") | rex field=str_seconds mode=sed "s/((((\d*)\+)?(\d*):)?(\d*):)?(\d*)/\4 days \5 hrs \6 mins \7 secs/" | rex field=str_seconds mode=sed "s/^ days/0 days/" | rename str_seconds as "Library Duration" | table "Library Duration"
Playtime Duration (Apple Music Play Duration)
index="applemusic" | eval track_seconds=Plays*((Minutes*60)+Seconds+floor(Milliseconds/1000)) | stats sum(track_seconds) as play_seconds | eval str_seconds=tostring(play_seconds,"duration") | rex field=str_seconds mode=sed "s/((((\d*)\+)?(\d*):)?(\d*):)?(\d*)/\4 days \5 hrs \6 mins \7 secs/" | rex field=str_seconds mode=sed "s/^ days/0 days/" | rename str_seconds as "Play Duration" | table "Play Duration"
Library Top 10 Artists by Track Count (Apple Music Library Artist Top 10)
index="applemusic" Artist != "Compilation" | top limit=10 Artist | rename count as Tracks | table Artist,Tracks
Library Top 10 Albums by Track Count (Apple Music Library Albums Top 10)
index="applemusic" | eval cd=Artist+" - "+Album | top 10 cd
Total Play Count (Apple Music Play Total)
index=applemusic | stats sum(Plays) as "Total Plays"
Total Skip Count (Apple Music Skip Total)
index=applemusic | stats sum(Skips) as "Total Skips"
Play Top 10 Artists (Apple Music Play Top 10 Artists)
index="applemusic" | stats sum(Plays) as "Play Count" by Artist | sort -"Play Count" | head 10 | table Artist,"Play Count"
Play Top 10 Albums (Apple Music Play Top 10 Albums)
index="applemusic" | eval art_album=Artist + " - " + Album | stats sum(Plays) as "Play Count" by art_album | rename art_album as "Album Name" | sort -"Play Count" | head 10
Play Top 10 Tracks (Apple Music Play Top 10 Tracks)
index="applemusic" | sort -Plays | eval title = Artist + " - " + Track | table title,Plays | head 10
Skip Top 10 Artists (Apple Music Skip Top 10 Artists)
index="applemusic" | stats sum(Skips) as "Skip Count" by Artist | sort -"Skip Count" | head 10 | table Artist,"Skip Count"
Skip Top 10 Albums (Apple Music Skip Top 10 Albums)
index="applemusic" | eval art_album=Artist + " - " + Album | stats sum(Skips) as "Skip Count" by art_album | rename art_album as "Album Name" | sort -"Skip Count" | head 10
Skip Top 10 Tracks (Apple Music Skip Top 10 Tracks)
index="applemusic" | sort -Skips | eval title = Artist + " - " + Track | table title,Skips | head 10
Library Top 10 Years (Apple Music Library Top 10 Years)
index="applemusic" | stats count as "Track Count" by Year | sort -"Track Count" | head 10
Play Top 10 Years (Apple Music Play Top 10 Years)
index="applemusic" | stats sum(Plays) as "Play Count" by Year | sort -"Play Count" | head 10
Library Top 10 Genres (Apple Music Library Top 10 Genre)
index="applemusic" | stats count as "Track Count" by Genre | sort -"Track Count" | head 10
Play Top 10 Genres (Apple Music Play Top 10 Genres)
index="applemusic" | stats sum(Plays) as "Play Count" by Genre | sort -"Play Count" | head 10
Dashboard Source
If your interested in the dashboard source XML, here it is. You can see the colors set, chart options, etc. Apologies the indenting was not preserved:
<dashboard theme="dark"> <label>Apple Music Statistics</label> <description>Apple Music Library and Playtime Statistics</description> <row> <panel> <chart> <title>Library Kinds</title> <search ref="Apple Music Kind"></search> <option name="charting.drilldown">none</option> </chart> </panel> <panel> <chart> <title>Library Genres</title> <search ref="Apple Music Genre"></search> <option name="charting.drilldown">none</option> </chart> </panel> </row> <row> <panel> <single> <title>Library Artist Count</title> <search> <query>index="applemusic" Artist != Compilation | dedup Artist | stats count</query> <earliest>0</earliest> <sampleRatio>1</sampleRatio> </search> <option name="colorBy">value</option> <option name="colorMode">none</option> <option name="drilldown">none</option> <option name="numberPrecision">0</option> <option name="rangeColors">["0x53a051","0x0877a6","0xf8be34","0xf1813f","0xdc4e41"]</option> <option name="rangeValues">[0,30,70,100]</option> <option name="showSparkline">1</option> <option name="showTrendIndicator">1</option> <option name="trellis.enabled">0</option> <option name="trellis.scales.shared">1</option> <option name="trellis.size">medium</option> <option name="trendColorInterpretation">standard</option> <option name="trendDisplayMode">absolute</option> <option name="unitPosition">after</option> <option name="useColors">0</option> <option name="useThousandSeparators">1</option> </single> </panel> <panel> <single> <title>Library Album Count</title> <search ref="Apple Music Library Album Count"></search> <option name="drilldown">none</option> </single> </panel> <panel> <single> <title>Library Track Count</title> <search> <query>index="applemusic" | stats count</query> <earliest>0</earliest> <sampleRatio>1</sampleRatio> </search> <option name="colorBy">value</option> <option name="colorMode">none</option> <option name="drilldown">none</option> <option name="numberPrecision">0</option> <option name="rangeColors">["0x53a051","0x0877a6","0xf8be34","0xf1813f","0xdc4e41"]</option> <option name="rangeValues">[0,30,70,100]</option> <option name="showSparkline">1</option> <option name="showTrendIndicator">1</option> <option name="trellis.enabled">0</option> <option name="trellis.scales.shared">1</option> <option name="trellis.size">medium</option> <option name="trendColorInterpretation">standard</option> <option name="trendDisplayMode">absolute</option> <option name="unitPosition">after</option> <option name="useColors">0</option> <option name="useThousandSeparators">1</option> </single> </panel> </row> <row> <panel> <single> <title>Library Size in Bytes</title> <search> <query>index="applemusic" | stats sum(Size) as s_size</query> <earliest>0</earliest> <sampleRatio>1</sampleRatio> </search> <option name="colorBy">value</option> <option name="colorMode">none</option> <option name="drilldown">none</option> <option name="numberPrecision">0</option> <option name="rangeColors">["0x53a051", "0x0877a6", "0xf8be34", "0xf1813f", "0xdc4e41"]</option> <option name="rangeValues">[0,30,70,100]</option> <option name="showSparkline">1</option> <option name="showTrendIndicator">1</option> <option name="trellis.enabled">0</option> <option name="trellis.scales.shared">1</option> <option name="trellis.size">medium</option> <option name="trendColorInterpretation">standard</option> <option name="trendDisplayMode">absolute</option> <option name="unitPosition">after</option> <option name="useColors">0</option> <option name="useThousandSeparators">1</option> </single> </panel> </row> <row> <panel> <single> <title>Library Duration</title> <search ref="Apple Music Library Duration"></search> <option name="drilldown">none</option> </single> </panel> <panel> <single> <title>Playtime Duration</title> <search ref="Apple Music Play Duration"></search> <option name="colorMode">none</option> <option name="drilldown">none</option> <option name="rangeColors">["0x53a051","0x0877a6","0xf8be34","0xf1813f","0xdc4e41"]</option> <option name="useColors">0</option> </single> </panel> </row> <row> <panel> <table> <title>Library Top 10 Artists by Track Count</title> <search> <query>index="applemusic" Artist != "Compilation" | top limit=10 Artist | rename count as Tracks | table Artist,Tracks</query> <earliest>0</earliest> <sampleRatio>1</sampleRatio> </search> <option name="drilldown">none</option> </table> </panel> <panel> <table> <title>Library Top 10 Albums by Track Count</title> <search> <query>index="applemusic" | eval cd=Artist+" - "+Album | top 10 cd | rename cd as "Album Name" | rename count as Tracks | table "Album Name",Tracks</query> <earliest>0</earliest> <latest></latest> <sampleRatio>1</sampleRatio> </search> <option name="drilldown">none</option> <option name="refresh.display">progressbar</option> </table> </panel> </row> <row> <panel> <single> <title>Total Play Count</title> <search> <query>index=applemusic | stats sum(Plays) as "Total Plays"</query> <earliest>0</earliest> <sampleRatio>1</sampleRatio> </search> <option name="colorMode">none</option> <option name="drilldown">none</option> <option name="rangeColors">["0x53a051","0x0877a6","0xf8be34","0xf1813f","0x53a051"]</option> <option name="useColors">1</option> </single> </panel> <panel> <single> <title>Total Skip Count</title> <search> <query>index=applemusic | stats sum(Skips) as "Total Skips"</query> <earliest>0</earliest> <sampleRatio>1</sampleRatio> </search> <option name="drilldown">none</option> <option name="rangeColors">["0x53a051","0x0877a6","0xf8be34","0xf1813f","0xdc4e41"]</option> <option name="useColors">1</option> </single> </panel> </row> <row> <panel> <table> <title>Play Top 10 Artists</title> <search> <query>index="applemusic" | stats sum(Plays) as "Play Count" by Artist | sort -"Play Count" | head 10 | table Artist,"Play Count"</query> <earliest>0</earliest> <sampleRatio>1</sampleRatio> </search> <option name="count">20</option> <option name="dataOverlayMode">none</option> <option name="drilldown">none</option> <option name="percentagesRow">false</option> <option name="rowNumbers">false</option> <option name="totalsRow">true</option> <option name="wrap">true</option> <format type="color" field="Play Count"> <colorPalette type="minMidMax" maxColor="#53A051" minColor="#FFFFFF"></colorPalette> <scale type="minMidMax"></scale> </format> <format type="number" field="Play Count"> <option name="precision">0</option> </format> </table> </panel> <panel> <table> <title>Play Top 10 Albums</title> <search> <query>index="applemusic" | eval art_album=Artist + " - " + Album | stats sum(Plays) as "Play Count" by art_album | rename art_album as "Album Name" | sort -"Play Count" | head 10</query> <earliest>0</earliest> <sampleRatio>1</sampleRatio> </search> <option name="count">20</option> <option name="dataOverlayMode">none</option> <option name="drilldown">none</option> <option name="percentagesRow">false</option> <option name="rowNumbers">false</option> <option name="totalsRow">true</option> <option name="wrap">true</option> <format type="color" field="Play Count"> <colorPalette type="minMidMax" maxColor="#53A051" minColor="#FFFFFF"></colorPalette> <scale type="minMidMax"></scale> </format> <format type="number" field="Play Count"> <option name="precision">0</option> </format> </table> </panel> <panel> <table> <title>Play Top 10 Tracks</title> <search> <query>index="applemusic" | sort -Plays | eval title = Artist + " - " + Track | table title,Plays | head 10</query> <earliest>0</earliest> <sampleRatio>1</sampleRatio> </search> <option name="count">20</option> <option name="dataOverlayMode">none</option> <option name="drilldown">none</option> <option name="percentagesRow">false</option> <option name="rowNumbers">false</option> <option name="totalsRow">true</option> <option name="wrap">true</option> <format type="color" field="Plays"> <colorPalette type="minMidMax" maxColor="#53A051" minColor="#FFFFFF"></colorPalette> <scale type="minMidMax"></scale> </format> <format type="number" field="Plays"> <option name="precision">0</option> </format> </table> </panel> </row> <row> <panel> <table> <title>Skip Top 10 Artists</title> <search> <query>index="applemusic" | stats sum(Skips) as "Skip Count" by Artist | sort -"Skip Count" | head 10 | table Artist,"Skip Count"</query> <earliest>0</earliest> <sampleRatio>1</sampleRatio> </search> <option name="count">20</option> <option name="dataOverlayMode">none</option> <option name="drilldown">none</option> <option name="percentagesRow">false</option> <option name="rowNumbers">false</option> <option name="totalsRow">true</option> <option name="wrap">true</option> <format type="color" field="Skip Count"> <colorPalette type="minMidMax" maxColor="#DC4E41" minColor="#FFFFFF"></colorPalette> <scale type="minMidMax"></scale> </format> <format type="number" field="Skip Count"> <option name="precision">0</option> </format> </table> </panel> <panel> <table> <title>Skip Top 10 Albums</title> <search> <query>index="applemusic" | eval art_album=Artist + " - " + Album | stats sum(Skips) as "Skip Count" by art_album | rename art_album as "Album Name" | sort -"Skip Count" | head 10</query> <earliest>0</earliest> <sampleRatio>1</sampleRatio> </search> <option name="count">20</option> <option name="dataOverlayMode">none</option> <option name="drilldown">none</option> <option name="percentagesRow">false</option> <option name="rowNumbers">false</option> <option name="totalsRow">true</option> <option name="wrap">true</option> <format type="color" field="Skip Count"> <colorPalette type="minMidMax" maxColor="#DC4E41" minColor="#FFFFFF"></colorPalette> <scale type="minMidMax"></scale> </format> <format type="number" field="Skip Count"> <option name="precision">0</option> </format> </table> </panel> <panel> <table> <title>Skip Top 10 Tracks</title> <search> <query>index="applemusic" | sort -Skips | eval title = Artist + " - " + Track | table title,Skips | head 10</query> <earliest>0</earliest> <sampleRatio>1</sampleRatio> </search> <option name="count">20</option> <option name="dataOverlayMode">none</option> <option name="drilldown">none</option> <option name="percentagesRow">false</option> <option name="rowNumbers">false</option> <option name="totalsRow">true</option> <option name="wrap">true</option> <format type="color" field="Skips"> <colorPalette type="minMidMax" maxColor="#DC4E41" minColor="#FFFFFF"></colorPalette> <scale type="minMidMax"></scale> </format> <format type="number" field="Skips"> <option name="precision">0</option> </format> </table> </panel> </row> <row> <panel> <chart> <title>Library Top 10 Years</title> <search> <query>index="applemusic" | stats count as "Track Count" by Year | sort -"Track Count" | head 10</query> <earliest>0</earliest> <sampleRatio>1</sampleRatio> </search> <option name="charting.chart">column</option> <option name="charting.chart.showDataLabels">all</option> <option name="charting.drilldown">none</option> </chart> </panel> <panel> <chart> <title>Play Top 10 Years</title> <search> <query>index="applemusic" | stats sum(Plays) as "Play Count" by Year | sort -"Play Count" | head 10</query> <earliest>0</earliest> <sampleRatio>1</sampleRatio> </search> <option name="charting.chart">column</option> <option name="charting.chart.showDataLabels">all</option> <option name="charting.drilldown">none</option> </chart> </panel> </row> <row> <panel> <chart> <title>Library Top 10 Genres</title> <search> <query>index="applemusic" | stats count as "Track Count" by Genre | sort -"Track Count" | head 10</query> <earliest>0</earliest> <sampleRatio>1</sampleRatio> </search> <option name="charting.chart">column</option> <option name="charting.chart.showDataLabels">all</option> <option name="charting.drilldown">none</option> </chart> </panel> <panel> <chart> <title>Play Top 10 Genres</title> <search> <query>index="applemusic" | stats sum(Plays) as "Play Count" by Genre | sort -"Play Count" | head 10</query> <earliest>0</earliest> <sampleRatio>1</sampleRatio> </search> <option name="charting.chart">column</option> <option name="charting.chart.showDataLabels">all</option> <option name="charting.drilldown">none</option> </chart> </panel> </row> </dashboard>
That’s it for now. As I add more to the dashboard, I’ll post about the additions.