Reading MP3 metadata with Python

Luciano Ratamero

2022-04-18

Music is one of my real passions. If I'm not listening to something, I'm probably singing something to myself all the time. But for the longest time, I never had the time (or money) to have a better listening experience. Since I'm at home, and the pandemic is not over, I decided to understand a little bit better what (and how) I'm listening to music - with Python :]

Quick summary on how I listen to music

So, as millenials do, I have grown listening to cassette tapes, CDs, and eventually MP3s. I remember being 15, using a huge ass pair of headphones (real cheap ones too) with my 128MB MP3 player - and it was awesome.

With time, I grew my MP3 collection, both through good channels (like bandcamp for example), and less... legal channels (like Napster, Kazaa, and Souseek, whic I still use now and then). I can't say my library is big, but it's around 4200 musics in size.

The problem is that a good chunk of it is garbage; not in quality of the music itself, but in the quality of the file compression. And now that I have the money to get some better headphones, some of my music library is just too dank.

As with anything, Python can help me.

Enter mutagen

mutagen is a Python library that helps us to find metadata for our audio files. After discovering it, discovering which MP3 files were too compressed was pretty straightforward.

First, I installed it locally (in my case, it was already installed somehow. I probably fiddled around with it in a distant past):

pip3 install --user mutagen

Then, I wrote a quick and really dirty Python script to tell me which of my MP3 had a bitrate lower than 256kbps. The higher the bitrate, the less it's compressed:

import os
import mutagen


def list_dank_mp3s(folder):
    matches = []
    for root, dirnames, filenames in os.walk(folder):
        for filename in filenames:
            if filename.endswith(".mp3"):
                # I didn't take the time to understand 
                # why it was formatting things wrong
                # so I just hammered the names
                matches.append(os.path.join(root, filename)[:-1] + "3\n")

    bad_mp3s = []
    for item in matches:
        # here, we get the bitrateof the MP3 file
        # there are a LOT of other metadata there,
        # if you want to use it
        bitrate = mutagen.File(item.strip()).info.bitrate
        # if the bitrate doesn't end with 000, it means it's most likely variable
        # if it is, we leave it be. if not, we check if it's below 256kbps
        if str(bitrate).endswith("000") and bitrate / 1000 < 256:
            bad_mp3s.append(os.path.join(os.path.dirname(os.path.abspath(__file__)), item))
    return bad_mp3s

This function gives me a list, that has the paths of the MP3s that I may want to buy or redownload in a better bitrate. I then write the list to a file, and voilĂ , all my bad sounding MP3s are listed.


Do you have any other uses for this kind of library? Do you have any tips on good headphones? Send me a comment down here so we can keep the discussion going! Thanks for your attention, and see you next time :]