Introduction
I have been working on perfecting a data-cleansing and import process for my Plex Server’s Music library for several years now, on and off.
What started as a process to acquire, sanitise and prepare music for DJing, has grown primarily into a process to acquire, clean and prepare music for streaming via PlexAmp from all my devices, including outside my home network, from my Plex server.
Music Import Process
Acquisition
Either:
- Via Lidarr the torrent management system that intelligently downloads music, and automatically fills holes in your music collection, coupled with Prowlarr .
- Via FLAC files being ripped on Windows via DBpoweramp CD ripper and placed in a samba share ‘incomingmusic’ on the server.
- Via any FLAC or MP3 files being placed in a samba share ‘incomingscratch’ on the server.
- Via files being downloaded from HumbleBundle ’s audio bundles, and also paid sites such as Bandcamp.com , and placed in a samba share ‘incomingmusic’ on the server.
- Via ARM - Automated Ripping Machine running on the Linux server for headless automatic ripping of music CDs - currently I do not rip music CDs automatically with ARM although I probably will re-establish this acquisition path in the future.
Sanitisation
- FLAC files get checked with the FLAC integrity checker and any ones that fail to pass the test are deleted. The integrity checker runs on every FLAC file in my collection once every 30 days.
- MP3 files get checked with mp3check and any data outside of the official MP3 headers are deleted. This removes any spurious information and also saves disk space.
- Any files and directories not consisting of MP3 or FLAC files, such as cover art image, text file logs, etc, get deleted to save space.
Preparation
I run Filebot that handles organisation, renaming and the folder structure of the music library according to what Plex expects, and also refuses to process corrupted files or spurious files as an extra step.
Frequency of Music Import Process
The music import process is a number of bash scripts that run automatically once per week via cron in a no-hup’d screen’d environment.
Conclusion
I find that this approach works well, and runs within an acceptable amount of time on my current music file library, which is extremely large at 1.5TB. Previous approaches using beets.io simply did not work for me, as I couldn’t get beets to scale to music libraries of this size and it was forever corrupting its database.