HTML5 Audio: Format Wars

I had originally planned to use HTML5’s new <audio> tag to actually render the audio in Project: Apollo. That went south, however, when I discovered a few critical problems with current implementations that disqualify the <audio> tag from contention.

Different Browsers, Different Formats

Both Mozilla and WebKit have already adopted at least some support for the <audio> tag. Unfortunately, Mozilla has decided to avoid potential licensing issues by omitting support for any audio codec that isn’t freely open source. This means no MP3, MP4 or AAC support, as those are proprietary codecs.

On the other side of the fence, WebKit (driven by Apple) has a substantial interest in seeing both MP3 and MP4 thrive on the web (maybe you’ve heard of iTunes?). Citing “simplicity”, Apple isn’t budging on adding support for the open standards Mozilla supports, such as Ogg Vorbis.

Google Web Toolkit

Unfortunately, there isn’t much support (yet) in Google Web Toolkit for HTML5, but it is coming. This makes it difficult to take advantage of HTML5 right now since my GWT is the main framework at play in Apollo. Specifically, I’m not sure how to go about listening for DOM events that are specific to HTML5, such as the ones fired by the <audio> element. This isn’t a huge hurdle, because I can always bridge to GWT with native JavaScript (ick). But it does add another hurdle to overcome.

The Solution, Sadly, is Flash

I have used Longtail’s JW Player for Flash in the past on other projects. Despite my reluctance to do so*, I have decided to implement JW Player 5 for Flash as the audio rendering engine. JW Player 5 has the best support of formats I called for in my key features, and it is relatively simple to use.

* Okay, I just have some minor gripes with JW. Namely, they documentation is always a sore spot as it’s often conflicting or missing. I also feel like I’m being nickel and dimed to death by their licensing terms.

An Overview of Project: Apollo

To start with, I’d like to lay down my vision statement and some of the major tenets of design for this project (or guiding principles, if you will). These points should help focus direction and guide tough decisions down the right path.

Project: Apollo connects users easily to their media and simplifies management of large, complex collections of media.

Project: Apollo is a learning experience, first and foremost.

The personal and professional gains to be had from seeing out a project of this size and scope are well beyond any short-term plans for entrepreneurialism.

Project: Apollo is designed with future standards in mind.

The goal is to discover what is possible with new tech, not work within the bounds of old tech.

Project: Apollo places the highest possible value on the user experience.

And the experience must be fast. And clean. And simple. And powerful. It is important that new users are just as comfortable as seasoned veterans.

Project: Apollo should be familiar.

Re-inventing the wheel is not a goal or a priority. However, innovation where necessary is perfectly acceptable.

Project: Apollo is about intelligent design.

The user is not stupid. Functionality should work for the user, rather than having the user work for functionality.

Project: Apollo is simple and focused on a single problem.

Stay focused on this problem and solve it well. Apollo is not intended to be a “push-button, do everything” solution nor a buzzword mashup.

Key Features and Functions

  • Browser-based, supporting Firefox 3.5+, Safari 3+, Chrome 5+, Internet Explorer* 8+, and iPad browsers. Native iOS applications are planned.
  • Supports both streaming and downloading of content in a user’s library.
  • Supports playlists with play one, play all, repeat and shuffle modes.
  • Comprehensive library browsers, including: by artist, by album, by genre, by song, by ranking and by year.
  • Support for album artwork and related media.
  • Support for MP3, MP4, AAC, and WAV audio.
* Support insofar as is reasonably possible. I refuse to let Internet Explorer cause me grief or frustration.

The Audacity of Music

For a number of reasons, I find my collection of music to be a generally complicated topic.

It’s Enormous

First of all, I have a ton of it. Gigs of it. Thousands of tracks. I own a 60GB 5th-gen iPod. It’s full. Needless to say, I seem to fancy myself a collector of audio. I refuse to let go of old tracks I would never even admit to owning. Managing such a large library is tough and most would-be music libraries out there I have tried tend to choke on large databases.

It’s Amorphous

My music collection isn’t just large, it’s also fragmented by duplicate tracks, multiple formats, and duplicate tracks in different formats.

It’s On My PC, At Home

Up until recently, I used my MacBook Pro at work and a Gateway desktop at home. All of my music resided on my Gateway with its 1TB internal disk drive. Then I decided to retire the desktop. It’s not old or broken (Intel quad-core 2.4GHz with 6GB RAM and 1TB RAID disk setup). I just grew tired of switching platforms night and day. It’s like getting into your car every morning to drive to work and finding that the seats and mirrors are never where you left them.

My desktop doesn’t travel with me, either. And since I have more music than my iPod can store, much of my music stays locked up at home (yes, I realize that millions of people have dealt with this problem since the dawn of recorded audio).

It’s Not All Physical Audio, or Even Virtual Audio

Not all the music I listen to exists in a physical incarnation. Some of it came from CD’s, some of it I bought on iTunes. Some of it exists only through streaming radio stations such as Pandora.

So What to Do About It?

Unfortunately, I can’t just throw everything onto my MacBook and be off. I don’t have the disk space for it. And I have no interest in carrying a USB drive everywhere I go to store all my music.

I tinkered with a Network Attached Storage device, but many of the same problems remain. Namely, the device is still constrained to my home and to get decent performance out of it would require a total upgrade of my home network to Gigabit.

Helium Music Manager

I did try and thoroughly enjoyed using Helium Music Manager. It’s a fantastic audio database and has some pretty nifty features for audiophiles. However, it is missing one critical, deal-breaker feature for me: it’s Windows-only.

Enter the MKLabs

Since I’m a developer, I think it ought to be possible to build a solution to suit my needs and wants. So that’s what I’m going to do. Part of MethodKnowledgy is to explore and learn about code, so this makes perfect sense. I think I’ll call my little “code shed” of tinkering MKLabs.

I won’t go into all the details of features here and now, but basically I intend to build a web-based audio database application and leverage some of the coolest new platforms and technology out there to make it happen.

I’ll use cloud-based hosting systems like Google AppEngine to host my application and give it a scalable, high performance back end. The AppEngine DataStore also looks like a good candidate for storing massive amounts of data on my music.

I’ll use Amazon S3 to store and backup my music and distribute it through Amazon CloudFront.

I’ll use Google Web Toolkit to build super-fast, engaging user interfaces so my application feels (and works) like a proper desktop application.

And I’ll use new features like HTML5 Audio tags to render the audio, removing the need for special plugins or browsers (albeit I make a rather large assumption about the capabilities of average Joe’s browser).

I’m not trying to reinvent the wheel, though. Products like iTunes, Helium, and Grooveshark have proved what features do and don’t work. Now I just want something that works the way I want it to.

I aim to make regular updates here about the progress of my new music jukebox, codename Apollo. I’ll discuss major successes or failures along the way and try to document things that work well or don’t work well, and how to overcome certain challenges.

Should be fun.

I’ve Done it Again

Yes, again.

For the umpteenth time, I have re-launched my blog. Or a portion of my blog.

Like many people, I too have these illusions of grandeur. My visions tell me I’m going to create some über-popular technical blog that everyone and their sister will read. I start the blog and after a few weeks I lose interest. Then my interest in reinvigorated at some point later on. That’s how it goes; it’s a vicious cycle.

Would there be any point to promising to do more updates or write more about topic ‘X’? Probably not. But I’ll do it anyway. I’m going to make more updates and write more about topic ‘X’. Satisfied? Good.

In all seriousness, though, I really would like to start documenting some of the things I learn as I trowel through code. There’s no reason anybody else should have to make the same mistakes I do, including myself.

It would make sense for me to start documenting things like how I finally managed to get Product Foo working on my atypical setup. Or how I didn’t.

So here goes. I’m looking forward to it. Read on and enjoy.