Wednesday, October 5, 2011

Good Coding Practices #...?

I have a semi-ongoing series in good coding practices... to the extent that I ever update my blog anyhow. Lately I've taken up work on an iTunes Android Sync tool. Is anyone else amazed that there are very few good programs that will sync iTunes playlists to an Android device? There are some out there, certainly but for some reason each of them tends to have one or two major flaws: way too slow, randomly renames your songs to the titles of different songs, costs money - the usual complaints. So I looked into it and it turns out that iTunes maintains an XML version of its library information file. But then I discovered that the Android playlists are stored in a highly technical format called ASCII-encoded text files. Let me tell you, it took forever to crack that puppy.

Well when all you have is a hammer every problem looks like a nail. When you have Python every problem looks... easy. So I decided to make a Python program to:

1) Read the XML library file.
2) Figure out what playlists are in there
3) Figure out what songs are in those playlists
4) Figure out where those songs are
5) Make Android playlists from the iTunes Playlists
6) Copy the playlists and music files to the Android device

So at first I tried using my favorite parser - SGML parser. But it turned out that SGML parser doesn't handle truncated tags. You know - the ones with nothing in them? With only a start tag that has a / in it and then it's done? Yeah, those. So I had to switch to expat which isn't so bad either.

But enough of that! I'm going to show you what I did that's a good coding practice. The iTunes XML file has several parts in it: a general section that describes the library, a tracks section that describes each track and assigns it a unique ID, a playlists section that describes the playlist and lists the unique track IDs in the playlist.

I wanted to start off by parsing all the goodness of the general library section and ignore the rest while at the same time planning ahead so I would... be able to figure out where to put the code to parse the rest of it as well. To that end I present a random code snippet:

def handle_data(self,text):

if self.current_tag == KEY_TAG:
self.current_key = str(text)
print "Key: " + text
elif self.current_tag == INTEGER_TAG or self.current_tag == STRING_TAG or self.current_tag == DATE_TAG or self.current_tag==TRUE_TAG:
if self.current_parent == LIBRARY_KEY:
if self.current_key in libraryKeyList:
print self.current_tag + "=" + text
self.tempDict[self.current_key] = text

elif self.current_parent == TRACKS_KEY:
elif self.current_parent == PLAYLISTS_KEY:
elif self.current_parent == TRACK_KEY:
elif self.current_parent == PLAYLIST_KEY:

self.current_key = ""

Some explanation: this function handles data inside of tags. It handles key tags specially, but handles tags that contain data (integer, string, date, etc) differently still depending on which section they reside in. So you can see I've written the code that handles the data in the library section but left out handling data in all the other sections. But this is by design: if I wasn't planning ahead I wouldn't have put the if statement that checks what the parent is in that function. I would just have put the code that handles data for the library section without verifying that I was still in the library section - and then it would have handled a whole lot of data in the rest of the file.

By putting the parent key check in there and explicitly listing the different situations that I want to code for I'm doing two things. First, I'm specifying the exact situation I expect this code to run in - putting my assumptions right out there in the code. Second I specify all of the other situations that I haven't yet coded for but want to in the future. I'm using the code to inform myself (in the future) that I need to put code there that does something different. That's the good coding standard.

It can be used in a variety of languages. In Python use the above form but make sure that you put the pass statement in an empty case - otherwise it gets angry. In C you can use #warning directives to produce a warning when you know you'll have to write some code but just haven't yet. Like '#warning Will Robinson, you didn't handle the default case!'

No comments: