Thursday, April 30, 2009

Laziness Continued

I am a fan of taking steps out of processes. Before tonight in my TinyCAD library creation process I had to do a few things:

  1. Update my CSV file with new attributes and/or data

  2. Update the schematic symbols in TinyCAD

  3. Export the symbol library as XML to a certain folder

  4. Run a Python script to turn each symbol in that XML file into its own XML file

  5. Run my main library creation Python script

  6. Open TinyCAD to check for errors. If any symbols are wrong, repeat from the second step


I hadn't really used it earnestly yet but it seemed like too many steps. So I decided to add functionality to my library creation script to pull the symbol data from the symbol library file directly instead of XML files created from that library. Tonight I got it to work, so now the process is:

  1. Update CSV file

  2. Update symbols in TinyCAD

  3. Run library generation script

  4. Check for errors


This is a great improvement. I may actually get work done now!

Tuesday, April 28, 2009

Exceptional Programming

I grew up programming. My first computer was a Laser 128 - an Apple IIc clone. I learned AppleSoft BASIC. Those were the days when the command line was also the BASIC interpreter. I programmed everything on that computer. Well, if 'everything' is some science fair projects and cute programs with blocky graphics. I grew up and got an 8086 and cut my teeth on QuickBASIC, then eventually C. In college I was taught C++ and I also picked up Perl and PHP. But despite all of this I never learned about exceptions until I earnestly got into Python.

We see exceptions all the time on our computers. "Firefox caused an exception in blah blah blah blah." I didn't really know what it meant past 'the program screwed up'. I had done most of my programming in C which doesn't have exceptions (or if it does I've never used them). Sometimes I screwed up in my programs and all sorts of crazy things happened. Gibberish printing out on the screen, random lockups, the computer starts beeping and won't stop, etc. I can't be sure, but once I think I caused the gibberish to come out on the printer. I'm probably imagining that though.

But after working with Python I found out those screwups were actually called exceptions. And what's more, you could handle them. Just put everything inside a try statement and if something bad happens you can catch it below and go on your merry way. It's great to just be able to deal with it and continue. But I noticed something. Exceptions weren't always exceptional. Try to access element number -1 of an array? C would happily let you do it and you could go crazy trying to figure out why. But in an exception-driven language it would catch it for you and stop you. But some functions throw exceptions instead of just telling you that you did something wrong. Well, I suppose that's HOW they tell you you did something wrong. But if you have three statements inside a try block and you just get a generic exception back, you're going to need to do more work to figure out what went wrong. And imagine my horror when I learned that some people will use exceptions for something as mundane as input validation! Maybe I'm just old fashioned, but in C you did your own input validation. You didn't just accept whatever the user gave you and then throw a fit when it turned out to be wrong. But what's worse is that this wasn't on your run-of-the-mill PC, but on an embedded device!

Exceptions? On an embedded device?

I've always been told that embedded devices needed to be ultra-reliable. All failure modes had to be accounted for and handled. True, exceptions are one way of handling failure, but for a lazy programmer it's entirely too easy to wrap the entire program in one big try block and just restart in case something bad happens. It's too easy to not even plan for the failure modes because you have exceptions. That may fly for your DVD player but certainly not for your airplane.

So what's the alternative? For one, strict input validation for everything - not just user data. Don't even assume that your own code will always pass you valid values in functions - always check! But then how do you signal an error? In C many functions would return a non-valid value if there was an error. For instance, abs() might return -1 if there was an error (and you had better check!). But this scheme doesn't always work. For some functions there may be no possible invalid values. You can't take a chance on using a valid value as your error signal. Sure, you'll bury the true meaning of the return value somewhere in the documentation but honestly who's going to look until there's a problem? And by then they've already cursed you for being so tricky.

No, the solution is to return a status along with your return value. In some serial systems the receiving device will respond to any message with a status message to tell you that it successfully received your information. Typically zero is 'OK' and everything else has a specific value - either the byte as a whole means something or each bit has significance. This can be done with functions as well. You can either pass all of the parameters as reference and the status as the return variable, or return the status and parameters in a struct if you don't like pointers. Of course it may be more work to work with a struct (and it's dangerously close to object-oriented programmng!). In this way a function can tell you that it failed and why it failed. You can use an enum to define all of the different error messages and then handle each of them. Of course some errors cannot be handled by your code alone. If data shows up late in a control system there's not much you can do about it except not use it in calculations. And your status returned will tell you if it was late (assuming you check it).

Exceptions are very good form for PC applications but it's easy to be lazy with them. Embedded programming sets a higher bar for the programmer, so bring your A game when developing for embedded devices. Always check inputs, parameters and status returned to make sure that you are working with valid data. The passengers on your airplane will thank you.

Update: I had a conversation with one of my colleagues on this issue and we're of the same opinion - in a sense. In essence he agrees with me except but he thinks exceptions are a valid method of achieving the same goals. BUT - we both agree that in embedded development you shouldn't be catching general exceptions or just wrap your entire main() in a try block. You have to know what failure you expect to happen and exactly how to correct it in EVERY case - exceptions or not. The way exceptions work on PCs (Oops, Firefox caused an exception - it's quitting) is unacceptable for embedded development. I shudder to think that the same approach would be applied to an embedded device by an unaware programmer, but that doesn't indict exceptions in general. Just their misuse.

Saturday, April 25, 2009

Laziness (and Python)

I wrote enough last time to scare some of you about serial communication. I was going to write about the design of a simple serial messaging protocol to make all of you readers feel better, but my laziness got to me. I mean, it's all designed in my head, but I figured I'd need tables and diagrams and figures and such to really explain it well. That riled up my innate laziness, so I decided to write about that instead.

To put this in some perspective, I am supposed to be the library maintainer for TinyCAD. It's an open-source schematic capture program, and development on it just started again after a few years of languishing. My duties are to put together libraries of basic schematic symbols (resistors, capacitors, diodes, some ICs, etc) for other people to use.

I'm not doing a good job. I have released no libraries yet.

I do admit - I'm lazy. And this job requires a fair bit of tedious work. The libraries are stored in Microsoft JET database format and the symbols can only be edited with TinyCAD's built-in drawing tools. They're OK, but not if you want to make two dozen symbols at once. Or worse yet, make slight changes to two dozen symbols at once. Or just add a bunch of meta-data (part name, manufacturer, part number, etc). You have to create all the fields for every part, edit them manually, save, etc etc. The entire process was not made for batch creation and editing. I was in danger of getting absolutely no work done at all. Or worse yet, I was in danger of doing lots of tedious work and then re-doing it when I needed to make slight changes to every part I had already made. Something that realistically I should only have to do once and then forget about it.

I am no fan of manual operations, especially when they are tedious and repetitive. Humans are not geared towards that sort of work - we make more mistakes and are much less efficient than a small shell script. When I was making wirelists my brain would just shut off after a while and I would do all sorts of really wrong things. Or I would cut and paste things that shouldn't be cut and pasted because although I thought two things were the same they weren't. My wirelist had tons of errors that luckily weren't too expensive to fix. But the lesson is clear - don't use people for repetitive operations. They're bad at it.

So when I was faced with the prospect of doing just that for these libraries I said 'no' and started immediately with the laziness. Laziness is bad of course because it keeps you from getting work done. And since I wasn't required by force of law or paycheck to make these libraries I wasn't too insistent on starting a process I knew would be frustrating and error-prone. I did nothing until I learned about Python.

Python is a scripting language that is designed to be easy and do everything. It is very close to succeeding. As I said before these symbol libraries were made with Microsoft Jet Database. It's the back end of Microsoft Access. I am not entirely impressed with Microsoft products but I knew SQL queries and it supports those so I had some baseline. And I had Python. A little searching and I figured out which module I had to import to allow access to the database. A little more searching and I figured out how I could insert the BLOB (Binary Long OBject, or maybe Binary Large OBject? - it's how raw data is stored in fields in databases) for the symbol drawings. Then I said to myself, why don't I store the symbol text data in a CSV file so I can just type things out once? Python has an import for that too. TinyCAD can export the parts data as XML? Great, I can import DOM to access the XML and create a CSV out of the current libraries.

If you've ever used Matlab scripting you'll feel right at home. Heck, there's even SciPy - it mimics many of the functions of Matlab (graphic, matrix operations, basic math, etc) in case you're too poor to buy Matlab. You can use wxWidgets or any other GUI library to create GUI apps. You can access almost any database. You can draw with TKInter. You could lose your voice from listing all the things you can do. It's easy, comprehensive and powerful.

Python gave me a reason for my laziness. I can do more work, more accurately and faster than if I had tried to do everything by hand. Bottom line: if you're avoiding doing some tedious work, pick up Python or any scripting language and do yourself a favor by making a tool to do the work for you. You'll thank me.

Tuesday, April 21, 2009

The scary world of serial protocols...

Serial is one of those things that may seem easy at first glance but in reality is so complicated that you want to hang yourself. I used to see plenty of job descriptions that required something along the lines of 'knowledge of serial protocols'. What a joke! 8-N-1, 115200bps, nine pin serial connector DONE! And you get BITS out the back end when you're done. Are people afraid of BITS?

Some people are afraid of BITS (I'm looking at you CS majors), but that's not the trouble with serial protocols. The trouble is that 'serial protocols' is really vague. The first protocol you might think of is good old RS-232. It's a lovely standby, 8 bits per message, no parity, one stop bit and your choice of baud rate. Why would anyone want to deviate from that? But what do all of these settings actually mean? It's swell if you have a GUI to enter these values in, but it gets harder when you have to configure a microcontroller with nothing but assembly. And even then what actually HAPPENS when you send a message? What does the waveform even look like? These may be idle questions for you until the first time that something doesn't work and you have to dig deeper than the GUI to fix it.

For instance what are the voltage levels of RS-232? The answer - heh. There is no answer. According to the standard it's supposed to be +/- 15V. That'd be easy except that no one follows the standard!. You might get +/-12V, or 0 and 5V (TTL levels). The scary thing is that most of these work because transceivers are often not too picky about voltage levels. And then when you figure that out you'll be surprised to know that the voltage levels are the inverse of what you'd expect - a '1' is -15V, a 0 is 15V. If you don't know that you'll have at least 20 minutes of confusion.

Add to the fact that even if you've figured out your RS-232, there are MANY more serial interfaces out there - all different. I2C and SPI are synchronous - there's a clock signal transmitted unlike RS-232. You don't have to worry about addresses with RS-232 since you've only got two devices communicating, but not so with the others. And have you even thought about hardware handshaking? Unnecessary with RS-232 but crucial to the others. I hope you can figure out why you need an open collector output with I2C...

And don't even get me started about data representation. Do you think bits are just bits? That 0x35 is always equivalent to 53 decimal? Not so fast. That could be ASCII-encoded which would mean that it represents the digit '5', not a decimal value of 53. If you look at your serial data stream in HyperTerminal the output you are seeing is decoded from ASCII. That means that 0x35 will display '5', not 53 decimal. ASCII encoding is useful when data is being displayed on a terminal, but it's also used in other circumstances where you'll tear your hair out because of it.

For instance many serial buses have packets with headers, footers, checksums, etc. Packets are usually started with 0x02 hex and ended with 0x04. You can transmit the length of the data packet when you send it so that your device will know when to start looking for another one, or you could choose not to. In that case, what if the data you're sending has an 0x04 in it? That would tell the device to stop listening and would ignore the rest of the data. So, you encode all of your data with ASCII. If you need to send the value 0x04 (decimal value 4, obviously) then you encode it to ASCII and send it as two bytes - 0x30 and 0x04. When you receive it, chop off the 3's and push the nibbles together and get 0x04 again, there's your data.

And that brings up another point - nibble order. Do I send the least significant nibble first (the '4') or the most significant ('0')? What does the standard say? There isn't one. You have to be told how to interpret the data by whoever put it together, and God help you if you didn't do it yourself.

Suddenly 8-N-1 doesn't sound as simple as it used to. Yes, it's all just BITS, but they're scary bits. They're bits that mean whatever the person on the other end of the conversation wants them to mean. You have to parse them, reorder them, combine them, split them and take a magnifying glass to them to make any sense of it. Watch out kids - it's a jungle out there.