Monday, May 4, 2009

Firmware in all things

So today I'm home from work for a bit to wait for the repairman (or repairwoman) to come and take a look at my dishwasher. It has lights flashing on the front and won't do anything no matter how many buttons I press. You may be wondering why I don't just fix it myself. Despite all of the experience I have with fixing household appliances (none) and my familiarity with similar devices (huh?) I've decided that it's just not worth my time to try to fix the thing myself (although having to stay home from work to wait for the repairman somewhat negates that position). I tried some things. The manual says that it may be a bad heating element and to check the wiring. I checked the wiring (well, looked at it anyhow) and nothing. Some advice online said to try pressing a sequence of buttons in rapid succession - nothing. I gave up after that.

Why? Because it's obvious this isn't a hardware problem. And if it's not a hardware problem it's a just plain HARD problem. Mechanical systems are easy - they either work or they don't. No ill-defined states, no invalid inputs, no built-in tests. If something can't happen it physically can't happen. If one gear is moving then the gear in direct contact with it also has to be moving. If something sounds wrong it's probably directly linked to the problem - just follow the connections.

But somewhere in this dishwasher is firmware - code. Code breaks all of the rules. Impossible things happen in code all of the time. Jump to the wrong memory address? You could end up executing impossible code. 'This shouldn't happen! I never called this function! It's impossible!' Mess up with pointers or indices? You could start reading impossible values. 'Umm, an unsigned 8-bit integer CAN'T be 1024....' Use a case fall-through instead of explicit checks? 'That's IMPOSSIBLE! That value isn't handled! Why isn't it going to default...'

Code is arbitrary. I have a flashing light. That's not a symptom of the problem - it's an indication. I need a manual to tell me what it means, and if that doesn't suffice I need the REAL manual - the one they only give to the service technicians (well, sell anyhow). It means whatever they tell me it means. A mechanical device is not arbitrary. If my engine is overheating it doesn't do something illogical like lock the drive shaft automatically - it just starts heating up. All according to rules Mr. Newton figured out (differential equations baby!). You can track it and explain it with rules you find in your high school physics book. The code only follows the rules that Mr. Programmer set forth and he's not bound to adhere to any standards and even if he was he wouldn't tell you. Furthermore, the state of a machine is obvious and can be ascertained by observation (perhaps complicated observations, but still, observations). Code need not give any indication of state and typically doesn't, or it's not very useful (green light - good, red light - bad!)

And if you aren't rigorous in your code you can have undocumented behavior. It's entirely possible for you not to be able to reach one of your states or not be able to leave it if you make simple mistakes. And if your testing doesn't catch it then you'll have to reboot your dishwasher or some other such silliness.

Computers are very powerful, but simple. They execute instructions at memory addresses. Everything else is defined by the designer. If he or she is incomplete in the definition of the system then irrational behavior follows. Mechanical systems are bound by the laws of physics: they cannot perform physically impossible functions, they must move smoothly between states that are defined by their physical characteristics alone, and failures are characterized by smooth transition to a new state(one which is obviously broken). Software doesn't follow these rules and that makes it much more difficult to troubleshoot.

PS: It was bugs Living on the electronics. Dirtying them up. I'm not a dirty person I swear. It wasn't a vague error code or random impossible to get to state, it was just flat out broke. At least even computers have flat-out broke modes...

No comments: