Thursday, February 17, 2011

C Gotchas

Holy crap, two updates in one day? Yeah... I decided that if I find myself sitting at my computer thinking 'I wonder what's on Slashdot' or 'Do I have any more street cred over at Chiphacker?' I decided I should do something potentially useful and update my blog. Doesn't have to be long, doesn't have to be good, dosen't ahve to ahve correct spellign - just do it. After all, the first step to making money with a blog is to update it every day. Second step? Have people actually read it. Step three profit baby!

But I'm sure you're all here for the real meat of this post and that is going to be the answer to the question 'What stupid thing did Steve do today that cost him hours of time?' It didn't take me hours, but here's a snippet of code that caused me trouble. By the way - you win $10,000 if you spot the bug and submit it before I hit 'Publish' for this post.

#define VALUE 0xF01FUL

short i = 0xF01F;

if( VALUE == i)
print("They are equal\r\n");
printf("They are not equal fool\r\n");

Ok hotshot, what prints?

If you said 'They are equal' you are in fact wrong. I hope that feels good. But do you know why you are wrong? Here comes the science.

We have two things being compared here: i is a short int. On most processors/architectures that is a 16-bit signed integer. The #defined value is in hex (obviously) and supposedly the same value as the variable but has a little 'UL' on the end of it. That signifies that it is to be treated as an Unsigned Long variable. This corresponds to the unsigned int (32 bit) type on most processors/architectures.

That might already give you the first inkling of why these two aren't equal: one is 32 bit but the other is 16. But you veterans out there (if you consider having taken an introductory C class in college being a veteran) will think 'Ah, but those bottom 16 bits are the same, so it shouldn't matter!'. You would be right, but C doesn't follow your rules. In C all integer comparisons are done on a 32-bit basis. Basically, C expands every integer used in a comparison to 32 bits to determine if they're the same.

'Aha!' you say with a sly smile, 'I was right then! Even if you expand them both to 32 bits they're padded with 0's in the upper unused portions (obviously!) so they come out to the same thing!'

But once again you're incorrect in your assumptions. Why oh why do you assume that they're padded with 0's? Because the only other option is to pad them with one's and that would change the value? You obviously forgot how signed data is represented on a computer! In signed integer types in C the most significant bit is the sign bit - if it's 1 then it's negative. Seems simple enough. So let's follow your line of thinking and expand our (signed) variable to 32 bits:

16-bit value: 0xF01F
Expanded to 32 bits: 0x0000F01F

Wait a darn tootin' second! This was a negative number (the most significant byte was F which means all 1's which means the most significant bit was 1 - which means negative). Now that we've expanded it it's suddenly.... not negative. Well that can't be. It's not the same value then - positive vs. negative. Kind of a big change. So to preserve the value we'd have to expand it and pad with 1's - like this:

16 bit value: 0xF01F
32 bit value: 0xFFFFF01F

Let's check my math with a signed integer calculator that you can find online (via the Google:

0xF01F: -4,065
0xFFFF01F: -4065
0x0000F01F: 61471

Yep... padding with 0's doesn't produce the same result if the integer is defined as signed. So where you see

if(0xF01F == 0xF01F)

C sees:

if(0x0000F01F == 0xFFFFF01F)

And then it looks at you funny for thinking they're the same.

But I don't look at you funny. I'm only so dismissive and rude because I just made this mistake today and the pain is still fresh. Someday we'll laugh about this.

But for now if you mention it I will end you.

Quick Debug Tip!

You've probably been told you should always read returned error code from functions - especially if you're working with a new API. It's too easy to assume you have everything working because it compiles and then it all falls flat when you try to run it. But the question is what do you DO with the status? It's not always clear. There usually a lot of them and often you won't be handling most of them in a release configuration (hopefully you will have learned how to avoid most of them by the time you release). Some are potentially ill-defined (how many times have you seen a code like ERR_UNKNOWN returned three or four different places in one function?). But what you should always do is read the code and check it - like this:

status = api_function(args);
if(status != ERR_OK)

This is good practice - always do this. Even if the if statement is blank, still do it. Just get your hands into the habit of reading returned error codes and checking them.

But you may have noticed I put something called ERROR() in there. That's a placeholder for a real error handling strategy. Just start by defining it as a macro (Note, this may not be correct, it's early and I'm not in my right mind):

#define ERROR()

Now it exists but it doesn't DO anything. This is a fancy way of putting nothing inside the brackets but still reminding yourself that you have to do something later. If you do this for every returned error code then you will have a hook in place to do something if a proper status code isn't returned.

Now depending on what point in development you're at and what kind of system you're running you have several options. While still debugging I find it easiest to just define the macro to be something like this:

#define ERROR() disable_ints();for(;;){wdt_pet();}

This will entirely block the program (including interrupts) while simultaneously petting the watchdog timer so it doesn't restart the processor (just in case the watchdog interrupt isn't maskable on your processor). If you don't have a watchdog timer you can ignore that part. This approach works best when you have some sort of debugger. You start the program, wait a second and then pause execution to see if it's stuck in any of these loops. This approach also works in a multi-threaded system assuming that your scheduler runs in a maskable interrupt. When you release this code you should remove that macro so that your system doesn't hang out in the field for an ignorable error.

For a more advanced approach that you might actually want to use in a release environment you can potentially define different levels of error such as ERROR and FAULT. FAULTS would obviously be more important and warrant more attention while ERRORs might simply be counted and then ignored. Most of the time dire errors can't be handled locally, so your only option is to report it to the operator (if there is one) and usually his/her only option is to hit reset and hope everything goes back to normal. But at least there's a process!

There are other interesting wrinkles in this error handling game. The ARM Cortex M3 for instance has a fault interrupt that is called whenever something bad happens (try to access memory outside of RAM, divide by 0, wear white after labor day, etc). It pushes the state of its registers, stack pointer, program counter, etc on to the stack and then visits the interrupt. You can use the information it saves to create a report (because sadly most faults that force a visit to the interrupt cannot be recovered from without a reset). The processor you're using may have similar error-handling features. Take a look.

To summarize - always check returned error codes. Even if most of the time you can't DO anything with them you can at least hang the program so you know you have a problem to fix. You might be able to get fancier later but as my favorite super-national paramilitary group used to say - Knowing is half the battle!

(Actually that's crap - they were all-American when I was growing up and I'm too old to change now! Get off my lawn! GO JOE!)

Tuesday, February 15, 2011

What are you doing?

Stop! Right now. What. Are. You. Doing?

Wow, let's stop that, I felt like a telegraph there for a minute (STOP). But the question stands: just what do you think you're doing? I intend this post mainly for people who have stopped developing software/hardware and have taken a break to absorb my acerbic wit. If I stopped you in the middle of enjoying a bowl of ice cream please believe me when I say I did not intend that you should question why you're putting it in your mouth. There's a good reason for that: it's ice cream. Duh.

But to all those coming for some witty banter - fresh from a break from developing the latest microcontroller-inspired widget - let me ask the question again. Just what do you think you're doing?

Probably your answer is going to be 'coding' or something similar. Good. Microcontrollers need code - that's obvious. But let me ask you this: how certain are you that the code you're writing right now is the code that's going to end up inside of that microcontroller when all is said and done? Uh, analysis? Maybe 15% certain I'd say - depending on what point you're at in the design. If you're early on in the design your chances are closer to 1%.

This is not your fault - well, that's a lie. It probably is your fault - but I'm trying not to scare you off. After all, this happens literally to everyone. Everyone. No one goes through a project and doesn't have one of those moments where they realize they have seriously miscalculated the scope of their project, or the simplicity or realized that they forgot about some other major hurdle. Then they end up decimating their code - and not in the literal sense where 90% of it is left after. No, more like 1% is left - and that's probably a header.

I typically see this problem because people don't consider whether their code actually works. They read specs, requirements and other documentation and then write a lot of C code. Or Java, or Python or whatever. It's all the same - almost literally because as I said 99% of it will typically be gone by the end of the project (so it makes little difference what language it's in anyway). Sure, it probably compiles - with only a few dozen warnings (It's all small stuff - it doesn't affect how the program works. Some casting will fix it, it's fine!) but there's no telling whether it actually does what is expected of it because you didn't set up benchmarks, tests or sanity checks on anything. Your development process goes something like this:

Delete 'old' code
Wish that you had used source control because that wasn't old code at all
Cod (not a misprint - you're enjoying fish at this point)
Ode (poetry break)
Blank stare
Blank stare
Delete 99% of code

And let's be honest - if at any point in the design process you stop to think about it you're going to think 'Man, integration is going to be a bitch.' There's no project for which that isn't the case - integration is always difficult. But there's a way to make it easier - don't save integration for last.

Why is it that when you work on something for some reason project management always seems to think that it's their job to keep developers apart for as long as possible? It's probably because the Waterfall Model says that development and integration are two separate phases and one (development) is a prerequisite for the other (integration). So no skipping ahead to integration! What will project management do if not enforce the flawed and ultimately unhelpful vanilla implementation of the Waterfall Model on the poor helpless engineers under its command?

Of course not everyone blindly follows the Waterfall Model, or eXtreme Programming (seriously, the initialism is XP, not EP, so I capitalized the right letter there) - it certainly isn't the case where I work. No, what you need to do is integrate as soon as possible.

A project typically consists of several independent parts which can be integrated and tested without bothering the other parts. Projects usually also consist of several pieces of hardware, code or technology you've never worked with before. Let's be honest - when was the last time the new chip you used followed its own datasheet exactly? And for that matter, when was the last time two engineers working on opposite sides of a communication channel, attending the same meetings, reading the same documentation and potentially being in the same love triangle decided to implement their portions of a project in a compatible fashion? These aren't signs of immature engineers or bad project management or difficult documentation - it's just life. These things happen. The difference between an inexperienced engineer and an experienced one is basically how jaded he/she is. Optimism is not a useful trait when everything is contractually obligated to go wrong.

Given these problems it makes no sense to write literally everything and then go back and make sure it actually works. Here are several suggestions:

Use unit test for complex algorithms to catch rookie mistakes such as off by one errors (we all make them).

Walk over to the other engineers office/cubicle and ask copious amounts of questions. Nine times out of ten on a project the arbiters of what actually gets made are the engineers themselves. Just to get everyone on the same page it helps to ask of your fellows 'So what CRC method are we actually using and how does it work?' Then document it.

If two chips have to talk to each other chances are they will hate each other and refuse to speak. Start early with actual chips and force them to get along as soon as possible.

Abstract away interfaces so you don't have to worry about specifics in unrelated parts of code. You don't have to know whether your serial interface actually works if your application's only method of accessing it is a ring buffer.

Verify all assumptions as soon as possible. Chances are you're wrong (life just hates you like that).

This is hardly an exhaustive list but I think you get the idea. In case you don't get the idea I'll state it as plainly as possible:

Code isn't useful unless it works! Don't sit there with questions, concerns and unverified assumptions but pop out 1K lines of code a day. In a month you'll be left with 500 lines of good code and a looming deadline. Be pragmatic, be wary and be prepared.

Note to anyone who is reading this who may actually know me and/or work with me: I am not vindictive, frustrated or lacking empathy for this situation. Believe me I have been stuck in it plenty of times and it was all my fault. But I will be the last person to berate you and the first person to stick up for you in a meeting or directly to your/our boss. Sure, this may be a rookie mistake but we're certainly all allowed them. If we're not then we have no opportunity to become better engineers.

Thursday, February 3, 2011

Terminology Galore

If you're anything like me you hate terminology. You know, those special, magical technical words that people use that you don't know the definition of. Terminology. It'd be great if it weren't so imprecise. You'd think (well, hope) that a word has one definition. This is not even close to the case with regular English (and even worse with British English) but can't one hope for a more direct mapping from technically-minded people. A word should just mean one thing, right?

Take the word 'driver'. On your desktop PC you have drivers - all kinds of them. On an embedded system you have drivers - all kinds of them! But they're not the same kind of drivers - not exactly the same anyway. If you wanted to fill a job writing Windows drivers you might not want to fill it with someone who write embedded systems drivers or even Linux drivers. So if you saw such a job advertised you'd want to make sure what kind of job you were getting into.

So some recruiter calls you and asks 'Do you have driver writing experience?' And you really want to ask what he means but you know he doesn't know. The best answer you're going to get is "What kind of experience do you have?" I would respond with something like "I've written multiple device drivers for bare-metal microcontrollers and real-time operating systems, is that what you're looking for?" And if you're lucky the notes he writes about your experience will be something like "bear-metal.. big iron? Iron Man? multiple operati.. operation systems. Operation - I loved that game...." And what he tells you is "Absolutely absolutely, I'll get in touch with them and let them know. That's great that's great!"

And that's only the answer I would give now - because now I have some idea what a driver is and how it isn't a board-support package or hardware abstraction layer (I think). But if you're anything like me a month ago you're a bit lost. You see I had developed drivers before (I think), and just didn't know it. So let's define some terms!

I consider a driver (in an embedded system) something that hides registers for you. For instance, here's some code that configures a timer on on MSP430 for creating a servo control pulse:

//Clear timer A config

TACTL = 0x04; //TACLR = 1

TACTL = (0x02 << tassel =" 10">
(0x00 << id0 =" 00">

TACCTL0 = 0x0000;

//Configure compare and capture unit 1 for output compare mode 3
TACCTL1 = 0x0000;

TACCTL1 = (0x00 << cap =" 0">

(0x03 << outmod0 =" 011">

//Set CC1 to generate 1.5ms pulse - neutral

//MSP430 user's manual page 11-14

//?? what's going on here?

TACCR1 = 0x0000; //SET output line HI at 0x0000

TACCR0 = PULSE_1MS; //RESET output line (LO) at 1.5ms

//Set timer A period to 2ms


//Start timer

TACTL |= (0x02 << mc0 =" 0x01">

(0x01 << taie =" 0x01">

Thats... a lot. A lot of bits. A lot of hex, a lot of OR'ing. A lot of bad formatting. Oh my, I can't handle this.

I'd rather do something like this:

timera_cc_setpw(1500 /*us*/);

See? No bits. That's a driver.

Now, this is an internal peripheral. Those are easy. Well, easier. We know that drivers basically set registers. When it's an internal peripheral then accessing those registers is just as easy as saying 'register = value'. But it's harder if you have (for instance) a peripheral connected over SPI. You still have to set registers but that requires you writing data over SPI - usually commands like 'I WANT TO WRITE TO THIS MEMORY LOCATION. IT'S A CONFIGURATION REGISTER YO' and then the peripheral responds 'YO DAWG THAT'S COOL WHERE THE DATA AT?' and then with another SPI transfer you say 'HERE DA DATA AT!'. So basically you'll have a peripheral driver utilizing the SPI driver. It's a whole bunch of driver on driver goodiness.

So what about all the other crap? Like a Board Support Package. Well a BSP... supports a board. For instance the ez430-USB development kit has one LED on it (this is the extent of its on-board peripherals). It's located on P1.0 which is on physical pin 3 which can be accessed on port (yadayadayadayada). You don't want to know all of that - you just want a heartbeat LED. You want it to flash. So you have a smart guy write a function for you - a BOARD SUPPORT FUNCTION!

void bsp_led_toggle( void )
P1 ^=0X01;

This is great - I don't have to know where the LED is, I can just say 'toggle that please!' and it gets done. That's board support - it supports the board. Whatever's on the board needs functions so I don't have to know all about it.

And what about the dreaded HARDWARE ABSTRACTION LAYER?!?
The HAL just makes sure you don't actually need to know what your hardware looks like to use it. For example, you can turn a general-purpose I/O port into a TTL serial interface - you just have to be careful with timing and such but it's certainly possible. Now imagine on your Arduino you have the regular UART and a software-based UART. You want the same interface to both: get a byte, put a byte, turn it on, turn it off. So you write up a bunch of functions and then you just say something like:

byte = uart_get(fake_uart);
byte = uart_get(real_uart);

Same interface, same bytes, different underlying hardware. That's what a hardware abstraction layer does.

Hopefully with some of these definitions you'll be a little more educated about what all these weird definitions are. Good luck!