Thursday, March 8, 2012

An In-Depth Look

I had someone ask to share the loops I was talking about in the earlier post for analysis.  Ask and ye shall receive!  Let's start with the loop that performed awfully.

Stare into the belly of the beast...
Here's a quick walkthrough (roughly from left to right):
  1. Load a text file as a spreadsheet with a break on all newline characters - this gives me all of the lines in the file in an array of strings (there's probably an easier way but this was most direct at the time.
  2. The array of strings is fed into the first FOR loop.  The loop is auto-indexed on the array of lines.
  3. Within the FOR loop there's a case statement that handles different lines in the text file differently.  There's header rows and such at the beginning so basically the case statement allows the loop to ignore those and start with the data rows.  
  4. Inside the case statement there's a second FOR loop - this one loops over each field in the line.  It is indexed by the array of integers.  Each row of that integer constant contains offsets and lengths for a string subset call. (More on this in a second)
  5. Inside the second FOR loop there's a case statement that has a case for each field.  Every field except the last is handled the same.
  6. Inside the case statement there's a string subset command fed with values from the constant integer array.  These values provide offset and length for each string subset call except for the last.  The last is the special case mentioned above - there's no length associated with the last call to string subset (not wiring the length causes it to read until the end of the line).
  7. When each field is picked off, that text replaces one of the elements in the constant string array being fed into the loops near the bottom.  The string array has an element for each field in each line, and the integer constant contains the array element within that string array to replace with the current field.
  8. When a full line has been read and turned into an array of strings, that array of strings is appended to the list of lines already read.  
  9. The loops then continue with the next line.
 I hope most of you can already guess what the main problem is so far.  It's the 'this loop inside this loop inside this case inside this loop inside....' yada yada ya. That's bad - for speed anyhow.  At first I tend to try to make my code.. shall we say, robust?  Able to handle even impossible situations?  I also try to make it clear, and clear means verbose or at least, doing much more than you need to to promote uniformity and centralization of data.  This often leads to unreasonable performance through over-engineering.

Let me give you an example -  I'll bet something like this has happened to you.  Let's say you've got a program that communicates with another system.  For what you're programming, you have to change your behavior depending on what commands the other system sends you. So as a hacker, you start hacking.  You read the first command you need to implement:
"Set mode to passive - in this mode your system will only respond to messages sent to it by the external system."  
Ah, easy!  I'll just make a boolean that tells me whether I've received that message!  Then I can read that boolean and determine whether or not to send messages automatically.  Great.  Second message:
"Set mode to active - in this mode your system will automatically send the following messages at these rates:..." 
Perfect - when you receive this just toggle the same boolean from before to set yourself back to automatic mode.  Third message:
"Set response delay to minimum - when you receive this message delay all automatically-sent messages by one frame."  
Okay... so I'll make a boolean for this one too?
"Set response delay to maximum: When you receive this message delay all automatic messages by three frames." 
Okay, okay, no, wait, I should make that last one an integer or something and not a boolean, I'll just go back and change that...

And right about here my bat-sense starts tingling.  "You're going to be doing this forever!  Use a state machine!  Otherwise you'll have a million variables and you won't be able to keep track of what combination of booleans and integers defines what state and what behavior!  STATE MACHINE!"  And a state machine sounds like a good idea - it allows you to be more rote, more design-oriented, more organized and more certain that your system works.  Implemented properly they improve deterministic behavior in complex systems.  So I add a state machine and start defining all the states I'll need: automatic, passive, umm, delay?  No, so far it's just two states but I'm sure there will be more.  Let's read the next requirement:

"....."

Ah, so we're done are we?  So I rigged up a whole state machine architecture when all I really needed was one boolean and one integer.  Sure, a state machine will WORK for this but it's certainly not efficient.  It needs memory, time, special coding practices, etc.  A boolean will do the same job much more efficiently.

And that's my problem - that's why I build loops inside of cases inside of loops (yo dawg, I herd u liek loops....). I think I'm doing something to make my programs deterministic and robust.  I'm actually making my programs robust, deterministic snails.  Like a giant snail-tank that's crushes anything in its path and is always on-time.  I don't think anyone wants that.  Unless that's what you ride to work every... month?

After much profiling of various methods of implementing this functionality, I settled on the below implementation:
So fresh and so clean!

Yeah - look at that.  ONE LOOP!  Seven calls to string subset (running in parallel) and no Build Array blocks!  Loading a 48000 line file used to take an hour.  Now it takes two seconds.  So how did I manage to rid myself of all of that cruft?

  1. First, I stripped off the header rows from the array containing each line.  That removed the need for a case statement inside of the first FOR loop (the one that ignored the header rows).  
  2. Then I removed the inner loop that looped over each field and just put all of the string subset calls in the same place.  This allows them to run in parallel instead of serially (if Labview is as smart at figuring out how to parallelize things as is claimed).
  3. Since I'm not ignoring any rows any more I can replace the Build Array block with a simple auto-indexer on the output of the FOR loop.  This significantly sped things up. 
  4. And finally, I configured the loop for parallel operation.  To do this right click the FOR loop and choose 'Configure loop parallelism' or something like that.  This lets different iterations of the loop run at the same time.  If you think about it it makes sense.  As long as the next loop iteration doesn't depend on the last loop iteration then you can just run them all at the same time.  If I had 48,000 processors then I could have each of them process one line of text and have this job done in microseconds.  When you use it though, pick a small single-digit number for the number of parallel loops. There is overhead associated with managing multiple threads so increasing the number of parallel threads starts to backfire after a while.
The second block diagram is much better programming than the first simply because it's quicker.  It's like I always say: any automated process is better than any manual one because it's consistent AND because once it works you can speed it up later.  There's no way this program would have been useful to me if it took an hour to load one log file.  So don't overcomplicate things!

No comments: