Lost in Translation

As a programmer, if you’ve ever been tasked with migrating any system from one implementation to another, then you have undoubtedly ran aground on technical issues ranging from missing or underperforming data structures, inefficient algorithms, or a combination of these and any number of other issues. It doesn’t matter if it’s merely going from one language to another or across platforms, the inevitability of technical conundrums is inescapable.

This is not to say, however, that they are insurmountable. Often, in fact, when we stop trying to over-engineer our system and evaluate the situation as it is, we can make optimizations beyond our expectations. As a simple “for instance,” presuming we have a determinant for a conditional that is initially a free-text entry. According to this pretense, Boolean evaluation can take as little as the first byte in the string or as much as the full text of the string; thus, for sake of computation time, memory, and most of all maintainability, it should be avoided at high cost. Compare this with replacing this free-text with a numerically-indexed list of choices and the Boolean evaluation drops to a static value. (Even better, most implementations of bit-wise integer comparison are executed as close to machine code as possible.)

But, inevitably, there are cases which solutions such as these are not so prevalent or obvious. For instance, when implementing join-context SQL conditionals as PHP, it can take a tremendous amount of derived and out of band knowledge to perform the translation. Combine this with already complex logic, (nested XORs, negated string comparisons, and cross-table lookups), and the code can get slow and ugly very fast. So what do you do as an engineer with an email, the SQL, and a pat on the back? Get a game plan.

It’s more important to pencil in a strategy for defining your criteria than going off of pre-existing assumptions. For one measure, repeated calculations in the original may be consolidated as a single conditional whose value is stored and later addressed; the processing efficiency alone is worth the change when scaling up to enterprise-level datasets. Additionally, as I stated above, changing data types of pre-existing conditionals can be a great boon as well, and performing integer arithmetic is always faster than strcmp.

Once the parameters are defined, we zoom out to observe the algorithm. Keeping in mind the practice degrades maintainability, nesting some repeated conditionals can save processing power down the line. But don’t get nesting-happy; (especially with code you’re simply translating and didn’t have the mental inception for) you’ll forget why two conditionals had been nested or why their order mattered so much. This is where your formal logic class enters the picture. Class, what is {A, B, C, D} & {A, B, C}? Of course it’s {A, B, C}; it’s also ¬{D}. The optimization, however slight, of changing

if($var == A OR $var == B OR $var == C) { something_true(); }

to

if($var != D) { something_true(); }

may not increase performance by leaps and bounds, but if you’re deep in the code anyway why wouldn’t you shave off the time?

Now, after the code is written and tested, there is much rejoicing. The only thing we now have to contend with is if the translation needs to change. And while we may not, (and hopefully will not), go back to square-one with our evaluation, we should at least take the time to consider where we may have gone wrong in our previous implementations and where we can continue to go right.


About this entry