By Shamus Posted Thursday Jan 17, 2013
Filed under: Programming
157 comments
Coding a Parser Previous Post | Next Post
I think most software companies have a set of rules for writing and formatting code. This is often referred to as a “style guide”. This tells the programmers on staff how to make code that will fit together in a coherent system. For example, in C++ this code:
#define IMPORTANT_NUMBER 10 voidDoSomething(constintiFirstVariable,char*szSecondVariable){ intiThirdVariable; floatiFourthVariable; iThirdVariable=iFirstVariable/IMPORTANT_NUMBER; iFourthVariable=0.0f; if(iFirstVariable>IMPORTANT_NUMBER){ DoSomethingElse(iFirstVariable+1);} if(strlen(szSecondVariable)>IMPORTANT_NUMBER){ return;} strcat(szSecondVariable,"foo"); iFourthVariable=(float)iFirstVariable/IMPORTANT_NUMBER; if(iFourthVariableIMPORTANT_NUMBER){ return;} elseif(CalculateSomething((iFirstVariable*iFirstVariable)>IMPORTANT_NUMBER){ strcat(szSecondVariable,"bar");}} DoFinalThing(iThirdVariable,iFourthVariable,szSecondVariable);}
#define IMPORTANT_NUMBER 10 void DoSomething (const int iFirstVariable, char* szSecondVariable ) { int iThirdVariable ; float iFourthVariable ; iThirdVariable = iFirstVariable / IMPORTANT_NUMBER ; iFourthVariable = 0.0f; if ( iFirstVariable > IMPORTANT_NUMBER ) { DoSomethingElse ( iFirstVariable + 1 ) ; } if ( strlen ( szSecondVariable ) > IMPORTANT_NUMBER ) { return; } strcat ( szSecondVariable , "foo" ) ; iFourthVariable = (float)iFirstVariable / IMPORTANT_NUMBER ; if ( iFourthVariable < IMPORTANT_NUMBER ) { if ( strlen ( szSecondVariable ) > IMPORTANT_NUMBER ) { return; } else if ( CalculateSomething ( ( iFirstVariable * iFirstVariable ) > IMPORTANT_NUMBER ) { strcat ( szSecondVariable , "bar" ) ; } } DoFinalThing ( iThirdVariable, iFourthVariable, szSecondVariable ) ; }
… and this code…
voiddo_something(intvar1,char*var2){ intvar3=var1/10; floatvar4=0; if(var1>10) Do_something_else(var1+1); if(strlen(var2)>10) return; strcat(var2,"foo"); var4=(float)var1/10; if(var4<10){ if(strlen(var2)>10) return; elseif(calc_something(var1*var1)>10){ strcat(var2,"bar");} do_final_thing(var3,var4,var2);}
void do_something (int var1, char* var2) { int var3=var1/10; float var4=0; if (var1 > 10) Do_something_else (var1+1); if (strlen (var2) > 10) return; strcat (var2, "foo"); var4=(float)var1/10; if (var4 < 10) { if (strlen(var2) > 10) return; else if (calc_something (var1*var1) > 10) { strcat (var2, "bar"); } do_final_thing (var3, var4, var2); }
…are basically the exact same code. (Assuming I didn’t botch the conversion.) They will do the same thing in the same way and will have the same bugs. (They might produce very slightly different executable code, but only because on line 3 of the second example, I assign a value to var3 before declaring var4. Other than this discrepancy, the resulting program should be identical. I think.) The second is written in a very compact style that was popular among traditional C programmers fifteen years ago. The first is more modern and is more common among younger C++ focused coders. Both of them have drawbacks and shortcomings.
The thing is, the cost of code isn’t in writing it. It’s in debugging, maintaining, updating, and re-using it. Contrary to popular belief, we coders are not super-geniuses that spend all day solving equations in our massive pulsating distended brains. More often than not, we’re regular people with a very peculiar idea of fun. In fact, I’ve found that it’s thereallysmart programmers who understand just how limited their mental faculties are and will make allowances for their own limitations. It’s the dum-dums (or more often, just the inexperienced) who assume they will always remember everything in perfect detail later.
It’s pretty common to write code and then forget what it does or how it works. So when you’re writing code you should always be thinking of theotherprogrammer. The dumb programmer. The programmer who doesn’t understand this code and has no idea what’s going on. The odds are very good that this other programmer is going to be a future version of you. That other programmer is going to want to look at this code and understand it as quickly as possible. This will happen many times, every time the code is revisited or even read. (Sometimes the coder will be looking for something else, but will get turned around and find themselves at your code. The sooner they realize this isn’t what they need, the sooner they can move on.)
If it takes five minutes to untangle your code, then it will cost five minutesevery timesomeone needs to look at it. If you can reduce that re-learning time to one minute, then the other programmer (who still might be you!) will be able to work five times faster. Multiply this effect across the entire codebase and you suddenly realize that you have a lot of power over that other programmer. You can make their life easy or you can torment them with obscurity and confusion. (The awesome thing is that if the other programmer is you, then they get what they deserve either way.)
Back when I was paid to write software, our company had a coding practices guide. We were a small company, so the system was largely informal and only covered the basics. The original guide was devised by one guy, who basically wrote most of the original software himself. When the rest of us joined him on the coding team, his own personal style became the company style by default. The guide remained after he departed. His successor didn’t see any reason to change it because consistency is more important than personal preference and nobody wanted to re-write or re-format all that old code. When the job fell to me, I retained the style for the same reason. After several years I got comfortable with it and it influenced the style I use in my personal projects. If you’ve ever downloaded the source to any ofthe projects I’ve done over the years, you’ve probably noticed bits of anachronistic C formatting in my code. This is why.
In a larger company these guides are more likely created by a group of people, and the question of what is “best” is probably shaped by what sort of software you make. If your software is some sort of complex mathematical system (perhaps a simulation) then your team might be served with a really sparse style with lots of empty space so it’s easy to follow the flow of all those densely packed variables. Meanwhile, if you’re writing something with lots of branching, looping code broken into small methods (like code that drives a GUI interface) then a really wide style will make even simple code overflow onto many excess pages. Your coders will end up spending all day looking at whitespace and smacking the Page Down key.
I bring all this up because the Doom 3 source code has been released to the public andsomeone askedaboutthis analysis of the source. In the process of reading that, I stumbled on the internal coding conventions of id Software, (which are publicly available as anOffice document) and got to read about how they write code in the house of Carmack.
I’m going to follow up with another post where I talk about these standards and, what I think about them, and what makes these fussy little details so important.
Shamus Youngis a programmer, an
author, and nearly a
composer. He works on this site full time. If you'd like to support him, you can do so via
Patreonor
PayPal.
Coding a Parser Previous Post | Next Post
From The Archives:
Shamus Plays WOW
Ever wondered what's in all those quest boxes you've never bothered to read? Get ready: They're more insane than you might expect.
Borderlands Series
A look at the main Borderlands games. What works, what doesn't, and where the series can go from here.
What is Vulkan?
What is this Vulkan stuff? A graphics engine? A game engine? A new flavor of breakfast cereal? And how is it supposed to make PC games better?
Netscape 1997
What did web browsers look like 20 years ago, and what kind of crazy features did they have?
Civilization VII
I'm a very casual fan of the series, but I gave Civilization VI a look to see what was up with this nuclear war simulator.
157 comments