This is a continuation of the blog post 12 Principles of Good Software Design.
12. An idiom should allow a new developer to understand the code.
Of course up until now all of the idioms–the ways we express ourselves in software–have focused on clarity and simplicity over complexity and over-engineered solutions which just make our life complicated. We’ve focused on idioms which keep functionality together, which make a particular module obvious, which breaks functionality into well-defined components.
But it’s not enough to simply use patterns which emphasize simplicity. We also need to think about how we build our code in a way which makes what we’re attempting to do understandable to another developer who is taking responsibility for your code, by using a uniform approach which provides consistency and clarity to the overall project.
My own approach is to consider the “big picture.” For most mobile applications which talk to a back-end API, I consider two things: how does data flow through the system, and how does the user interact with the system. For a mobile application with a back end service, the flow is rather simple:
We store our back-end data in a database. We use a database I/O collection of classes (such as JDBC) to obtain and store data in the database. Business logic verifies the data is stored in the back end correctly and maintains integrity of the data. Business logic also handles complex logic such as logging in, database access and the like. The servlet logic handles API requests made by the model code on the mobile app. Then a system of views and controllers help present the data.
Data flows up, from the database to the views. Changes in the data flows down, from the view to the database.
Most applications we build for mobile applications follow the same flow of data up from the database to the views, down from the views to the database.
The flow of the individual screens within an application (either an iOS or Android application) tend to be determined by the specific domain of the application. But even there some generalities can be made: a login view controller flows to a “forgot password” screen and a “sign up” screen. (One of the advantages of modern iOS development are storyboards which, when used properly, allow a graphical relationship between the screens of a section of the application to be maintained.)
There are other types of applications which we could build which have a different overall structure. For example, a compiler/parser tool which takes an input file and produces an output file would generally consist of code which manages the input files being read, a lexical tokener which converts a character stream into parser tokens, a parser which converts the language into a parse tree, and optimization/conversion steps which manipulate the parse tree according to the various optimization or conversion steps required for our language. A file writer then handles converting our parse tree into our desired output file. Data flows from the top down, converted at each step according to the rules of our language.
Not all applications have the same structure, because not all applications are mobile applications talking to a back end server architecture.
But I do personally consider how data flows: when it is created, when it is manipulated, when it is presented and when it is saved. I may consider the format of the data, how components of the data interact, what the business rules governing the data are, and what the data represents.
And that helps us with the structure of the application: how the model component works (if we’re writing an MVC style application), how the data is loaded, how the data moves–and that then drives the overall design of the rest of the application.
Of course how we structure our application also depends on a series of best practices–and it is those best practices which allow us to decide where to put the components of our application.
For example, one best practice is that business logic which constrains the integrity of data in a database needs to reside as close to the database as possible–that way, violations of business logic doesn’t leak in from bugs or security loopholes in other places of the application. This is why I’m a huge believer in putting business logic in a client/server application on the server side: because a hacker who hacks the API of a client/server application cannot do more to the back end than a user of the application which talks to the back end.
Once you realize that the business logic belongs in a particular place, then it is easy to use the other idioms to create code that is easy for another programmer to follow, using design patterns which are easy to follow. The logic lands as a chunk of business logic between the code which parses incoming HTTP requests and the database access code which gets and saves data to the database.
Besides considering the global structure of the application and being consistent in our use of the global structure as we assemble our applications, we can also make use of notes–both in the form of comments and in separate documentation, which gives the overall structural ‘gestalt’ of the application.
There are a number of books which eschew documentation, either in the form of comments and in the form of documentation, which litter the scene. For example, in chapter 4 of Clean Code: A Handbook of Agile Software Craftmanship, we see the remark:
Every time you express yourself in code, you should pat yourself on the back. Every time you write a comment, you should grimace and feel the failure of your ability of expression.
And certainly there is some value to this idea, at least when it comes to writing most code. After all,
if ((row.record == VALID_SYMBOL) && (row.name) && !(row.error)) {
would be better written:
if (IsRowValid(row)) {
with the logic in a self-contained function call. (This also has the advantage of concentrating the logic for validating a row in one location rather than allowing that logic to be scattered everywhere.)
However, there are times when it is necessary to use a comment and to include documentation to allow another developer understand roughly where all the pieces are.
Comments are also necessary when writing complex code which implements a particular algorithm, so a new developer understands what is being implemented. Even if the comment is simply a reference to the textbook or paper from which the algorithm is pulled, since you cannot count on every developer looking at your code having a Ph.D. in Computer Science.
For example, how self-evident is the following code?
struct node *grandparent(struct node *n) { if ((n != NULL) && (n->parent != NULL)) return n->parent->parent; else return NULL; } struct node *uncle(struct node *n) { struct node *g = grandparent(n); if (g == NULL) return NULL; if (n->parent == g->left) return g->right; else return g->left; } void insert_case1(struct node *n) { if (n->parent == NULL) n->color = BLACK; else insert_case2(n); } void insert_case2(struct node *n) { if (n->parent->color == BLACK) return; else insert_case3(n); } ...
But if we were to include the following comment at the top:
/* insert.c * * Red/Black Tree Insert Code. (From https://en.wikipedia.org/wiki/Red–black_tree#Insertion) */
now everything we’re doing a lot more obvious.
The same can be said about the larger architecture used by your application. Just a simple one-page comment stating the rough layout of all the components can go a long ways in helping a new developer to the team have an understanding as to where all the code lies.
Reports that say that something hasn’t happened are always interesting to me, because as we know, there are known knowns; there are things we know we know. We also know there are known unknowns; that is to say we know there are some things we do not know. But there are also unknown unknowns – the ones we don’t know we don’t know. And if one looks throughout the history of our country and other free countries, it is the latter category that tend to be the difficult ones. – U.S. Secretary of Defense Donald Rumsfeld
Despite being ridiculed for his statement, he does allude to a fundamental principle of security, one I first encountered when studying for the CISSP certification.
But I contend there is a fourth combination: unknown knowns–things we are not aware that we know. And one of the biggest problems new developers to a project have is that they don’t even know where to start–and the other senior developers on the team don’t realize the knowledge they haven’t passed forward.
And until that knowledge is passed, a new developer, not wanting to disturb the structure of the existing application, tend to the Lava Flow Antipattern.
In many ways, if we are writing code for a team, we need to think about how that code can survive our own authorship. To do this, we need to consider how we structure our code so that our intention is clear. But more importantly, we need to give clues to new developers how they are expected to extend and modify our code, where fixes should be made, and where new functionality should be added to prevent various antipatterns from forming, and to reduce technical debt.