- Introduction: What is a loop?
- Mistake 1: Redundant loops
- Mistake 2: Imprecise loops
- Summary
Mistake 1: Redundant loops
Can you spot a redundant loop? Are there any in the following model?
I'm guessing that you'll take one look at Model 7.3 and recoil in horror at the pointless excess. But which associations should be omitted? And why?
To figure this out, let's start with a bare bones model lacking any loops. Then we'll see if we need to add anything.
There, now we're de-looped. Model 7.4 models the physical containment hierarchy from a File Cabinet on down to a Document. Given an instance of Document, not only do you know what Folder it's in, you also know what Drawer and what File Cabinet it's in. All you have to do is navigate the associations. Given an instance of Drawer you can find out what Documents it contains. Just traverse R2, get all of the Folders, then for each Folder, traverse R3 and get all of the Documents in the Folder. So as far as containment goes, no loops are necessary. The associations R4, R5 and R6 in Model 7.3 are redundant and should be omitted.
Considering speed of access
What about speed and convenience of access? Our system may need to frequently ask the question, "What File Cabinet contains this Document?". Rather than answering this question by traversing R3->R2->R1 every time, couldn't we put R5 back in the model so we can take a shortcut?
We could. We would then have to write procedures to update two links every time we create or refile a Document. If you only did this in one place, it wouldn't be so bad. But you know what's going to happen once you get started down this road, don't you? Before you know it you're second guessing every access direction and you're modeling loops and redundant procedures everywhere. I've seen it done on real projects involving hundreds of classes. It isn't pretty.
Don't optimize for fast access in the class model
Of course we do want to optimize everywhere we can for fast data access, but the class model is the wrong place to do it. The model compiler should be smart enough to optimize frequently traversed chains of data navigation. (The analyst may have to color the high traffic association chains to make this possible.) When the model compiler generates data structure code, it can generate extra pointers, hash tables and any other mechanisms it needs to optimize data access. The model compiler can scan the class and procedure models and any added coloring to choose the correct optimization mechanisms. In fact, the analyst's well-intentioned class model optimizations, such as redundant associations, can result in fatter, slower generated code. I've seen this happen several times and come to the conclusion that the best optimization an analyst can offer is a minimal formalization of the essential requirements. The less gunk the model compiler has to sift through, the better.
A nonredundant loop
Now let's extend our file cabinet application requirements a little so that we do need a loop. Consider the following model:
Do we really need association R8? Given a Document, couldn't we just locate the publishing Department via R3->R2->R1->R7? No, we can't. This chain of navigation would lead us to the Department that OWNS the File Cabinet that contains our Document. But this is not necessarily the Department that PUBLISHED our Document.
The R8 association is necessary because it expresses a fact not already expressed. Since PUBLISH means something different than CONTAIN, it is not redundant just another reason why it is so important to always name both perspectives on every association!