- 1 Beginning
- 2 Writing Code
- 3 Observation
- 4 Documentation
- 5 Developing
- 6 Accident Prevention
- 7 Tips for Reducing Complexity
3.7 Tips for Reducing Complexity
"Complexity is the enemy, and our aim is to kill it."
-Jan Baan
One of Perl's greatest strengths is its expressiveness and extreme conciseness. Complexity is the bane of software development: when a program grows beyond a certain size, it becomes much harder to test, maintain, read, or extend. Unfortunately, today's problems mean this is true for every program we need. Anything you can do to minimize the complexity of your program will pay handsome dividends.
The complexity of a program is a function of several factors:
The number of distinct lexical tokens
The number of characters
The number of branches in which control can pass to a different point
The number of distinct program objects in scope at any time
Whenever a language allows you to change some code to reduce any of these factors, you reduce complexity.
3.7.1 Lose the Temporary Variables
The poster child for complexity is the temporary variable. Any time a language intrudes between you and the solution you visualize, it diminishes your ability to implement the solution. All languages do this to some degree; Perl less than most.13 In most languages, you swap two variables a and b with the following algorithm:
Declare temp to be of the same type as a and b temp = a; a = b; b = temp;
But most languages are not Perl:
($b, $a) = ($a, $b);
Iterating over an array usually requires an index variable and a count of how many things are currently stored in the array:
int i; for (i = 0; i < count_lines; i++) { strcat (line[i], suffix); }
Whereas in Perl, you have the foreach construct borrowed from the shell:
foreach my $line (@lines) { $line .= $suffix }
And if you feel put out by having to type foreach instead of just for, you're in luck, because they're synonyms for each other; so just type for if you want (Perl can tell which one you mean).
Because functions can return lists, you no longer need to build special structures just to return multivalued data. Because Perl does reference-counting garbage collection, you can return variables from the subroutine in which they are created and know that they won't be trampled on, yet their storage will be released later when they're no longer in use. And because Perl doesn't have strong typing of scalars, you can fill a hierarchical data structure with heterogeneous values without having to construct a union datatype and some kind of type descriptor.
Because built-in functions take lists of arguments where it makes sense to do that, you can pass them the results of other functions without having to construct an iterative loop:
unlink grep /~$/, readdir DIR;
And the map function lets you form a new list from an old one with no unnecessary temporary variables:
open PASSWD, '/etc/passwd' or die "passwd: $!\n"; my @usernames = map /^([^:]+)/, <PASSWD>; close PASSWD;
Because Perl's arrays grow and shrink automatically and there are simple operators for inserting, modifying, or deleting array elements, you don't need to build linked lists and worry if you've got the traversal termination conditions right. And because Perl has the hash data type, you can quickly locate a particular chunk of information by key or find out whether a member of a set exists.
3.7.2 Scope Out the Problem
Of course, sometimes temporary variables are unavoidable. Whenever you create one though, be sure and do it in the innermost scope possible (in other words, within the most deeply nested set of braces containing all references to the variable).
Create variables in the innermost scope possible.
For example, let's say somewhere in my program I am traversing my Netscape history file and want to save the URLs visited in the last 10 days in @URLs:
use Netscape::History; my $history = new Netscape::History; my (@URLs, $url); while (defined($url = $history->next_url() )) { push @URLs, $url if time - $url->last_visit_time < 10 * 24 * 3600; }
This looks quite reasonable on the face of it, but what if later on in our program we create a variable called $history or $url? We'd get the message
"my" variable $url masks earlier declaration in same scope
which would cause us to search backward in the code to find exactly which one it's referring to. Note the clause "in same scope"if in the meantime you created a variable $url at a different scope, well, that may be the one you find when searching backward with a text editor, but it won't be the right one. You may have to check your indentation level to see the scope level.
This process could be time-consuming. And really, the problem is in the earlier code, which created the variables $history or $url with far too wide a scope to begin with. We can (as of perl 5.004) put the my declaration of $url right where it is first used in the while statement and thereby limit its scope to the while block. As for $history, we can wrap a bare block around all the code to limit the scope of those variables:
use Netscape::History; my @URLs; { my $history = new Netscape::History; while (defined(my $url = $history->next_url() )) { push @URLs, $url if time - $url->last_visit_time < 10 * 24 * 3600; } }
If you want to create a constant value to use in several places, use constant.pm to make sure it can't be overwritten:
$PI = 3.1415926535897932384; use constant PI => 3.1415926535897932384; my $volume = 4/3 * PI * $radius ** 3; $PI = 3.0; # The 'Indiana maneuver' works! PI = 3.0; # But this does not
In response to the last statement, Perl returns the error message, "Can't modify constant item in scalar assignment."
constant.pm creates a subroutine of that name which returns the value you've assigned to it, so trying to overwrite it is like trying to assign a value to a subroutine call. Although the absurdity of that may sound like sufficient explanation for how use constant works, in fact, the latest version of perl allows you to assign a value to a subroutine call, provided the result of the subroutine is a place where you could store the value. For example, the subroutine could return a scalar variable. The term for this feature is lvaluable subroutine. But since the results of the subroutines created by use constant aren't lvalues, lvaluable subroutines won't cause problems for them.