Higher-Order Ranges
In keeping with higher-order functions that take and return other functions, a higher-order range is one that aggregates and manipulates other ranges, while itself still offering a range interface. Building ranges that decorate other ranges is easy, useful, and fun. In fact, higher-order ranges fulfill the promise that many saw in the early days of the STL. Back then, people thought custom iterators would solve many problems, and consequently defined quite a few of them. The idea, however, has had limited success. I'm not sure why, but I hypothesize that the difficulties of defining iterators and the verboseness of using them were possible factors.
Ranges, by contrast, are very easy to define and use. The result of many algorithms are, in fact, custom ranges. Consider, for example, the classic higher-order function filter, which takes a range r plus a predicate, and returns a range that only offers the elements of r satisfying the predicate. The work that filter itself does is minimalit merely constructs and returns a range type, call it Filter, that does the filtration in its iteration primitives.
- Laziness. Higher-order ranges offer the opportunity of doing computation lazily, in the style preferred by functional languages, instead of eagerly. Consider, for example, the STL algorithm set_union that takes two sorted ranges and yields one sorted range containing the elements of both ranges, in linear time. set_union is eagerwhen it returns, it has finished the job. This approach has two problems. First, you need to create (and possibly allocate memory for) the target range. This is wasteful of both memory and time if all you want to do is peek at each element of set_union in turn and maybe finish the loop without inspecting all elements. Second, eager set_union needs to read all of its input before finishing, which simply does not work with infinite ranges.
- Preserving Range Categories. Recall Retro, a range that traverses a given range backwards. Clearly the original range, call it r, must offer double-ended iteration. Question: If the original range offered random access, should Retro also offer random access? The answer is a resounding yes. As a rule, a higher-order range must offer the highest capability that it can, subject to what the original range can offer. So Retro should do something like this:
A classic argument in favor of lazy evaluation is that it leads to better modular composition. This is because lazy evaluation allows for much more involved compositions, with powerful generators that construct a large data space, out of which a selector chooses the items of interest. This advantage has been beautifully argued by Hughes [9] and is famously used by Google in its implementation of the MapReduce algorithm [6]. D's standard library uses lazy evaluation wherever possible, to great effect.
struct Retro<DoubleEndedRange> { ... as before ... static if (isRandomAccess(DoubleEndedRange) && hasLength(DoubleEndedRange)) { Ref<T> at(unsigned i) { return r.at(r.length() - 1 - i); } DoubleEndedRange slice(int i, int j) { return r.slice(r.length() - j, r.length() - i); } } static if (hasLength(DoubleEndedRange)) { unsigned length() { return r.length(); } } }
I used the CDJ#++ construct static if as an optional declaration: If the tested condition is true, then the guarded code is compiled in; otherwise, it just vanishes. The predicates hasLength and isRandomAccess use introspection to figure out during compilation whether the original range offers length and random access, respectively. Note how DoubleEndedRange also may or may not define length, depending on whether r does.
This kind of optionally enriched interface depending on the type of the input puts a great strain on the language's static introspection mechanisms. I don't know of a way to do that in Java or C#, and in C++ things can be done, albeit with difficulty. In D, the static if construct exists and makes it easy to define isRandomAccess and hasLength. If your target language is dynamic, there should be no problem to use dynamic reflection to allow clients to discover the capabilities of a range object.
If static introspection is available, a host of really cool stuff can be done. For example, if Retro is composed with itself (e.g., Retro<Retro<SomeRange>>), why do all the busywork? The entire construct should statically boil down to SomeRange at exactly zero computational cost.