Isn't That the Kernel's Job?
At the very bottom of the existing stack, you find the DRMdirect rendering modulekernel module. This is responsible for providing a set of OS-agnostic interfaces to the component of the driver in the X server. Typically, this is really two modules: one that provides the OS-specific bits and another that provides device-specific command validation.
Recently, there has been a move to push a bit more into the kernel. The first part is mode setting. Currently, X changes device resolution by talking to the device directly. This is problematic for power saving because the kernel needs to be able to restore the device in the correct state after putting it to sleep. It's also redundant, because the kernel also needs to be able to set the device mode for console frame buffers and virtual terminals. Moving it into the kernel simplifies matters considerably.
The second component is memory management. Managing memory is traditionally the kernel's job, but for graphics devices the situation is somewhat more complicated. Early devices had their own memory, and this was managed by the driver. Then came AGP, and it became possible for devices to access the host memory quickly. With PCIe, this got even worse and a number of cheap GPUs use system memory exclusively.
Over the years, the requirements for memory have also become more complicated. For example, a modern X environment uses a compositing manager. This means that every window is rendered to a texture and then composited to the screen by another process. This is difficult with the current DRI model, where OpenGL clients write directly to the screen. With DRI2, every OpenGL client renders to an off-screen buffer and the compositing manager can then combine these to produce the final version. This means that two processes have to be able to share the render buffer.
Managing the device's memory in the driver is the existing approach, and ends up with the same problems we've seen elsewhere: massive code duplication and horrible spaghetti code where the kernel, X server, and drivers all contain code that does some part of the work. Because this memory is a resource that can be shared among processes, the only place where it really makes sense is the kernel. The Gallium model expects the kernel to take responsibility for this, although the exact interfaces are part of the windowing system layer.