Conclusions
Lots of changes, small and large, can be used to make dynamic language implementations faster. Some will also help with speed and safety in static languages. An architecture like ARM, which already has a number of different decoders, might be an ideal place to implement some of these changes, and now is the ideal time. I hope within a few years to be using a CPU designed for running the languages I use, rather than one on which the language designers are constantly fighting limitations imposed by the CPU designers.