You know, PPM works fine if it have enough (megabytes of) memory. PPM use it for the contexts - information about probabilities of the next symbol depend on some previous ones.
The idea is to use only one probability estimation for every contexts' order. So, number of probability sets will be equal to maximum contexts order. All we need to do is just to sort symbols in cotexts by LRU (Last Recently Used). So, most probable symbol will be first in every context.
Of cause, compresson ratio will be a little bit worse (because we'll be estimate the probabilities with lower accuracy). However, as long as we'll reduse memory usage in a 3 times, we could allocate additional (higher order) contexts there and even reach a higher compression ratio.
Ones more, it have a sence only if You have no enough memory (for example, less, then 64k). If You have more, then 384k of RAM, please use the HA with a2 switch instead :)