The difference between gpu and cpu backend

I run the example file mpm99.py with both (arch=ti.cpu) and (arch=ti.gpu). I find the results are quite different. That with cpu as the backend is more likely to crush. I wonder why this happens.