为什么用gpu会报错而cpu能正常计算

之前还能计算的代码,现在突然除了问题。在采用gpu计算时,在核函数声明的地方(scene.find_particle_min_radius())报错,但用cpu时还能正常计算,不知具体原因

Taichi JIT:0: allocate_from_reserved_memory: block: [0,0,0], thread: [0,0,0] Assertion `Out of CUDA pre-allocated memory.
Consider using ti.init(device_memory_fraction=0.9) or ti.init(device_memory_GB=4) to allocate more GPU memory` failed.
[E 10/09/23 22:55:05.528 52987] [cuda_driver.h:operator()@92] CUDA Error CUDA_ERROR_ASSERT: device-side assert triggered while calling stream_synchronize (cuStreamSynchronize)


Traceback (most recent call last):
  File "sphere_packing.py", line 125, in <module>
    dem.run()            
  File "/home/eleven/work/GeoTaichi/src/dem/mainDEM.py", line 228, in run
    self.check_critical_timestep()
  File "/home/eleven/work/GeoTaichi/src/dem/mainDEM.py", line 236, in check_critical_timestep
    critical_timestep = self.get_critical_timestep()
  File "/home/eleven/work/GeoTaichi/src/dem/mainDEM.py", line 243, in get_critical_timestep
    return self.contactor.physpp.calcu_critical_timesteps(self.scene, self.sims.max_material_num)
  File "/home/eleven/work/GeoTaichi/src/dem/contact/HertzMindlin.py", line 23, in calcu_critical_timestep
    radius = scene.find_particle_min_radius()
  File "/home/eleven/work/GeoTaichi/src/dem/SceneManager.py", line 152, in find_particle_min_radius
    return find_particle_min_radius_(self.particleNum, self.particle)
  File "/home/eleven/.local/lib/python3.8/site-packages/taichi/lang/kernel_impl.py", line 974, in wrapped
    return primal(*args, **kwargs)
  File "/home/eleven/.local/lib/python3.8/site-packages/taichi/lang/kernel_impl.py", line 905, in __call__
    key = self.ensure_compiled(*args)
  File "/home/eleven/.local/lib/python3.8/site-packages/taichi/lang/kernel_impl.py", line 873, in ensure_compiled
    self.materialize(key=key, args=args, arg_features=arg_features)
  File "/home/eleven/.local/lib/python3.8/site-packages/taichi/lang/kernel_impl.py", line 560, in materialize
    self.runtime.materialize()
  File "/home/eleven/.local/lib/python3.8/site-packages/taichi/lang/impl.py", line 459, in materialize
    self.materialize_root_fb(not self.materialized)
  File "/home/eleven/.local/lib/python3.8/site-packages/taichi/lang/impl.py", line 394, in materialize_root_fb
    root.finalize(raise_warning=not is_first_call)
  File "/home/eleven/.local/lib/python3.8/site-packages/taichi/_snode/fields_builder.py", line 170, in finalize
    return self._finalize(raise_warning, compile_only=False)
  File "/home/eleven/.local/lib/python3.8/site-packages/taichi/_snode/fields_builder.py", line 182, in _finalize
    return SNodeTree(_ti_core.finalize_snode_tree(_snode_registry, self.ptr, impl.get_runtime().prog, compile_only))
RuntimeError: [cuda_driver.h:operator()@92] CUDA Error CUDA_ERROR_ASSERT: device-side assert triggered while calling stream_synchronize (cuStreamSynchronize)
[E 10/09/23 22:55:05.611 52987] [cuda_driver.h:operator()@92] CUDA Error CUDA_ERROR_ASSERT: device-side assert triggered while calling stream_synchronize (cuStreamSynchronize)


terminate called after throwing an instance of 'std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >'
Aborted (core dumped)

我们调整了一下CUDA后端显存分配的机制,现在默认预分配的显存更小了。你可以手动改大一些比如:ti.init(device_memory_GB=2)试试?