在cuda error出现之后能保存field变量吗?

我的sph程序在算到一定步数后会发生如下的error,应该是算法方面的问题。发生这个error之后程序就会停止,我想能不能在发生这个error之后,导出现在的field变量,好方便我之后检查问题在哪?
以及此处的“illegal memory access”错误,是数组索引越界导致的吗?

[W 07/21/23 12:47:41.293 12612] [taichi/rhi/cuda/cuda_driver.h:taichi::lang::CUDADriverFunction<void * *,void *,char const *>::call_with_warning@85] CUDA Error CUDA_ERROR_ILLEGAL_ADDRESS: an illegal memory access was encountered while calling module_get_function (cuModuleGetFunction)                                                                      L_ADDRESS: an illegal 
[E 07/21/23 12:47:41.295 12612] [taichi/runtime/cuda/jit_cuda.h:taichi::lang::JITModuleCUDA::lookup_function@54] Cannot look up function reset_outside_particle_c102_0_kernel_1_range_for                                                                                                                                                                         ange_for


Traceback (most recent call last):
  File "c:\Users\server\Desktop\better exclude\run_simlation.py", line 113, in <module>
    solver.step()
  File "c:\Users\server\Desktop\better exclude\sph_base.py", line 257, in step
    self.ps.initialize_particle_system()
  File "c:\Users\server\Desktop\better exclude\particle_system.py", line 506, in initialize_particle_system
    self.reset_outside_particle()
  File "C:\Users\server\anaconda3\lib\site-packages\taichi\lang\kernel_impl.py", line 1033, in __call__
    return self._primal(self._kernel_owner, *args, **kwargs)
  File "C:\Users\server-liuyunpu\anaconda3\lib\site-packages\taichi\lang\kernel_impl.py", line 906, in __call__
    return self.runtime.compiled_functions[key](*args)
  File "C:\Users\server\anaconda3\lib\site-packages\taichi\lang\kernel_impl.py", line 817, in func__
    raise e from None
  File "C:\Users\server\anaconda3\lib\site-packages\taichi\lang\kernel_impl.py", line 814, in func__
    t_kernel(launch_ctx)
RuntimeError: [taichi/runtime/cuda/jit_cuda.h:taichi::lang::JITModuleCUDA::lookup_function@54] Cannot look up function reset_outside_particle_c102_0_kernel_1_range_for
[E 07/21/23 12:47:41.559 12612] [taichi/rhi/cuda/cuda_driver.h:taichi::lang::CUDADriverFunction<void *>::operator ()@92] CUDA Error CUDA_ERROR_ILLEGAL_ADDRESS: an illegal memory access was encountere access was encountered while calling stream_synchronize (cuStreamSynchronize)

你好 在发生 error 之后导出变量可能是有点困难的,一般可以尝试在出错之前导出数组来推理问题在哪儿,毕竟出问题的原因肯定在出问题“之前”。
“illegal memory access” 的原因并不一定只是数组访问越界;你可以开启 ti.init(debug=True) 来帮助检查是否有数组越界。

那请问还有什么情况会导致illegal memory access错误呢?

鉴于你的报错信息有如下文字

an illegal memory access was encountered while calling stream_synchronize (cuStreamSynchronize)

有可能是并行的 for 循环中有 data-race(不同 thread 尝试修改同一个内存地址),但是没有源代码的话我也不太敢完全确定是哪里的问题。

您好,请问您这个问题最终是怎么解决的,我也遇到了这两个问题,找不到出错点

具体是为什么出现这个错误我也不太清楚…调整了算法之后就没出现过这问题了
或许每步都保存下场变量,大概是算法层面的错误