GGUI使用求助

大家好,我用太极写了一个仿真程序,在本地机器上可以正常运行,但在服务器上运行却出现了一些问题。
当我使用ti.cuda初始化时,ggui会报如下错误:

[Taichi] version 1.4.1, llvm 15.0.4, commit e67c674e, linux, python 3.9.13
[I 03/02/23 21:52:15.177 346087] [shell.py:_shell_pop_print@23] Graphical python shell detected, using wrapped sys.stdout
[Taichi] Starting on arch=cuda
90601
[E 03/02/23 21:52:22.420 346087] [cuda_driver.h:operator()@88] CUDA Error CUDA_ERROR_INVALID_DEVICE: invalid device ordinal while calling external_memory_get_mapped_buffer (cuExternalMemoryGetMappedBuffer)


Traceback (most recent call last):

  File "/home/luyin/anaconda3/lib/python3.9/site-packages/spyder_kernels/py3compat.py", line 356, in compat_exec
    exec(code, globals, locals)

  File "/home/luyin/Desktop/Anisotropic-MPM/1AA_source/1A_working_version/1A_hanger_02_20/1A_sparse/1A_main_with_GUI.py", line 263, in <module>
    main()

  File "/home/luyin/Desktop/Anisotropic-MPM/1AA_source/1A_working_version/1A_hanger_02_20/1A_sparse/1A_main_with_GUI.py", line 249, in main
    render(obj_list, boundary_list, camera, scene, canvas, window, axisX, axisY, axisZ, colorX, colorY, colorZ,Circle_Center,Circle_Radius)

  File "/home/luyin/Desktop/Anisotropic-MPM/1AA_source/1A_working_version/1A_hanger_02_20/1A_sparse/render.py", line 60, in render
    canvas.scene(scene)

  File "/home/luyin/anaconda3/lib/python3.9/site-packages/taichi/ui/canvas.py", line 136, in scene
    self.canvas.scene(scene.scene)

RuntimeError: [cuda_driver.h:operator()@88] CUDA Error CUDA_ERROR_INVALID_DEVICE: invalid device ordinal while calling external_memory_get_mapped_buffer (cuExternalMemoryGetMappedBuffer)

注释了ggui部分后,主程序可以正常运行。

在运行ggui的范例程序时,ti.cuda也是报同样的错误,ti.vulkan可以正常运行。但是我的主程序用ti.vulkan时会报如下错误:

[Taichi] version 1.4.1, llvm 15.0.4, commit e67c674e, linux, python 3.9.13
[I 03/02/23 21:45:50.975 345528] [shell.py:_shell_pop_print@23] Graphical python shell detected, using wrapped sys.stdout
[Taichi] Starting on arch=vulkan
90601
[E 03/02/23 21:45:52.427 345528] [spirv_codegen.cpp:generate_listgen_kernel@2100] Not supported.


Traceback (most recent call last):

  File "/home/luyin/anaconda3/lib/python3.9/site-packages/spyder_kernels/py3compat.py", line 356, in compat_exec
    exec(code, globals, locals)

  File "/home/luyin/Desktop/Anisotropic-MPM/1AA_source/1A_working_version/1A_hanger_02_20/1A_sparse/1A_main_with_GUI.py", line 263, in <module>
    main()

  File "/home/luyin/Desktop/Anisotropic-MPM/1AA_source/1A_working_version/1A_hanger_02_20/1A_sparse/1A_main_with_GUI.py", line 239, in main
    Reset()

  File "/home/luyin/anaconda3/lib/python3.9/site-packages/taichi/lang/kernel_impl.py", line 974, in wrapped
    return primal(*args, **kwargs)

  File "/home/luyin/anaconda3/lib/python3.9/site-packages/taichi/lang/shell.py", line 27, in new_call
    ret = old_call(*args, **kwargs)

  File "/home/luyin/anaconda3/lib/python3.9/site-packages/taichi/lang/kernel_impl.py", line 901, in _call_
    return self.runtime.compiled_functions[key](*args)

  File "/home/luyin/anaconda3/lib/python3.9/site-packages/taichi/lang/kernel_impl.py", line 826, in func__
    raise e from None

  File "/home/luyin/anaconda3/lib/python3.9/site-packages/taichi/lang/kernel_impl.py", line 823, in func__
    t_kernel(launch_ctx)

RuntimeError: [spirv_codegen.cpp:generate_listgen_kernel@2100] Not supported.

服务器的gpu信息如下:
e583690a-e9ca-41b9-b9ce-edf0fa543fe8
本地的gpu信息如下:


请问应该如何解决呢?谢谢!

GGUI的问题在最新的master应该解决了,可以试试安装Taichi nightly pip install -i https://pypi.taichi.graphics/simple/ taichi-nightly

vulkan后端的问题想问问是不是代码里用了sparse的相关功能呢?vulkan目前还不支持sparse

1 个赞
  • Nice!升级后ggui在cuda下确实能运行了!

  • 确实使用了sparse相关功能,抱歉没仔细读文档。

  • 还有一个小问题,我写的是一个仿真循环,每次会把time step print出来。观察发现,升级前后每个time step计算的时间差不多,但是开始计算(也就是“time step:0”这个语句print出来)的时间却很不一样。升级前用的是1.4.1,差不多5s就会开始计算,升级后却要27s左右才会开始计算。这个问题我之前在1.13升级到1.30的时候也遇到过,在这个提问里有提到过。这就是鱼和熊掌不能兼得吗 :joy:

这个编译时间变长的问题可以麻烦你提供一个小的repro吗~ 我们想探究一下原因

破案了,原来是offline cache的原因。因为我在测试时我只改变了物体下落的位置,升级前应该很多位置都有cache了。。我把cache禁用了新旧版本的速度就都一致了。谢谢你们的解答!

1 个赞