大家好,我试了一天Bug,一直不知道问题在哪里。我把代码里出问题的Part摘出来,大家看一下。CPU 可以跑,GPU在某些数值可以跑,在某些数值不可以跑。我用的系统是win10 ,cpu-Intel® Xeon® W-2235 CPU ;GPU-NVIDIA Quadro P2200. 通过Spyder运行的。可能问题出在GPU上,但是我不知道怎么才能正常跑起来,因为我需要的参数刚好就是跑不起来的情况
代码如下:
import taichi as ti
import sys
ti.init(arch=ti.gpu)
@ti.data_oriented
class Test:
def __init__(self):
self.RankingNumAll = ti.field(ti.u32, shape=())
self.RankingNumAll[None] = 0
self.max_num_plumes = 80
self.Pos = ti.Vector.field(2, dtype=ti.f32)
self.R = ti.field( dtype=ti.f32)
self.Vel = ti.Vector.field(14, dtype=ti.f32)
self.plumelistA = ti.root.dynamic(ti.i,self.max_num_plumes)
self.plumelistA.place(self.Pos,self.R,self.Vel)
@ti.kernel
def NewParticle(self):
AppendRow = ti.length(self.plumelistA, [])
for i in range(3):
Row = AppendRow + i
self.Pos[Row] = ti.Vector([0,0])
self.R[Row] = 0
self.Vel[Row] = ti.Vector([0,0,0,0,0,0,0,0,0,0,0,0,0,0])
for i in range(3):
Row = AppendRow + i + 3
x = 0
self.Pos[Row] = ti.Vector([x,0])
self.R[Row] = 0
self.Vel[Row] = ti.Vector([0,0,0,0,0,0,0,0,0,0,0,0,0,0])
@ti.kernel
def Act(self):
NUMBEROFPLUMES = ti.length(self.plumelistA, [])
for AN in range(NUMBEROFPLUMES*NUMBEROFPLUMES):
targetIDRow = ti.cast(ti.floor(AN/NUMBEROFPLUMES),ti.i32)
ActionIDRow = ti.cast(AN - NUMBEROFPLUMES * targetIDRow,ti.i32)
x0 = self.Pos[targetIDRow][0]
y0 = self.Pos[targetIDRow][1]
r0 = self.R[targetIDRow]
x1 = self.Pos[ActionIDRow][0]
y1 = self.Pos[ActionIDRow][1]
r1 = self.R[ActionIDRow]
dx = x1 - x0
dy = y1 - y0
dist = ti.sqrt(dx*dx + dy*dy)/10000
interactionLength = 5*(r0 + r1)
AvoidLength = 0.5*(r0 + r1)
if (AvoidLength < dist < interactionLength):
self.Vel[targetIDRow][2] += 1
self.Vel[targetIDRow][3] += 1
elif ((r0 + r1)/2 < dist <= AvoidLength):
self.Vel[targetIDRow][4] += 1
self.Vel[targetIDRow][5] += 1
def Run(self):
stepall = 0
while stepall < 200:
stepall += 1
self.NewParticle()
self.Act()
if __name__ == '__main__':
Test().Run()
会跑一会然后出现如下错误:
RuntimeError: [taichi/backends/cuda/jit_cuda.h:taichi::lang::JITModuleCUDA::lookup_function@53] Cannot look up function NewParticle_c42_0_kernel_1_serial
但是这个代码,在CPU上可以跑,将初始化改为ti.init(arch=ti.cpu)
就可以运行了。
那么在GPU下运行,我试了一下有两种情况下可以跑,初始化ti.init(arch=ti.gpu)
:
1)将dynamic节点的最大长度设置短一些,即将第12行self.max_num_plumes = 80
的数值设置小一些,如self.max_num_plumes = 40
GPU下可以正常运行。
2)第12行self.max_num_plumes = 80
的数值不变小,而将另外一个预设参数注释掉也可以运行,即将第10行self.RankingNumAll[None] = 0
注释掉,即:
import taichi as ti
import sys
ti.init(arch=ti.gpu)
@ti.data_oriented
class Test:
def __init__(self):
self.RankingNumAll = ti.field(ti.u32, shape=())
#self.RankingNumAll[None] = 0
self.max_num_plumes = 80
self.Pos = ti.Vector.field(2, dtype=ti.f32)
self.R = ti.field( dtype=ti.f32)
self.Vel = ti.Vector.field(14, dtype=ti.f32)
self.plumelistA = ti.root.dynamic(ti.i,self.max_num_plumes)
self.plumelistA.place(self.Pos,self.R,self.Vel)
但是,我需要self.max_num_plumes = 400
同时self.RankingNumAll[None] = 0
这个参数在其他Part有用不能删除,然后GPU下运行,不知道怎么问题是啥,有好心人帮忙解答吗?