Does func works like a macro?

Does func works like a macro? If it is the case, I only need to grad() the kernel when I need grad instead of use grad() on all func called in the kernel?

Yeah, funcs are essentially functions that are force-inlined. You are right. You only need to invoke the gradient kernels, and don’t have to worry about func gradients (which are expanded into part of kernels).