1. Goal
Learn Taichi Language to speed up python
- Especially for my ISP project
- Intensive computation
- many for loops along the image
2. Essential
Backend
- CPU
- GPU: cuda / vulkan / opengl / metal
Field: basic data structure, like the ndarray in numpy
Structure
1 | ## main entry point |
Parallel: parallelize any for loop at outermost scope in a kernel
- not at the outer most scope would be handled serially
Arguments
- Take ti.types for input
- scalar and Matrix are passed by value
- ndarray and template are passed by reference
Primitive type
- i,u,f
- 8,16,32,64
- Data type can not be changed during running time, you can only cast to another type
- Scalar
- Vector: A matrix of 1d element, MxNxn, use a extra [] to access the element
- Matrix: A matrix of 2d element, MxNxmxn, use a extra [] to access the element
3. Takeaway
Local parameter must be defined in the for loop, since its not global shared.