CUDA - what's the most efficient way to compute euclidean distance between 2 float3? -
currently i'm using following code compute euclidean distance between 2 float3 took 1 of nvidia samples.
inline __host__ __device__ float3 operator-(float3 a, float3 b) { return make_float3(a.x - b.x, a.y - b.y, a.z - b.z); } inline __host__ __device__ float dot(float3 a, float3 b) { return a.x * b.x + a.y * b.y + a.z * b.z; } inline __host__ __device__ float euclideandistance(float3 v) { return sqrtf(dot(v, v)); }
is there (maybe more low level) way faster?
cuda has functions norm3d{f}()
in math library best fit when computing euclidean distance of 3-vectors ensure maximum accuracy , avoid overflow in intermediate computation. if need normalize vectors, want @ rnorm3d{f}()
. canonical choice operation , should optimal.
note might possible run computations in distance squared instead, rather distance, eliminate expensive square root operation , should considerably faster using euclidean distance.
[this answer assembled comments , added community wiki entry question off unanswered list cuda tag]
Comments
Post a Comment