c++ - Does ArrayFire have a popcount or bitcount function? - Stack Overflow

I am trying to move away from OpenCL and CUDA into ArrayFire. One of my functions uses the GPU's p

I am trying to move away from OpenCL and CUDA into ArrayFire. One of my functions uses the GPU's popcount() to make pre-processing data easier. But I can't find it anywhere in the list of functions in ArrayFire.

OpenCL has popcount, CUDA has popc, and there is the builtin_popcount for CPU work. Where the heck is the function in ArrayFire? I see count() and count_all() but those are for the element count of the array not the bits in an element (as far as I can tell).

Am I missing something or is this just a feature not implemented in the library? I feel like it is a pretty important function and expected it to be with the bitwise manipulation functions.

I was expecting some function with the ability to tell me the count of 1s in an integer. I honestly would like to leverage the optimization features of the library, but without this it is impossible.

Yes, I can write my own. No I don't want to do it. I want to use the architecture optimized implementations provided by the hardware vendor on the CPUs/GPUs.

I am trying to move away from OpenCL and CUDA into ArrayFire. One of my functions uses the GPU's popcount() to make pre-processing data easier. But I can't find it anywhere in the list of functions in ArrayFire.

OpenCL has popcount, CUDA has popc, and there is the builtin_popcount for CPU work. Where the heck is the function in ArrayFire? I see count() and count_all() but those are for the element count of the array not the bits in an element (as far as I can tell).

Am I missing something or is this just a feature not implemented in the library? I feel like it is a pretty important function and expected it to be with the bitwise manipulation functions.

I was expecting some function with the ability to tell me the count of 1s in an integer. I honestly would like to leverage the optimization features of the library, but without this it is impossible.

Yes, I can write my own. No I don't want to do it. I want to use the architecture optimized implementations provided by the hardware vendor on the CPUs/GPUs.

Share Improve this question asked Nov 21, 2024 at 3:50 Ben HBen H 393 bronze badges
Add a comment  | 

1 Answer 1

Reset to default 1

Well I will answer my own question.

There is not one.

I looked through the repo, yes they use popc and popcount (CUDA and OpenCl) for the nearest_neighbour. But it is not used anywhere else. So it is not implemented.

Now I have a few choices; use the custom kernel, fork their code and make my own, or abandon this folly and move on.

I will probably try the custom kernel. If it fails I will switch back to OpenCL and CUDA.

发布者:admin,转转请注明出处:http://www.yc00.com/questions/1742315068a4420672.html

相关推荐

发表回复

评论列表(0条)

  • 暂无评论

联系我们

400-800-8888

在线咨询: QQ交谈

邮件:admin@example.com

工作时间:周一至周五,9:30-18:30,节假日休息

关注微信