An efficient way to convert from a Vector{String} to String in Julia? - Stack Overflow

Is there an efficient method to convert from a Vector{String} to String in Julia?I attempted a naive i

Is there an efficient method to convert from a Vector{String} to String in Julia?

I attempted a naive implementation using a for loop, however the performance is terrible. (Even when type information is available.)

In this particular case, I have a data structure which is a Vector of single character strings.

This is effectively a concatenation / reduce operation. I'm not sure how to express it in Julia.

The fact that a for loop gives poor performance makes me suspect there is no efficient implementation for this.

I think the poor performance is most likely due to the result of repeated memory allocations.

I tried sizehint!, but this doesn't appear to have an implementation for String. So I am not sure if there is a way to avoid the repeated memory allocation.

Is there an efficient method to convert from a Vector{String} to String in Julia?

I attempted a naive implementation using a for loop, however the performance is terrible. (Even when type information is available.)

In this particular case, I have a data structure which is a Vector of single character strings.

This is effectively a concatenation / reduce operation. I'm not sure how to express it in Julia.

The fact that a for loop gives poor performance makes me suspect there is no efficient implementation for this.

I think the poor performance is most likely due to the result of repeated memory allocations.

I tried sizehint!, but this doesn't appear to have an implementation for String. So I am not sure if there is a way to avoid the repeated memory allocation.

Share Improve this question asked Mar 3 at 15:49 user2138149user2138149 17.7k30 gold badges150 silver badges296 bronze badges
Add a comment  | 

3 Answers 3

Reset to default 3

You're likely going to get the best mileage out of join(iterator):

using BenchmarkTools
v = [String(rand('a':'z', 10)) for _ in 1:1000];

@btime reduce(*, $v)
  376.600 μs (999 allocations: 4.87 MiB)

@btime prod($v)
  367.800 μs (999 allocations: 4.87 MiB)

@btime join($v)
  13.000 μs (17 allocations: 41.55 KiB)

@assert reduce(*, v) == prod(v) == join(v)

The timing of reduce and prod being equal makes sense: they both ultimately call mapreduce(identity, *, v).

The speedup from join occurs because join(::Vector{String}) calls sprint(join, ::Vector{String}), then join(::IOBuffer, ::Vector{String}), then ultimately String(_unsafe_take!(::IOBuffer)), meaning the call benefits from allocating a buffer to hold the concatenated string, then wrapping the String around the buffer without checking and, if your compiler allows it, eliding the extra Array allocation and performing an equivalent move operation instead. If you want it to go even faster and you know the size of the resultant string, you can do something like this:

@btime sprint(join, $v, sizehint=length($v)*10)
  11.500 μs (3 allocations: 9.97 KiB)

(I played around with the headroom needed to maximize performance, so your mileage may vary).

prod(vector_of_string) seems both fastest (twice as fast as reduce on my machine) and semantically clearest.

One possible implementation

vector_of_string::Vector{String} = ...

reduce(*, vector_of_string)

I don't know how this performs compared to other alternatives, but it seems much faster than my existing implementation.

The reverse operation, if you require it

a_string::String = ...

collect(a_string) # returns Vector{Char}
string.(collect(a_string)) # returns Vector{String}

发布者:admin,转转请注明出处:http://www.yc00.com/questions/1745085674a4610391.html

相关推荐

发表回复

评论列表(0条)

  • 暂无评论

联系我们

400-800-8888

在线咨询: QQ交谈

邮件:admin@example.com

工作时间:周一至周五,9:30-18:30,节假日休息

关注微信