python - Fast way to expand split list into index list

Given an index split list T of length M + 1, where the first element is 0 and the last element is N, generate an array D of length N such that D[T[i]:T[i+1]] = i.

For example, given T = [0, 2, 5, 7], then return D = [0, 0, 1, 1, 1, 2, 2].

I'm trying to avoid a for loop, but the best I can do is:

def expand_split_list(split_list):
    return np.concatenate(
        [
            np.full(split_list[i + 1] - split_list[i], i)
            for i in range(len(split_list) - 1)
        ]
    )

Is there a built-in function for that purpose?

Given an index split list T of length M + 1, where the first element is 0 and the last element is N, generate an array D of length N such that D[T[i]:T[i+1]] = i.

For example, given T = [0, 2, 5, 7], then return D = [0, 0, 1, 1, 1, 2, 2].

I'm trying to avoid a for loop, but the best I can do is:

def expand_split_list(split_list):
    return np.concatenate(
        [
            np.full(split_list[i + 1] - split_list[i], i)
            for i in range(len(split_list) - 1)
        ]
    )

Is there a built-in function for that purpose?

Share Improve this question asked Mar 12 at 9:58 Leo 711 silver badge6 bronze badges

Add a comment |

3 Answers 3

Sorted by: Reset to default 7

You could combine diff, arange, and repeat:

n = np.diff(T)
out = np.repeat(np.arange(len(n)), n)

As a one-liner (python ≥3.8):

out = np.repeat(np.arange(len(n:=np.diff(T))), n)

Another option with assigning ones to an array of zeros, then cumsum:

out = np.zeros(T[-1], dtype=int)
out[T[1:-1]] = 1
out = np.cumsum(out)

Output:

array([0, 0, 1, 1, 1, 2, 2])

A numpy option is np.searchsorted

np.searchsorted(T, np.arange(max(T)), side='right')-1

which gives

array([0, 0, 1, 1, 1, 2, 2])

Another option (but seems clumsy) is using itertools.accumulate if you don't want to load numpy

from itertools import accumulate
list(accumulate([1 if i in T else 0 for i in range(max(T))], initial=-1))[1:]

and you will obtain a list

[0, 0, 1, 1, 1, 2, 2]

If you want to leverage broadcasting, a different (but not the fastest) numpy approach could be using np.meshgrid.

def expand_split_list(T):
    grid, _ = np.meshgrid(np.arange(T[-1]), T[:-1], indexing="ij") 
    # Creates a grid of indices and boundaries
    return (grid >= T[:-1]).sum(axis=1) - 1 
    # Boolean mask to check segment membership, then sum to assign group   indices

Another numpy approach could be using np.digitize if you want direct binning approach, but it is slightly slower than np.searchsorted() due to monotonicity checks

np.digitize(np.arange(T[-1]), bins=T) - 1

If you're working with Pandas, pd.cut() is another easy way to segment values:

pd.cut(range(T[-1]), bins=T, labels=False, right=False).tolist()

For a pure Python approach, you can use bisect_right(), which performs binary search over T for each element:

from bisect import bisect_right

def expand_split_list(T):
    return [bisect_right(T, i) - 1 for i in range(T[-1])]

发布者：admin，转转请注明出处：http://www.yc00.com/questions/1744760385a4592138.html

python - Fast way to expand split list into index list - Stack Overflow

3 Answers 3

发表回复

评论列表（0条）

联系我们

400-800-8888

python - Fast way to expand split list into index list - Stack Overflow

3 Answers 3

相关推荐