Given an index split list T
of length M + 1
, where the first element is 0
and the last element is N
, generate an array D
of length N
such that D[T[i]:T[i+1]] = i
.
For example, given T = [0, 2, 5, 7]
, then return D = [0, 0, 1, 1, 1, 2, 2]
.
I'm trying to avoid a for loop, but the best I can do is:
def expand_split_list(split_list):
return np.concatenate(
[
np.full(split_list[i + 1] - split_list[i], i)
for i in range(len(split_list) - 1)
]
)
Is there a built-in function for that purpose?
Given an index split list T
of length M + 1
, where the first element is 0
and the last element is N
, generate an array D
of length N
such that D[T[i]:T[i+1]] = i
.
For example, given T = [0, 2, 5, 7]
, then return D = [0, 0, 1, 1, 1, 2, 2]
.
I'm trying to avoid a for loop, but the best I can do is:
def expand_split_list(split_list):
return np.concatenate(
[
np.full(split_list[i + 1] - split_list[i], i)
for i in range(len(split_list) - 1)
]
)
Is there a built-in function for that purpose?
Share Improve this question asked Mar 12 at 9:58 LeoLeo 711 silver badge6 bronze badges3 Answers
Reset to default 7You could combine diff
, arange
, and repeat
:
n = np.diff(T)
out = np.repeat(np.arange(len(n)), n)
As a one-liner (python ≥3.8):
out = np.repeat(np.arange(len(n:=np.diff(T))), n)
Another option with assigning ones to an array of zeros
, then cumsum
:
out = np.zeros(T[-1], dtype=int)
out[T[1:-1]] = 1
out = np.cumsum(out)
Output:
array([0, 0, 1, 1, 1, 2, 2])
A numpy option is np.searchsorted
np.searchsorted(T, np.arange(max(T)), side='right')-1
which gives
array([0, 0, 1, 1, 1, 2, 2])
Another option (but seems clumsy) is using itertools.accumulate
if you don't want to load numpy
from itertools import accumulate
list(accumulate([1 if i in T else 0 for i in range(max(T))], initial=-1))[1:]
and you will obtain a list
[0, 0, 1, 1, 1, 2, 2]
If you want to leverage broadcasting, a different (but not the fastest) numpy approach could be using np.meshgrid
.
def expand_split_list(T):
grid, _ = np.meshgrid(np.arange(T[-1]), T[:-1], indexing="ij")
# Creates a grid of indices and boundaries
return (grid >= T[:-1]).sum(axis=1) - 1
# Boolean mask to check segment membership, then sum to assign group indices
Another numpy approach could be using np.digitize
if you want direct binning approach, but it is slightly slower than np.searchsorted()
due to monotonicity checks
np.digitize(np.arange(T[-1]), bins=T) - 1
If you're working with Pandas, pd.cut()
is another easy way to segment values:
pd.cut(range(T[-1]), bins=T, labels=False, right=False).tolist()
For a pure Python approach, you can use bisect_right()
, which performs binary search over T
for each element:
from bisect import bisect_right
def expand_split_list(T):
return [bisect_right(T, i) - 1 for i in range(T[-1])]
发布者:admin,转转请注明出处:http://www.yc00.com/questions/1744760385a4592138.html
评论列表(0条)