Binning A Numpy Array

November 30, 2023 Post a Comment

I have a numpy array which contains time series data. I want to bin that array into equal partitions of a given length (it is fine to drop the last partition if it is not the same

Solution 1:

Just use reshape and then mean(axis=1).

As the simplest possible example:

import numpy as np

data = np.array([4,2,5,6,7,5,4,3,5,7])

print data.reshape(-1, 2).mean(axis=1)

More generally, we'd need to do something like this to drop the last bin when it's not an even multiple:

import numpy as np

width=3data = np.array([4,2,5,6,7,5,4,3,5,7])

result = data[:(data.size // width) * width].reshape(-1, width).mean(axis=1)

print result

Solution 2:

Since you already have a numpy array, to avoid for loops, you can use reshape and consider the new dimension to be the bin:

In [33]: data.reshape(2, -1)
Out[33]: 
array([[4, 2, 5, 6, 7],
       [5, 4, 3, 5, 7]])

In [34]: data.reshape(2, -1).mean(0)
Out[34]: array([ 4.5,  3. ,  4. ,  5.5,  7. ])

Actually this will just work if the size of data is divisible by n. I'll edit a fix.

Looks like Joe Kington has an answer that handles that.

Baca Juga

Solution 3:

Try this, using standard Python (NumPy isn't necessary for this). Assuming Python 2.x is in use:

data = [ 4, 2, 5, 6, 7, 5, 4, 3, 5, 7 ]

# example: for n == 2
n=2
partitions = [data[i:i+n] for i in xrange(0, len(data), n)]
partitions = partitions if len(partitions[-1]) == n else partitions[:-1]

# the above produces a list of lists
partitions
=> [[4, 2], [5, 6], [7, 5], [4, 3], [5, 7]]

# now the mean
[sum(x)/float(n) for x in partitions]
=> [3.0, 5.5, 6.0, 3.5, 6.0]

Solution 4:

I just wrote a function to apply it to all array size or dimension you want.

data is your array
axis is the axis you want to been
binstep is the number of points between each bin (allow overlapping bins)
binsize is the size of each bin

func is the function you want to apply to the bin (np.max for maxpooling, np.mean for an average ...)

def binArray(data, axis, binstep, binsize, func=np.nanmean):
    data = np.array(data)
    dims = np.array(data.shape)
    argdims = np.arange(data.ndim)
    argdims[0], argdims[axis]= argdims[axis], argdims[0]
    data = data.transpose(argdims)
    data = [func(np.take(data,np.arange(int(i*binstep),int(i*binstep+binsize)),0),0) for i in np.arange(dims[axis]//binstep)]data = np.array(data).transpose(argdims)
    returndata

In you case it will be :

data = [4,2,5,6,7,5,4,3,5,7]
bin_data_mean = binArray(data, 0, 2, 2, np.mean)

or for the bin size of 3:

bin_data_mean = binArray(data, 0, 3, 3, np.mean)

Python Programming Language

Binning A Numpy Array

Solution 1:

Solution 2:

Solution 3:

Solution 4:

Post a Comment for "Binning A Numpy Array"