Skip to content Skip to sidebar Skip to footer

Select Certain Rows (condition Met), But Only Some Columns In Python/numpy

I have an numpy array with 4 columns and want to select columns 1, 3 and 4, where the value of the second column meets a certain condition (i.e. a fixed value). I tried to first se

Solution 1:

>>> a = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12]])
>>> a
array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

>>> a[a[:,0] > 3] # select rows where first column is greater than 3
array([[ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

>>> a[a[:,0] > 3][:,np.array([True, True, False, True])] # select columns
array([[ 5,  6,  8],
       [ 9, 10, 12]])

# fancier equivalent of the previous
>>> a[np.ix_(a[:,0] > 3, np.array([True, True, False, True]))]
array([[ 5,  6,  8],
       [ 9, 10, 12]])

For an explanation of the obscure np.ix_(), see https://stackoverflow.com/a/13599843/4323

Finally, we can simplify by giving the list of column numbers instead of the tedious boolean mask:

>>> a[np.ix_(a[:,0] > 3, (0,1,3))]
array([[ 5,  6,  8],
       [ 9, 10, 12]])

Solution 2:

If you do not want to use boolean positions but the indexes, you can write it this way:

A[:, [0, 2, 3]][A[:, 1] == i]

Going back to your example:

>>> A = np.array([[1,2,3,4],[6,1,3,4],[3,2,5,6]])
>>> print A
[[1 2 3 4]
 [6 1 3 4]
 [3 2 5 6]]
>>> i = 2
>>> print A[:, [0, 2, 3]][A[:, 1] == i]
[[1 3 4]
 [3 5 6]]

Seriously,

Solution 3:

>>> a=np.array([[1,2,3], [1,3,4], [2,2,5]])
>>> a[a[:,0]==1][:,[0,1]]
array([[1, 2],
       [1, 3]])
>>> 

Solution 4:

This also works.

I = np.array([row[[x for x in range(A.shape[1]) if x != i-1]]for row in A if row[i-1] == i])
print I

Edit: Since indexing starts from 0, so

i-1

should be used.

Solution 5:

I am hoping this answers your question but a piece of script I have implemented using pandas is:

df_targetrows = df.loc[df[col2filter]*somecondition*, [col1,col2,...,coln]]

For example,

targets = stockdf.loc[stockdf['rtns'] > .04, ['symbol','date','rtns']]

this will return a dataframe with only columns ['symbol','date','rtns'] from stockdf where the row value of rtns satisfies, stockdf['rtns'] > .04

hope this helps

Post a Comment for "Select Certain Rows (condition Met), But Only Some Columns In Python/numpy"