Convert Numpy Object Array To Sparse Matrix
Solution 1:
It is possible to create a coo
format matrix from your x
:
In [22]: x = np.array([['a', 'b', 'c']], dtype=object)
In [23]: M=sparse.coo_matrix(x)
In [24]: M
Out[24]:
<1x3 sparse matrix of type'<class 'numpy.object_'>'with3 stored elements in COOrdinate format>
In [25]: M.data
Out[25]: array(['a', 'b', 'c'], dtype=object)
coo
has just flattened the input array and assigned it to its data
attribute. (row
and col
have the indices).
In [31]: M=sparse.coo_matrix(x)
In [32]: print(M)
(0, 0) a
(0, 1) b
(0, 2) c
But displaying it as an array produces an error.
In [26]: M.toarray()
ValueError: unsupported data types in input
Trying to convert it to other formats produces your typeerror
.
dok
sort of works:
In [28]: M=sparse.dok_matrix(x)
/usr/local/lib/python3.5/dist-packages/scipy/sparse/sputils.py:114: UserWarning: object dtype isnot supported by sparse matrices
warnings.warn("object dtype is not supported by sparse matrices")
In [29]: M
Out[29]:
<1x3 sparse matrix of type'<class 'numpy.object_'>'with3 stored elements in Dictionary Of Keys format>
String dtype works a little better, x.astype('U1')
, but still has problems with conversion to csr
.
Sparse matrices were developed for large linear algebra problems. The ability to do matrix multiplication and linear equation solution were most important. Their application to non-numeric tasks is recent, and incomplete.
Solution 2:
I don't think this is supported and while the documents are a bit sparse on this end, this part of the sources should show that:
# List of the supported data typenums and the corresponding C++ types#T_TYPES = [
('NPY_BOOL', 'npy_bool_wrapper'),
('NPY_BYTE', 'npy_byte'),
('NPY_UBYTE', 'npy_ubyte'),
('NPY_SHORT', 'npy_short'),
('NPY_USHORT', 'npy_ushort'),
('NPY_INT', 'npy_int'),
('NPY_UINT', 'npy_uint'),
('NPY_LONG', 'npy_long'),
('NPY_ULONG', 'npy_ulong'),
('NPY_LONGLONG', 'npy_longlong'),
('NPY_ULONGLONG', 'npy_ulonglong'),
('NPY_FLOAT', 'npy_float'),
('NPY_DOUBLE', 'npy_double'),
('NPY_LONGDOUBLE', 'npy_longdouble'),
('NPY_CFLOAT', 'npy_cfloat_wrapper'),
('NPY_CDOUBLE', 'npy_cdouble_wrapper'),
('NPY_CLONGDOUBLE', 'npy_clongdouble_wrapper'),
]
Asking for object-based types sounds like a lot. Even some more basic types like float16 are missing.
Post a Comment for "Convert Numpy Object Array To Sparse Matrix"