Tensorflow Dense Tensor To Sparse Binarized Hash Trick Tensor

January 31, 2024 Post a Comment

I want to transform this dataset in such a way that each tensor has a given size n and that a feature at index i of this new tensor is set to 1 if and only if there is a i in the o

Solution 1:

Here is a possible implementation for that:

import tensorflow as tf

defbinarization_sparse(t, n):
    # Input size
    t_shape = tf.shape(t)
    t_rows = t_shape[0]
    t_cols = t_shape[1]
    # Make sparse row indices for each value
    row_idx = tf.tile(tf.range(t_rows)[: ,tf.newaxis], [1, t_cols])
    # Sparse column indices
    col_idx = t % n
    # "Flat" indices - needed to discard repetitions
    total_idx = row_idx * n + col_idx
    # Remove repeated elements
    out_idx, _ = tf.unique(tf.reshape(total_idx, [-1]))
    # Back to row and column indices
    sparse_idx = tf.stack([out_idx // n, out_idx % n], axis=-1)
    # Sparse values
    sparse_values = tf.ones([tf.shape(sparse_idx)[0]], dtype=t.dtype)
    # Make sparse tensor
    out = tf.sparse.SparseTensor(tf.cast(sparse_idx, tf.int64),
                                 sparse_values,
                                 [t_rows, n])
    # Reorder indices
    out = tf.sparse.reorder(out)
    return out

# Testwith tf.Graph().as_default(), tf.Session() as sess:
    t = tf.constant([
        [ 0,  3,  4],
        [12,  2,  4]
    ])
    # Sparse result
    t_m1h_sp = binarization_sparse(t, 9)
    # Convert to dense to check output
    t_m1h = tf.sparse.to_dense(t_m1h_sp)
    print(sess.run(t_m1h))

Output:

[[1 0 0 1 1 0 0 0 0]
 [0 0 1 1 1 0 0 0 0]]

I added the logic to remove repeated elements because in principle it could happen, but if you have a guarantee that there are no repetitions (including modulo), you may skip that step. Also, I reorder the sparse tensor at the end. That is not strictly necessary here, but (I think) sparse operations sometimes expect the indices to be ordered (and sparse_idx may not be ordered).

Also, this solution is specific to 2D inputs. For 1D inputs would be simpler, and it can be written for higher-dimensional inputs as well if necessary. I think a completely general solution is possible but it would be more complicated (specially if you want to consider tensors with unknown number of dimensions).

Python Programming Language

Tensorflow Dense Tensor To Sparse Binarized Hash Trick Tensor

Solution 1:

Post a Comment for "Tensorflow Dense Tensor To Sparse Binarized Hash Trick Tensor"