Sparsemax
Content copied to clipboard
Sparsemax activation function is similar to softmax but able to output sparse probabilities.
for batch i and class j
sparsemax(x)[i,j] = max(0, logits[i,j] - τ(logits[i,:]))
Sparsemax activation function is similar to softmax but able to output sparse probabilities.
for batch i and class j
sparsemax(x)[i,j] = max(0, logits[i,j] - τ(logits[i,:]))