Sparsemax
Content copied to clipboard
Sparsemax activation function is similar to softmax but able to output sparse probabilities.
for batch i
and class j
sparsemax(x)[i,j]
= max(0, logits[i,j]
- τ
(logits[i,:]
))
Sparsemax activation function is similar to softmax but able to output sparse probabilities.
for batch i
and class j
sparsemax(x)[i,j]
= max(0, logits[i,j]
- τ
(logits[i,:]
))