Rereading my 'notes' on activation functions - whether the neuron contributes or
not - it is all starting to make sense to me. ReLU is actually a linear function for
values above 0, and non-linear at 0 and below. Interestingly, it is the one that has
become favored for Deep Learning, and also the one that profits from a small 'correction'.
Below, tanh and sigmoid on the job, and performing predictably well. For the last,
ReLu, without and with the aid of a correction factor. Indeed, on the last example the
learning rate could be accelerated...
No comments:
Post a Comment