I would recommend both. For regression you also can use sigmoid/tanh activation in the hidden layer since it adds non-linearity to the function approximation. For the output layer I would normally pick linear for regression, and a squashing function for classification.
↧