It looks like a good starting point. I do have a few suggestions:
For scalability, fire() should be restructured so that a neuron that's already fired with the current input set doesn't have to recalculate each time. This would be the case if you had another hidden layer, or more than one output node.
Consider splitting your threshold calc into its own method. Then you can subclass Neuron and use different types of activation functions (bipolar sigmoid, RBF, linear, etc).
To learn more complex functions, add a bias input to each neuron. It's basically like another input with it's own weight value, but the input is always fixed at 1 (or -1).
Don't forget to allow for training methods. Backpropagation will need something like the inverse of fire(), to take a target output and ripple the weight changes through each layer.