Compression - The Glue
Why compression? One of the reasons for using a compressor on the master bus is the fact that you don't need to use as much compression on the individual tracks. It reduces the overall dynamic range of the mix, so that the mix will come together faster and will sound "gelled". Compression acts like glue; with the right settings, the drums will sit better in the mix, and the whole mix will sound punchier. Lower frequencies have the most energy, so they are usually driving the compressor. In theory this may sound like a bad thing if the kick drum hit is making the cymbals "duck". However, my personal opinion is that it really is the core of punchiness. The right type of pumping is the key. It's something to strive for!
Well, is it mastering then? No, it's not. If you remove the compressor, your mix will most likely collapse, because you did your mix decisions through the compressor. It's nothing to be worried about. Audio engineers have used this trick for ages. No one will ever hear your mix without the compressor. Just keep it there when you print the final mixes for mastering. You are trying to create the best possible mix for the song. Do whatever it takes to make it sound good!

An ordinary compressor might not be the best choice here. Sure, you can use them, but the setup will be much harder. There are many special mix bus compressors on the market that are designed for bus use. The controls are usually pretty limited and that's actually a good thing - you will have an easier time finding the right settings. The guideline is to use a fairly low ratio, a relatively slow attack, and a relatively quick release. My favourite plugins are the various SSL console master bus compressor emulations available. The following settings will work with most "bus compressors".
The first adjustment to make is to set the ratio. I have found that the most "universal" ratio is 4:1. It's low enough to not eat the impact but high enough to make the drums punch. I know Andy Wallace likes it too.
The idea is to use slow attack times so that it doesn't kill your transients. I usually end up with using 30 ms. If you find it too slow and want the compressor to bite more, 10 ms is another good option. Less than 10 ms attack times might work with some type of material, but I have yet to find it.
Most of the time I find myself using the fastest release available, usually 100 ms. Auto release may work better with songs that have constantly changing dynamics and tempos. Long release times will introduce more pumping. Usually setting the release to the eighth note of the song tempo can be a good starting point.
Set the threshold so that you get 3-5 db of gain reduction on the loudest part of the song. If your mix is peaking at -3 db, the threshold would usually be around -12-20 db.
Make-Up Gain
This is used for bringing the lost level back. Try to set the make-up gain so that it is the estimated average of the compression throughout the song, in this case usually 2-4 db
There are two different styles of compression, hard and soft knee. When the signal exceeds the threshold, hard-knee compression kicks in immediately with the selected ratio. Soft-knee compression is more gradual. The compression starts before the threshold and the ratio works progressively around the threshold. Soft-knee works better in the master bus, because it sounds more musical and transparent. Most of the time hard-knee is too fast and aggressive, making the compression sound very obvious.

Example Settings
Here are two different master bus compressor settings to try out. The best way to compare them would be to mix a song from the start two times with both settings.
Ratio 4:1
Attack: 30 ms
Release: 0.1 s
Threshold: 3-5 db of gain reduction
Make-Up Gain: 3.5 db
Ratio 4:1
Attack: 10 ms
Release: auto
Threshold: 3-5 db of gain reduction
Make-Up Gain: 3.5 db

