A comprehensive quality assessment of the ozone products from 18 limb-viewing satellite instruments is provided by means of a detailed inter-comparison. The ozone climatologies in the form of monthly zonal mean time series covering the upper troposphere to lower mesosphere are obtained from LIMS, SAGE I, SAGE II, UARS-MLS, HALOE, POAM II, POAM III, SMR, OSIRIS, SAGE III, MIPAS, GOMOS, SCIAMACHY, ACE-FTS, ACE-MAESTRO, Aura-MLS, HIRDLS, and SMILES within 1978-2010. The inter-comparisons focus on mean biases based on monthly and annual zonal mean fields, on inter-annual variability and on seasonal cycles. Additionally, the physical consistency of the data sets is tested through diagnostics of the quasi-biennial oscillation and the Antarctic ozone hole. The comprehensive evaluations reveal that the uncertainty in our knowledge of the atmospheric ozone mean state is smallest in the tropical middle stratosphere and in the midlatitude lower/middle stratosphere, where we find a 1σ multi-instrument spread of less than ±5%. While the overall agreement among the climatological data sets is very good for large parts of the stratosphere, individual discrepancies have been identified including unrealistic month-to-month fluctuations, large biases in particular atmospheric regions, or inconsistencies in the seasonal cycle. Notable differences between the data sets exist in the tropical lower stratosphere and at high latitudes, with a multi-instrument spread of ±30% at the tropical tropopause and ±15% at polar latitudes. In particular, large relative differences are identified in the Antarctic polar cap during the time of the ozone hole, with a spread between the monthly zonal mean fields of ±50%. Differences between the climatological data sets are suggested to be partially related to inter-instrumental differences in vertical resolution and geographical sampling. The evaluations as a whole provide guidance on what data sets are the most reliable for applications such as studies of ozone variability, model-measurement comparisons and detection of long-term trends. A detailed comparison versus SAGE II data is presented, which can help identify suitable candidates for long-term data merging studies.