


The sampling error of an estimate is the error caused by the selection of a sample instead of conducting a census of the population. Sampling error is reduced by selecting a large sample and by using efficient sample design and estimation strategies such as stratification, optimal allocation, and ratio estimation.
With the use of probability sampling methods in the NHSDA, it is possible to develop estimates of sampling error from the survey data. These estimates have been calculated for all prevalence estimates presented in this report using a Taylor series linearization approach that takes into account the effects of the complex NHSDA design features. The sampling errors are used to identify unreliable estimates and to test for the statistical significance of differences between estimates.
Estimates considered to be unreliable due to unacceptably large sampling error are not shown in this report, and are noted by asterisks (*) in the tables in the appendix. The criterion used for suppressing estimates was based on the relative standard error (RSE), which is defined as the ratio of the standard error over the estimate. The log transformation of the proportion estimate (p) was used to calculate the RSE. Specifically, rates and corresponding estimated number of users were suppressed if:
RSE[ln(p)] > 0.175 when p < .5
or RSE[ln(1p)] > 0.175 when p .5.
Estimates were also suppressed if they rounded to zero or 100 percent. This occurs if p < .0005 or if p .9995. Statistical tests of significance have been computed for comparisons of estimates from 1995 with 1994. Results are shown in the appendix 5 tables. As indicated in the footnotes, significant differences are noted by "a" (significant at the .05 level of significance) and "b" (significant at the .01 level of significance). All changes described in this report as increases or decreases were tested and found to be significant at least at the .05 level, unless otherwise indicated.
Nonsampling errors such as nonresponse and reporting errors may affect the outcome of significance tests. Also, keep in mind that while a level of significance equal to .05 is used to determine statistical significance in these tables, large differences associated with slightly higher pvalues (specifically those between .05 and .10) may be worth noting along with the pvalues. Furthermore, statistically significant differences are not always meaningful, because the magnitude of difference may be small or because the significance may have occurred simply by chance. In a series of twenty independent tests, it is to be expected that one test will indicate significance merely by chance even if there is no real difference in the populations compared. In making more than one comparison among three or more percentages (comparing percentages within a table), there has been no attempt to adjust the level of significance to account for making simultaneous inferences (often referred to as multiple comparisons). Therefore, the probability of falsely rejecting the null hypothesis at least once in a family of k comparisons is higher than the significance level given for individual comparisons (in this report, either .01 or .05).
When making comparisons of estimates for different population subgroups from the same data year, the covariance term, which is usually small and positive, has typically been ignored. This results in somewhat conservative tests of hypotheses that will sometimes fail to establish statistical significance when in fact it exists.



This page was last updated on June 16, 2008. 