Appendix B: Statistical Methods and Limitations of the Data
B.1. Target Population
An important limitation of the NHSDA estimates of drug use prevalence is that they are only designed to describe the target population of the survey, e.g., the civilian noninstitutionalized population aged 12 and older. Although this population includes almost 98% of the total U.S. population aged 12 and older, it does exclude some important and unique subpopulations who may have very different drugusing patterns. The survey excludes active military personnel, who have been shown to have significantly lower rates of illicit drug use. Persons living in institutional group quarters, such as prisons and residential drug treatment centers, are not included in the NHSDA and have been shown in other surveys to have higher rates of illicit drug use. Also excluded are homeless persons not living in a shelter on the survey date, another population shown to have higher than average rates of illicit drug use. Appendix C describes other surveys that provide data for these populations.
B.2. Sampling Error and Statistical Significance
The sampling error of an estimate is the error caused by the selection of a sample instead of conducting a census of the population. Sampling error is reduced by selecting a large sample and by using efficient sample design and estimation strategies such as stratification, optimal allocation, and ratio estimation.
With the use of probability sampling methods in the NHSDA, it is possible to develop estimates of sampling error from the survey data. These estimates have been calculated for all prevalence estimates presented in this report using a Taylor series linearization approach that takes into account the effects of the complex NHSDA design features. The sampling errors are used to identify unreliable estimates and to test for the statistical significance of differences between estimates.
Variance Estimation for Totals
Estimates of proportions, such as drug use prevalence rates, take the form of nonlinear statistics where the variances can not be expressed in closed form. Variance estimation for nonlinear statistics is performed using a firstorder Taylor series approximation in RTI's SUDAAN software package. The approximation is unbiased for sufficiently large samples and has proven to be at least as accurate and less costly to implement than its competitors such as balanced repeated replication or jackknife methods (Rao and Wu, 1985).
Corresponding to proportion estimates, , the number of drug users, Y_{d} , can be estimated as
where is the estimated population total for domain d, and is the estimated proportion for domain d. The standard error for the total estimate, is obtained by multiplying the standard error of the proportion by , i.e
.
This approach is theoretically correct when the domain size estimates are among those forced to Census Bureau population projections through the weight calibration process. In these cases, is clearly not subject to sampling error.
For domain totals Y_{d} where is not fixed, this formulation may still provide a good approximation if we can reasonably assume that the sampling variation in is negligible relative to the sampling variation in . In most analysis conducted for prior years, this has been a reasonable assumption.
For some of the tables produced from the 2000 data, it was clear that the above approach yielded an underestimate of the variance of a total because was subject to considerable variation. In these cases, a different method was used to estimate variances. SUDAAN provides an option to directly estimate the variance of the linear statistic which estimates a population total. Using this option did not affect the standard error estimates for the corresponding proportions presented in the same sets of tables.
Suppression Criteria for Unreliable Estimates
As was done in the past, direct survey estimates considered to be unreliable due to unacceptably large sampling errors are not shown in this report, and are noted by asterisks (*) in the tables containing such estimates found in the appendices. The criterion used for suppressing all direct survey estimates was based on the relative standard error (rse), which is defined as the ratio of the standard error (se) over the estimate.
Proportion estimates (p) within the range [0<p<1], rates and corresponding estimated number of users were suppressed if:
rse[(ln(p)] > 0.175 when p < 0.5
or
rse[(ln(1p)] > 0.175 when p > 0.5.
Using a firstorder Taylor series approximation to estimate rse[(ln(p)] and rse[(ln(1p)], we have the following, which was used for computational purposes:
se(p)/p > 0.175 when p < 0.5
1n( p)
or
se(p)/(1p) > 0.175 when p > 0.5.
1n(1p)
The separate formulae for p < 0.5 and p > 0.5 produces a symmetric suppression rule; that is, if p is suppressed, then so will 1 p. This is an ad hoc rule that requires an effective sample size in excess of 50. When 0.05 <p< 0.95, the symmetric properties of the rule produces a local maximum effective sample size of 68 at p =0.5. Thus, estimates with these values of p along with effective sample sizes falling below 68 are suppressed. A local minimum effective sample size of 50 occurs at p =0.2 and again at p =0.8 within this same interval; so, estimates are suppressed for values of p with effective sample sizes below 50.
In previous NHSDA surveys, these varying sample size restrictions sometimes produced unusual occurrences of suppression for a particular combination of prevalence rates. For example, in some cases, lifetime prevalence rates near p =0.5 were suppressed (effective sample size was less than 68 but greater than 50), while not suppressing the corresponding past year or past month estimates near p = 0.2 (effective sample sizes were greater than 50). To reduce the occurrence of this type of inconsistency, a minimum effective sample size of 68 was added to the suppression criteria in the 2000 NHSDA. As p approaches 0.00 or 1.00 outside the interval (0.05, 0.95), the suppression criteria will still require increasingly larger effective sample sizes. For example, if p=0.01 and 0.001, the effective sample size must exceed 152 and 684, respectively.
Also new to the 2000 survey is a minimum nominal sample size suppression criteria (n=100) that protect against unreliable estimates caused by small design effects and small nominal sample sizes. Prevalence estimates are also suppressed if they are close to zero or 100 percent (i.e., if p < .00005 or if p >.99995).
Estimates of other totals (e.g., number of initiates) along with means and rates (both not bounded between 0 and 1) are suppressed if:
rse(p) > 0.5.
Additionally, estimates of mean age of first use were suppressed if the sample size is smaller than 10 respondents; also, the estimated incidence rate and number of initiates were suppressed if they round to 0.
The suppression criteria for various NHSDA estimates are summarized in Table B.1 below.
Table B.1. Summary of 2000 NHSDA Suppression Rules  
Estimate 
Suppress if: 
Prevalence rate, p, with nominal sample size, n and design effect deff 
The estimated prevalence rate, p, is less than 0.00005 or greater than 0.99995, or when p < 0.5, or when p > 0.5, or Effective n < 68, or n < 100 where Note: The rounding portion of this suppression rule for prevalence rates will produce some estimates that round at one decimal place to 0.0% or 100.0% but are not suppressed from the tables. 
Estimated Number 
The estimated prevalence rate, p, is suppressed. Note: In some instances when p is not suppressed, the estimated number may appear as a 0 in the tables; this means that the estimate is greater than 0 but less than 500 (estimated numbers are shown in thousands). 
Mean age at first use, , with nominal sample size, n 
, or 
Incidence rate, 
Rounds to less than 0.1 per thousand personyears of exposure, or 
Number of initiates, 
Rounds to less than 1000 initiates, or 
Statistical Significance of Differences
This section describes the methods that were used to compare the prevalence estimates in this report. Customarily, the observed difference between estimates is evaluated in terms of its statistical significance. "Statistical significance" refers to the probability that a difference as large as that observed would occur due to random error in the estimates if there were no difference in the prevalence rates for the population groups being compared. The significance of observed differences in this report is generally reported at the 0.05 and 0.01 levels. When making comparisons between the 1999 and 2000 prevalence estimates, one can test the null hypotheses (no difference in the 1999 and 2000 prevalence rates) against the alternative hypothesis (there is a difference in prevalence rates) using the standard difference in proportions test expressed as
where p_{1} = 1999 estimate, p_{2} = 2000 estimate, var(p_{1}) = variance of 1999 estimate, var(p_{2}) = variance of 2000 estimate, and cov(p_{1},p_{2}) = covariance between p_{1} and p_{2}.
Under the null hypothesis, Z is asymptotically distributed as a normal random variable. Calculated values of Z can therefore be referred to as the unit normal distribution to determine the corresponding probability level (i.e., pvalue). Since there is a 50 percent overlap in the sampled segments between the 1999 and 2000 NHSDAs, the covariance term in the formula for Z will, in general, be greater than zero. Estimates of Z along with its pvalue were calculated using RTI's (Research Triangle Institute) SUDAAN, using the analysis weights and accounting for the sample design as described in Appendix A. A similar procedure and formula for Z are used for estimated totals.
When making comparisons of estimates for different population subgroups from the same data year, the covariance term, which is usually small and positive, was ignored. This results in somewhat conservative tests of hypotheses that sometimes fail to establish statistical significance when in fact it exists.
B.3. Nonsampling Error
Nonsampling errors can occur from nonresponse, coding errors, computer processing errors, errors in the sampling frame, reporting errors, and other errors not due to sampling. Nonsampling errors are reduced through data editing, statistical adjustments for nonresponse, close monitoring and periodic retraining of interviewers, and improvement in various quality control procedures.
Although nonsampling errors can often be much larger than sampling errors, measurement of most nonsampling errors is difficult or impossible. However, some indication of the effects of some types of nonsampling errors can be obtained through proxy measures such as response rates and from other research studies.
Screening and Interview Response Rate Patterns
Response rates for the NHSDA were stable for the period of 19941998, with the screening response rate at about 93% and the interview response rate at about 78% (response rates discussed in this Appendix are weighted). In 1999, the CAI screening response rate was 89.6% and the interview response rate was about 68.6%. A more stable and experienced field interviewer workforce improved these rates in 2000. Of the 182,576 eligible households sampled for the 2000 NHSDA main study, 169,769 were successfully screened for a weighted screening response rate of 92.8% (Table B.2). In these screened households, a total of 91,961 sample persons were selected, and completed interviews were obtained from 71,764 of these sample persons, for a weighted interview response rate of 73.9%. A total of 10,109 (15.0%) sample persons were classified as refusals, 4,834 (5.5%) were not available or never at home, and 5,254 (5.5%) did not participate for various other reasons, such as physical or mental incompetence or language barrier (Table B.3). Tables B.4 and B.5 show the distribution of the selected sample by interview code and age group. The weighted interview response rate was highest among 12 to 17 year olds (82.6%), females (75.1%), blacks and Hispanics (76.2% and 78.0% respectively), in nonmetropolitan areas (77.6%), and among persons residing in the South (76.4%) (Table B.6).
The increase in nonresponse between the 1998 and 1999 NHSDAs can be attributed primarily to the hiring of many new and inexperienced Field Interviewers in 1999 and a larger than usual turnover. By the end of 2000, the interviewer workforce primarily consisted of experienced interviewers and fewer were leaving for other jobs. In 1999, there were 1,997 Field Interviewers hired and trained to conduct the computerassisted interviewing (CAI) and paper and pencil interviews (PAPI) surveys. More than a third of them did not complete the survey year (37.7%). In 2000, the number of trained interviewers decreased to 1356 (since only CAI interviews were conducted in 2000), and the attrition rate dropped to 29.8%. Both prior NHSDA experience and onthejob experience were shown to be related to nonresponse. Previously experienced interviewers and interviewers with one, two, or three quarters of onthejob experience were more successful at obtaining an interview.
The overall weighted response rate, defined as the product of the weighted screening response rate and weighted interview response rate, was 61.5% in 1999 and 68.6% in 2000 (an 11.5 percent improvement over the 1999 rate). Nonresponse bias can be expressed as the product of the response rate (R) and the difference between the characteristic of interest between respondents and nonrespondents in the population (P_{r}  P_{nr}). Thus, assuming the quantity (P_{r} P_{nr}) is fixed over time, the improvement in response rates in 2000 will result in estimates with lower nonresponse bias.
Inconsistent Responses and Item Nonresponse
Among survey participants, item response rates were above 98% for most questionnaire items. However, inconsistent responses for some items, including the drug use items, are common. Estimates of substance use from the NHSDA are based on the responses to multiple questions by respondents, so that the maximum amount of information is used in determining whether a respondent is classified as a drug user. Inconsistencies in responses are resolved through a logical editing process that involves some judgment on the part of survey analysts and is a potential source of nonsampling error. Because of the automatic routing through the CAI questionnaire (e.g., lifetime drug use questions which skip entire modules when answered "no"), there is less editing of this type than in the PAPI questionnaire used in previous years.
Table B.2. Weighted Percent and Sample Size for 1999 and 2000 NHSDAs by Screening Result Code
Screening Result 
1999 NHSDA 
2000 NHSDA  
Sample Size 
Weighted Percent 
Sample Size 
Weighted Percent  
Total Sample 
223,868 
100.00 
215,860 
100.00 
Ineligible Cases 
36,026 
15.78 
33,284 
15.09 
Eligible Cases 
187,842 
84.22 
182,576 
84.91 
Ineligibles 
36,026 
100.00 
33,284 
100.00 
Vacant 
18,034 
49.71 
16,796 
50.76 
Not a Primary Residence 
4,516 
12.90 
4,506 
13.26 
Not a Dwelling Unit 
4,626 
12.70 
3,173 
9.33 
All Military Personnel 
482 
1.22 
414 
1.21 
Other, Ineligible 
8,368 
23.46 
8,395 
25.43 
Eligible Cases 
187,842 
100.00 
182,576 
100.00 
Screening Complete 
169,166 
89.63 
169,769 
92.84 
No One Selected 
101,537 
54.19 
99,999 
55.36 
One Selected 
44,436 
23.63 
46,981 
25.46 
Two Selected 
23,193 
11.82 
22,789 
12.03 
Screening Not Complete 
18,676 
10.37 
12,807 
7.16 
No One Home 
4,291 
2.38 
3,238 
1.82 
Respondent Unavailable 
651 
0.36 
415 
0.24 
Physically or Mentally Incompetent 
419 
0.24 
310 
0.16 
Language Barrier  Hispanic 
102 
0.06 
83 
0.05 
Language Barrier  Other 
486 
0.28 
434 
0.27 
Refusal 
11,097 
5.92 
7,535 
4.14 
Other, Access Denied 
1,536 
1.08 
748 
0.45 
Other, Eligible 
38 
0.02 
7 
0.00 
Other, Problem Case 
56 
0.03 
37 
0.02 
Table B.3. Weighted Percent and Sample Sizes for 1999 and 2000 NHSDA by Final Interview Code Among Persons Aged 12 or Older
Final Interview Code 
1999 NHSDA 
2000 NHSDA  
Sample Size 
Weighted Percent 
Sample Size 
Weighted Percent  
Total Selected Persons 
89,883 
100.00 
91,961 
100.00 
Interview Complete 
66,706 
68.55 
71,764 
73.93 
No One at Dwelling Unit 
1,795 
2.13 
1,776 
2.02 
Respondent Unavailable 
3,897 
4.53 
3,058 
3.52 
BreakOff 
50 
0.07 
72 
0.09 
Physically/Mentally Incompetent 
1,017 
2.62 
1,053 
2.57 
Language Barrier  Spanish 
168 
0.12 
109 
0.08 
Language Barrier  other 
480 
1.46 
441 
1.06 
Refusal 
11,276 
17.98 
10,109 
14.99 
Parental Refusal 
2,888 
1.01 
2,655 
0.88 
Other 
1,606 
1.53 
924 
0.86 
Table B.4. Weighted Percent and Sample Sizes for 1999 and 2000 NHSDA by Final Interview Code Among Persons Aged 12 to 17
Final Interview Code 
1999 NHSDA 
2000 NHSDA  
Sample Size 
Weighted Percent 
Sample Size 
Weighted Percent  
Total Selected Persons 
32,011 
100.00 
31,242 
100.00 
Interview Complete 
25,384 
78.07 
25,756 
82.58 
No One at Dwelling Unit 
322 
1.09 
278 
0.86 
Respondent Unavailable 
872 
3.04 
617 
2.05 
BreakOff 
13 
0.03 
18 
0.05 
Physically/Mentally Incompetent 
244 
0.76 
234 
0.76 
Language Barrier  Spanish 
15 
0.03 
10 
0.03 
Language Barrier  other 
58 
0.18 
50 
0.20 
Refusal 
1,808 
5.97 
1,455 
4.52 
Parental Refusal 
2,885 
9.50 
2,641 
8.35 
Other 
410 
1.33 
183 
0.59 
Table B.5. Weighted Percent and Sample Size for 1999 and 2000 NHSDA by Final Interview Code Among Persons Aged 18 or Older
Final Interview Code 
1999 NHSDA 
2000 NHSDA  
Sample Size 
Weighted Percent 
Sample Size 
Weighted Percent  
Total Selected Persons 
57,872 
100.00 
60,719 
100.00 
Interview Complete 
41,322 
67.41 
46,008 
72.92 
No One at Dwelling Unit 
1,473 
2.25 
1,498 
2.16 
Respondent Unavailable 
3,025 
4.71 
2,441 
3.69 
BreakOff 
37 
0.07 
54 
0.09 
Physically/Mentally Incompetent 
773 
2.85 
819 
2.78 
Language Barrier  Spanish 
153 
0.13 
99 
0.09 
Language Barrier  other 
422 
1.62 
391 
1.16 
Refusal 
9,468 
19.41 
8,654 
16.22 
Parental Refusal 
3 
0.00 
14 
0.01 
Other 
1,196 
1.55 
741 
0.89 
Table B.6. Response Rates and Sample Sizes for the 1999 and 2000 NHSDAs by Demographic Characteristics
1999 NHSDA 
2000 NHSDA  
Selected Persons 
Completed Interviews 
Weighted 
Selected Persons 
Completed Interviews 
Weighted Response Rate  
Total 
89,883 
66,706 
68.55% 
91,961 
71,764 
73.93% 
Age 

1217 
32,011 
25,384 
78.07% 
31,242 
25,756 
82.58% 
1825 
30,439 
22,151 
71.21% 
29,424 
22,849 
77.34% 
26 or Older 
27,433 
19,171 
66.76% 
31,295 
23,159 
72.17% 
Gender 

Male 
43,883 
31,987 
67.12% 
44,899 
34,375 
72.68% 
Female 
46,000 
34,719 
69.81% 
47,062 
37,389 
75.09% 
Race/Ethnicity 

Hispanic 
11,203 
8,755 
74.59% 
11,454 
9,396 
77.95% 
NonHispanic, White 
63,211 
46,272 
67.98% 
64,517 
49,631 
73.39% 
NonHispanic, Black 
10,552 
8,044 
70.39% 
10,740 
8,638 
76.19% 
NonHispanic, All Other Races 
4,917 
3,635 
59.28% 
5,250 
4,099 
67.31% 
Region 

Northeast 
16,794 
11,830 
64.03% 
18,959 
14,394 
71.68% 
Midwest 
24,885 
18,103 
69.63% 
25,428 
19,355 
73.23% 
South 
27,390 
21,018 
70.93% 
27,217 
22,041 
76.38% 
West 
20,814 
15,755 
67.47% 
20,357 
15,974 
72.68% 
County Type 

Large Metro 
36,101 
25,901 
65.15% 
37,754 
28,744 
71.77% 
Small Metro 
30,642 
22,612 
69.98% 
31,400 
24,579 
74.96% 
Nonmetro 
23,140 
18,193 
74.97% 
22,807 
18,441 
77.58% 
In addition, less logical editing is used because with the CAI data, statistical imputation is relied upon more heavily to determine the final values of drug use variables in cases where there is the potential to use logical editing to make a determination. The combined amount of editing and imputation in the CAI data is still considerably less than the total amount used in prior PAPI surveys. For the 2000 CAI data, for example, 3.2% of the estimate of past month hallucinogen use is based on logically edited cases and 5.4% on imputed cases, for a combined amount of8.6%. For the 1999 CAI data, 1.7% of the estimate of past month hallucinogen use is based on logically edited cases and 4.6% on imputed cases, for a combined amount of 6.2%. In the 1998 NHSDA (administered using PAPI), the amount of editing and imputation for past month hallucinogen use was 60% and 0%, respectively, for a total of 60%. The combined amount of editing and imputation for the estimate of past month heroin use is 5.0% for the 2000 CAI, 14.8% for the 1999 CAI, and 37.0% for the 1998 PAPI data.
Imputation Error in the 1999 NHSDA Estimates
While working on the 2000 NHSDA imputations, a programming error was discovered in the 1999 imputations of recency of use, frequency of use, and age at first use for several drugs. This error resulted in overestimates of past year and past month use of marijuana, inhalants, heroin, and alcohol. Thus, estimates such as past month any illicit drug use and use of any illicit drug other than marijuana were also affected. The error was limited to cases which did not have complete recency information, where it was necessary to maintain consistency between the 30day frequency and 12month frequency data during the imputation process. This error did not affect lifetime use measures. Because of the sequential nature of the imputation procedures (i.e., imputed values for a substance processed early are used subsequently in the imputation of data on other substances), it was necessary to reimpute recency of use, frequency of use, and age at first use measures for all substances. Rerunning the imputations for all substances provided the opportunity to employ several minor enhancements to the imputation procedure that had been developed for the 2000 data, thereby improving consistency between the 1999 and 2000 estimates. Due to these enhancements and the random nature of the imputation process, the revised 1999 substance use estimates are slightly different from those previously published for all substances. Below is a discussion of how the error was discovered and the corrective actions that were taken. More information about the statistical imputation procedures used in the NHSDA data can be found in Appendix A. A more complete discussion of the imputation error can be found in the 1999 NHSDA Methodological Resource Book, Section 4.
How the Error Was Discovered:
New quality control checks were instituted on the 2000 imputations of substance use variables. These checks were also applied to the 1999 data, revealing unusual imputation results for alcohol, marijuana, inhalant, and heroin use variables. Results showed that a large proportion of respondents who were known lifetime users, but had missing recency information, had been imputed to be past month and past year users. Further checking of computer programs involved in the imputation of these variables identified the error.
Description of the Error:
If a respondent is a past month user of one of these four substances, he or she should have values for frequency of use in the past month and in the past year. Legitimate values for users are 1 to 30 for past month frequency and 1 to 365 for past year frequency. (For the 12month frequency, the variable that is actually used in the imputation of missing values is the proportion of the past year that the donor used a particular drug.) However, if the respondent is a user of a substance in the past year but not the past month, he or she would not have a value for the 30day frequency of use variable. Moreover, respondents who did not use a substance in the past year would not have values for either of the frequency of use variables. Before the NHSDA imputation programs are run, the editing procedures assign "skip" codes for the frequency of use variables for these respondents for whom frequency information is not present: a "93" for the 30day frequency variables and a "993" for the 12month frequency variables.
For NHSDA respondents with missing values for certain key items (such as recency and frequency of substance use), the imputation procedure involves defining a "donor pool" which consists of respondents with complete data that can be "donated" to the respondents with missing data. This process is done within subgroups of users based on the amount of information that is known. For example, respondents with missing data on lifetime use of a substance draw from a donor pool that includes both users and nonusers, but respondents who are known to be lifetime users but have unknown recency draw from a donor pool of lifetime users, excluding the nonusers. For many of the substance use measures, the imputation is multivariate, meaning that a respondent with more than one item missing will receive imputed values for all those missing items from a single donor.
The donor pool for respondents whose recency is not completely known should consist of respondents with a variety of values for recency and frequency of use, including skip codes for frequency of use where applicable. For example, if a respondent is a lifetime user of marijuana but past year and past month use information is missing, donors consist of the following possibilities.
past month user with valid values for 12month frequency of use and 30day frequency of use
past year but not past month user with valid values for 12month frequency of use and the skip code for 30day frequency of use (93)
lifetime but not past year user with skip codes for 12month frequency of use (993) and 30day frequency of use (93), and missing values for the proportion of the past year that the donor used
One of the constraints built into the imputation programs is to make sure that each respondent's 12month frequency of use is greater than his or her 30day frequency, provided he or she is a past month user. Thus, potential donors are checked to make sure that when their frequencyofuse information is donated to a respondent with missing data, it is consistent with preexisting frequency ofusedata for that respondent. The error resulted from implementing this check across all potential donors, regardless of their recency of use. As a result, missing data values were incorrectly applied in comparisons that were designed to work only with valid frequency of use values. Many potential donors that were past year but not past month users were excluded from the donor pool because their past year frequency was less than 93, the skip code for 30day frequency of use. Even more significant, potential donors who were lifetime but not past year users were entirely excluded from the donor pool because the proportion of the past year that the donor used for these cases was correctly coded to a missing value. The donated 12month frequency that was derived from this proportion was therefore also missing. These missing values were then compared with the past month frequency skip code (93) and determined to be smaller by the software used (SAS). The result of these donor pool restrictions was that for respondents who were known lifetime users of any of the four drugs but had missing information on recency of use, the imputation procedure applied a donor pool made up entirely of past year users, most of whom were past month users.
Table B.7. Comparison of Original And Revised Estimates of Percentages Reporting Past Year and Past Month Use of Illicit Drugs and Alcohol Among Persons Aged 12 or Older: 1999
Past Year 
Past Month  
Drug 
1999 Original 
1999 Revised 
1999 Original 
1999 Revised 
Any Illicit Drug^{1} 
11.9 
11.5 
6.7 
6.3 
Marijuana and Hashish 
8.9 
8.6 
5.1 
4.7 
Heroin 
0.2 
0.2 
0.1 
0.1 
Inhalants 
1.1 
0.9 
0.5 
0.3 
Any Illicit Drug Other Than Marijuana^{1} 
6.3 
6.1 
2.9 
2.7 
Alcohol 
62.6 
62.3 
47.3 
46.4 
Binge Use 
 
 
20.2 
20.2 
Heavy Use 
 
 
5.6 
5.7 
See footnotes at the end of Table B.8.
Table B.8. Comparison of Original And Revised 1999 Estimates of Percentages Reporting Past Year and Past Month Use of Illicit Drugs and Alcohol Among Persons Aged 12 to 17: 1999
Past Year 
Past Month  
Drug 
1999 Original 
1999 Revised 
1999 Original 
1999 Revised 
Any Illicit Drug^{1} 
20.3 
19.8 
10.9 
9.8 
Marijuana and Hashish 
14.4 
14.2 
7.7 
7.2 
Heroin 
0.3 
0.3 
0.2 
0.2 
Inhalants 
4.6 
3.9 
1.9 
1.1 
Any Illicit Drug Other Than Marijuana^{1} 
12.0 
11.6 
5.3 
4.5 
Alcohol 
34.9 
34.1 
18.6 
16.5 
Binge Use 
 
 
10.9 
10.1 
Heavy Use 
 
 
2.5 
2.4 
 Not available.
^{1} Any Illicit Drug indicates use at least once of marijuana/hashish, cocaine (including crack), heroin, hallucinogens (including LSD and PCP), inhalants, or any prescriptiontype psychotherapeutic used nonmedically. Any Illicit Drug Other Than Marijuana indicates use at least once of any of these listed drugs, regardless of marijuana/hashish use; marijuana/hashish users who also have used any of the other listed drugs are included.
^{2} Nonmedical use of any prescriptiontype pain reliever, tranquilizer, stimulant, or sedative; does not include overthecounter drugs.
Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999 CAI.
How the Error Was Corrected:
In the revised programs for the multivariate imputation of recency and frequency of use, the consistency constraints that are applied depend upon the recency of use of the potential donor. Hence, donors who are past month users have one set of consistency constraints applied, past year but not past month users have another set, and lifetime but not past year users have yet another set.
Tables B.7 and B.8 present the 1999 estimates before the error was corrected (original) and after the correction (revised). These original estimates are presented in the 1999 NHSDA Summary of Findings Report (SAMHSA, 2000c); the revised 1999 estimates are included in this report. As expected, most revised estimates are lower than the original estimates. Measures with the most notable decrease were past year and past month use of inhalants, particularly among adolescents. For example, past year inhalant use among persons aged 12 to 17 decreased from 4.6 percent to 3.9 percent (Table B.8).
Validity of SelfReported Use
NHSDA estimates are based on selfreports of drug use, and their value depends on respondents' truthfulness and memory. Although many studies have generally established the validity of selfreport data and the NHSDA procedures were designed to encourage honesty and recall, some degree of underreporting is assumed. No adjustment to NHSDA data is made to correct for this (Appendix D lists a number of references addressing the validity of selfreported drug use data). The methodology used in the NHSDA has been shown to produce more valid results than other selfreport methods (e.g., by telephone) (Turner, Lessler, and Gfroerer 1992; Aquilino 1994). However, comparisons of NHSDA data with data from surveys conducted in classrooms suggest that underreporting of drug use by youth in their homes may be substantial (Gfroerer 1993; Gfroerer, Wright, and Kopstein 1997).
Assessment of Longterm Trends
While the redesign has improved the NHSDA estimates of substance use prevalence, it also made it difficult to assess longterm trends. Because of the major differences between the CAI and PAPI methods, it is not appropriate to compare the 1999 or 2000 CAI estimates of substance use prevalence to earlier NHSDA estimates to assess changes over time in substance use. To assess trends, SAMHSA fielded a supplemental national sample employing the PAPI methodology in 1999. This sample of 13,809 persons employed a paper questionnaire that was identical to the one fielded in 1998. Weighting, editing, and imputation procedures were also conducted in a manner comparable to prior years' surveys.
In spite of the efforts taken to maintain total methodological comparability, analyses have suggested that the 1999 PAPI data are not comparable to earlier data. Investigations into possible problems related to data collection, response rates, Quarter 1 startup, weighting, editing and imputation were done to see if any procedural changes or errors may underlie the problem. While no technical problems or obvious causes associated with these factors have been discovered, one line of inquiry was to investigate possible interviewer experience effects. That study shows that respondents were more likely to report substance use in interviews conducted by inexperienced interviewers than by experienced interviewers. Differences were found in prevalence rates based on data collected by experienced and inexperienced interviewers. Because of the expansion of the sample, a significantly larger proportion of the interviewers in 1999 were inexperienced than in prior years. Also observed was a decline in substance use rates over time (within 1999) that seemed to be correlated only with the growing experience of interviewers.
The impact on prevalence estimates is large enough that comparisons of the 1999 PAPI estimates to estimates from earlier NHSDAs should not generally be included to describe longterm trends. However, based on analysis of statistical models that account for the effect of interviewer experience, adjustments to 1999 PAPI data (in the form of revised analysis weights) have been developed for a limited set of key trend measures of interest. Analysis of the CAI sample discussed in this Appendix indicates smaller interviewer experience effects.
In view of the large discrepancies between the distributions of the interviewer characteristics over the two years, the bounds on the poststratification adjustment factor had to be broadened to keep the same set of covariates in the model in addition to the new interviewer experience covariates. As a result, the realized design effect for the total sample increased from 3.01 to 5.77 because, on average, the adjusted weights were about twice as large as the original weights for the prior NHSDA experience interviewer data while being cut in half for data corresponding to interviewers with no prior NHSDA experience.
Impact of Field Interviewer Experience on the 1999 and 2000 CAI Estimates
In the 1999 NHSDA Summary of Findings Report (SAMHSA, 2000c), it was reported that the large change in the distribution of experienced and inexperienced Field Interviewers (FI) between the 1998 and 1999 surveys was associated with unanticipated and unusually large increases in substance use rates for data collected using the paper and pencil interview (PAPI) method. The report also found that data collected from interviewers with prior NHSDA experience resulted in drug use rates that were significantly lower than rates based on data collected from interviewers with no prior NHSDA experience. As a result, the 1999 PAPI estimates presented in the above SAMHSA report were based on analysis weights that were adjusted to measures representing the 1998 FI experience distribution.
Along with fielding PAPI data, the 1999 NHSDA marked the beginning of the use of computerassisted interviewing (CAI) methods to solicit data from over 66,000 respondents in 50 states and the District of Columbia that year. This section will focus on the analysis of 1999 and 2000 CAI data to determine the impact of FI experience on drug use estimates (PAPI data were not collected in 2000). Overall, it was found these interviewer effects still remain although not as pronounced as found in the PAPI data. Based on these findings, it was not necessary to adjust the CAI analysis weights as was done with the 1999 PAPI data.
Similar to analyses of the 1998 and 1999 PAPI data, Field Interviewer experience for 1999 and 2000 CAI data was defined two different ways: 1) a two level overall experience variable (no prior NHSDA experience, some prior NHSDA experience) and, 2) by interview order, which is a measure of experience level over the course of the survey year (i.e., 1=first interview conducted, 100=100th interview conducted). Here, an interview order was defined in terms of a five level variable is used (119, 2039, 4059, 6099, and 100+). For the 1999 CAI, interviewers with no experience were simply those who did not have NHSDA experience prior to the 1999 survey. For the 2000 survey, interviewers with no experience were those who did not have NHSDA experience prior to 1999 and did not complete any interviews in 1999; thus, until the 2000 survey, these individuals did not have any experience collecting NHSDA data. Tables B.9 and B.10 present the distribution of CAI Field Interviewers and interviews in 1999 and 2000 according to interviewer experience. Over 86 percent of the 1999 interviewer workforce had no prior NHSDA experience, and they were responsible for about 78 percent of the 66,706 completed interviews. In contrast, less than 28 percent of the 2000 interviewer workforce had no prior NHSDA experience, collecting data from less than 15 percent of the 71,764 completed interviews. The large number of inexperienced interviewers in 1999 was due to extensive hiring to work the sample which had expanded threefold from 1998. Note that over half of the interviews were conducted by FIs before their 40^{th} interview in either survey year. Table B.11 (which is the weighted version of Table B.10) show results similar to Table B.10. Overall, the 1999 FI workforce and collected data were dominated by inexperienced interviewers, while the opposite was true in 2000.
Tables B.12 and B.13 compare 1999 CAI and PAPI weighted estimates of lifetime use of any illicit drug and nonmedical use of any psychotherapeutic drug by prior interviewer experience and interview order. Both the 1999 PAPI and 1999 CAI estimates show a decreasing trend as the interview order increases; also, estimates within a given year and interview order were higher among interviewers with no prior NHSDA experience than among those with some experience. However, the decline among PAPI interviewers was generally larger than among CAI interviewers. For example, among PAPI interviewers, the percent change in rates of lifetime use of any nonmedical psychotherapeutic drug decreased overall by 38.8 percent between the 119 and 100+ interview order group (from 13.4 percent to 8.2 percent) (Table B.13). In comparison, estimates from the same interview order groups in the CAI declined by 15.8 percent (from 15.8 percent to 13.3 percent). Estimates of lifetime use of any illicit drug also declined for both PAPI and CAI overall, although at a slower rate between the lowest and highest interview order groups among CAI interviewers.
Using the same two drug measures, Table B.14 contains prevalence rates from the 2000 survey as a function of interview order and experience. Parallel to what was observed from the 1999 PAPI and CAI data, there appears to be an inverse relationship between interview order and drug use rates.
To investigate the effects of adjusting for interview experience on various measures of change, a logistic regression model was used with the results shown as odds ratios. RTI's (Research Triangle Institute) SUDAAN was employed and the analysis weights were used in both years. The sample structure was represented using standard NHSDA analysis NEST statements for variance strata and variance replicates. The drug use measures modeled were lifetime, past year, and past month use of any illicit drug, marijuana, and nonmedical use of any psychotherapeutic drug (Table B.15). In these models, the response variable was a dichotomous measure of drug use (1=yes, 0=no). Odds ratios that are in bold and less than 1 for the "changefrom 1999 to 2000" effect indicate that 2000 estimates are significantly lower than the 1999 estimates; other odds ratios shown in bold are statistically significant from the reference class (at the "=0.05 level of significance). Results are shown before and after the adjustment for covariates. The covariates used are the following: (1) year (1999, 2000); (2) prior interviewer experience (no NHSDA experience, some NHSDA experience); (3) interview order (119, 2039, 4059, 6099, and 100+); (4) age of respondent (1217, 1825, 2634, 35+); (5) census region (Northeast, North Central, South, and West); (6) gender of respondent; (7) race/ethnicity of respondent (Hispanic, NonHispanic black, and NonHispanic, all other races), and (8) population density (1 million or more persons in a Metropolitan Statistical Area (MSA), 250,000 to 999,999 persons in an MSA, less than 250,000 persons in an MSA, persons not in an MSA and not in a rural area; and persons not in an MSA and in a rural area).
Odds ratios that are in bold and less than 1 for the "change from 1999 to 2000" effect indicate that 2000 estimates are significantly lower than the 1999 estimates; other odds ratios shown in bold are statistically significant from the reference class (at the a=0.05 level of significance). Table B.15 shows the unadjusted odds ratio for the "change from 1999 to 2000" to be, in general, similar to the model odds ratio which controls for demographics, prior interviewer experience, and interviewer order. Most notable are odds ratios which are generally lower for experienced interviewers compared to those with no prior experience. However, compared to the PAPI analysis (using exactly the same model on the 1998 and 1999 PAPI data), the CAI odds ratios comparing experienced to inexperienced interviewers are much closer to 1.00. For example, the PAPI odds ratios for nonmedical use of any psychotherapeutics drug during the lifetime and past month were 0.69 and 0.59 (statistically significant), respectively (SAMHSA, 2000c), compared to 0.85 (statistically significant) and 1.02 (not statistically significant), respectively for CAI. Statistically significant odds ratios for any illicit and marijuana lifetime use from the PAPI data where also lower, ranging from 0.84 to 0.90 compared to 0.88 to 0.92 from the CAI data.
Table B.16 shows results from agespecific models for lifetime and past month any illicit substance use. Results for marijuana (not shown) are similar to results for any illicit substance. Except for the elimination of age, the same covariates are used as the model used in Table B.15. As before, results are shown before and after adjustment for demographics, prior interviewer experience, and interview order. Similarly, across age groups, the adjustment does not significantly change the magnitude of the year to year change. Compared to the 1999 PAPI analysis, the odds ratios for Field Interviewers with some NHSDA experience were generally higher in the CAI interviewing environment (although still below 1.00).
In order to examine more directly the effect the more experienced field interviewer workforce in 2000 would have on the 1999 estimates, and subsequently trends, the analysis weights in the 1999 CAI were adjusted (in Table B.17 in this appendix only). More specifically, the 1999 analysis weights were adjusted by introducing additional controls from the 2000 survey into the poststratification step of the 1999 weighting process. The additional control totals were derived by using the 2000 weighted distribution as shown in Table B.11 (i.e., 86.0% with prior NHSDA experience vs. 14.0% with no prior NHSDA experience; 30.0% with interview number 119, 55.2% in the category 2099, and 14.7% in the 100+ category). Since the 2000 control totals for FI experience were so different from the observed ones for 1999 CAI,it required a drastic weight adjustment, and resulted in a threefold increase the design effect due to unequal weighting (from 4.6 before adjustment to 15.9 after adjustment). On average, the adjusted weights were about 3.5 times larger than original weights for the prior NHSDA experience interviewer data, while being cut by a factor of 0.3 for data corresponding to interviewers with no prior NHSDA experience. Table B.17 presents past month use of various illicit drugs, alcohol, and tobacco for 1999 (adjusted and unadjusted for interviewer experience) and 2000. As with the unadjusted 1999 estimates, the results of this interviewer experience adjustment show very few statistically significant differences between the adjusted 1999 and 2000 estimates. Statistical significance between the adjusted 1999 and 2000 estimates and the unadjusted 1999 and 2000 estimates occurred among different characteristics. However, the direction of the change (statistically significant or not) was consistent. For example, for binge alcohol use among persons aged 12 or older, there is a statistically significant increase between the adjusted 1999 estimate (19.3 percent) and the 2000 estimate (20.6 percent). The unadjusted 1999 estimate was 20.2 percent which, while not statistically different from the 2000 estimate, was lower in magnitude. Similar occurrences can be seen for cocaine use (aged 18 and over), heroin use (aged 12 to 17), use of pain relievers (aged 1217), binge alcohol use (aged 18 and over) and cigarette use (aged 12 to 17).
The analysis presented here indicates that the uneven mix of experienced and inexperienced NHSDA field interviewers between 1999 and 2000 had some effect on estimated drug use rates in 1999, 2000, and the trend. Overall, the 1999 and 2000 CAI rates of decline are smaller in magnitude than the 1999 PAPI rates of decline, which is an indication that the CAI methods are playing a role in reducing the effects of FI experience on substance use rates. However, because the mechanism of these effects is unknown, additional studies will be undertaken to increase our understanding this phenomenon. In the meantime, analyses of interviewer effects as seen in this Appendix will continue to be presented in subsequent reports.
These findings have resulted in added emphasis being placed in training and in the field to encourage experienced and new FI's to follow the interview protocol.
Table B.9. Unweighted Distribution of Interviewers by Field Interviewer Experience: 1999 and 2000 CAI
Prior Interviewer NHSDA Experience 
CAI Interviewers  
1999 
2000  
No. 
% 
No. 
%  
None 
1544 
86.4 
368 
27.5 
Some 
243 
13.6 
968 
72.5 
Total 
1787 
100.0 
1336 
100.0 
Table B.10. Unweighted Distribution of CAI Interviews by Interview Order and Prior Interviewer Experience: 1999 and 2000 CAI  
Interview Order 
1999 CAI 
2000 CAI  
No Prior NHSDA 
Some Prior NHSDA 
Total 
No Prior NHSDA 
Some Prior NHSDA 
Total  
No. 
% 
No. 
% 
% 
No. 
% 
No. 
% 
%  
119 
18,713 
28.1 
2,999 
4.5 
32.6 
5,036 
7.0 
15,744 
21.9 
29.0 
2039 
12,088 
18.1 
2,656 
4.0 
22.1 
2,633 
3.7 
13,143 
18.3 
22.0 
4059 
7,902 
11.9 
2,262 
3.4 
15.2 
1,276 
1.8 
10,163 
14.2 
15.9 
6099 
8,505 
12.8 
3,076 
4.6 
17.4 
1,126 
1.6 
12,244 
17.1 
18.6 
100 + 
5,114 
7.7 
3,391 
5.1 
12.8 
426 
0.6 
9,973 
13.9 
14.5 
Subtotals 
52,322 
78.4 
14,384 
21.6 
100.0 
10,497 
14.6 
61,267 
85.4 
100.0 
Total 
66,706 
71,764 
Table B.11. Weighted Distribution of CAI Interviews by Interview Order and Prior Interviewer Experience (Numbers in Thousands): 1999 and 2000 CAI  
Interview Order 
1999 CAI 
2000 CAI  
No Prior NHSDA 
Some Prior NHSDA 
Total 
No Prior NHSDA 
Some Prior NHSDA 
Total  
No. 
% 
No. 
% 
% 
No. 
% 
No. 
% 
%  
119 
66,339 
30.0 
14,760 
6.7 
36.7 
15,335 
6.9 
51,724 
23.2 
30.0 
2039 
39,169 
17.7 
12,646 
5.7 
23.4 
7,957 
3.6 
38,896 
17.4 
21.0 
4059 
22,925 
10.4 
8,582 
3.9 
14.3 
3,376 
1.5 
31,086 
13.9 
15.4 
6099 
22,507 
10.2 
11,166 
5.1 
15.2 
3,361 
1.5 
38,677 
17.3 
18.8 
100 + 
12,416 
5.6 
10,613 
4.8 
10.4 
1,259 
0.6 
31,610 
14.2 
14.7 
Subtotals 
163,355 
73.9 
57,768 
26.1 
100.0 
31,287 
14.0 
191,993 
86.0 
100.0 
Total 
221,123 
223,280 
Table B.12. Percent Reporting Lifetime Use of Any Illicit Drug by Interview Order  
Interview Order 
1999 PAPI 
1999 CAI  
No Prior NHSDA 
Some Prior NHSDA 
All Interviews 
No Prior NHSDA 
Some Prior NHSDA 
All Interviews  
119 
39.9 
36.3 
39.3 
41.5 
39.5 
41.1 
2039 
40.3 
41.8 
40.7 
40.8 
39.4 
40.5 
4059 
38.0 
37.7 
37.9 
38.9 
35.4 
38.0 
6099 
37.7 
37.8 
37.7 
40.7 
34.8 
38.7 
100 + 
35.7 
30.6 
33.8 
37.1 
35.8 
36.5 
All Interviews 
38.9 
37.1 
38.5 
40.5 
37.3 
39.7 
% Change from 119 to 100+ Interviews 
10.5% 
15.7% 
14.0% 
10.6% 
9.4% 
11.2% 
Table B.13 Percent Reporting Lifetime Nonmedical Use of Any Psychotherapeutic Drug by Interview Order and Prior Interviewer Experience: 1999 PAPI and 1999 CAI  
Interview Order 
1999 PAPI 
1999 CAI  
No Prior NHSDA 
Some Prior NHSDA 
All Interviews 
No Prior NHSDA 
Some Prior NHSDA 
All Interviews  
119 
13.3 
13.8 
13.4 
16.0 
14.8 
15.8 
2039 
11.9 
10.9 
11.7 
16.5 
16.4 
16.5 
4059 
12.7 
7.2 
11.1 
15.6 
11.4 
14.4 
6099 
10.6 
8.5 
10.0 
16.2 
13.3 
15.2 
100 + 
9.2 
6.7 
8.2 
14.2 
12.2 
13.3 
All Interviews 
12.0 
9.7 
11.4 
16.0 
13.9 
15.4 
% Change from 119 to 100+ Interviews 
30.8% 
51.4% 
38.8% 
11.3% 
17.6% 
15.8% 
Table B.14. Percent Reporting Lifetime Use of Any Illicit and Nonmedical Use of Any Psychotherapeutic Drug by Interview Order and Prior Interviewer Experience: 2000 CAI  
Interview Order 
2000 CAI 
2000 CAI  
No Prior NHSDA 
Some Prior NHSDA 
All Interviews 
No Prior NHSDA 
Some Prior NHSDA 
All Interviews  
119 
42.9 
40.9 
41.4 
18.4 
15.7 
16.3 
2039 
40.0 
38.7 
38.9 
17.1 
14.8 
15.2 
4059 
43.5 
35.9 
36.6 
15.3 
11.9 
12.2 
6099 
45.7 
38.1 
38.7 
13.5 
13.7 
13.7 
100 + 
34.0 
36.8 
36.7 
10.8 
13.4 
13.3 
All Interviews 
42.2 
38.4 
38.9 
16.9 
14.1 
14.5 
% Change from 119 to 100+ Interviews 
20.7% 
10.0% 
11.4% 
41.3% 
14.6% 
18.4% 
Table B.15. Odds Ratios for Year, Prior Interviewer Experience, and Order Effects for Any Illicit Drug, Marijuana, and Nonmedical Use of Any Psychotherapeutic:1999 and 2000 CAI  
Description 
Any Illicit 
Marijuana 
Any Psychotherapeutics  
Lifetime 
Past year 
Past Month 
Lifetime 
Past year 
Past Month 
Lifetime 
Past year 
Past Month  
Change from 1999 to 2000 

Before adjustment 
0.97 
0.95 
1.00 
0.98 
0.96 
1.02 
0.93 
0.94 
0.96 
Model adjustment 
1.06 
1.03 
1.05 
1.05 
1.04 
1.07 
1.05 
1.00 
0.97 
Prior interviewer experience 

No NHSDA (reference class) 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
Some NHSDA 
0.88 
0.89 
0.94 
0.92 
0.90 
0.93 
0.85 
0.91 
1.02 
Interview order 

119 (reference class) 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
2039 
0.94 
0.93 
0.90 
0.93 
0.94 
0.87 
1.00 
0.94 
1.04 
4059 
0.85 
0.85 
0.84 
0.88 
0.85 
0.79 
0.82 
0.95 
1.07 
6099 
0.92 
0.88 
0.87 
0.95 
0.92 
0.88 
0.90 
0.86 
0.84 
100+ 
0.84 
0.84 
0.86 
0.86 
0.86 
0.85 
0.83 
0.83 
0.91 
Odds ratios in bold are statistically different from 1.00 at the 0.05 level of significance.
Table B.16. Odds Ratios for Year, Prior Interviewer Experience, and Order Effects for Any Illicit Drug, by Age Category:1999 and 2000 CAI  
Description 
Lifetime 
Past Month  
1217 
1825 
2634 
35+ 
1217 
1825 
2634 
35+  
Change from 1999 to 2000 

Before adjustment 
0.96 
0.95 
0.91 
0.99 
0.99 
0.96 
1.16 
0.98 
Model adjustment 
1.02 
1.02 
0.97 
1.10 
1.08 
1.04 
1.14 
1.01 
Prior interviewer experience 

No NHSDA (reference class) 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
Some NHSDA 
0.92 
0.89 
0.93 
0.86 
0.88 
0.90 
1.06 
0.97 
Interview order 

119 (reference class) 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
1.00 
2039 
0.98 
0.91 
0.92 
0.95 
0.92 
0.88 
0.90 
0.92 
4059 
0.89 
0.91 
0.83 
0.84 
0.93 
0.90 
0.80 
0.78 
6099 
0.92 
0.89 
0.88 
0.94 
0.89 
0.94 
0.83 
0.83 
100+ 
0.89 
0.84 
0.77 
0.86 
0.94 
0.84 
0.85 
0.84 
Odds ratios in bold are statistically different from 1.00 at the 0.05 level of significance.
Table B.17. Percentages Reporting Past Month Use of Illicit Drugs, Alcohol, and Tobacco by Age Group: 1999, 1999 Adjusted^{1} and 2000
TIME PERIOD AND AGE  
Total 
1217 
18 and Over  
Drug 
1999 
1999 Adj^{1} 
2000 
1999 
1999 Adj^{1} 
2000 
1999 
1999 Adj^{1} 
2000 
Any Illicit Drug^{2} 
6.3 
6.2 
6.3 
9.8 
9.2 
9.7 
5.8 
5.9 
5.9 
Marijuana and Hashish 
4.7 
4.7 
4.8 
7.2 
6.9 
7.2 
4.4 
4.4 
4.5 
Cocaine 
0.7 
0.6 
0.5 
0.5 
0.4 
0.6 
0.7^{a} 
0.6 
0.5 
Crack 
0.2 
0.2 
0.1 
0.1 
0.1 
0.1 
0.2 
0.3 
0.1 
Heroin 
0.1 
0.1 
0.1 
0.2^{a } 
0.2 
0.1 
0.1 
0.1 
0.1 
Hallucinogens 
0.4 
0.5 
0.4 
1.1 
1.1 
1.2 
0.3 
0.4 
0.4 
LSD 
0.2 
0.2 
0.2 
0.6 
0.7 
0.5 
0.2 
0.2 
0.1 
PCP 
0.0 
0.0 
0.0 
0.1 
0.1 
0.1 
0.0 
0.0 
0.0 
Inhalants 
0.3 
0.2 
0.3 
1.1 
0.9 
1.0 
0.2 
0.2 
0.2 
Nonmedical Use of Any 
1.8 
2.0 
1.7 
2.9 
2.6 
3.0 
1.7 
1.9 
1.6 
Pain Relievers 
1.2 
1.3 
1.2 
2.1 
1.9^{a} 
2.3 
1.1 
1.3 
1.1 
Tranquilizers 
0.5 
0.7 
0.4 
0.5 
0.5 
0.5 
0.5 
0.7 
0.4 
Stimulants 
0.4 
0.6 
0.4 
0.7 
0.6 
0.8 
0.4 
0.6 
0.3 
Sedatives 
0.1 
0.1 
0.1 
0.2 
0.2 
0.2 
0.1 
0.1 
0.1 
Any Illicit Drug 
2.7 
2.9 
2.6 
4.5 
4.2 
4.6 
2.5 
2.7 
2.3 








 
Alcohol 
46.4 
46.3 
46.6 
16.5 
16.4 
16.4 
50.0 
49.8 
50.2 
"Binge" Alcohol Use^{4} 
20.2 
19.3^{a} 
20.6 
10.1 
9.8 
10.4 
21.4 
20.4^{a} 
21.8 
Heavy Alcohol Use^{4} 
5.7 
5.2 ^{ } 
5.6 
2.4 
2.4 
2.6 
6.1 
5.5 
6.0 
Cigarettes 
25.8 
25.5^{ } 
24.9 
14.9^{b } 
14.5 
13.4 
27.0 
26.7 
26.3 
Smokeless Tobacco 
3.4 
3.2 
3.4 
2.3 
2.1 
2.1 
3.6 
3.3 
3.5 
^{a}Difference between estimate and 2000 estimate is statistically significant at the .05 level.
^{b}Difference between estimate and 2000 estimate is statistically significant at the .01 level.
^{1} 1999 Adj estimates have been adjusted to reflect the 2000 distribution of NHSDA interviewing experience among field interviewers.
^{2} Any Illicit Drug indicates use at least once of marijuana/hashish, cocaine (including crack), heroin, hallucinogens (including LSD and PCP), inhalants, or any prescriptiontype psychotherapeutic used nonmedically. Any Illicit Drug Other Than Marijuana indicates use at least once of any of these listed drugs, regardless of marijuana/hashish use; marijuana/hashish users who also have used any of the other listed drugs are included.
^{3} Nonmedical use of any prescriptiontype pain reliever, tranquilizer, stimulant, or sedative; does not include overthecounter drugs.
^{4} "Binge" Alcohol Use is defined as drinking five or more drinks on the same occasion on at least 1 day in the past 30 days. By "occasion" is meant at the same time or within a couple hours of each other. Heavy Alcohol Use is defined as drinking five or more drinks on the same occasion on each of 5 or more days in the past 30 days; all Heavy Alcohol Users are also "Binge" Alcohol Users.
Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999 and 2000.
B.4. Incidence Estimates
For diseases, the incidence rate for a population is defined as the number of new cases of the disease, N, divided by the person time, PT, of exposure or:
The person time of exposure can be measured for the full period of the study or for a shorter period. The person time of exposure ends at the time of diagnosis (e.g., Greenberg et al., 1996, pp. 1619). Similar conventions are applied for defining the incidence of first use of a substance.
Beginning in 1999, the NHSDA questionnaire allows for collection of year and month of first use for recent initiates. Month, day, and year of birth are also obtained directly or imputed in the process. In addition, the questionnaire call record provides the date of the interview. By imputing a day of first use within the year and month of first use reported or imputed, the key respondent inputs in terms of exact dates are known. Exposure time can be determined in terms of days and converted to an annual basis.
Having exact dates of birth and first use also allows us to determine person time of exposure during the targeted period, t. Let the target time period for measuring incidence be specified in terms of dates; e.g,. for the period 1998 we would specify:
,
a period that includes 1 January 1998 and all days up to but not including 1 January 1999. The target age group can also be defined by a half open interval as a= [a_{1},a_{2}). For example, the age group 12 to 17 would be defined by a=[12, 18) for persons at least age 12, but not yet age 18. If person I was in age group a during period t, the time and age interval, ,can then be determined by the intersection:
assuming we can write the time of birth as in terms of day (DOB_{i}), month (MOB_{i}), and year (YOB_{i}). Either this intersection will be empty or we will designate it by the half open interval where:
and
.
The date of first use, t _{fu,d,i}, is also expressed as an exact date. An incident of first drug d use by person I in age group a occurs in time t if . The indicator function I _{i} (d, a, t) used to count incidents of first use is set to , and to 0 otherwise. The person time exposure measured in years and denoted by e_{i}(d, a, t) for a person I of age group a depends on the date of first use. If the date of first use precedes the target period , then e_{i}(d, a, t) = 0. If the date of first use occurs after the target period or if person I has never used drug d, then:
.
If the date for first use occurs during the target period L_{t,a,i} , then:
.
Note that both I_{i} (d,a,t) and e_{i} (d,a,t) are set to zero if the target period L_{t,a,i} is empty; i.e., person I is not in age group a during time t. The incidence rate is then estimated as a weighted ratio estimate:
where the w_{i} are the analytic weights.
Prior to the 1999 survey, exact date data were not available for computing incidence rates. For these rates, a person was considered to be of age a during the entire time interval t , if his/her ath birthday occurred during time interval t (generally, a single year). If the person initiated use during the year, the person time exposure was approximated as onehalf year for all such persons rather than computing it exactly for each person.
Because of the new methodology, the incidence estimates discussed in section 5 are not strictly comparable to the estimates before the 1999 NHSDA. Since they are based on retrospective reports by survey respondents as was the case for earlier estimates, they may be subject to some of the same kinds of biases.
Bias due to differential mortality occurs because some persons who were alive and exposed to the risk of first drug use in the historical periods shown in the tables died before the 1999 NHSDA was conducted. This bias is probably very small for estimates shown in this report. Incidence estimates are also affected by memory errors, including recall decay (tendency to forget events occurring long ago) and forward telescoping (tendency to report that an event occurred more recently than it actually did). These memory errors would both tend to result in estimates for earlier years (i.e., 1960s and 1970s) that are downwardly biased (because of recall decay) and estimates for later years that are upwardly biased (because of telescoping). There is also likely to be some underreporting bias due to social acceptability of drug use behaviors andrespondents' fear of disclosure. This is likely to have the greatest impact on recent estimates, which reflect more recent use and reporting by younger respondents. Finally, for drug use that is frequently initiated at age 10 or younger, estimates based on retrospective reports one year later underestimate total incidence because 11 year old children are not sampled by the NHSDA. Prior analyses showed that alcohol and cigarette (any use) incidence estimates could be significantly affected by this. Therefore, for these drugs no 1998 estimates were made.
A recent study (Johnson, Gerstein, and Rasinski, 1998) concluded that the marijuana incidence trend from the NHSDA was biased because the reporting of initiation declines as the length of time between initiation and the survey increases. However, this study did not address very recent estimates, i.e., 199698, which could be biased because they reflect recent drug use and because they are heavily based on the reports of adolescents. In order to better understand the size of the biases and to assess the reliability of estimates for recent years, OAS performed an analysis of estimates based on single years of NHSDA data. This analysis focused on three drugs: cocaine, heroin, and marijuana. Using the survey data from 1994 to 1998, estimates were made of the number of initiates, the rate of initiation for youth aged 12 to 17, and the rate of initiation for persons aged 18 to 25. For the 1994 survey, an estimate was made for the year 1993. For the 1995 survey, another estimate was made for the year 1993. In this way, two recent estimates of the same year could be compared. Similarly, the 1995 and 1996 data provided two estimates for 1994, the 1996 and 1997 surveys provided two estimates for 1995, the 1997 and 1998 surveys provided two estimates for 1996. Since these calculations represent two measurements of the same population characteristic, they would ideally be the same. Examples of these estimates are shown in the following table:
Table B.18. Comparison of Initiation Rates by Year of Initiation and Survey Year
Year of Initiation 
Avg. of Ratio  
1993 
1994 
1995 
1996 
 
Year of Survey 
 
1994 
1995 
1995 
1996 
1996 
1997 
1997 
1998 

Rate for Age 1217 Marijuana Cocaine Heroin Rate for Age 1825 Marijuana Cocaine Heroin 
8.9 0.7
12.8 0.1 
5.0 0.5
12.8 1.4 
10.2 2.1
9.9 1.4 
5.7 1.4
11.8 2.1 
10.6 2.5
13.8 2.4 
8.0 1.8
14.7 1.9 
11.3 3.9
14.8 2.3 
11.0 1.5
13.9 3.0 
1.480 1.722
0.961 0.692 
Number of Initiates Marijuana Cocaine Heroin 
595 41 
538 62 
533 122 
530 97 
652 141 
654 93 
675 171 
664 127 
1.031 1.195 
Drug initiation rates for youth aged 12 to 17 for the more hard core drugs (like cocaine and heroin) appear to be most prone to bias. For example, on average across the four survey years, the estimate for the rate of initiation of cocaine use among youth aged 12 to 17 was 48% higher the first time the estimate could be made than the second time. This indicates a probable bias in the estimation; however, it is unclear which estimate is the correct one. As a result, one should be cautious in interpreting any changes between the prior year and the most recent year in the initiation rates for youth of the more stigmatized drugs. Since only five years of data were used to estimate how the rate of incidence changes between the first year it can be estimated and the second, one should be cautious about inferring the magnitude of the bias (for example, that it is 48% for cocaine).
In the above table, the average ratio of one year recall to two year recall is calculated across four "years." Implicit in the above table is the fact that the estimates for each ratio vary around the average. For example, therefore, taking the 18 to 25 marijuana incidence numbers, the four individual ratios can be calculated as 1.13, .75, .89, and 1.06. While the average ratio is .96, the yeartoyear variation is much larger, ranging from .75 to 1.13. So, it is clear that for any single year, the bias implied by the sample estimates could be negative or positive. Since we are not clear whether the 1year recall or the 2year recall estimate is closer to unbiased true value, then the estimate that we use for the most recent year could be as much as 25 percent too high or too low in this example. The samples for 1999 and 2000 based on the new computerassisted interviewing method are significantly larger than those in prior years; therefore, estimates of bias should suffer from less sampling variability and the estimates should be less variable than before. Nevertheless, since there are only two years under the new computer assisted interview method, and, therefore, only one calculation possible of the ratio of the onetotwo year recall, more analysis is needed to see how stable the new estimates from CAI will be.
This page was last updated on June 03, 2008. 