| 1. Data
Sources: Utilizing the list of Validated Archival Indicators of Risk and
Outcome Variables that Predict Problem Behavior and the definitions provided
by Center for Substance Abuse Prevention (CSAP) at SAMHSA, appropriate data
were collected from existing records of state, county, city, and other
governmental agencies. The specific data source for each social indicator was
indicated on the page reporting the frequency of the social indicator within
the state, as well as in the Data Definitions section of this report.
2. Data Collection
Methods: Data were obtained electronically, whenever possible. Some data
did, however, have to be transferred from a hard copy to an electronic
database. A format for entering the social indicator data into a database was
completed; all specific geographic coding received with the data set was
maintained.
3. Calculation of
Population Frequency (Rates): Most
of the social indicator variables required that a frequency or rate be
calculated, e.g. juvenile arrest rate for alcohol violations per 100,000
juveniles. This calculation required that an appropriate denominator be
associated with the appropriate numerator. The numerator was the number of
events identified from the data set for the appropriate age range, gender, and
geographic unit. The denominator was the estimated number of persons of the
same age range, gender, and geographic unit who were potentially at risk, i.e.
lived in that area during the same time period.
For all data that used 1997
and 1999 event information, the appropriate denominator data were 1997 and
1999 population estimates. The county data were obtained from the United
States Census and were the same estimates used by the Department of Economic
Security and Arizona Department of Health Services (ADHS). These denominator
data may be found in table format at the end of the Data Definitions Section.
For those variables using 1990 US Census files as the source of numerators
(e.g. adults without a high school diploma), the denominator was obtained from
the 1990 census. For the 1999 community data, specific population estimates
were obtained from ADHS.
For most of the
county-level and state-level indicators, 95% confidence intervals were
calculated around the rate. All calculations were made using Stata software,
version 6 and assumed the Poisson distribution. It should be noted that for
those indicators that could incorporate negative change, e.g. net migration,
the underlying formulas did not allow the interval to overlap zero.
All rates and confidence
intervals were recalculated for this report and some differences were noted
from the prior report. Results from the current report should be considered
the final results.
4. Geographic Areas
Sampled: In order
to provide consistent reporting of the social indicator data across the state,
the population frequency (rates) for each indicator were estimated for each
county and the overall state. Not all data sets included sufficient
information for estimation of the frequency of the indicator for geographic
areas smaller than a county, e.g. community. Also, the numbers of events were
extremely low for some indicators (e.g. adolescent suicide), making rate
estimation inappropriate. Furthermore, some jurisdictions, e.g. South Tucson,
were not recognized geographical units within each data source. These analyses
would require further assumptions and interpolation to construct the smaller
jurisdiction rates.
5. Standard Definitions:
The standard definitions
specified in the contract by SAMSHA were used whenever possible. However,
based on data availability, it became necessary to refine a number of
definitions. All definitions are listed in the Data Definitions section at the
end of this report. Final definitions are reported at the bottom of each table
reporting the statewide frequency of the social indicator and are labeled
"ADHS Definition" in the Data Definitions section of this report.
6. Graphic
Presentations: For
each social indicator, state maps were created to graphically represent the
frequency of the social indicator throughout the state. ARCVIEW software was
used to develop these maps. The maps provided a visual description of the
ranking of the counties throughout the state for each social indicator. The
z-score (or standard deviation from the mean of all 15 counties) was used to
develop these maps. ARCVIEW categorizes the individual county scores by their
relative distance from the county mean. In the maps, shades of red represent
counties above the mean of the 15 counties and shades of blue represent
counties below the mean.
7. Z-Scores: For
this report, this score is the standard deviation for each county of the
social indicator rate from the mean of all the county experiences
(all-counties mean). It was calculated as: (county rate - all-counties mean) /
standard deviation of the all-counties mean
Conversion of the
individual county rates for each variable to z-scores produces a score
distribution with a mean of 0.00 and a standard deviation of 1.00. This score
allows meaningful comparisons between multiple variables that may, in their
unconverted forms, display widely varying means and standard deviations,
making comparison across counties and across variables difficult.
For this z-score
calculation, the all-counties mean represents the mean value of the 15 Arizona
county data points (i.e., the all-counties mean); it has a value in z-score
metric, as stated above, of 0.00. An indicator z-score of +/- 1.00 represents
then a value that is a single standard deviation from the county mean, with a
z-score of positive 1.00 representing a value one standard deviation above the
all-counties mean, and a z-score of negative 1.00 representing a value one
standard
deviation below the all-counties mean.
8. County Profiles:
Risk profiles were developed for each county to summarize the experience of
that county for the set of social indicators. The first page of the profiles
lists the specific county rates for each indicator and the overall Arizona
rate for the same time period. The second page of profiles consists of graphs
that display the degree to which the county rates vary from the experience of
all the state counties (the standard variance above or below the mean of the
state 15 counties for each social indicator). These graphs used the z-score
defined above.
9. Cautions: Several
issues arose throughout the study for some of the variables. These are
described within the Data Definitions Section; however, further note should be
made of these problems.
- Domestic Violence:
This variable is only voluntarily submitted to the Governor's Council. We
went ahead and calculated county rates; however, these are serious
underestimates of the rates because not all cities submitted data and not
all cities appeared to submit complete reports. We suggest caution in
interpreting county data for this variable. We did not create a map for
this variable.
- Children Living in
Foster Care: The definition used by the state agency changed between
the two time periods. The relationships between the counties may have
stayed the same; however, the absolute rates between the two time periods
will not be comparable.
- Tobacco Sales
Outlets: This variable was only collected for 1998. This data does not
appear to be routinely collected at smaller geographical areas. This
variable will not, therefore, be of utility over a longer time period
needed for monitoring. However, the 1998 information is presented in this
report.
- 1990 Census
Variables: The list of Validated Archival Indicators of Risk and
Outcome Variables that Predict Problem Behavior mandated that five 1990
census variables be included in the risk profiles. It is unlikely,
however, that 1990 information will be of substantial utility for a
rapidly changing population. The next archival data collection period
should include 2000 Census information.
Limitations
of the Data
Most of the indicator
variables in this study are aggregate measures, meaning they are summaries of
observations derived from individuals in the group. In this type of analysis,
the social indicator variables are ecological variables with the unit of
analysis the group (e.g. the county). Within each geographic unit, we do not
actually know the joint distribution of any combination of variables at the
individual level. For instance, we do not know the joint distribution of
whether an individual is from a divorced home and a substance user, or whether
a person from a high poverty area is actually below the poverty level. As
noted by numerous statisticians and epidemiologists, it can be misleading to
use ecological variables as proxies for individual data in models to predict
individual behavior. This makes ecological analyses particularly prone to a
type of bias known as the ecological fallacy (Morgenstern, 1998). The
potential for ecological fallacy will be particularly relevant when comparing
the risk profile information with the student survey results.
The aggregate variables,
however, often measure a different construct than a similar variable at the
individual level. The variable may be the social environment or context in
which the individual lives, and this environment may be distinct from the
personal attribute of the individual (Susser, 1994). The creation of a risk
profile from social indicator data for substance abuse within communities
should not imply that community characteristics are equivalent to
individual-level characteristics. These ecological variables can be useful
tools to define high-risk groups for community intervention and education
programs (Feinleib, 1998).
Another problem inherent in
ecological analyses is temporal ambiguity. It is often unclear whether the
various social indicator variables came as a result of the outcome (high or
low substance abuse rates) or that they led to the outcome. A specific problem
for the current study is the use of social indicator estimates derived from
the 1990 US Census data to represent population experiences during 1997 and
1999. It is unclear whether, in a state undergoing rapid population changes,
that the information from 1990 will still be relevant for all geographic areas
in 1997. These data were required for this Report and are included within the
tables. However, the information may not be as relevant as originally
intended. As 2000 US Census data become available, these indicators can
supplant the 1990 data.
Finally, it must be
remembered that these social indicator data are based on archival data
collected within the state by multiple agencies for multiple purposes, none of
which included prevention assessment. While the use of archival data can be
time and cost effective, there are limitations to its utility. There are
distinct variations in the geographic boundaries used by the different
collecting agencies. For instance, some information is collected only at the
zip code level and others only at the city jurisdiction level. Since there is
not perfect congruity between zip codes and city jurisdictions, if zip code
information is to be aggregated to the city level, a set of assumptions and
interpolations will need to be made. The appropriateness of these assumptions
need to be kept in mind while reviewing the risk profiles. Another issue is
that data systems used within the agencies for collecting and archiving data
are constantly changing. Variables that are available one year for the Social
Indicator Study may be modified, or even eliminated, by a reporting agency
another year. Definitions used to structure the variable can also change,
making it necessary to annually review the data sources being received by an
archival monitoring system.
Summary and
Recommendations
At the end of a project,
there is always more known about the problems and issues than are known at the
beginning. The original goal of the Social Indicator Study was to develop an
ongoing system of gathering and monitoring a specific set of archival data.
The specific aims had been to collect annual data for three years, to
determine if there were changes in the frequency of the various indicators
over the time period, and to then compare the risk factors with corresponding
domains from results from the Student survey. Of necessity, these aims were
modified to reflect the change in budget, a decrease in the number of years of
data collection, and the inability to compare archival data with the final
student surveys. The Social Indicator Study did, however, collect data for 40
indicators for two years and integrated results for all these data into a
documented database. The prevalence of the various social indicators was
calculated by standardized geographic and demographic subgroups for individual
years and by individual counties. Risk profiles for the 40 specific indicators
and for a potentially relevant subset were developed for counties and selected
communities.
From this
collective work, we make several suggestions for future archival data
monitoring projects within the state:
- Carefully evaluate each
variable for the coverage being collected by the agency. Do not include a
variable in the main database if it is not collected by most of the
jurisdictions within the state, regardless of the national mandate to
collect the information. Domestic violence arrests is a variable, for
instance, that is only voluntarily collected, making for poor coverage and
probably of poor utility for an ongoing archival project.
- Consider presenting the
merged data across several years of data collection. This should increase
the reliability of the indicators and strengthen assumptions made
regarding the data.
- Make the archival
database flexible. Geographical areas of interest change; new variables
may need to be added as new data sources become available.
References
Used in Report
Arthur MW & Blitz C.
(2000). Bridging the gap between science and practice in drug abuse prevention
through needs assessment and strategic community planning. J Community
Psychology 28:241-255.
Cagle LT & Banks SM.
(1986). The validity of assessing mental health needs with social indicators.
Evaluation and Program Planning 9: 127-142.
Feinleib M. (1998). A new
twist in ecological studies. Am J Public Health. 88:1445-1446.
Fiorentine R. (1994).
Assessing drug and alcohol treatment needs of general and special populations:
Conceptual, empirical and inferential issues. Journal of Drug Issues.
24:445-462.
Gruenewald P.J., Treno A.J.,
Taff G. & Klitzner M. (1997). Measuring community indicators. A system
approach to drug and alcohol problems. Thousands Oaks, CA: Sage Publications.
Hawkins J.D., Arthur M.D.,
& Catalano R.F. (1995). Preventing substance abuse. In M. Tonry & D.
Farrington (eds), Crime and Justice: Vol 19. Building a safer society.
Strategic approaches to crime prevention. (p 343-427). Chicago: University of
Chicago Press.
Hawkins J.D., Catalano R.F.,
& Miller, J.Y. (1992). Risk and protective factors for alcohol and other
drug problems in adolescence and early adulthood: Implications for substance
abuse prevention. Psychological Bulletin, 112: 64-105.
Morgenstern H. (1998).
Ecological Studies. In: Rothman, KJ and Greenland, S.(Eds.), Modern
Epidemiology, 2nd Ed. (pp. 459-480). Philadelphia, PA: Lippincott-Raven
Publishers.
Susser M. (1994). The logic
in ecological: II. The logic of design. Am J. Public Health,84: 830-835.
Wieczorek, W. (1997).
Alcohol and other drug abuse prevention services needs assessment:
County-level social indicator study. Albany NY: New York State Office of
Alcoholism and Substance Abuse Services. |