![]() |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Analytic Framework One artifact of using lists of witness testimonies as data sources results from the fact that for any particular killing, there may be multiple witnesses to the events or evidence of the killings. Thus, more than one witness may report having observed the same killing or killings to investigators. On the other hand, some killings may not have been witnessed by others and hence went unreported by anyone. Therefore, it is not adequate to simply add up the total number of killings in the data files as an estimate of the total number of killings that occurred. Researchers must attempt to determine the number of killings that were reported in multiple sources, as well as estimate the number of killings that were not reported in any of the three studies. The question then is how to analyze non-random data that contains multiple reports of some incidents—yet no reports at all of other incidents. Population-based studies can be complicated by a low event rate in the population. In these studies, a random sample of the population is surveyed, and the prevalence of an event within the sample is weighted and applied to the population. A population-based study must sample enough cases to document a sufficient number of the studied event for analysis. Standard errors, necessary to construct confidence intervals, are sensitive to sample sizes and event prevalence, and sub-group analyses (e.g., by time, geography or perpetrator) can only be conducted with a sufficiently large number of observed events. Thus, for population-based studies, researching relatively rare events requires a larger sample.38 However, as sample size increases, so does the time and cost required to conduct the data collection. In the case of the event being studied here, the number of victims is a relatively small proportion of the overall population. Traditional sampling techniques would require a large number of households to be interviewed in order to generate enough documented killings for reliable estimates and detailed analysis. For example, in the PHR sample, only about five percent of the sampled households reported the killing of one or more household members (59 out of over 1,000 interviews). While this number may be adequate for generating an overall estimate of the number of killings, the low prevalence in the sample limits the ability of researchers to conduct detailed analyses. To conduct in-depth analyses with reliable estimates, several thousand additional interviews would need to be conducted. In addition, population-based studies depend on systematic sampling techniques and individual events being reported only once. This can be ensured by restricting the reporting of witnesses to those events that occurred within the sampling unit. For example, the PHR study asks respondents about killings of household members only. Since only one representative of each household participated in the study, killings of individual household members cannot be included more than once (though a respondent may have reported killings of more than one household member). However, when all members of the sampling unit are killed, the killings cannot be reported.39 Thus, population-based studies may systematically exclude some reports in an attempt to eliminate duplicating reporting. AAAS has outlined two major obstacles to generating accurate estimates of the number of Kosovar Albanians killed during the violence from March 20 to June 12, 1999. First is that the prevalence in the population is sufficiently low that traditional population-based data collection techniques are more costly and less efficient. Second is that witnesses will often provide multiple reports of the same killing and attempts to limit over-reporting may result in under-reporting. Thus, population-based estimation techniques may not be best suited to determining estimates of the killing that occurred in Kosova/Kosovo. While the PHR data were collected using a quasi-random stratified sampling technique, the HRW and ABA/CEELI - Center data were not collected using random sampling techniques. In both of these studies, the researchers specifically sought out reports of killings and human rights abuses by interviewing as many individual witnesses as possible. The projects were originally intended to provide as much documentation of human rights abuses as possible, not to make population estimates. Thus, while the PHR data can be considered a random sample of refugees, the HRW and ABA/CEELI - Center data can only be considered lists of killings. To use these two lists to generate population estimates, alternative analysis techniques must be employed. There are analytic techniques that are not only well-suited to correct for the limitations of list-based data, but actually benefit from this method of data collection. Marks, Seltzer and Krótki40 outline a technique that allows researchers to use the results of multiple, quasi-independent data collections to compute not only the recorded number of events but also estimate the unrecorded number of events. Using these techniques, AAAS is able to (1) compute the number of reported killings and (2) estimate the number of unreported killings to generate an overall estimate of the number of killings of Kosovar Albanians during the 85 days between March 20 and June 12, 1999. While originally developed to estimate wildlife populations, capture-recapture techniques have more recently been adapted by demographic, public health and human rights researchers for a variety of projects.41 Among other things, capture-recapture techniques have been used to estimate the prevalence of drug use,42 HIV infection43 and prostitution.44 This technique has also been used extensively to evaluate the level of undercount in the decennial census of the United States.45 In the area of human rights, capture-recapture techniques have been applied to analyse the number of killings during the violence in Guatemala between 1960 and 1996.46 Underlying capture recapture techniques is basic probability theory. The most basic principle is that if A and B are two independent events, then the probability of the two events jointly occurring is equal to the probability of A occurring times the probability of B occurring.
The next proposition states that if a researcher uses a data collection method that is known to obtain reports of a fixed percentage of the total number of events in the population, the population total can be estimated by:
Thus, in order to estimate the number of events in the population, the number of events in the sample (NA) and an estimate of the efficiency of the data collection method (PA) must be determined. By returning to Equation 1, and keeping in mind the assumption of independence between the two data sources, PA can be estimated by:
Given that
the total number of observations in the population,
and thus,
The two formulas
and
can be substituted into Equation 6, to show that
or, in the two-sample notation style, where the subscript 1 indicates presence in and 0 indicates absence from a data source,
The last portion of the equation, equals the number of events that are unreported to either study Thus, by knowing the level of overlap, or the number of killings reported to two independent lists, it is possible to generate an estimate of the number of killings that occurred in the population that includes an estimate of the number of killings that were not recorded in either list. The two-sample estimator shown in Equation 10 is the simplest model within this general analytic technique. Marks, Seltzer, and Krótki47 also present a model that allows for estimation using three samples. In addition, there have been many other developments, with some of the more recent variations48 on this basic technique also used in this report. The specific formulas for the estimates and their standard errors are presented below. Given the number of killings reported by one, two, or all three projects, we can estimate the number of killings excluded from all three samples. Each estimation technique used in these analyses is based on the principles of the general capture-recapture model. There are several methods by which this estimate can be made, each of which involves different assumptions about the relationships between the sources of data. The first method is Equation 11 from Marks, Seltzer, and Krótki49
When combined with the unduplicated documented total
this yields a total estimate of 10,538. This model assumes that there may be “appreciable correlation bias,” that is, the existence of inter-system dependencies among the three lists or systems. Marks, Seltzer, and Krótki50
do not suggest a variance estimator. Therefore, the varaiance was computed
using a jackknife estimator, following Wolter,51
where The variance is defined by
and
Equation 13 yields k values of The other beneficial result of the jackknife method is that the values
of
A standard error of 1,576 was calculated with Equation 15. This standard error is used to generate the confidence interval in bar one of Figure 1. A variation on these equations yields convergent estimates of the number of killings and the corresponding standard error. Bishop, Fienberg, and Holland52 suggest the following estimator for cases in which one sample is independent of the first two. This model is plausible for this case since the PHR sample was taken systematically within some camps, which would lead it to substantially different biases from the arbitrarily collected information in the ABA/CEELI - Center and HRW samples (though all of these samples overcollected information from refugees relative to people who were internally displaced). Treating PHR as the third (independent) system yields
When combined with the documented total per Equation 12, Equation 16 yields a total estimate of 10,242. The standard error is defined by
where the subscript “+” indicates summation over that variable. This equation yields an estimated standard error of 1,412. While this estimate and its confidence interval are not presented in the text of this report, the results provide convergent estimates, and therefore support, those presented. Data Preparation and Estimation Regardless of the specific equations used, the generation of estimates using the capture-recapture technique involves the following steps: generating two or more internally non-redundant lists of events; matching events across lists to identify those events that are documented in two or more of the sources; merging the lists into one file and; estimating the number of undocumented events using the information on the matching of incidents across sources. A final step involves generating the estimates of the standard error in order to develop a confidence interval around the estimate of the total number of documented and undocumented events. Reports of killings took multiple forms within the data lists. In the PHR data, respondents identified specific household members who had been killed, and the result was reasonably detailed identifications of victims. The ABA/CEELI-Center and HRW data were collected during interviews in which respondents were asked to describe all the killings or evidence of killings that they witnessed. These data collection techniques often yielded imprecise descriptions of the victims. Many killings were described in specific terms, naming the individual who had been killed, and perhaps providing the person’s sex and age. For example, a respondent might say that her son, John Doe (a 27 year-old male) was killed. Or, the respondent might provide a list of individual people who were killed; killings identified in this way are called individual, named victims. Other killings were reported as unnamed groups: “there were twenty people killed in village X on March 28.” Killing victims identified in this way are called anonymous victims. Many reports are a mixture of the two forms: “my son Adam and his wife Betty were killed, along with twenty others from the village.” The HRW and ABA/CEELI-Center projects sought multiple witnesses to killing events. This format meant that each killing could have been reported in many different interviews. This is a “many-to-many” reporting format, in which each witness may talk about many killings, and each killing may be reported by many different witnesses. Before any statistical analyses can begin, all reports of each killing had to be identified so that the victims are counted only once. The number of times each victim is reported is called that victim’s reporting density. For example, the victim Adam Smith (M 27) may have been reported as the respondent’s son and as the colleague of some other victim (say, Carl) in a report by another witness. The reporting density for Adam would therefore be two. Reports of killings may identify victims by slightly different information. Adam’s name may be spelled in various ways, his age might be reported differently or not reported at all. The killing may be reported in a slightly different location, or on a slightly different date. The matching must take all of this variability into consideration. Killings of large groups present other complexities. The group killing described above may be reported by one respondent as “my son Adam and his wife Betty were killed, along with twenty others from the village.” A second respondent might say “My friend Carl and his colleague Adam were killed, along with about 25 other people.” Matching these reports, three individuals are identified, each of whom has a reporting density equal to two. Adam is clearly identified in each report. Betty and Carl also have a reporting density of two because they are identified once by name, and a second time as implied members of an anonymous group. In this example, the group also has a reporting density of two. Its quantity is the number in the group (20 or 25, depending on which number coders judged to have been more precise) minus the number of named individuals identified in the group. If the data coders judged 20 to be the more accurate estimate of the group size than 25 (as reported in the matched interview), the group of anonymous victims would be assigned quantity equal to 18 (20 minus Betty and Carl). The data coders took a maximal approach to matching. That is, whenever two individuals or groups seemed likely to be matches, they were coded as matches. We were concerned that matching errors should be conservative, that is, that the errors would tend to create bias toward lower total estimates (see “Sources of Bias in Estimates” below). How the matching was done, and the subsequent data processing, are described below. Named victims were matched against other named victims by crosschecking the names, the reported dates of the killings, and the reported places of the killings. The record being examined is called the source; the records to which it is compared are called the targets. All target records within +/- one week of the source were considered, and all target records in the same municipality as the source were considered. Names that contained obvious variant spellings or partial information (i.e., including only a first name or surname) were considered matches. Named source records were also matched against target collective records using the place and time limitations described above. These matches were more difficult to establish because the collective records give only a place and date at which a number of people were killed. Whenever there was minimal agreement between the time and place of an individual and collective killing, the data coders defined the killing as a match. Similarly, whenever collective killings were compared to other collective killings, minimal agreement was sufficient to define a match. Intra-system matching often produced clusters of linked records in which the links were overdetermined: all the records in a particular incident linked to all the other records. The overlinking conflated separate individuals who were each linked to groups of anonymous victims. Returning to an example suggested earlier, imagine two reports. The first report says “my son Adam and his wife Betty were killed, along with 20 others from the village.” A second respondent might say “My friend Carl and his colleague Adam were killed, along with about 25 other people.” Matching these reports, the coders linked the reference to Adam in each report, and they linked the two groups of unnamed victims. However, the linking process failed to distinguish among individuals and groups clearly, and so Carl and Betty were also inadvertantly linked to each other and to Adam. To solve this, an additional matching step was added. A first pass of matching pulled all the related records together as described above. In a second pass, each incident (composed of all the individuals and groups identified as having been killed at one time and place) was examined. All the separate individuals were unlinked from the anonymous group while maintaining the links for records that point to a single person. The overall count of the group of anonymous victims was then decremented by the number of identified individuals who had been pulled out. In this example, the group would then be identified as having 17 people in it (20 minus the three individuals). Internal Matching of Incidents. Prior to matching victims across lists, each list must be free of internal duplications of reports. Only one of the three data lists contained duplicate reports of the same killing incident. The PHR data contained reports of killings of household members only and therefore contained no redundant reports of incidents. The HRW data had been pre-processed by researchers at Human Rights Watch, who had examined all reports of killings and eliminated any redundant incidents. Thus, only the ABA/CEELI-Center list contained redundant reports that needed to be identified and eliminated. The identification and elimination of duplicate records was conducted in the manner described above. Matching Incidents Across Lists. In order to determine the level of overlap of reports across the three sources of data, the three lists were then matched. The PHR data were matched to both the HRW and the ABA/CEELI-Center lists and the HRW list was matched to the ABA/CEELI-Center list. Records could be unmatched, double-matched (found in two lists) and triple-matched (found in all three lists). Inter-Matcher Reliability. AAAS generated two different measures of inter-matcher reliability. The first relates to the identification of matches between the HRW and ABA/CEELI-Center lists. The second relates to the correspondence between matches to the ABA/CEELI-Center list from the PHR and HRW lists. Overall, the inter-matcher reliability across lists indicated a high level of consistency in identifying multiple reports of killings. When matched to the ABA/CEELI - Center list, the HRW list was divided into three overlapping subsets. Four hundred of the records were duplicated and listed in one of the other two subsets. Inter-matcher reliability was assessed by examining how different matchers linked these duplicated HRW records to the ABA/CEELI-Center list. Examination of the reliability across subsets indicate a very high level of agreement in matching clusters. Of these 400 overlapping records, only 49, or twelve percent, contained any discrepancies in how they were matched to the ABA/CEELI-Center list. The second source of inter-matcher reliability was determined when comparing how the matching clusters from the HRW and PHR agreed or disagreed in their match to the ABA/CEELI-Center list. In total there were seven records from the PHR list that matched records from the HRW list. These seven records were evaluated to assess their agreement in either matching or not matching to records in the ABA/CEELI-Center list. The consistency in HRW and PHR matching to the ABA/CEELI-Center list was not highly reliable. Only one of the seven records agreed that there was no match to the ABA/CEELI-Center list, while the remaining six records contained some sort of discordance in their match. In three of the seven, the HRW and PHR records were linked to different ABA/CEELI-Center records. In the final three records, either the HRW or PHR record matched to an ABA/CEELI-Center record while the other did not. While this portion of the matching was not reliable, it is important to note that this involved only seven records in total. Overall, AAAS is confident that there was sufficient reliability during the matching of incidents across lists. While the reliability of the agreement of the HRW and PHR links to the ABA/CEELI-Center list was poor, it involved only seven cases. In the larger matching of the HRW to the ABA/CEELI-Center list, the inter-matcher reliability is quite high, with a rate of 88 percent. Data Merging. Once the three lists were matched, they were merged into one data file. Each list may contain records with reports of multiple killings. That is, one record could contain a report of two or more unnamed individuals who were killed in the same incident. The result is that a cluster of killings in one data file may contain multiple records of multiple killing reports and match to a cluster in one or both of the other data files. These other clusters may also contain multiple records of multiple killing reports. In addition, a cluster may be listed in multiple sources but each list may not report the same number of killings. Given this file structure, the data merging process was somewhat more complicated than a simple one-to-one match merge. The first step involved comparing the clusters across lists and correcting for those that contained an unknown number of killings. For example, the HRW list may contain a cluster that reports the killing of 20 individuals. This cluster may be linked to a cluster in the ABA/CEELI-Center list that reports the same incident but for which the total number of victims was unknown. In cases such as this, the number of victims in the ABA/CEELI-Center file was set to equal the number of cases in the matching HRW cluster. Thus, AAAS was able to properly match records for killings that were linked but the total number of victims was unknown in one list. It is important to note that this adjustment was only made in the case of a link between clusters where one but not both had an unknown number of victims; if both contained an unknown quantity or there were known but unequal numbers in both clusters, the number of victims was not adjusted. File merging was conducted in two stages. First, the PHR list was merged to the HRW list. In this case, seven of the 59 PHR records matched records from the HRW file. The 52 unmatched records were added to the merged file. For the matching seven records, the PHR and HRW data may have contained links to the ABA/CEELI-Center list, with disagreement on the linkage to a specific ABA/CEELI-Center cluster. These link discrepancies could take three forms. First, the HRW data may indicate no link to ABA/CEELI-Center clusters, while the PHR data indicate a link. Second, the HRW data could indicate a link to the ABA/CEELI-Center clusters while the PHR data indicate no link. Finally, it may be that the HRW and PHR data indicate links to different ABA/CEELI-Center clusters. Since there was no simple system for determining which of the two records were more accurate, the following rules were applied. If one data source indicated a link to an ABA/CEELI-Center cluster while another did not, the link was preserved. However, if both data files indicated a different cluster link, the HRW cluster link was preserved. These discrepancies occurred in a total of six records. Once the HRW and PHR lists were merged, the ABA/CEELI-Center list was then merged by cluster identification number, with any unmatched cluster records being appended to the file. When matching the ABA/CEELI-Center data to the combined HRW/PHR list, 1,827 individual killing records were merged, while the remainder were appended to the combined file. After selecting only those incidents that occurred between March 20 and June 12 within Kosova/Kosovo, the combined data file contained 7,322 documented killings. 3,909 were from the HRW list, while 21 and 5,417 were from the PHR and ABA/CEELI-Center lists, respectively.53 See Tables 1 through 3 for detailed information on the overlap of records for the three data files as they were merged into the combined data file. Table 1. Killing Overlap Counts for Combined Data File
Table 1 presents information on the number and percent of individual killings in the combined data file that were contained in the different lists. For example, 5 (0.07 percent of the 7,322 killings were reported in all three lists (M111) while 2,005 (27 percent) of the killings were reported in the HRW and ABA/CEELI-Center, but not the PHR list (M101). Table 2. Overlap Percentages within Data Sources
Table 2 shows the overlap as the percentage of the number of killings within in each list. For example, 0.13 percent of the 3,909 HRW killings were also listed in the PHR and ABA/CEELI-Center lists (M111) while nearly 24 percent of the 21 PHR killings were reported in the other two lists. Table 3. Aggregate Overlap Percentages within Data Sources
Table 3 shows the percent of non-matches and double and triple-matches as the percentage of the number of killings reported in each list. Note that for each list, a relatively high number of killings were reported in at least one other list. In the HRW, PHR and ABA/CEELI-Center lists, 52 percent, 71 percent and 37 percent of the reports were double-or triple-matched, respectively. Characteristics of the killings, as well as the data collection and preparation process can introduce either upward or downward bias into these estimates.54 Some of these factors influence our estimates while others do not. The following section addresses potential sources of bias and discuss what, if any, effects they may have on these estimates. Homogeneity of Catchability. Homogeneity within lists results when the probability of inclusion on a source list does not vary from individual to individual. Heterogeneity, or unequal probabilities of being listed in the data sources, can generate bias in the resulting estimates (the direction of bias depends on the pattern of heterogeneity). Studies relying on voluntary reporting are especially susceptible to this form of bias. While this study does not rely strictly on voluntary reporting, there are some characteristics of events that may affect the homogeneity of catchability. First, some witnesses may be more interested in reporting their experiences than other witnesses. Research in account-making in response to stressful events indicates that individuals psychologically benefit from relating these traumatic experiences to others.55 Thus, witnesses who experienced the loss of family members or for whom the killings may have been more “personal” may be more likely than witnesses of less personally traumatic events to relate their experiences to researchers. Some victims may seek out an audience for their accounts for personal reasons; others may be making an overt attempt to document the events that occurred. Regardless of the reason, there will be some individual-based sources of heterogeneity in reporting events. However, the efforts of researchers to contact a large number of witnesses serves to mitigate this particular form of heterogeneity of catchability. There are additional potential sources of list heterogeneity. Within these lists, we observed differences across time and space in the likelihood that a witness will report their experiences to researchers. Analyses of the overlap of reports of killings across lists indicate that across both time and geography there exists variability in the probability of any killing being reported. While it is difficult to estimate the exact amount of bias caused by heterogeneity of catchability, AAAS does not expect that these estimates contain significant upward bias from this source.56 Event Clustering. As stated previously, a necessary assumption for unbiased estimates using the capture-recapture technique is that within any data source, the probability of selection is equal for all events. Given the nature of the violence studied, the probability of selection is probably not equal for all events. That is, within any killing incident, there are often groups with unnamed victims. Therefore, there are two probabilities of the event being reported. First is the probability that a particular killing cluster is reported. Second is that, given that the cluster is reported, there is a separate probability that a specific killing within the cluster is reported to the researcher by name or by inclusion in a quantified group. In addition, if a killing cluster is not reported to researchers, the probability is zero that any individual killing within that cluster is reported. Although this clustering effect has been shown to introduce bias into the generated estimates within two-sample studies, in simulations the bias does not exceed one percent.57 Research on two-sample studies with clustered data indicates that the type of bias introduced varies depending on the relationship between cluster size and probability of cluster selection in both data sources. Where the probability of cluster selection is unrelated to cluster size in both data sources, there are techniques that can be used to correct for any introduced bias, but when the probability of selection varies according to cluster size, the bias introduced is difficult to determine. However, it appears that the net effect of clustering on the estimates is reasonably small. Given the event being studied here, it is likely that the probability of cluster selection does vary according to cluster size and that the relationship is similar for all data sources. It is probable that the fewer the number of individuals killed in a single incident, the less likely the event is to be reported (i.e., fewer witnesses exist to report the event). Thus, as the number of killings within a cluster increase, the probability that the cluster is reported to researchers could also increase. AAAS believes that clustering may introduce a small, downward bias to the estimates and that this effect could be tested with additional data. Independence of Lists. Independence of lists occurs when the probability of an event being included in one list does not depend on whether the event is included on another list. For example, if researchers in the different projects coordinated efforts to only interview refugees who had not provided accounts to one of the other data collection projects, there would exist a negative dependence between the lists and a resulting upward bias in the estimates. If researchers shared information that would lead each other to potential witnesses, the positive dependence between the lists would result in a downward bias in the estimates. However, there were no overt efforts by any of the researchers to exclude or include witnesses who had participated in another data collection project, and therefore AAAS anticipates no bias from dependence between the lists. Unpurged Out of Scope Reports. Upward bias occurs if reports of events that did not occur within the scope of the study are included in the analyses. In these analyses, the major boundaries of this project were of time and space. AAAS estimated only the number of killings that occurred between March 20 and June 12, 1999 within the 29 municipalities of the province of Kosova/Kosovo. In the process of these analyses, all cases with out of range or unknown dates were dropped from the analysis. In addition, the few reports of killings that occurred outside of Kosova/Kosovo were excluded from analyses. There were incidents for which witnesses could not accurately identify a specific location, and these cases were evaluated for whether or not they occurred within Kosova/Kosovo. Those that did not occur within the province were not included in the study. Thus, even those incidents that were not coded for their specific municipality within Kosova/Kosovo could be included in the analysis without biasing the estimates. It is unlikely that unpurged out-of-scope reports created any significant bias to these estimates. Perfect Matching. One assumption required for capture-recapture techniques is that all duplicate reports across lists are matched perfectly. That is, there are no falsely matched records (i.e., reports of separate killings that are mistakenly matched as duplicate reports of the same killing) and no false unmatched records (i.e., duplicate reports of the same killing that are mistakenly left unmatched). In the HRW and ABA/CEELI-Center lists, there is an additional required assumption—that there is perfect elimination of duplicate reports within each of the lists. That is, because the research techniques from both of these studies were such that killings could be reported by multiple witnesses, these duplicate reports must be detected and eliminated. The elimination of false matches and non-matches is primarily dependent upon the development of a matching criteria that allows for sufficient discrimination between events and identification of matching events. The three lists used in these analyses were sorted by location and date of killing and compared across all of the characteristics of the killing, including the name, sex and age of victim and the type of perpetrator. With issues of matching and potential bias in mind, the identification of duplicate reports within lists and the matching of reports across lists was conducted with the intent to err on the side of over-matching records. That is, matchers were instructed that if they found a possible match, but were not certain about whether the reports were of the same killing, they should identify the pair as a match. One of the most effective ways to match victims is by name. However, while most of the killing records had valid date and location information, there were a significant number of records that did not contain any victim names. It is reasonable to assume that among those incidents where the victims were named, there were few incorrect non-matches. However, among the unnamed victims, the matching process was less certain, and there may have been more errors. Indeed, matching rates do differ by whether the victim was named. For example, of the incidents in the HRW list for which victims were named, approximately 49 percent were matched to either the PHR or ABA/CEELI-Center lists. Of the incidents where victims were unnamed, approximately 52 percent were matched to either the PHR or ABA/CEELI-Center lists. Among the ABA/CEELI-Center data, six percent of the named records were matched to either the HRW or PHR lists. On the other hand, of the unnamed clusters, approximately 44 percent were matched to either the HRW or PHR lists. In the case of the HRW list, roughly equivalent proportions of named and unnamed victims were matched to the other lists, while for the ABA/CEELI-Center list, a greater proportion of unnamed than named victims were matched. The question of whether these lists have been over- or under- matched cannot be resolved empirically. However, while there was potential for under-matching that could result in upward bias in the overall estimate, the high rate of matching across lists suggests that any upward bias is minimal. Matching Rates Across Data Sources. Bias can occur when there are a small number of matches across lists. Across the three lists used in these analyses, 28 percent of the documented killings were reported in two or three of the lists. These results indicate that the matching rate across lists is high enough to negate concerns about an upward bias in the estimates from this source. Overall, there are several sources of potential bias in these estimates. For these sources, the effect on the overall estimate cannot be estimated empirically. However, the analysis process was designed to minimize upward bias whenever possible in favor of a downward bias. Of the possible sources of bias, it is likely that dependence of lists, unpurged out of scope events and matching rates do not introduce significant bias into these estimates. Heterogeneity of catchability may introduce a small positive bias into estimates, and the amount of bias introduced by imperfect matching within and across lists or by event clustering is unknown. However, AAAS is confident that, overall, there is no significant upward bias to these estimates. Supplemental Data on Expulsions Additional data were used to provide comparisons between the documented killings and estimates of expulsions between March 20 and June 12, 1999. These supplemental data originate from research by Ball58 on the flight of ethnic Albanians from Kosova/Kosovo between March and May of 1999. Several sources of data were used to generate the estimates of expulsions. The primary data consisted of records maintained by border guards who registered refugees crossing into Albania. Supplementing these data were:
Ball first generated estimates of the number of refugees who crossed the borders into Albania, using border records supplemented by the other data sources to impute missing data. Using data on villages of origin and travel times from the in-depth interviews, Ball was then able to generate estimates of locations and departure dates for Albanian refugees.61 These estimates of locations and dates of departure are then compared to the documented killings by date and municipality. The comparisons of killings and expulsions were generated by sorting expulsion data by two-day intervals and by municipality. The documented reports of killings and estimated expulsions showed a great level of correspondence in their patterns, as was shown in Figures 2 and 4 and can be seen by the following two figures. Figure 5 shows a scatterplot of documented killings and estimated expulsions by date. The plot shows a reasonably high level of correspondence between documented killings and estimated expulsions; the line indicates the slope of the estimated regression between the two. Of particular note is the data point in the upper-right hand corner of the plot. This data point represents March 27-28, when almost 1,000 Albanians were documented as having being killed and over 50,000 were estimated as having been expelled from their villages. Figure 5: Scatterplot of Documented Killings and Estimated Expulsions by Municipality
Figure 6 presents a scatterplot of the documented killings and estimated expulsions by municipality. As with the scatterplot by date, this figure also indicates a high level of correspondence between the documented killings and estimated expulsions. The one main outlying observation is in the lower right-hand corner of the scatterplot; in the municipality of Prizren/Prizren, there were a high number of estimated expulsions but a relatively low number of documented killings. Regardless of this one municipality, the pattern is clear; in municipalities with high levels of estimated expulsions there were also a high number or documented killings, while in municipalities with low levels of estimated expulsions there were a low number of documented killings. Figure 6: Scatterplot of Documented Killings and Estimated Expulsions by Time
38 William G. Cochran, Sampling Techniques (1977). 39 The HRW data contain witness reports of the killing of entire households (F. Abrahams, personal communication, August 1, 2000). 40 E.S. Marks, et al., supra note 23. 41 See G. A. F. Seber, A Review of Estimating Animal Abundance II, 60(2), Int’l Stat. Rev., 129 (1992); Chandra C. Sekar and William E. Deming, On a Method of Estimating Birth and Death Rates and the Extent of Registration, 44 J. A. Stat. A. 101 (1949); Global Health Network, Capture Recapture Webpage, http://www.pitt.edu/~yuc2/cr/main.htm (2000). 42 J. N. Doscher & J. A. Woodward, Estimating the Size of Subpopulations of Heroin Users: Applications of Log-Linear Models to Capture-Recapture Sampling, 18 Int’l J. Addiction (1983); T. D. Mastro, et al., Estimating the Number of HIV Infected Injection Drug Users in Bangkok: A Capture-recapture Method, 84(7) A. J. Pub. Health 1094 (1994) 43 E. Drucker & S. H. Vermud, Estimating Population Prevalance of HIV Infection in Urban Areas with High Rates of Intravenous Drug Use, 130(1) A. J. Epidemiology 131 (1989); C. A. Perucci et al., The Impact of Intravenous Drug Use on Mortality of Young Adults in Rome, Italy. 87(12) British Journal of Addiction 1637 (1992). 44 N. McKeganey et al., Female Streetwalking Prostitution and HIV Infection in Glasgow, 308(6920) British Med. J. 27 (1994) 45 For example, see C. D. Cowan & D. Malec, Capture-Recapture Models When Both Sources Have Clustered Observations, 81(394) J. A. Stat. A. (1986). 46 Ball, et al., supra note 23 at Chapter 11. 48 For example, see Yvonne M. Bishop, et al., Discrete Multivariate Analysis: Theory and Practice (1975). 49 Marks, supra note 23, eq. 7.118 at 406. 51 Kirk Wolter, Introduction to Variance Estimation (1985). 52 Bishop, supra note 48, eq. 6.4-20 at 241. 53 Due to overlap, the counts from the individual data files do not sum to the total for the combined data file. 54 For more detail on sources of bias, see Cowan, supra note 45; J. N. Darroch et al., A Three-Sample Multiple-Recapture Approach to Census Population Estimation with Heterogenous Catchability, 81(394) J. A. Stat. A. 1137 (1993). 55 See Judith Herman, Trauma and Recovery (1992); See also Terri L. Orbuch, People’s Accounts Count: The Sociology of Accounts, 23 Ann Rev. Sociology 455(1997). 56 Research that has modeled the effect of variable catchability on capture-recapture estimates do not find that it contributes a significant amount to the overall bias. E.B. Hook and R.R. Regal, Effect of Variation in Probability of Ascertainment by Sources (Variable Catchability) upon Capture-Recapture Estimates of Prevalence, 137 Am. J. Epidemiology 1148 (1993). 58 Ball, supra note 27. 59 Some of these 123 interviews were also used to generate the estimates of killings in the current study. 60 These are the same 1,180 interviews that provided the 59 PHR cases used in the current study. 61 For detailed information on the data used and the estimation process, see Ball, supra note 27. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||