Monday, May 20, 2019
The Effect of Retention Interval on the ConfidenceÃ¢â¬Accuracy Relationship for Eyewitness Identification
fair play warble Behav (2010) 34337347 DOI 10. coulomb7/s10979-009-9192-x ORIGINAL ARTICLE The proceeds of Retention Interval on the cartelAccuracy Relationship for Eyewitness Identification James Sauer ? Neil brewer ? Tick Zweck ? Nathan Weber Published online 22 July 2009 O Ameri raise psychology- rightfulness Society/Division 41 of the American Psychological Association 2009 Abstract Recent question utilize a exemplificationisation antenna indicates that witness authorization assessments obtained fastly after a positive denomination closing provide a utilitarian guide as to the likely trueness of the realisation.This culture extended interrogation on the boundary terminal figures of the cartel the true (CA) race by varying the computer storage breakup amid convert and denomination analyze. Participants (N = 1,063) viewed unmatched of five different tar- gets in a community setting and attempted an realisation from an 8-person orchestrate-present or - rattlebrained wag either flyingly or several weeks later. Comp bed to the immediate condition, the wait condition relieve oneselfd greater whatsoeverplace assertion and dismount diagnosticity.However, for selectors at both remembering separations there was a hatefulingful CA likenessship and diagnosticity was much stronger at steep than low authority levels. Keywords Eyewitness credit Confidence the true Retention interval normalisation Criminal justice systems often use witness identifica- tion say when assessing the likely guilt of a defendant or defendant. Yet, the likelihood of witness denomination error is well documented by laboratory- and field-based look for demonst evaluation that, when presented with a card J.Sauer N. brewer (&) T. Zweck N. Weber School of psychology, Flinders University, GPO Box 2100, Adelaide, SA 5001, Australia e-mail neil. emailprotected edu. au Present Address J. Sauer subdivision of Psychology, University of Portsmouth, Portsmouth, UK and asked to make an identification decision, witnesses well-nightimes (a) misidentify innocent wit pieces as the culprit or (b) drain to identify the culprit when (s)he is present in the lineup (Cutler & Penrod, 1995 Innocence Project, 2009 Pike, Brace, & Kynan, 2002 Wells et al. , 1998). Such dentification errors divert fact-finding attention from the definite culprit and atomic number 18 likely to under- mine the pieceiveness of the criminal justice system. Their come to has motivated a substantial amount of research aimed at identifying markers capable of discriminating accurate from wrong identification decisions. Eyewitness faith is one possible marker of iden- tification accuracy that has been utilize by rhetorical decision makers. Not unless has self-assertion been endorsed by the U. S. Supreme Court as one of the criteria to be acquireed when assessing the likely accuracy of identification evidence (Neil v.Biggers, 1972) but there is also a substantial books demonst rank that eyewitness combine decides assessments of likely identification accuracy do by police officers, lawyers, jurors, and jury-eligible samples (e. g. , Bradfield & Wells, 2000 Brewer & Burke, 2002 Cutler, Penrod, & Stuve, 1988 Deffenbacher & Loftus, 1982 Lindsay, Wells, & Rumpel, 1981). More everyplace, there are sound theoretical grounds for pre- dicting a substantive surenessaccuracy (CA) alliance for eyewitness identification decisions, which are a form of recognition recollection decision.A emergence of theories of decision reservation and authority processingsuch as signal detection theory (Egan, 1958 Green & Swets, 1966 Mac- millan & Creelman, 1991) and accumulator models of decision making and perceptual discrimination (Van Zandt, 2000 Vickers, 1979)suggest a shared ostensibleial basis for retort and response trustfulness in recognition entrepot tasks. few(prenominal) classes of theory hold that arrogance stems from the sam e evidence that drives the decision-making ? 123 338 rightfulness Hum Behav (2010) 34337347 ?process and, consequently, conditions facilitating accurate responding (e. . , pertinacious exposure durations, focused atten- tion, short guardianship intervals) should also produce blue boldness. Conversely, conditions that hinder accurate responding should also lead to decreased assertion. Although there have been recurrent demonstrations of shaky or, at best, modest, CA coefficient of correlations (e. g. , Bothwell, Deffenbacher, & Brigham, 1987 Sporer, Penrod, Read, & Cutler, 1995), empirical support for the diagnostic utility of eyewitness identification surenessunder certain con- ditionshas grown (e. g. Brewer & Wells, 2006 Juslin, Olsson, & Winman, 1996 Lindsay, Nilsen, & Read, 2000 Lindsay, Read, & Sharma, 1998 Sauer, Brewer, & Wells, 2008 Sauerland & Sporer, 2009 Weber & Brewer, 2004). Continued research interest in the CA relationship has been stimulated by deuce lines o f enquiry suggesting that the early correlational work under adjudicated the CA relationship. First, Lindsay et al. (2000, 1998) argued that the homo- geneity of encode and exam conditions (e. g. , exposure duration, witnesses attention to the target foreplay, keeping interval, etc. evident in most correlational investigations of the CA relationship for eyewitness iden- tification tasks restricts variation in the quality of actors memories for the target. Thus, variations in accuracy and confidence are limited, and the CA relationship underestimated. Lindsay et al. demonstrated substantial CA correlations across participants making a positive identification when witnessing conditions were alter to produce changes in the quality of the witness reposition for the target.Second, Juslin et al. (1996) argued that the point-biserial correlation provides only a peculiar(a) vista on the CA relationship, whereas an alternative approachcalibra- tionprovides (a) a more detailed repre sentation of the CA relationship and (b) more forensically useful entropy. The normalisation approach equations the objective and sub- jective probabilities of a response being lay out, determine the proportion of correct responses at each confidence level (typically measured on 0100% scale).Perfect calibration is obtained when, for example, 100% of all responses make with 100% confidence are accurate, 90% of all responses do with 90% confidence are accurate, etc. This affirma- tion is typically plotted on a graph, with the resulting calibration campaign compared to the ideal function, to assess the CA relationship. In accompaniment to optical inspection of the curve, the calibration approach incorporates a number of statistical tools for assessing the CA relation. First, the cal- ibration (C) statistic indexes the stage of correspondence surrounded by the subjective assessment (i. e. confidence) and the objective probability (i. e. , accuracy) of correct recognition, an d varies from 0 (perfect calibration) to 1. To calculate the C statistic, the going away betwixt proportion correct and confidence level is computed, and squared, for each confidence level. These values, each multiplied by the number of judgments at the respective confidence level, are then summed and divided by the total number of judgments in the sample. Second, the computation of an everywhere/under- confidence (O/U) statistic indicates the outcome to which participants are, mainly, more or less surefooted than they are accurate.The O/U statistic is metric by subtracting the mean accuracy from the mean confidence of the sample. The O/U statistic can range from -1 to 1, with negative and positive gobs indicating underconfidence and overconfi- dence, respectively. Finally, proclamation (like the CA correlation) assesses the extent to which confidence dis- criminates correct from incorrect decisions. The Normalized Resolution Index (NRI) ranges from 0 (no discrimination) to 1 (perfect discrimination).The forensic utility of the cali- bration approach, when compared to correlation, lies in its indication of probable accuracy for each level of confidence. As Juslin et al. (1996) communication channel, the familiarity that the CA correlation is, for example, . 28 does not help assess the accuracy of an individual identification made with 80% confidence. On the other hand, knowing that 80% (or 70, or 90%) of identifications made with 80% confidence are cor- rect provides a guide for assessing the likely dependableness of an individual identification decision.Studies employ the calibration approach have not only provided detailed information on the CA relationship but, in so doing, have also demonstrated robust CA relation- ships when participants positively identify a lineup member as the culprit (e. g. , Brewer & Wells, 2006 Juslin et al. , 1996 Sauerland & Sporer, 2009), provided confi- dence is assessed immediately after the identification is made (Bradfield, Wells, & Olson, 2002 Brewer, Weber, & Semmler, 2007). The reason for the pitiable CA relations typically observed for non-choosers remains unclear.However, it is well tacit wherefore breaking the assess- ment of confidence is harmful to the CA relation. As outlined above, the relationship between memory quality, confidence, and accuracy is vestigial to the CA rela- tionship. However, confidence can be shaped not only by memory quality but also by various social, environmental, and meta-cognitive work outs (see Wells, 1993). As the influence of these non-memorial factors increases, the degree to which confidence reflects the evidential basis it shares with accuracy decreases and, in turn, the CA relation weakens.Research runninging the boundary conditions for CA cali- bration is under way. The difference in CA relations for choosers and non-choosers, and the deleterious launchs of delaying assessments of confidence on the CA relationship, are well documented. Brew er and Wells (2006) examined the put ups on CA calibration of varying instructional bias, foil resemblingity, and target-absent base rates, eon Weber and Brewer (2003) tried and true the perfume of varying the 123 justice Hum Behav (2010) 34337347 339 confidence scale on CA calibration in basic face recogni- tion tasks. The present memorize further probes the boundary conditions for CA calibration by investigating the effects of varying the store interval between encoding and the identification test on the CA relationship. Retention interval is a variable of particular interest for cardinal main reasons. First, witnesses to actual offences com- monly experience delays ranging from hours to months between viewing an event and being asked to make an identification decision. For example, Pike et al. 2002) report UK analyze selective information revealing a median delay of over 10 weeks between police requesting and administering a lineup, although they distinguishd that m ore conservative esti- mates put the average delay at just over a month. Regardless, it seems safe to assume that the average retention interval for the witness (i. e. , between viewing the crime and viewing the lineup) is monthlong. In contrast, retention intervals employed to date in laboratory-based investigations of CA calibration for eyewitness (e. g. , 15 min in Brewer & Wells, 2006) and face recognition memory tasks (e. . , 310 min in Weber & Brewer, 2003, 2004, 2006) are considerably shorter and less vary in range. Juslin et al. s (1996) CA calibration national provides an exception by employing 1 h and 1 week retention intervals, and their findings are addressed below. The ferocity placed on confidence by decision makers in the forensic setting makes understanding the effect of length- ened delays on the efficacy of confidence in discriminating accurate from inaccurate identification decisions a reckon of forensic significance.Second, theories of recognition and recall memory function suggest that, in general, the quantity, quality, and/ or accessibility of information stored in memory decreases over time. This claim is supported by a large body of research literature demonst grade that, across a variety of memory task paradigms, increases in retention interval generally produce decreases in recognition and recall memory performance (e. g. , Deffenbacher, Bornstein, McGorty, & Penrod, 2008 Ebbinghaus, 1964 Schacter, 1999). Thus, variations in retention interval should produce variations in accuracy. turn memory cogency is the pro- posed basis for both confidence and accuracy (e. g. , Egan, 1958 Green & Swets, 1966 Macmillan & Creelman, 1991) and, hence, variations in memory strength should affect both components of the CA relationship, it is unclear whether the effects on confidence and accuracy will be equivalent. Previous research demonstrates that changes in accuracy are not always accompanied by equivalent changes in confidence (e. g. , Web er & Brewer, 2004) and, further, that various utilisations can influence confi- dence, and the CA association, nonsymbiotic of effects on accuracy (e. . , Busey, Tunnicliff, Loftus, & Loftus, 2000). Investigations of the CA relation for eyewitness recall memory suggest that restate questioning produces con- fidence inflation (Shaw, 1996 Shaw & McClure, 1996). For recognition memory, providing post-identification feedback, encouraging witnesses to reflect on whether their encoding conditions were likely to facilitate or hinder identification accuracy, and having witnesses consider their behavior during the identification process all produce variations in the CA relation, without affecting accuracy (e. g. , Bradfield et al. 2002 Brewer, Keast, & Rishworth, 2002 Kassin, 1985 Kassin, Rigby, & Castillo, 1991). In sum, it is clear that despite the strong theoretical link between confidence, accuracy, and memory strength, non- memorial factors can lead to CA dissociation. Thus, while th e effect of change magnitude retention interval on memory strength (and accuracy) is predictable, the effect of increased retention interval on CA calibration is not. Third, while numerous studies have investigated the effect of varied retention interval on recognition and recall memory accuracy (see Deffenbacher et al. 2008 for a review), studies probing the effect of varied retention interval on the CA relationship are scarce. Lindsay et al. (1998) varied retention interval, but it was manipulated in conjunction with a number of other variables in an effort to exert a compounded effect on memory quality. Further, they assessed the CA relation using correlation and, thus, their findings do not allow specific predictions regarding CA calibration (see also Lindsay et al. , 1981). As men- tioned above, Juslin et al. varied retention interval and found no ifference in CA calibration for identifications made after retention intervals of 1 h and 1 week. However, Juslin et al. s investig ation is limited in two important ways. First, their manipulation of retention interval exerted a negligible effect on accuracy (correct identification rates were . 69 and . 64 for the 1 h and 1 week conditions, respectively). Thus, there is no evidence that participants memories were challenged by the superfluous delay, and these findings are unable to speak to the effect of delay- induced memory adulteration on CA calibration. Juslin et al. resented an redundant CA calibration curve, based on a different informationset from that described in the article, which (a) combined data from a 1 week and 3 month retention interval condition and (b) suggested a meaningful CA relation in the pep pill half(prenominal) of the confidence scale. However, for three reasons, this curve is not edifying regarding the effect of retention interval. First, the exper- imental methodology and data underpinning this curve remain (to our knowledge) unpublished. Second, the absence of each accuracy data precludes an assessment of any decline in memory associated with the increased retention interval.Third, derivation of a calibration curve given such a small sample required collapsing data across retention interval conditions, and no indication was given of the relevant contribution of data from each retention 123 340 justice Hum Behav (2010) 34337347 ?interval condition. Thus, we have no way of knowing to what extent this curve reflects the influence of either the shorter or longer of the two retention intervals. Simply put, Juslin et al. s initial manipulation of retention interval was not strong enough to affect memory quality, and the introduction of the additional data did not overcome this limitation.Second, after presenting a lineup but prior to making an identification decision, Juslin et al. (1996) had participants rate their confidence that any lineup member was presented at encoding. Brewer et al. (2002) found that having par- ticipants consider encoding condition s prior to rating confidence improved CA calibration. In a similar way Juslin et al. s initial rating task may have aided calibration. For example, if a participant rates the likelihood that a lineup member was present at encoding as high, (s)he is likely to pick and do so with high confidence.Alterna- tively, if (s)he rates this likelihood as low but still chooses, confidence (and accuracy) is likely to be low. This pre- decision rating task may have improved CA calibration. Further, other research suggests that encouraging witnesses to consider confidence prior to making an identification can alter the decision making process and decision accuracy (e. g. , Fleet, Brigham, & Bothwell, 1987). In addition to these two major limitations, two idiosyn- crasies in Juslin et al. s (1996) methodology may have affected the CA relation observed.First, Juslin et al. used a target-absent base rate of . 25, rather than the . 50 base rate typical of eyewitness CA calibration research (and used in this research). succession there is no reason to assume a . 50 target-absent base rate in the utilize setting (with the typ- ically used . 50 target-absent base rate perhaps representing a considerable overestimation), differences in the target- absent base rate affect CA calibration (Brewer & Wells, 2006). Second, the research workers provided instructions on calibration and interpretation of the confidence scale.Prior to eliciting confidence estimates, Juslin et al. informed participants that a positive identification accompanied by a confidence estimate of 0% amounted to a contradiction. While this logic may be sound, positive identifications are sometimes made with very low (even 0%) confidence, and this instruction may have influenced participants confi- dence estimates and, consequently, the CA relationship observed. Taken together, these differences are sufficient to raise doubts approximately the generalizability of Juslin et al. s findings.Specifically, given that ac curacy was barely affected by the manipulation, and that the rating task and lower target-absent base rate may have get upd calibra- tion and reduced underconfidence (cf. Brewer et al. , 2002 Brewer & Wells, 2006), Juslin et al. s (1996) study does not represent an adequate test of the effect of increased reten- tion interval on CA calibration. CA calibration in Juslin et al. s shorter retention interval condition was already strong. Thus, any over-estimation of the CA relation resulting from Juslin et al. s methodology would most likely also manifest in the longer retention interval, increasing the likelihood of obtaining similar CA relations across conditions. CA calibration research in the eyewitness identification area is in its infancy. The paucity of research in this area is intelligible given the large number of participants required to generate stable estimates of CA calibration. Indeed, most of what is currently understood in this area relies on laboratory research using a limited range of stimulus materials. Only one study has previously exam- ined CA calibration using a field study methodology (Sauerland & Sporer, 2009).The present research advances understanding of the CA relationship in three main ways. First, we used the CA calibration approach to examine the effect of retention interval on the CA relation, differentiate the CA relation for a virtually immediate identification test with that for one conducted between 3 and 7 weeks after the encoding event (and producing lower identification accuracy). Second, we used five different sets of encoding and test stimuli and, third, we tested the robustness of the CA relation in a field setting that provided varied and more realistic encoding conditions (cf.Lindsay et al. , 1998). METHOD Design A 2 (retention interval immediate test versus hold up test) 9 2 (target-presence present versus absent), between- subjects design was used to test the effect of varied retention interval on the confidenceaccu racy relationship using multiple target stimuli in a field setting. Participants A total of 1,063 (548 female) participants provided data for this research. Participant ages ranged from 15 to 85 (M = 29. 21, SD = 14. 33). A functional kitchen range of the Eng- lish language was the only prerequisite for participation. MaterialsPhotographs of the target were cropped to present the individual, from the shoulders up, against a plain white/ gray background, and were some 55 mm 9 55 mm in size. Non-target (i. e. , foil) photographs were selected from our laboratorys large database using a match-description strategy, with foil selection requiring agreement between the detectives and the experimenter from each pair that the foils matched the targets 123 Law Hum Behav (2010) 34337347 341 ?description. In sum, five different sets of target-present and target-absent lineups were constructed.For each target, identical foils were used for target-present and -absent lineups. Target-absent li neups were created by substitution the target with another foil photograph. However, as dis- cussed in the Results section, because the designation of individual foils as target-replacements was arbitrary, the target-replacement is not analogous to an innocent laughable. Procedure Ten female, third-year honors psychological science students col- lected data as part of a work experience course-component. The 10 students split into pairs with one acting as the researcher and the other as the target.Targets were of either Caucasian or Mediterranean appearance. Data were collected at various locations ranging from on-campus to city streets to parking lot areas. While the target remained out of sight, the researcher approached members of the public (individually) and asked if they would like to participate in a psychology experiment. If the individual agreed, the researcher signaled to her partner who moved into the participants view, and remained in view for 10 s. Targets were viewed at a pre-measured distance of 10 m, and participants were instructed to attend to the target for the full 10 s.After encoding, participants were allocated to either an immediate or detain testing condition. Data were obtained from 691 participants in the immediate condition and from 372 participants in the retard condition (i. e. , only about 55% of participants approached in the delay condition responded). Participants in the immediate con- dition were asked to perform an identification task. The researcher read the following instructions to the partici- pant Im now going to ask you to adjudicate and pick the person you just saw out of a group of photographs on this sheet The researcher then presented the participant with a laminated piece of A4 paper displaying eight, clearly numbered, color photographs organized into two rows of cardinal faces. The instructions continued The person may or may not be present in the lineup. If you think the person is not present, please say n ot present. Please indicate the number of the person who is the person you have just viewed. The researcher then recorded the participants response, asked the participant to indicate their confidence in the accuracy of their response on an 11-point scale (0100%), and collected some demographic information.Participants in the retard condition provided an email address and were contacted approximately 1821 geezerhood after encoding, and provided with a link to an online data collection system. Actual retention intervals ranged from 20 to 50 days (M = 23, SD = 5). When entered into the system, participant email addresses were matched to the relevant researcher/target pair to ensure that each partici- pant viewed the correct lineup for their target stimulus. Participants accessed the online system and were presented with instructions generally identical to those reported above.However, rather than indicating responses verbally, participants in the slow condition made identification decisions by either (a) clicking the photo of the lineup member they believed to be the target, or (b) clicking a button labeled Not Present at the bottom of the screen. Similarly, participants entered their confidence estimate by clicking one of eleven on-screen buttons representing the levels of confidence indicated above. Participants in the delayed condition were asked for the same demographic information as those in the immediate condition.Target- presence was counterbalanced in both the immediate and delayed conditions to achieve an pertain number of target- present and -absent trials. RESULTS Retention Interval and Accuracy Chi-square analyses performed on response accuracy for the delayed and immediate conditions found predictable effects of retention interval for both choosers, v2(1, N = 614) = 11. 59, p . 001, w = 0. 14, and non-choosers, v2(1, N = 449) = 13. 85, p. 001, w = 0. 18.In both causal agencys, accuracy was greater in the immediate condition (62 and 82% for ch oosers and non-choosers, respectively) than in the delayed condition (47 and 66% for choosers and non- choosers, respectively). Thus, the effect of increased retention interval on identification accuracy was tenacious with the expected reduction in memory quality. As found by Juslin et al. (1996) and Sauerland & Sporer (2009), accuracy rates for non-choosers were importantly higher than for choosers in both the immediate, v2(1, N = 691) = 32. 24, p . 001, w = 0. 22, and delayed condi- tions, v2(1, N = 372) = 13. 4, p . 001, w = 0. 19. The present non-chooser accuracy and diagnosticity data (see below) lend support to previous research demonstrating that lineup rejections can inform assessments of the likely guilt of a shadowy (e. g. , Clark, Howell, & Davey, 2008 Wells & Olson, 2002). Retention Interval and the CA Relation To enhance the stability of the plotted CA calibration functions, confidence data were collapsed from the 11 initial confidence categories (i. e. , 0100%) to five (i. e. , 0 20%, 3040%, 5060%, 7080%, 90100%) (see Brewer & Wells, 2006 Juslin et al. , 1996).Moreover, because foils are known in advance to be innocent, we excluded target- 123 342 Law Hum Behav (2010) 34337347 ?present, foil identifications from our calibration analyses (see Brewer & Wells, 2006). However, as there was no actual police shadowed in the target-absent lineups, all false identifications of foils from target-absent lineups were included in calibration analyses, a practice that necessarily inflates the degree of cocksureness. 1 Table 1 presents the distributions of confidence ratings for choosers and non-choosers, in the immediate and delayed conditions, according to identification response.Given the well-documented differences in the CA rela- tion for choosers and non-choosers, we present CA calibration analyses separately for these two groups (see Tables 1 and 2, and Fig. 1). In both retention interval conditions, meaningful CA relationships for choosers are ap parent. Visual inspection of choosers CA calibration functions (Fig. 1) shows increasing accuracy as confidence increases for both retention intervals. Moreover, in the upper section of the confidence scale, the immediate and delayed condition curves are almost identical.While reli- ance on visual inspection may appear to lack rigor, the threadbare error bars for each confidence interval permit an estimation of the stability of the results obtained. Over- lapping standard error bars (evident for the two highest confidence intervals of the chooser curves) denote non- reliable differences between groups. Table 1 presents the diagnosticity ratios for each con- fidence category. Diagnosticity ratios indicate the likely reliability of an identification decision, in this case, according to the level of confidence expressed.Chooser diagnosticity ratios compare the likelihood that a guilty suspect will be identified to the likelihood that an innocent suspect will be identified. The procedu re for separating suspect from foil identifications from target-absent lineups is complex. In contrast to the forensic setting, the labora- tory setting provides no basis for designating any particular member of a target-absent lineup as the suspect (cf. Brewer & Wells, 2006). Accordingly, we calculated target-absent suspect identification rates by dividing the total number of target-absent false identifications by the number of lineup members (i. e. , eight).Non-chooser diagnosticity ratios compare the probability that the witness responds not- present, given the target is not-present, to the probability that the witness responds not-present, given the target is present. Both retention interval conditions show increased diagnosticity at each successive confidence interval. Thus, when a suspect is identified, an increase in witness 1 Including only target-replacement identifications as false identifi- cations from target-absent lineups resulted in only 13 and 6% (in the immediate an d delayed conditions, respectively) of all target-absent misidentifications being available for calibration analyses.Split over the five confidence intervals, these data is insufficient to provide stable estimates of calibration. Table 1 Diagnosticity ratios and number of responses (according to response type) for each confidence interval, for choosers and non- choosers in the immediate and delayed testing conditions ? Condition & response Confidence level (%) 020 3040 5060 7080 90100 boilers suit fastchoosers sub ascribable 5 identification Foil identification 1 False identification 9 Overall 15 Diagnosticityratio 6. 68 SEDiagnosticity 2. 71 Delayedchoosers Correct 1 identificationFoil identification 5 False identification 9 Overall 15 Diagnosticity ratio 1. 56 SEDiagnosticity 1. 66 Immediatenon-choosers Correct rejection 6 irrational rejection 4 Overall 10 Diagnosticity ratio 1. 01 SEDiagnosticity 0. 57 Delayednon-choosers 12 40 103 90 250 6 18 12 4 41 13 31 43 18 114 31 89 1 58 112 405 8. 87 11. 08 18. 74 37. 79 17. 80 2. 40 1. 91 2. 56 8. 61 1. 49 5 16 39 38 99 5 8 9 3 30 11 25 24 10 79 21 49 72 51 208 4. 02 6. 28 13. 63 20. 47 10. 12 1. 83 1. 52 2. 63 6. 23 1. 11 11 42 91 84 234 2 10 22 14 52 13 52 113 98 286 4. 68 3. 91 4. 23 6. 35 4. 44 5. 91 1. 31 0. 89 1. 71 0. 0 Correct rejection Incorrect rejection Overall Diagnosticity ratio SEDiagnosticity 5 10 25 41 27 108 2 9 16 14 14 55 7 19 41 55 41 163 1. 48 1. 01 1. 23 2. 79 2. 87 1. 92 1. 62 0. 35 0. 30 0. 74 0. 76 0. 25 ? confidence is accompanied by an increase in the probability that the identified suspect is guilty. There are, however, some differences apparent between the two retention interval conditions for choosers. A modified jackknife procedure (Koriat, Lichtenstein, & Fischhoff, 1980 Mosteller & Tukey, 1968) was performed on the C, O/U, and NRI statistics obtained for each retention interval condition.The jackknife procedure involves repeated calculation of each of the three statistics above, with each calculation omitting data from a different, individual participant. As many calculations are run as there are participants. This permits the calculation of mean and standard error data (Table 2) for the statistics obtained which, in turn, allows an assessment of differences in the relevant measures between groups. While these jackknife mean and standard error data cannot be subjected to inferential testing, they are intended to allow researchers to ? 123Law Hum Behav (2010) 34337347 343 100 80 60 40 20 Table 2 Calibration (C), overconfidence (O/U), and Normalized Resolution Index (NRI) statistics, for choosers and non-choosers, in the immediate and delayed testing conditions Measure Statistic C Value Jackknife SE O/U Value Jackknife SE NRI Value Jackknife SE Choosers Non-choosers Immediate Delay Immediate Delay 0. 01 0. 04 0. 00 0. 01 0. 09 0. 19 0. 02 0. 03 0. 10 0. 17 0. 03 0. 05 0. 03 0. 04 0. 01 0. 02 -0. 08 0. 01 0. 02 0. 04 0. 02 0. 02 0. 02 0. 02 ?Immediate Choo sers Delayed Choosers 0 0 20 40 60 80 100 n the immediate condition. This produces differences between conditions in three measures of the CA relation the visual appearance of calibration function, the O/U statistic, and the NRI statistic. First, the calibration curve for the immediate condition flattens out in the lower half of the confidence scale, rather than following the ideal func- tion. Further, the curve for the immediate condition shifts from overconfidence in the higher confidence intervals to underconfidence in the lower confidence intervals, a pattern not observed in the delayed condition.In addition to pro- ducing a visual flattening of the curve, this transition from overconfidence to underconfidence has important effects on two of the statistical measures of the CA relationship. It drives the immediate condition O/U statistic toward its mid-point (i. e. , 0). Consequently, although the immediate condition curve exhibits noticeable underconfidence and overconfidence at the relevant extremes of the confidence scale, this is not reflected in the O/U statistic for that condition, thereby exaggerating the apparent difference in overconfidence between conditions.Finally, as evidenced by the NRI statistics (Table 2), it reduces the overall level of discrimination provided by confidence in the immediate condition. This discrepancy between conditions at the lower confidence extremes is addressed further in the Discussion. The second difference between the CA relations for choosers in the delayed and immediate conditions is evident in the diagnosticity ratios reported for each confidence interval (Table 1). undifferentiated with the reported overall drop in identification accuracy associated with the delayed con- dition, the degree of diagnosticity at each confidence interval is greater in the immediate than delayed condition. Further, although no difference in overconfidence is appar- ent between conditions in the higher confidence brackets, the differe nce in diagnosticity persists. Nonetheless, as outlined above, the finding of increased diagnosticity with increased confidence is consistent (for choosers) across conditions.In sum, the CA relations observed for choosers in the two retention interval conditions differ in terms of the Confidence 100 80 60 40 20 ?Immediate Non-Choosers Delayed Non-Choosers 0 0 20 40 60 80 100 Confidence Fig. 1 Confidenceaccuracy (CA) calibration curves for choosers (upper panel) and non-choosers (lower panel) in the delayed and immediate testing conditions. Error bars represent standard errors draw inferences in conditions where data violate assump- tions of conventional inferential testing techniques (Sheskin, 2004).Because the jackknife means replicated the original values in every case, only the original values are reported. Inspection of the calibration functions, together with the O/U statistics (Table 2), suggests greater overconfidence for the delayed compared to the immediate condition. Howev er, two aspects of the calibration information justify qualification of this general observation. First, for the two highest confidence categories, the standard error bars for the two functions overlap suggesting no meaningful dif- ference in over/underconfidence.The apply value of this parity at the higher confidence intervals is addressed in the Discussion. Second, the overall difference in over- confidence between conditions is, in fact, exaggerated by underconfidence in the lower half of the calibration curve 123 % Correct % Correct 344 Law Hum Behav (2010) 34337347 ?general overconfidence and discriminability, due primarily to the trend toward underconfidence at low confidence levels in the immediate condition. However, in the upper half of the confidence scale, the conditions produce highly similar calibration functions.For non-choosers, both retention interval conditions produced the typically weak CA relations observed in previous CA calibration research. Further, any vari ations in diagnosticity between confidence levels were small and unsystematic in both conditions. While this absence of resolution might commonly be taken as an indication that a confident rejection should not be given any special status, this needs to be considered in the context of accuracy rates for rejections usually being high. Thus, from an utilise perspective, provided the conditions are such that non-chooser accuracy is high (e. . , unbiased lineup instructions, good encoding conditions), it is important to note that a highly confident rejection is as good a guide to (in)accuracy as a confident ID. Importantly also, an unconfident rejection is also likely to be as accu- rate as a confident ID. The CA correlation patterns are generally in line with previous research (e. g. , Lindsay et al. , 1998 Sporer et al. , 1995). CA correlations of moderate strength were found for choosers in both the immediate (r (405) = 0. 32, p . 001) and delayed conditions (r (209) = 0. 41, p . 001).While these values lie toward the high end of typically reported CA correlations, the relationships are still only moderate in size. self-consistent with previous research, correlations for non-choosers were weak and non-significant in both the immediate (r (286) = . 09, ns) and delayed conditions (r (163) = . 06, ns). DISCUSSION While the dominant perspective in eyewitness identifica- tion research has been that the CA relationship is, at best, a weak one, recent researchunderpinned by theoretically motivated changes in design and analysis techniqueshas demonstrated meaningful CA relationships when certain pre-conditions are met.The present study extends this research, providing an important test of the boundary conditions of the CA relation. Variation in retention interval is (a) theoretically linked to variation in memory quality (and, thus, confidence and accuracy), (b) typical in the forensic setting, and (c) atypical in psychological investigations of the CA relation. Fu rther, the emphasis placed on confidence when assessing the reliability of identification evidence in the forensic setting makes the effect of varied retention interval on the CA relationship an issue of applied and theoretical relevance.The most striking feature of our examination of the effect of retention interval on the CA relationship is the agreement of the findings across retention interval con- ditions. Consistent with previous calibration research in the eyewitness and face recognition paradigms (e. g. , Brewer & Wells, 2006 Juslin et al. , 1996 Sauerland & Sporer, 2009 Weber & Brewer, 2003, 2004, 2006), confidence and accuracy were meaningfully related for choosers in both the immediate and delayed conditions, particularly in the upper half of the confidence scale.Further, both conditions show systematic increases in diagnosticity with increased witness confidence. Compared to the immediate condition, the delayed condition demonstrated an increase in general overconfidenc e and a decrease in the imperious levels of diagnosticity. However, such differences are equally likely to occur when retention interval is held constant but target stimuli or instructional bias are varied (e. g. , Brewer & Wells, 2006).Of primary importance is the finding that the fundamental nature of the CA relationship, as evidenced by the shape of the calibration functions and the systematic relationship between confidence and diagnosticity, did not vary meaningfully between conditions. As Bruck and Poole (2002) note, albeit it in a different context, when assessing consistency across conditions, patterns of findings are often more informative than individual numbers. While our conclusions may be similar to those of Juslin et al. 1996) in that CA calibration was still evident when the retention interval was extended, our findings add sig- nificantly to our understanding of the effect of retention interval on the CA relation. Whereas there was no evidence that Juslin et al. s r etention interval manipulation affected memory strength, our manipulation clearly affected rec- ognition memory performance and yet evidence of CA calibration persisted. Moreover, CA calibration was evi- dent at the longer retention interval in our study, despite the absence of several methodological features contained in Juslin et al. s research that may have reenforce the CA calibration find at their longer retention interval. This suggests that these idiosyncrasies were not sufficient to affect the CA association. Additionally, by providing data from a field setting using multiple sets of encoding and test materials, our study provides an important pointer to the likely generality of the above conclusions. The improved diagnosticity in both retention interval conditions evident at the upper confidence levels has significant forensic implications.Highly confident identi- fications, when compared to those made with low confidence, are likely to have a greater impact on police inv estigations and jury decision making. For example, in the absence of other compelling evidence, police are more likely to proceed with a case given a highly confident identification than given an identification made with low confidence. Further, compared to an identification made with low confidence, an identification made with high confidence is likely to be more weighty in the 123 Law Hum Behav (2010) 34337347 345 courtroom, and thus exert a more pronounced effect on juror assessments of likely guilt. Thus, it is reassuring that the identification decisions likely to exert the greatest influence in criminal justice system are those for which (a) diagnosticity is greatest and (b) there was no significant variation in CA relationship associated with increased retention interval. We emphasize here, of course, that we are talking only about relationships detected when confi- dence was measured and recorded immediately after the identification, and not when opportunities for influenci ng confidence judgments had occurred.A potentially interesting difference between the CA relations obtained in the two conditions presents in the lower half of the confidence scale for the chooser curves. As previously outlined, while the immediate condition curve exhibited underconfidence in the lower confidence levels, the delayed condition curve maintained its resem- blance to the ideal function (i. e. , low confidence ratings were accompanied by equivalently poor identification performance). As noted earlier, confidence judgments may be shaped not only by memory strength but also by various non-memorial factors.It may be the case that, because the immediate condition provided virtually no time for the memory trace to degrade, very low confidence estimates in this condition reflected the influence of misleading meta- cognitive inferences. In contrast, the delayed condition allowed for significantly greater degradation in memory trace, and, consequently, a greater drop in identifi cation accuracy than did the immediate condition. In the delayed condition, very low confidence was perhaps more likely to reflect poor memory quality and, consequently, predict very poor performance.Thus, in this condition, confidence and accuracy corresponded more closely at the lower confidence levels, and the overall level of confidence-based discrimination increased (as evidenced by the NRI statis- tics). The improved resolution associated with the longer retention interval in the present study supports claims made by Lindsay et al. (2000, 1998) that the CA relation (and, in particular, resolution) is likely to be most evident in con- ditions that produce greatest variability in witnesses memory strength.However, given the low number of data points for these confidence categories, any conclusions must be tentative. Moreover, from an applied perspective, the data clearly show that low confidence identifications are associated with low accuracy (regardless of the exis- tence of o ver- or underconfidence). We should note three features of this study that might possibly have influenced the pattern of results obtained. First, despite email reminders to participants in the delay condition, there was still significant attrition.If it turns out that those careful enough to respond were also more conscientiousand, importantly, effectivewhen deter- mining confidence judgments, then it is conceivable that the strength of the CA relation is overestimated in our delay condition. However, we know of no evidence that could sustain an argument either way on this issue. Second, our retention interval manipulation was confounded with method of responding. Participants in the immediate test condition provided their responses during face-to-face interactions with the researcher, while delayed condition participants responded via computer.As previously noted, social influence can undermine the confidenceaccuracy relationship. However, given (a) the similarity of CA rela- tion ships evident between conditions in this experiment and (b) the similarity in CA relationships between the imme- diate condition in this experiment and previous work using similar (i. e. , relatively short though not immediate) reten- tion intervals and non-face-to-face responding (Brewer & Wells, 2006 Weber & Brewer, 2003, 2004, 2006), there is no reason to believe that method of responding exerted a significant effect on the results obtained.Third, for ethical reasons the encoded event in our field study did not involve a crime. Whether this might influence the CA relationship is also not known, though there is no obvious reason why this variable should interact with retention interval. What we do know, of course, is that the most reliable determinant of variations in the degree of over/underconfidence is task difficulty (see Brewer, 2006 Weber & Brewer, 2004), with our various stimuli providing tasks of sufficient difficulty to produce over- rather than under-confidence and, pred ict- ably, greater overconfidence in the delay condition.In sum, this research asked Does an increase in retention interval undermine the meaningful CA relationships reported in recent research? These results suggest not, at least not for retention intervals in the range used here. For choosers in both the delayed and immediate conditions, increased confidence was associated with increased proba- ble accuracy. While this finding is encouraging, one important caveat is required. Although retention interval did not affect the CA relationship observed, many factors capable of distorting the CA relation over time in the forensic setting (e. g. confirmative feedback/interaction with co-witnesses, repeated post-event questioning) were not addressed in our approach. It would be premature to suggest that, in the forensic setting, confidence-based discrimina- tion of accuracy will not ever vary with increased retention interval. Simply increasing retention does not, by itself, seem to dampe n the CA relation, but increased retention intervals may be associated with increased exposure to other factors likely to affect the relationship between con- fidence and accuracy. Moreover, it should be noted that retention intervals long enough to reduce identification accuracy to chance levels (i. . , likely much longer than in this study) would constrain variation in accuracy, reducing the extent to which confidence can discriminate accurate from inaccurate identification decisions. 123 346 Law Hum Behav (2010) 34337347 ?Acknowledgments This research was supported by grant DP0556876 from the Australian Research Council and a Flinders Research Grant. We are grateful to Monica Beshara, Megan Cant, Danielle Chant, Kelly Ferber, Suzana Freegard, Caitlin Hithcock, Michaela OKeefe, Lucy Pillay, Carla Raphael, Nancy Whitaker, and Anneke Woods for their assistance with data collection. REFERENCESBothwell, R. K. , Deffenbacher, K. A. , & Brigham, J. C. (1987). Correlations of eyewitness accuracy and confidence Optimality hypothesis revisited. diary of Applied Psychology, 72, 691 695. Bradfield, A. L. , & Wells, G. L. (2000). The perceived validity of eyewitness identification testimony A test of the five Biggers criteria. Law & valet Behavior, 24, 581594. Bradfield, A. L. , Wells, G. L. , & Olson, E. A. (2002). The damaging effect of confirming feedback on the relation between eyewitness certainty and identification accuracy. Journal of Applied Psy- chology, 87, 112120. Brewer, N. (2006).Uses and abuses of eyewitness identification confidence. Legal and Criminological Psychology, 11, 323. Brewer, N. , & Burke, A. (2002). effect of testimonial inconsistencies and eyewitness confidence on mock-juror judgements. Law & tender Behavior, 26, 353364. Brewer, N. , Keast, A. , & Rishworth, A. (2002). The confidence- accuracy relationship in eyewitness identification The effects of reflection and disconfirmation on correlation and calibration. Journal of experimental Ps ychology Applied, 8, 4456. Brewer, N. , Weber, N. , & Semmler, C. (2007). A role for theory in eyewitness identification research.In R. C. L. Lindsay, D. F. Ross, J. D. Read, & M. P. Toglia (Eds. ), The handbook of eyewitness psychology Volume II. retentiveness for people (pp. 210 218). Mahwah, NJ Lawrence Erlbaum Associates. Brewer, N. , & Wells, G. L. (2006). The confidence-accuracy relationship in eyewitness identification Effects of lineup instructions, functional size and target-absent base rates. Journal of observational Psychology Applied, 12, 1130. Bruck, M. , & Poole, D. A. (2002). Introduction to the special issue on forensic developmental psychology. Developmental Review, 22, 331333. Busey, T. A. , Tunnicliff, J. , Loftus, G. R. & Loftus, E. F. (2000). Accounts of the confidence-accuracy relation in recognition memory. Psychonomic Bulletin & Review, 7, 2648. Clark, S. E. , Howell, R. , & Davey, S. L. (2008). Regularities in eyewitness identification. Law & Human Behavio r, 32, 187203. Cutler, B. L. , & Penrod, S. D. (1995). Mistaken identification The eyewitness, psychology, and the law. New York Cambridge University Press. Cutler, B. L. , Penrod, S. D. , & Stuve, T. E. (1988). Jury decision making in eyewitness identification cases. Law & Human Behavior, 12, 4156. Deffenbacher, K. A. , Bornstein, B. H. , McGorty, E. K. , & Penrod, S. 2008). Forgetting the once-seen face Estimating the strength of an eyewitnesss memory representation. Journal of observational Psychology Applied, 14, 139150. Deffenbacher, K. A. , & Loftus, E. F. (1982). Do jurors share a common understanding concerning eyewitness behavior? Law & Human Behavior, 6, 1530. Ebbinghaus, H. (1964). Memory A contribution to experimental psychology. New York Dover (Original work published 1895). Egan, J. P. (1958). quotation memory and the operating charac- teristic (No. Tech. Rep. No. AFCRC-TN-5851). Hearing and Communication Laboratory, Indiana University Bloomington. Fleet, M.L. , Brig ham, J. C. , & Bothwell, R. K. (1987). The confidence- accuracy relationship The effects of confidence-accuracy and choosing. Journal of Applied Social Psychology, 17, 171187. Green, D. M. , & Swets, J. A. (1966). Signal detection theory and psychophysics. New York Wiley. Innocence Project. (2009). Innocence project. Retrieved March 15, 2009, from http//www. innocenceproject. org/about/index. php. Juslin, P. , Olsson, N. , & Winman, A. (1996). Calibration and diagnosticity of confidence in eyewitness identification Com- ments on what can be inferred from the low confidence-accuracy correlation.Journal of Experimental Psychology Learning, Memory, and Cognition, 22, 13041316. Kassin, S. M. (1985). Eyewitness identification Retrospective self- awareness and the accuracy-confidence manipulation. Journal of Personality and Social Psychology, 49, 878893. Kassin, S. M. , Rigby, S. , & Castillo, S. R. (1991). The accuracy- confidence correlation in eyewitness testimony Limits and extensions of the retrospective self-awareness effect. Journal of Personality and Social Psychology, 61, 698707. Koriat, A. , Lichtenstein, S. , & Fischhoff, B. (1980). Reasons for confidence.Journal of Experimental Psychology Human Learn- ing & Memory, 6, 107118. Lindsay, D. S. , Nilsen, E. , & Read, J. D. (2000). Witnessing-condition heterogeneity and witnesses versus investigators confidence in the accuracy of witnesses identification decisions. Law & Human Behavior, 24, 685697. Lindsay, D. S. , Read, J. D. , & Sharma, K. (1998). Accuracy and confidence in person identification The relationship is strong when witnessing conditions vary widely. Psychological Science, 9, 215218. Lindsay, R. C. L. , Wells, G. L. , & Rumpel, C. M. (1981). Can people detect eyewitness-identification accuracy within and across situations?Journal of Applied Psychology, 66, 7989. Macmillan, N. A. , & Creelman, C. D. (1991). Detection theory A users guide. New York Cambridge University Press. Mosteller, F. , & Tuke y, J. W. (1968). Data analysis including statistics. In G. Lindzey & E. Aronsen (Eds. ), The handbook of social psychology (Vol. 2, pp. 80203). Reading, PA Addison-Wesley. Neil v. Biggers, 409 U. S. 188 (1972). Pike, G. , Brace, N. , & Kynan, S. (2002). The visual identification of suspects Procedures and practice. (Briefing Note 2/02). Lon- don Home Office. Sauer, J. D. , Brewer, N. , & Weber, N. (2008).Multiple confidence estimates as indices of eyewitness memory. Journal of Exper- imental Psychology General, 137, 528547. Sauerland, M. , & Sporer, S. (2009). Fast and confident Postdicting eyewitness identification accuracy in a field study. Journal of Experimental Psychology Applied, 15, 4662. Schacter, D. L. (1999). The seven sins of memory. American Psychologist, 54, 182203. Shaw, J. S. (1996). Increases in eyewitness confidence resulting from postevent questioning. Journal of Experimental Psychology Applied, 2, 126146. Shaw, J. S. , & McClure, K. A. (1996). Repeated postevent u estioning can lead to elevated levels of eyewitness confidence. Law & Human Behavior, 20, 629653. Sheskin, D. (2004). Handbook of parametric and non-parametric statistical procedures (3rd ed. ). Boca Raton, FL Chapman & Hall/CRC. Sporer, S. L. , Penrod, S. D. , Read, D. , & Cutler, B. L. (1995). Choosing, confidence, and accuracy A meta-analysis of the confidence-accuracy relation in eyewitness identification studies. Psychological Bulletin, 118, 315327. Van Zandt, T. (2000). ROC curves and confidence judgments in recognition memory. Journal of Experimental Psychology Learning, Memory, and Cognition, 26, 582600. 23 Law Hum Behav (2010) 34337347 347 ?Vickers, D. (1979). ratiocination processes in visual perception. New York Academic Press. Weber, N. , & Brewer, N. (2003). The effect of judgment type and confidence scale on confidence-accuracy calibration in face recognition. Journal of Applied Psychology, 88, 490499. Weber, N. , & Brewer, N. (2004). Confidence-accuracy calibration i n absolute and relative face recognition judgements. Journal of Experimental Psychology Applied, 10, 156172. Weber, N. , & Brewer, N. (2006). Positive versus negative face recognition decisions Confidence, accuracy and response latency.Applied Cognitive Psychology, 20, 1731. Wells, G. L. (1993). What do we know about eyewitness identifica- tion? American Psychologist, 48, 553571. Wells, G. L. , & Olson, E. A. (2002). Eyewitness identification data gain from incriminating and exonerating behaviors. Journal of Experimental Psychology Applied, 8, 155167. Wells, G. L. , Small, M. , Penrod, S. , Malpass, R. S. , Fulero, S. M. , & Brimacombe, C. A. E. (1998). Eyewitness identification proce- dures Recommendations for lineups and photo spreads. Law & Human Behavior, 22, 603647.