False Positive Responses in Standard Automated Perimetry
ANDERSHEIJL,VINCENT MICHAELPATELLA,JOHN G.FLANAGAN,AIKOIWASE,CHRISTOPHER K.LEUNG, ANJATUULONEN,GARY C.LEE,THOMASCALLAN,ANDBOELBENGTSSON
• PURPOSE: To analyze the relationship between rates offalse positive(FP)responsesand standardautomated perimetryresults.
• DESIGN: Prospectivemulticentercross-sectionalstudy.
• METHODS: Onehundredtwenty-sixpatientswithman- ifestorsuspectglaucomaweretestedwithSwedishInter- activeThresholdingAlgorithm(SITA)Standard,SITA Fast,and SITAFasterateachof2 visits.Wecalculated intervisitdifferencesinmeandeviation(MD),visualfield index(VFI),and number of statisticallysignificanttest pointsasafunctionofFPratesandalsoasafunctionof generalheight(GH).
• RESULTS: Increasing FP values were associated with higher MD values for all 3 algorithms, but the effects were small, 0.3 dB to 0.6 dB, for an increase of 10 percentage points of FP rate, and for VFIeven smaller (0.6%-1.4%).Onlysmall parts of intervisit differences wereexplainedbyFP(r2values0.00-0.11).Theeffects ofFPwerelargerinsevereglaucoma,withMDincreases of1.1dBto2.0dB per10percentagepointsofFP,and r2valuesrangingfrom0.04to0.33.Thenumbersofsig- nificantlydepressedtotal deviation pointswere affected onlyslightly,andpatterndeviationprobabilitymapswere generallyunaffected.GHwasmuchmorestronglyrelated toperimetricoutcomesthanFP.
• CONCLUSIONS: Across3differentstandardautomated perimetrythresholdingalgorithms,FPratesshowedonly weak associations with visual field test results, except in severe glaucoma. Current recommendations regard- ing acceptable FP ranges may require revision. GH or other analyses may be better suited than FP rates for identifying unreliable results in patients who fre- quently pressthe response button without having per-
SupplementalMaterialavailableatAJO.com. AcceptedforpublicationJune25,2021.
FromOphthalmologyResearchUnit,DepartmentofClinicalSciences Malmö,LundUniversity(A.H.andB.B.);DepartmentofOphthalmology, SkåneUniversityHospital,Malmö,Sweden(A.H.);DepartmentofOph- thalmology(V.M.P.),UniversityofIowa,IowaCity,Iowa,USA;Schoolof OptometryandVisionScienceProgram(J.G.F.),UniversityofCalifornia, Berkeley,Berkeley,USA; TajimiIwaseEyeClinic(A.I.),Tajimi,Japan;
DepartmentofOphthalmologyandVisualSciences(C.K.L.),Chinese UniversityofHongKong,HongKong,China; TaysEyeCentre(A.T.), TampereUniversityHospital,Tampere,Finland; CarlZeissMeditec,Inc.
(G.C.L.,T.C.),Dublin,California,USA
InquiriestoAndersHeijl,DepartmentofOphthalmology,SkåneUni- versityHospital,JanWaldenströmsGata24,SE-20502,Malmö,Sweden.;
e-mail:anders.heijl@med.lu.se
ceived stimuli. (Am J Ophthalmol 2021;233: 180–
188.© 2021TheAuthor(s).PublishedbyElsevierInc.
This isanopenaccess articleundertheCCBYlicense (http://creativecommons.org/licenses/by/4.0/))
W
ITH the introduction of computerized perimetersinthe1970s,3so-called“reliability parameters” were implementedwith thehope of helping users judge whether test results were reliable and useful. These parameters were fixation losses (FLs), false negative (FN)responses, and falsepositive (FP) re- sponses.1-3 FLresponses are obtainedusingamethodde- scribed in1974inwhichteststimuliarepresentedatthe expectedlocationofthephysiologicblindspotofthetested eye.1Themethodwasoriginallydesignedtogiveaqualita- tive ideaaboutfixation in anearlycomputerized perime- ter,wheretheoperatorcould notseethetestedeye.The methodhasbeenwidelyusedinmanyormostautomated perimeters,buthaswell-knownshortcomings,especiallyin eyeswheretheblindspotisnotsituatedintheassumedlo- cation.Today,variousmethodsforgazetrackingcanbecon- sideredsuperiortotheblindspottechnique,andatleastone newtestingalgorithmreliesbydefaultupongazetracking andnotFLestimatesbasedontheblindspotmethod.4FNresponseswereintendedtobeanindexofpatientvig- ilance.FNratesusuallyaremeasuredbydisplayingstimuli that shouldbe easilyvisible, basedupon thresholdsensi- tivity measurements madeatthechosen locationsearlier inthetest.However,inthe1980sitwasreportedthatthe percentageofFNanswersdependedmoreonthelevelofvi- sualfielddamagethanonpatientvigilance.5 InBengtsson and Heijl,6 this shortcomingwasclearlydemonstratedby testingbotheyesofpatientshavingunilateralglaucoma.It isnowrecognizedthattestresultsshouldnotbediscarded solelyonthebasisofelevatedFNresponserates.
While FLand FN rateshave beenconsidereddecreas- inglyimportantovertime,thishasnotbeenthecasethus far for FPresponse rates.FP rate estimates are meant to identify “trigger-happy” testingbehavior,ie,examinations inwhichpatientstoofrequentlypressedtheperimeter’sre- sponsebuttonwithouthaving perceivedastimulus.Clas- sic "trigger-happy" fields, with very high-thresholdsensi- tivity valuesandwhite patchesin thegrayscalemaps, of- ten have high percentages of FPanswers, but this is not always the case(Figure 1). FP rates were originally esti-
© 2021THEAUTHOR(S).PUBLISHEDBYELSEVIERINC.
matedusingcatchtrialsinwhichnostimuluswaspresented, notingifthepatienterroneouslypressedtheresponsebut- ton.7Morerecently,SwedishInteractiveThresholdingAl- gorithm(SITA)testingprogramshaveincorporatedadif- ferentmethodofestimatingFPratesthatisbaseduponde- tectionofpatientresponsesduringtimeswhenitisimpos- sibleorunlikelythatastimuluswasseen.8
ThereasonthathighFPresponseratesareofinterestis thattheyare expectedto be associatedwith artifactually elevatedthresholdsensitivityvalues,withhigherFPrates beingassociatedwithhighermeandeviation(MD)values, bothin perimetry-naïve normalsubjects9 and in patients
with glaucoma.10 Inthefirst ofthesestudies,theanalysis wasbasedonjustasinglevisualfieldtestpernormalsub- ject,andinthelatterstudytheresultswerebasedondiffer- ences betweenpredictedand observedMDvaluesineyes withsuspectormanifestglaucoma.However,FPrateshave also beenreportedto havealmostnocorrelationto mea- surement variability in acohort of patients with suspect ormanifestglaucomawhounderwentthresholdvisualfield testingtwicewithinapproximately1week.11
Recommendedlimitsforclinically“acceptable” FPrates haveevolvedovertime.Inthe1980s,weusedanarbitrary limit of33%,which simply wasthelimitwehad chosen
FIGURE1. Falsepositiveratesin“trigger-happyfields”:these2fieldsbothshowtypicalfeaturesoftrigger-happyfields,including abnormallyhighthresholdsensitivityvalues,“whitescotomas,” “reversedcataractpattern,” manymoresignificanttestpointloca- tionsinpatterndeviationprobabilitymapsthanintotaldeviationmaps,andGHTclassificationsof“abnormallyhighsensitivity.”
Onefield(A)showsahighrateoffalsepositiveresponses(35%)whiletheother(B)hasafalsepositiverateof0%.
FIGURE1. Continued
asanexclusion criterion forthe visualfieldtestsusedto definenormativesignificancelimitsforthefirstHumphrey Statpacinterpretationpackage.12 Later,wesuggestedthat FPrates>15%mightindicateunreliabletestresults,arec- ommendationthatwasbased uponthe distributionofFP levelsseeninasampleoffieldtestresults.Thus,FPrates
>15%wereflaggedbecausetheywereuncommon,notbe- causetestswithhigherFPrateswereunreliable.13
WhileinperimetryFPresponseshavetraditionallybeen regardedaserrors,signaldetectiontheoryprovidesadiffer- entperspective.14 Insignaldetectiontheory,FPresponses aremerelyareflectionofthesubject’sresponsecriterion.
Recently,whiledeveloping theSITAFaster(SFR)test strategy,wenoticedthatthepercentageofFPanswerswas higherwiththenewprogramthanwith SITAFast(SF),4
andthishasbeensubsequentlyreportedbyotherinvestiga- tors.15 Ithasbeenknownfor>20yearsthatFPrateesti- matesaretypicallyslightlyhigherwithSFthanwithSITA Standard(SS).DespitethehigherFPrateswithSFR,the results ofamulticenterclinicaltrialshowedalmostiden- ticalSFRandSFthreshold testresults.4 Werealizedthat further analysis ofourmulticenter SFRstudydata might provideanopportunityto studytherelationship between FPanswersandperimetrictestresultsingreaterdetail.The distinctiveadvantageofusingthisrecentstudymaterialwas thatallpatientshadbeentestedtwicewithinsuchashort periodoftime,<2weeks,thatitwasreasonabletopostu- latethatnosignificantvisualfieldprogressionwouldhave occurredbetweenthe2tests.Asecondadvantagewasthat wecouldsimultaneouslyevaluateFPeffectsinall3SITA
testingalgorithms.Therefore,wehopedtodetermine the extentto whichdifferencesin FPmeasurements between thefirstandsecondtestswereassociatedwithobserveddif- ferencesin measured threshold sensitivity and associated metrics.
Theaimofthecurrentinvestigationwastoanalyzeour recentmulticenterdataset,focusingontherelationshipbe- tweenFPratesandperimetrictestresultsineachof3dif- ferenttestingstrategies.
METHODS
• SETTING: This prospective multicenterstudywascon- ductedat5centerslocatedin5differentcountriesinaccor- dancewiththeDeclarationofHelsinkiandwasapproved bythe Ethics Committeeofthe Gifu Prefecture Medical Association,theEthicsCommitteeofTampereUniversity Hospital, the Committee forProtection of HumanSub- jectsoftheUniversityofCaliforniaBerkeley,andtheHong KongHospitalAuthorityKowloonCentralResearchEthics Committee.ThestudywasalsosubmittedtotheRegional Ethics ReviewBoard in Lund, Sweden.The LundBoard concludedthatthestudydid notneedtheir approvalbut thattheysawnoethicalissues.
• STUDYPOPULATION: Theacquisitionofstudydatahas beenpreviouslydescribed.4Thestudyincluded126patients withmanifestorsuspectglaucoma.Nostagesofglaucoma- tousvisualfieldlosswereexcluded.
• OBSERVATIONPROCEDURE: Allparticipantsunderwent Humphrey24-2visualfieldtestinginasinglestudyeyeus- ing3differentthresholdtestingstrategies(SFR,SF,andSS) inrandomizedorder.Allperimetrictestingwasrepeatedat asecondvisit,between1dayand2weekslater,withtesting orderreversed.Ateachstudysite,participantsunderwent alltestingonthesameHumphrey860perimeter(CarlZeiss Meditec,Dublin,California,USA).
If,duringtesting,the perimetristobservedpatient gaze instabilityorresultsconsistentwithfalseresponses,patient misunderstanding,or inattentiveness,the perimetristwas allowedtostopthetest,reinstructthepatient,andrestart thetestfromthebeginning,thusdiscardingtheinterrupted test.However,onceatesthadbeencompleted,itcouldnot bedeleted,anditwasincludedinallstatisticalanalyses.
• MAINOUTCOMEMEASURES: Fromeachvisualfieldtest wetabulatedthepercentageofFPresponses,visualfieldin- dex(VFI),andMDvalues,andthenumberofsignificantly depressedtestpointsatthe1%and0.5%significancelev- elsinthetotaldeviation(TD)andpatterndeviation(PD) probabilitymaps.
• STATISTICALANALYSIS: First,weregisteredFPratesand MDandVFIvaluesforthe3algorithms.Foreachtestedeye and each teststrategy,wethencalculated differencesbe- tweenvisit1andvisit2FPrates,aswellasintertestdiffer- encesinVFI,MD,andthenumberofsignificantlydepressed testpoints. Wethenperformed linearregressionanalyses withintrasubjectFPdifferencesastheexplanatoryvariable andintrasubjectdifferencesinVFI,MD,andnumberofsig- nificantlydepressedtestpointsasthedependentvariables.
We alsocalculated intertestdifferencesin generalheight (GH).16,17GHisthedifferencebetweenthenumericalTD values and the PD values in the Statpacprogram of the Humphreyperimeter.Wethenperformedthesameregres- sionanalyseswithGHdifferences,insteadofFPdifferences, astheexplanatoryvariable.
We also performed regression analyses with FP differ- ences astheexplanatory variableand differencesin MD, VFI,andinnumbersofsignificantlydepressedpointswith studyeyesdividedinto3groupswithearly,moderate,orse- verevisualfieldlossusingtheMDvaluesofthestagingsys- temsofHoddapandassociates18andMillsandassociates.19 TheMD stageforeach eyewasdefinedastheaverageof thevisit1and visit2MDvalues,foreachtestalgorithm.
Assumptions for linearregression were testedby residual analysisbetweendifferencesinFPvsdifferencesinMDand VFI.Histogramsofresidualswereproduced,aswerescatter- plotsofstandardizedresiduals overstandardizedpredicted values.
RESULTS
We analyzed test results from 125 patients, including 64 women(51%)and 61men(49%).Themeanagewas67 years(range26-82years).Resultsfrom1subjectwereex- cludedbecausetestingofthispatienthadbeeninterrupted becauseoftheobservationoflargeeyemovements.Thepa- tientwasreinstructedandanewtestwasstarted,butfixa- tionstabilitywasstillconsideredunacceptable.
The 3 test strategies showed significantly different FP rates, while MD and VFI values were very similar4 (Table1).Intervisitdifferencesshownin Table1 wereall distributednormally.
Foreachofthe3strategies, intervisitdifferencesin FP explainedonlyasmallpartoftheintervisitdifferencesin MDandVFI,despitereachingstatisticalsignificanceinhalf oftheanalyses(Table2).Statisticalsignificancemayhave been reachedsimply because oftherelatively largenum- berofobservations.Thecoefficientsofdetermination—r2, thevariabilityinthedependentvariablethatisexplained by theexplanatory variable—were smallforall strategies forFPvsMD andevensmallerforFPvsVFI.Higher FP rateswereassociatedwithgreaterincreasesinMDvalues, as expected, butthe effects were small(0.4-0.5 dB), de- pending upon testing strategy, foran increase of 10 per-
TABLE1.DescriptiveAnalysisofMean,Median,Minimum,andMaximumValuesforParametersWithSkewedDistributions,and MeansandStandardDeviationsforVariablesThatWereNormallyDistributed
Visit1Mean;Median(Minimum, Maximum),SkewedDistribution
Visit2Mean;Median(Minimum, Maximum),SkewedDistribution
IntrapatientDifference(Visit1– Visit 2),Mean(SD),AllGaussian
FP(%)SS 2.8;2(0,28) 2.8;2(0,13) 0.0(4.1)
FP(%)SF 3.3;2(0,41) 3.65;2(0,32) −0.4(5.5)
FP(%)SFR 4.9;0(0,39) 5.0;3(0,43) −0.1(9.5)
MD(dB)SS −8.5;−6.0(−28.3,0.56) −8.5;−6,4(−28.7,0.58) −0.1(1.3) MD(dB)SF −8.6;−6.2(−28.7,1.33) −8.4;−6.1(−28.9,0.8) −0.2(1.6) MD(dB)SFR −8.4;−5.8(−28.5,1.9) −8.5;−6.4(−28.2,2.9) 0.1(1.5)
VFI(%)SS 75.9;83(8,100) 75.9;83(6,100) −0.0(3.7)
VFI(%)SF 76.6;82(9,100) 77.1;84(11,100) −0.5(4.6)
VFI(%)SFR 77.6;85(11,100) 77.1;85(11,100) 0.4(4.6)
MD=meandeviation;SD=standarddeviation;SF=SITAFast;SFR=SITAFaster;SITA=SwedishInteractiveThresholdingAlgorithm;
SS=SITAStandard;VFI=visualfieldindex.
TABLE2.RelationshipsBetweenDifferencesinFalsePositiveResponseRatePercentagesandMeanDeviationandVisualField IndexValues,andNumberofSignificantTestPointsinTotalandPatternDeviationProbabilityMapsDifferencesBetweenVisits1
and2
r2 Slope(ChangeperPercentagePoint
IncreaseinFPRate)and95%CI
Effectper10Percentage Point–IncreaseinFP
DiffMD/diffFPSS 0.01 0.04(−0.02to0.08) 0.36dB
DiffMD/diffFPSF 0.04 0.06(0.01-0.11)a 0.60dB
DiffMD/diffFPSFR 0.11 0.05(0.03-0.08)a 0.51dB
DiffVFI/diffFPSS 0.00 0.07(−0.10to0.22) 0.56%
DiffVFI/diffFPSF 0.03 0.14(−0.01to0.28) 1.37%
DiffVFI/diffFPSFR 0.04 0.10(0.01-0.18)a 0.95%
DiffTD1%/diffFPSS 0.00 −0.04(−0.25to0.10) −0.4points
DiffTD1%/diffFPSF 0.03 −0.15(−0.30to0.01) −1.5points
DiffTD1%/diffFPSFR 0.04 −0.09(−0.17to−0.01)a −0.9points
DiffPD1%/diffFPSS 0.00 −0.02(−0.17to0.13) −0.2points
DiffPD1%/diffFPSF 0.00 0.00(−0.11to0.10) 0.0points
DiffPD1%/diffFPSFR 0.00 −0.02(−0.09to0.05) −0.2points
CI=confidenceinterval;diff=difference;FP=falsepositive;MD=meandeviation;PD=patterndeviation;SF=SITAFast;SFR=SITA Faster;SITA=SwedishInteractiveThresholdingAlgorithm;SS=SITAStandard;TD=totaldeviation;VFI=visualfieldindex.
aStatisticallysignificantslope.
centagepointsinFPrates(forexample,anincreaseinFP ratefrom 5%to 15%).EffectsforVFI wereeven smaller 0.6%-1.4% (approximatelycorresponding to 0.2-0.4dB), foranincreaseof10percentagepointsin FPrates.Simi- larly,theassociationsbetweenFPintervisitdifferencesand differencesinnumbersofsignificantlydepressedtestpoints were weak for all 3 test strategies, with many r2 values closeto0.Mostofthoserelationshipswerenotstatistically significant.
TherelationshipsbetweenintervisitdifferencesinGH and MD were markedly stronger, with r2 values ranging from0.22to0.46forthe3strategies.Therelationshipsbe- tweenGHand VFIintervisitchangeswereweakbutstill
muchstrongerthanforFPvsVFI(Tables2and3).There- lationshipsbetweenGHandnumberofsignificantTDtest points werefairlystrongacrosstestingstrategiesbutwere muchweakerforpointsinPDmaps.Thislatterobservation is notsurprising,because GHwasdesigned tocorrect for generalizedchangesinvisualfieldsensitivity,suchasthose associatedwithcataractdevelopment.
Analysisoflinearregressionresidualvaluesrevealedthat assumptions implicit in linear regressionwere supported.
Histograms of standardized residuals were normally dis- tributed around zero. Most residual points in the scatter plotswerewithinthe±2 intervalsofstandardizedresidu- alsontheyaxesandrandomlydispersedaroundstandard-
TABLE3.RelationshipsBetweenDifferencesinGeneralHeightandMeanDeviationandVisualFieldIndexValues,andNumberof SignificantTestPointsinTotalandPatternDeviationProbabilityMapsDifferencesBetweenVisits1and2
r2 Slope(ChangeperdBofGH)and95%CI Effectper10-dBChangeinGH
DiffMD/diffGHSS 0.35 0.55(−0.42to0.69) 5.53dB
DiffMD/diffGHSF 0.46 0.93(0.75-1.1)a 9.25dB
DiffMD/diffGHSFR 0.22 0.56(0.37-0.74)a 5.55dB
DiffVFI/diffGHSS 0.10 0.84(0.39-1.28)a 8.35%
DiffVFI/diffGHSF 0.22 1.83(1.22-2.43)a 18.25%
DiffVFI/diffGHSFR 0.03 0.64(0.00-1.28)a 6.38%
DiffTD1%/diffGHSS 0.35 −1.95(−2.42to−1.47)a −19.5points
DiffTD1%/diffGHSF 0.39 −2.56(−3.17to−2.00)a −25.6points
DiffTD1%/diffGHSFR 0.20 −1.51(−2.05to−0.90)a −15.1points
DiffPD1%/diffGHSS 0.00 0.06(−0.37to0.50) 0.6points
DiffPD1%/diffGHSF 0.00 −0.18(−0.68to0.32) −1.8points
DiffPD1%/diffGHSFR 0.11 0.92(0.44-1.40)a 9.2points
CI=confidenceinterval;diff=difference;GH=generalheight;MD=meandeviation;PD=patterndeviation;SF=SITAFast;SFR=SITA Faster;SITA=SwedishInteractiveThresholdingAlgorithm;SS=SITAStandard;TD=totaldeviation;VFI=visualfieldindex.
aStatisticallysignificantslope.
TABLE4.RelationshipsBetweenFalsePositiveResponseRatePercentagesandMeanDeviationValuesatDifferentStagesof Glaucoma
Eyes(n) Strategy Stage r2 Slope(DecibelChangeperPercentage
PointChangeinFPRate)
95%CI Effectper10Percentage Point–IncreaseinFPRate(dB)
61 SSc Early 0.01 0.03 −0.04to0.09 0.3a
25 SS Moderate 0.00 0.02 −0.14to0.18 0.2
39 SS Severe 0.05 0.11 −0.05to0.27 1.1
58 SFd Early 0.06 0.04 −0.004to
0.09
0.4
30 SF Moderate 0.01 0.04 −0.13to0.22 0.4
37 SF Severe 0.09 0.14 −0.01to0.28 1.4
63 SFRe Early 0.06 0.03 0.001-0.05b 0.3
23 SFR Moderate 0.14 0.05 −0.01to0.11 0.5
39 SFR Severe 0.33 0.20 0.11-0.30b 2.0
37a SFR Severe 0.12 0.14 0.01-0.26b 1.4
CI=confidenceinterval;FP=falsepositive;SF=SITAFast;SFR=SITAFaster;SS=SITAStandard.
aTwooutliersexcluded.
bStatisticallysignificantslope.
izedpredictedMDand VFIvaluesonthehorizontalaxes (SupplementalFigure1).
TheinfluenceofFPratesonMDandVFIwhendividing thefieldtestsintoseveritystagesispresentedinTables4and Table5.ForMD,therelationshipsdid notreachstatisti- calsignificanceatanydiseasestagewithSSandSF.With SFR,theywerestatisticallysignificantin severeglaucoma andborderlinesignificantforSSandSF.InfluencesonVFI werestatisticallysignificantonlyforSFandSFRinsevere glaucoma. Eliminating 2 outliers among the SFR results inthegroupofeyeswith severeglaucoma(Supplemental Figure 2)reduced theslopesconsiderably. Corresponding
results atdifferent stagesof glaucomaon numbers ofsig- nificantly depressed TDand PDtestpoints are shownin Tables6and7.Noneoftherelationshipswerestatistically significant.
DISCUSSION
Our results indicate that across 3 different perimetric thresholding strategies, FP rate measurements generally showedonlyweakassociationswithvisualfieldthreshold
TABLE5.RelationshipsBetweenFalsePositiveResponseRatePercentagesandVisualFieldIndexValuesatDifferentStagesof Glaucoma
Eyes(n) Strategy Stage r2 SlopeVFIPercentageChangeper
PercentageChangeinFP
95%CI Effectper10Percentage Point–IncreaseinFP(%)
61 SS Early 0.00 0.01 −0.17to0.18 0.08
25 SS Moderate 0.01 0.10 −0.36to0.55 0.95
39 SS Severe 0.04 0.28 −0.20to0.76 2.81
58 SF Early 0.01 0.05 −0.07to0.16 0.47
30 SF Moderate 0.01 0,12 −0.38to0.61 1.17
37 SF Severe 0.11 0.44 0.01-0.88a 4.43
63 SFR Early 0.01 0.02 −0.04to0.08 0.23
23 SFR Moderate 0.04 0.08 −0.11to0.26 0.76
39 SFR Severe 0.25 0.62 0.26-0.97a 6.15
37b SFR Severe 0.09 0.43 −0.04to0.90 4.29
CI=confidenceinterval;FP=falsepositive;SF=SITAFast;SFR=SITAFaster;SS=SITAStandard.
aStatisticallysignificantslope.
bTwooutliersexcluded.
TABLE6.RelationshipsBetweenFalsePositiveResponseRatePercentagesandNumbersofSignificantPointsattheP<.01Level inTotalDeviationProbabilityMaps
Eyes(n) Strategy Stage r2 Slope(ChangeinNumberof1%Pointsper PercentagePointChangeinFPRate)
PValue Effectper10Percentage Point–IncreaseinFP
61 SS Early 0.01 −0.05 .68 −0.5points
25 SS Moderate 0.00 −0.008 .98 −0.1points
39 SS Severe 0.00 −0.002 1.00 0.0points
58 SF Early 0.02 −0.08 .36 −0.8points
30 SF Moderate 0.02 −0.16 .52 −1.6points
37 SF Severe 0.07 −0.29 .11 −2.9points
63 SFR Early 0.01 −0.04 .42 −0.4points
23 SFR Moderate 0.07 −0.12 .21 −1.2points
39 SFR Severe 0.09 −0.25 .06 −2.5points
FP=falsepositive;SF=SITAFast;SFR=SITAFaster;SS=SITAStandard.
TABLE7.RelationshipsBetweenFalsePositiveResponseRatePercentagesandNumbersofSignificantPointsattheP<.01 LevelinPatternDeviationProbabilityMaps
Eyes(n) Strategy Stage r2 Slope(ChangeinNumberof1%Pointsper PercentagePointChangeinFPRate)
PValue Effectper10Percentage Point–IncreaseinFPRate
61 SS Early 0.00 0.01 .89 0.1points
25 SS Moderate 0.01 −0.11 .57 −1.1points
39 SS Severe 0.00 −0.05 .79 −0.5points
58 SF Early 0.01 0.03 .56 0.3points
30 SF Moderate 0.02 0.13 .52 1.3points
37 SF Severe 0.08 −0.21 .08 −2.1points
63 SFR Early 0.00 −0.02 .69 −0.2points
23 SFR Moderate 0.00 0.15 .82 1.5points
39 SFR Severe 0.06 −0.16 .14 −1.6points
FP=falsepositive;SF=SITAFast;SFR=SITAFaster;SS=SITAStandard.
sensitivityand associatedanalysismetrics.This finding is somewhatunexpected,butweseeaparallelintheevolu- tionofourthinkingregardingtheroleofFNresponserates
>20yearsago.
Thetraditionaldefinitionofreliabilityinresearchisre- producibility.IfthatiswhatwewantFPmetricstoassess, thenourfindingssuggestthatthecurrentFPindexmaybe oflittleuse.This isnotatallanewfinding, however,as similarresultswerereported20yearsago;theresultsofre- liabilitytestinghadalmostnegligiblecorrelationwithtest reliabilityasexpressedasthresholdreproducibility.11 Ifwe instead define reliability asindicating the “usefulness” of testresults,ourfindingsshowthatFPmeasurementchanges wereassociatedwithchangesintestresultsinthesamedi- rectionasinotherpublishedstudies.9,10,20Therefore,inthe currentstudy,increasingratesofFPresponseswereassoci- atedwithincreasesin MDvalues,buttheeffects onMD weresmall,exceptinsevere disease,andevensmallerfor VFI.PDprobabilitymapswerenotinfluencedatall,which isinterestingbecauseahighernumber ofsignificantlyde- pressedPDtestpointsthanTDpointsisoneoftheclassical hallmarksofatrigger-happyfield.Thatobservationalone showstheresultsreportedherein:thattherelationshipof higherFP ratesto signs oftrigger-happyfields isweakto poor.
The effects of FP rates on MD were larger with SFR than with SS and SF, but in early glaucoma the slopes were small with all 3 algorithms, while in severe dis- ease the slope with SFR was considerably larger than those of SS and SF. The SFR results in severe glau- coma were partially explained by 2 outliers (Supple- mental Figure 2). In line with earlier reports, we thus found that FP rates seemed to be more important in eyeswith severe fieldloss. PDprobabilitymapswere not influenced.
Tanandassociates9andYohannanandassociates10both reportedthatFPinfluenced MDto agreater extentwith higherfrequencies ofFP.Weapplied theanalysesof Tan andassociates9onourowndatabutcouldnotconfirmtheir findings.Therefore,inourmaterial,theeffectofFPonMD didnotdifferbetweeneyeswithhighvslowFPvalues.Each percentagepointofhigherFPratewasassociatedwithan increaseinMDof0.06dBineyeswithFP≤15%and0.04 dBineyeswithFP>15%.
Themainstrengthofthecurrentstudywasthat2visual fieldswereobtainedwitheachof3perimetricthresholding algorithmswithin averyshorttime interval, eliminating theneedtocomparesingletestresultstoamodelofanex- pectedfield,ashasbeenthecaseinearlierstudies.10Other strengthsincludeourstudy’smulticenterdesignandthefact thatwecouldassessFPperformanceinsubjectswhowere testedusing3differentthresholdtestingalgorithms,mak- ingit possible to determine if observed trends werecon- sistentacross testingstrategies, and thefact thatwe also studiedthe effectson theresults expressedin probability maps.
Aweaknessofthisstudyisthesomewhatlimitednumber ofenrolledsubjects.Thematerialconsistedmostlyofeyes with manifestglaucomabutalsocontainedglaucomasus- pects.Itwouldhavebeeninterestingtohavehadanequally largeage-matchedcohortofentirelynormalsubjects,each testedtwicewithall3strategies.
This studyis notthe first attempt to address the rela- tionshipbetweenFPresponseratemeasurementsandMD values.Ourresultsreferring toMDvaluesgoin thesame direction but are of a smaller magnitude than those re- ported in a largesimilar populationofpatients with sus- pect andmanifest glaucoma10 and in anotherlarge study ofnormalsubjects.9Theseearlierstudieshavenotreported the relationship of FP rates to the VFI, wherewe found even smallereffects,and, therefore, wecannot make any comparisons.
We havefound nopreviouspublicationsreportingthe relationshipofFPratestothenumberofsignificantlyde- pressedTDandPDpoints,whicharecentraltotheclinical interpretationofperimetricresults.Wefoundnosignificant influenceofFPratesonPDprobabilitymaps.
OnemayspeculateastowhySFRtestsgeneratealarger numberofFPresponsesthanSF,andwhySFgeneratesmore suchresponsesthanSS.Accordingtosignaldetectionthe- ory,amorelenientresponsecriterionleadstoahigherrate ofFPresponses.14 Thishasalsobeenshowntohappenin computerizedperimetrictesting,whereinstructionsencour- agingtestsubjectstousemorelenientresponsecriteriare- sultedinhigherFPrates.21Inthebeginningofavisualfield test,patientsmustsettheirownsubjectiveresponsecriteria.
InSSandSF,theteststartswithstimulithatarequiteabit moreintensethannormalthresholdsensitivity,whichusu- allyareeasilyperceived.Itseemslikelythatpatientstaking aSFtestmaythenrequirestrongerstimulibeforerespond- ingthan inSFRtests,which startoutatthenormal age- correctedthreshold.Thelackofclearlyvisible(supralimi- nal)stimuliinSFRtestsmakesitreasonabletoassumethat patientsmightthentendtoadoptamorelenientresponse criterion and respondmoreoftenwhennotbeingsure of having seenastimulus.ThismightexplainthehigherFP rateswithSFR.Duringmostofthetest,SFRpresentsstim- uliatthepatient’spredicted50%thresholdlevel,whileSF presentsstimulithatareapproximately1dBbrighterand SS3dBbrighter,possiblyexplainingthesmallerFPdiffer- encebetweenSFandSS.ThetimingalgorithmsinSSand SFareidentical,andtheoneusedinSFRdifferslittlefrom thatofSSandSF.Wedonotbelievethatthedifferences of FPrates amongthe3 algorithms areexplained by the methodusedin SITA to assessFP rates,ie,to registeras FPresponsesanybuttonpressesthatoccurduringthefirst 180millisecondsafterstimulusinitiationorduringaperiod fromtheendofaresponsewindowuntiltheonsetthenext stimulusexposure.8
TheintendedaimoftheFPindexwastoflagperimetry testresultsfrom"trigger-happy"patientswherethoseresults cannotbetrusted,andonthatbasiscurrentmethodsofes-
timatingFPratesarefarfromoptimal.Therefore,itseems likelythattestresultsshouldneverbediscardedsolelybased onFPresponserates.Itisencouragingtonotethatthecor- relation ofGHwith other testmetrics wasmuchgreater thanthatofFPratesandwasusually highlysignificant.It
mightperhapsbepossibletoconstructametricthatisbased inpartonGHtobetteridentifytrigger-happyfieldswhere testresultsshouldnotbetrusted.
ALLAUTHORSHAVECOMPLETEDANDSUBMITTEDTHEICMJEFORMFORDISCLOSUREOFPOTENTIALCONFLICTSOFINTEREST andnonewerereported.
Funding/Support:TheclinicalstudywassupportedbytheHermanJärnhardtFoundation,theFoundationforVisuallyImpairedinFormerMalmöhus County,Sweden.Thesefundingorganizationshadnoroleinthedesignorconductofthisresearch.CarlZeissMeditecInc.,Dublin,California,USAwas directlyinvolvedinthedevelopmentofSITAFasterandloanedperimetersto4oftheparticipatingclinicalsites.CarlZeissMeditecInc.providedresearch fundingforthisstudytoJ.G.F.attheUniversityofCalifornia,Berkeley.FinancialDisclosures:A.H.andB.B.areconsultantsofandareentitledtoroyalties fromCarlZeissMeditec.A.H.isaconsultantforAllerganplcandhasreceivedspeakerhonorariafromAllerganandZeiss.V.M.P.isaconsultantforCarl ZeissMeditecandwasaCarlZeissMeditecemployeeduringthedevelopmentandevaluationofSITAFaster.J.G.F.isaconsultantforandreceivedresearch supportfromCarlZeissMeditec,Inc.A.I.isaconsultantforSantenandhasreceivedspeakerhonorariafromPfizer,Santen,Kowa,Alcon,Heidelberg EngineeringthroughJapanfocusCompany,andCarlZeissMeditec,Tokyo.A.I.alsoholdsapatentlicensedtoTopconwithoutanyroyalties.C.K.L.has receivedspeakerhonorariafromCarlZeissMeditec,Topcon,Tomey,Allergan,Novartis,Santen,Glaukos,andGlobalVision;researchsupportintheform ofinstrumentsfromCarlZeissMeditec,HeidelbergEngineering,Topcon,Tomey,andOptovue;researchgrantsfromCarlZeissMeditec,Topcon,Novartis, Glaukos,Alcon,andOptovue;consultantfeesfromAllerganandNovartis;andhaspatentswithCarlZeissMeditec.A.T.hasnofinancialdisclosures,but CarlZeissMeditecprovidedtheperimeterusedintheclinicalevaluation.G.C.L.andT.C.areemployeesofCarlZeissMeditec.Allauthorsattestthat theymeetthecurrentICMJEcriteriaforauthorship.
REFERENCES
1.Heijl A, Krakau CE. An automatic static perime- ter, design and pilot study. Acta Ophthalmol (Copenh). 1975;53(3):293–310.
2.FankhauserF,SpahrJ,Bebié H.Someaspectsoftheautoma- tionofperimetry.SurvOphthalmol.1977;22(2):131–141. 3.AndersonDR,PatellaVM,eds.AutomatedStaticPerimetry.
Mosby;1999.
4.HeijlA,PatellaVM,ChongLX,etal.AnewSITAperimetric thresholdtestingalgorithm:constructionandamulticenter clinicalstudy.AmJOphthalmol.2019;198:154–165.
5.KatzJ,SommerA.Reliabilityindexesofautomatedperimetric tests.ArchOphthalmol.1988;106(9):1252–1254.
6.BengtssonB,HeijlA.False-negativeresponsesinglaucoma perimetry:indicatorsofpatientperformanceortestreliability?
InvestOphthalmolVisSci.2000;41(8):2201–2204.
7.HaleyM.TheFieldAnalyzerPrimer.2nded.HumphreyInstru- ments;1987.
8.Olsson J, Bengtsson B, Heijl A, Rootzen H. An im- proved method to estimate frequency of false positive an- swers in computerized perimetry. Acta Ophthalmol Scand. 1997;75(2):181–183.
9.TanNYQ,ThamYC,KohV,etal.Theeffectoftestingrelia- bilityonvisualfieldsensitivityinnormaleyes:TheSingapore ChineseEyeStudy.Ophthalmology.2018;125(1):15–21. 10.YohannanJ, WangJ, BrownJ, et al.Evidence-based crite-
ria for assessment of visual field reliability. Ophthalmology. 2017;124(11):1612–1620.
11.BengtssonB. Reliabilityofcomputerized perimetricthresh- oldtests asassessedbyreliability indicesand thresholdre- producibilityinpatientswithsuspectandmanifestglaucoma.
ActaOphthalmolScand.2000;78(5):519–522.
12.HeijlA,LindgrenG,OlssonJ.Apackageforthestatistical analysisofvisualfields.In:GreveEL,HeijlA,eds.SeventhIn- ternationalVisualFieldSymposium,Amsterdam,September 1986. DocumentaOphthalmologicaProceedingsSeries,vol 49. Springer, Dordrecht. doi:10.1007/978-94-009-3325-5_ 23.
13.BengtssonB,HeijlA.Acceptablefrequenciesoffalsepositive answersincomputerizedperimetry.InvestOphthalmolVisSci. 2000;41(4):478.
14.Green DM, Swets JA. Signal Detection Theory and Psy- chophysics.Wiley;1966.
15.Phu J, Khuu SK, Agar A, Kalloniatis M. Clinical evalua- tion of SwedishInteractiveThresholding Algorithm-Faster comparedwithSwedishInteractiveThresholdingAlgorith- m-Standard in normal subjects, glaucoma suspects, and patients with glaucoma. Am J Ophthalmol. 2019;208:251–
264.
16.OlssonJ.StatisticinPerimetry.Lund,Sweden:LundUniversity, DepartmentofMathematicalStatistics;1991.
17.Asman P, Heijl A, Olsson J, Rootzen H. Spatial analy- ses of glaucomatous visual fields; a comparison with tra- ditional visual field indices. Acta Ophthalmol (Copenh). 1992;70(5):679–686.
18.HodappE,Parrish2ndRK,AndersonDR.ClinicalDecisions inGlaucoma.Mosby;1993.
19.MillsRP,BudenzDL,LeePP,etal.Categorizingthestageof glaucomafrompre-diagnosistoend-stagedisease.AmJOph- thalmol.2006;141(1):24–30.
20.Junoy Montolio FG, Wesselink C, Gordijn M, Janso- niusNM.Factorsthatinfluencestandardautomatedperime- try test results in glaucoma: test reliability, technician ex- perience,timeofday,andseason.InvestOphthalmolVisSci. 2012;53(11):7010–7017.
21.KutzkoKE,BritoCF,WallM.Effectofinstructionsoncon- ventional automated perimetry. Invest Ophthalmol Vis Sci. 2000;41(7):2006–2013.