False Positive Responses in Standard Automated Perimetry

(1)

False Positive Responses in Standard Automated Perimetry

ANDERSHEIJL,VINCENT MICHAELPATELLA,JOHN G.FLANAGAN,AIKOIWASE,CHRISTOPHER K.LEUNG, ANJATUULONEN,GARY C.LEE,THOMASCALLAN,ANDBOELBENGTSSON

• PURPOSE: To analyze the relationship between rates offalse positive(FP)responsesand standardautomated perimetryresults.

• DESIGN: Prospectivemulticentercross-sectionalstudy.

• METHODS: Onehundredtwenty-sixpatientswithman- ifestorsuspectglaucomaweretestedwithSwedishInter- activeThresholdingAlgorithm(SITA)Standard,SITA Fast,and SITAFasterateachof2 visits.Wecalculated intervisitdifferencesinmeandeviation(MD),visualfield index(VFI),and number of statisticallysignificanttest pointsasafunctionofFPratesandalsoasafunctionof generalheight(GH).

• RESULTS: Increasing FP values were associated with higher MD values for all 3 algorithms, but the effects were small, 0.3 dB to 0.6 dB, for an increase of 10 percentage points of FP rate, and for VFIeven smaller (0.6%-1.4%).Onlysmall parts of intervisit differences wereexplainedbyFP(r²values0.00-0.11).Theeffects ofFPwerelargerinsevereglaucoma,withMDincreases of1.1dBto2.0dB per10percentagepointsofFP,and r²valuesrangingfrom0.04to0.33.Thenumbersofsig- nificantlydepressedtotal deviation pointswere affected onlyslightly,andpatterndeviationprobabilitymapswere generallyunaffected.GHwasmuchmorestronglyrelated toperimetricoutcomesthanFP.

• CONCLUSIONS: Across3differentstandardautomated perimetrythresholdingalgorithms,FPratesshowedonly weak associations with visual field test results, except in severe glaucoma. Current recommendations regard- ing acceptable FP ranges may require revision. GH or other analyses may be better suited than FP rates for identifying unreliable results in patients who fre- quently pressthe response button without having per-

SupplementalMaterialavailableatAJO.com. AcceptedforpublicationJune25,2021.

FromOphthalmologyResearchUnit,DepartmentofClinicalSciences Malmö,LundUniversity(A.H.andB.B.);DepartmentofOphthalmology, SkåneUniversityHospital,Malmö,Sweden(A.H.);DepartmentofOph- thalmology(V.M.P.),UniversityofIowa,IowaCity,Iowa,USA;Schoolof OptometryandVisionScienceProgram(J.G.F.),UniversityofCalifornia, Berkeley,Berkeley,USA; TajimiIwaseEyeClinic(A.I.),Tajimi,Japan;

DepartmentofOphthalmologyandVisualSciences(C.K.L.),Chinese UniversityofHongKong,HongKong,China; TaysEyeCentre(A.T.), TampereUniversityHospital,Tampere,Finland; CarlZeissMeditec,Inc.

(G.C.L.,T.C.),Dublin,California,USA

InquiriestoAndersHeijl,DepartmentofOphthalmology,SkåneUni- versityHospital,JanWaldenströmsGata24,SE-20502,Malmö,Sweden.;

e-mail:anders.heijl@med.lu.se

ceived stimuli. (Am J Ophthalmol 2021;233: 180–

This isanopenaccess articleundertheCCBYlicense (http://creativecommons.org/licenses/by/4.0/))

W

^ITH ^the introduction of computerized perimetersinthe1970s,3so-called“reliability parameters” were implementedwith thehope of helping users judge whether test results were reliable and useful. These parameters were fixation losses (FLs), false negative (FN)responses, and falsepositive (FP) responses.^1-3 FLresponses are obtainedusingamethodde- scribed in1974inwhichteststimuliarepresentedatthe expectedlocationofthephysiologicblindspotofthetested eye.¹Themethodwasoriginallydesignedtogiveaqualita- tive ideaaboutfixation in anearlycomputerized perimeter,wheretheoperatorcould notseethetestedeye.The methodhasbeenwidelyusedinmanyormostautomated perimeters,buthaswell-knownshortcomings,especiallyin eyeswheretheblindspotisnotsituatedintheassumedlo- cation.Today,variousmethodsforgazetrackingcanbecon- sideredsuperiortotheblindspottechnique,andatleastone newtestingalgorithmreliesbydefaultupongazetracking andnotFLestimatesbasedontheblindspotmethod.⁴

FNresponseswereintendedtobeanindexofpatientvig- ilance.FNratesusuallyaremeasuredbydisplayingstimuli that shouldbe easilyvisible, basedupon thresholdsensi- tivity measurements madeatthechosen locationsearlier inthetest.However,inthe1980sitwasreportedthatthe percentageofFNanswersdependedmoreonthelevelofvi- sualfielddamagethanonpatientvigilance.⁵ InBengtsson and Heijl,⁶ this shortcomingwasclearlydemonstratedby testingbotheyesofpatientshavingunilateralglaucoma.It isnowrecognizedthattestresultsshouldnotbediscarded solelyonthebasisofelevatedFNresponserates.

While FLand FN rateshave beenconsidereddecreas- inglyimportantovertime,thishasnotbeenthecasethus far for FPresponse rates.FP rate estimates are meant to identify “trigger-happy” testingbehavior,ie,examinations inwhichpatientstoofrequentlypressedtheperimeter’sre- sponsebuttonwithouthaving perceivedastimulus.Clas- sic "trigger-happy" fields, with very high-thresholdsensi- tivity valuesandwhite patchesin thegrayscalemaps, of- ten have high percentages of FPanswers, but this is not always the case(Figure 1). FP rates were originally esti-

(2)

matedusingcatchtrialsinwhichnostimuluswaspresented, notingifthepatienterroneouslypressedtheresponsebut- ton.⁷Morerecently,SwedishInteractiveThresholdingAl- gorithm(SITA)testingprogramshaveincorporatedadif- ferentmethodofestimatingFPratesthatisbaseduponde- tectionofpatientresponsesduringtimeswhenitisimpos- sibleorunlikelythatastimuluswasseen.⁸

ThereasonthathighFPresponseratesareofinterestis thattheyare expectedto be associatedwith artifactually elevatedthresholdsensitivityvalues,withhigherFPrates beingassociatedwithhighermeandeviation(MD)values, bothin perimetry-naïve normalsubjects⁹ and in patients

with glaucoma.¹⁰ Inthefirst ofthesestudies,theanalysis wasbasedonjustasinglevisualfieldtestpernormalsub- ject,andinthelatterstudytheresultswerebasedondiffer- ences betweenpredictedand observedMDvaluesineyes withsuspectormanifestglaucoma.However,FPrateshave also beenreportedto havealmostnocorrelationto mea- surement variability in acohort of patients with suspect ormanifestglaucomawhounderwentthresholdvisualfield testingtwicewithinapproximately1week.¹¹

Recommendedlimitsforclinically“acceptable” FPrates haveevolvedovertime.Inthe1980s,weusedanarbitrary limit of33%,which simply wasthelimitwehad chosen

FIGURE1. Falsepositiveratesin“trigger-happyfields”:these2fieldsbothshowtypicalfeaturesoftrigger-happyfields,including abnormallyhighthresholdsensitivityvalues,“whitescotomas,” “reversedcataractpattern,” manymoresignificanttestpointloca- tionsinpatterndeviationprobabilitymapsthanintotaldeviationmaps,andGHTclassificationsof“abnormallyhighsensitivity.”

Onefield(A)showsahighrateoffalsepositiveresponses(35%)whiletheother(B)hasafalsepositiverateof0%.

(3)

FIGURE1. Continued

asanexclusion criterion forthe visualfieldtestsusedto definenormativesignificancelimitsforthefirstHumphrey Statpacinterpretationpackage.¹² Later,wesuggestedthat FPrates>15%mightindicateunreliabletestresults,arec- ommendationthatwasbased uponthe distributionofFP levelsseeninasampleoffieldtestresults.Thus,FPrates

>15%wereflaggedbecausetheywereuncommon,notbe- causetestswithhigherFPrateswereunreliable.¹³

WhileinperimetryFPresponseshavetraditionallybeen regardedaserrors,signaldetectiontheoryprovidesadiffer- entperspective.¹⁴ Insignaldetectiontheory,FPresponses aremerelyareflectionofthesubject’sresponsecriterion.

Recently,whiledeveloping theSITAFaster(SFR)test strategy,wenoticedthatthepercentageofFPanswerswas higherwiththenewprogramthanwith SITAFast(SF),⁴

andthishasbeensubsequentlyreportedbyotherinvestiga- tors.¹⁵ Ithasbeenknownfor>20yearsthatFPrateesti- matesaretypicallyslightlyhigherwithSFthanwithSITA Standard(SS).DespitethehigherFPrateswithSFR,the results ofamulticenterclinicaltrialshowedalmostiden- ticalSFRandSFthreshold testresults.⁴ Werealizedthat further analysis ofourmulticenter SFRstudydata might provideanopportunityto studytherelationship between FPanswersandperimetrictestresultsingreaterdetail.The distinctiveadvantageofusingthisrecentstudymaterialwas thatallpatientshadbeentestedtwicewithinsuchashort periodoftime,<2weeks,thatitwasreasonabletopostu- latethatnosignificantvisualfieldprogressionwouldhave occurredbetweenthe2tests.Asecondadvantagewasthat wecouldsimultaneouslyevaluateFPeffectsinall3SITA

(4)

testingalgorithms.Therefore,wehopedtodetermine the extentto whichdifferencesin FPmeasurements between thefirstandsecondtestswereassociatedwithobserveddif- ferencesin measured threshold sensitivity and associated metrics.

Theaimofthecurrentinvestigationwastoanalyzeour recentmulticenterdataset,focusingontherelationshipbe- tweenFPratesandperimetrictestresultsineachof3dif- ferenttestingstrategies.

METHODS

• SETTING: This prospective multicenterstudywascon- ductedat5centerslocatedin5differentcountriesinaccor- dancewiththeDeclarationofHelsinkiandwasapproved bythe Ethics Committeeofthe Gifu Prefecture Medical Association,theEthicsCommitteeofTampereUniversity Hospital, the Committee forProtection of HumanSub- jectsoftheUniversityofCaliforniaBerkeley,andtheHong KongHospitalAuthorityKowloonCentralResearchEthics Committee.ThestudywasalsosubmittedtotheRegional Ethics ReviewBoard in Lund, Sweden.The LundBoard concludedthatthestudydid notneedtheir approvalbut thattheysawnoethicalissues.

• STUDYPOPULATION: Theacquisitionofstudydatahas beenpreviouslydescribed.⁴Thestudyincluded126patients withmanifestorsuspectglaucoma.Nostagesofglaucoma- tousvisualfieldlosswereexcluded.

• OBSERVATIONPROCEDURE: Allparticipantsunderwent Humphrey24-2visualfieldtestinginasinglestudyeyeus- ing3differentthresholdtestingstrategies(SFR,SF,andSS) inrandomizedorder.Allperimetrictestingwasrepeatedat asecondvisit,between1dayand2weekslater,withtesting orderreversed.Ateachstudysite,participantsunderwent alltestingonthesameHumphrey860perimeter(CarlZeiss Meditec,Dublin,California,USA).

If,duringtesting,the perimetristobservedpatient gaze instabilityorresultsconsistentwithfalseresponses,patient misunderstanding,or inattentiveness,the perimetristwas allowedtostopthetest,reinstructthepatient,andrestart thetestfromthebeginning,thusdiscardingtheinterrupted test.However,onceatesthadbeencompleted,itcouldnot bedeleted,anditwasincludedinallstatisticalanalyses.

• MAINOUTCOMEMEASURES: Fromeachvisualfieldtest wetabulatedthepercentageofFPresponses,visualfieldin- dex(VFI),andMDvalues,andthenumberofsignificantly depressedtestpointsatthe1%and0.5%significancelev- elsinthetotaldeviation(TD)andpatterndeviation(PD) probabilitymaps.

• STATISTICALANALYSIS: First,weregisteredFPratesand MDandVFIvaluesforthe3algorithms.Foreachtestedeye and each teststrategy,wethencalculated differencesbe- tweenvisit1andvisit2FPrates,aswellasintertestdiffer- encesinVFI,MD,andthenumberofsignificantlydepressed testpoints. Wethenperformed linearregressionanalyses withintrasubjectFPdifferencesastheexplanatoryvariable andintrasubjectdifferencesinVFI,MD,andnumberofsig- nificantlydepressedtestpointsasthedependentvariables.

We alsocalculated intertestdifferencesin generalheight (GH).¹⁶^,¹⁷GHisthedifferencebetweenthenumericalTD values and the PD values in the Statpacprogram of the Humphreyperimeter.Wethenperformedthesameregres- sionanalyseswithGHdifferences,insteadofFPdifferences, astheexplanatoryvariable.

We also performed regression analyses with FP differences astheexplanatory variableand differencesin MD, VFI,andinnumbersofsignificantlydepressedpointswith studyeyesdividedinto3groupswithearly,moderate,orse- verevisualfieldlossusingtheMDvaluesofthestagingsys- temsofHoddapandassociates¹⁸andMillsandassociates.¹⁹ TheMD stageforeach eyewasdefinedastheaverageof thevisit1and visit2MDvalues,foreachtestalgorithm.

Assumptions for linearregression were testedby residual analysisbetweendifferencesinFPvsdifferencesinMDand VFI.Histogramsofresidualswereproduced,aswerescatter- plotsofstandardizedresiduals overstandardizedpredicted values.

RESULTS

We analyzed test results from 125 patients, including 64 women(51%)and 61men(49%).Themeanagewas67 years(range26-82years).Resultsfrom1subjectwereex- cludedbecausetestingofthispatienthadbeeninterrupted becauseoftheobservationoflargeeyemovements.Thepa- tientwasreinstructedandanewtestwasstarted,butfixa- tionstabilitywasstillconsideredunacceptable.

The 3 test strategies showed significantly different FP rates, while MD and VFI values were very similar⁴ (Table1).Intervisitdifferencesshownin Table1 wereall distributednormally.

Foreachofthe3strategies, intervisitdifferencesin FP explainedonlyasmallpartoftheintervisitdifferencesin MDandVFI,despitereachingstatisticalsignificanceinhalf oftheanalyses(Table2).Statisticalsignificancemayhave been reachedsimply because oftherelatively largenum- berofobservations.Thecoefficientsofdetermination—r², thevariabilityinthedependentvariablethatisexplained by theexplanatory variable—were smallforall strategies forFPvsMD andevensmallerforFPvsVFI.Higher FP rateswereassociatedwithgreaterincreasesinMDvalues, as expected, butthe effects were small(0.4-0.5 dB), de- pending upon testing strategy, foran increase of 10 per-

(5)

TABLE1.DescriptiveAnalysisofMean,Median,Minimum,andMaximumValuesforParametersWithSkewedDistributions,and MeansandStandardDeviationsforVariablesThatWereNormallyDistributed

Visit1Mean;Median(Minimum, Maximum),SkewedDistribution

Visit2Mean;Median(Minimum, Maximum),SkewedDistribution

IntrapatientDifference(Visit1– Visit 2),Mean(SD),AllGaussian

FP(%)SS 2.8;2(0,28) 2.8;2(0,13) 0.0(4.1)

FP(%)SF 3.3;2(0,41) 3.65;2(0,32) −0.4(5.5)

FP(%)SFR 4.9;0(0,39) 5.0;3(0,43) −0.1(9.5)

MD(dB)SS −8.5;−6.0(−28.3,0.56) −8.5;−6,4(−28.7,0.58) −0.1(1.3) MD(dB)SF −8.6;−6.2(−28.7,1.33) −8.4;−6.1(−28.9,0.8) −0.2(1.6) MD(dB)SFR −8.4;−5.8(−28.5,1.9) −8.5;−6.4(−28.2,2.9) 0.1(1.5)

VFI(%)SS 75.9;83(8,100) 75.9;83(6,100) −0.0(3.7)

VFI(%)SF 76.6;82(9,100) 77.1;84(11,100) −0.5(4.6)

VFI(%)SFR 77.6;85(11,100) 77.1;85(11,100) 0.4(4.6)

MD=meandeviation;SD=standarddeviation;SF=SITAFast;SFR=SITAFaster;SITA=SwedishInteractiveThresholdingAlgorithm;

SS=SITAStandard;VFI=visualﬁeldindex.

TABLE2.RelationshipsBetweenDifferencesinFalsePositiveResponseRatePercentagesandMeanDeviationandVisualField IndexValues,andNumberofSigniﬁcantTestPointsinTotalandPatternDeviationProbabilityMapsDifferencesBetweenVisits1

and2

r² Slope(ChangeperPercentagePoint

IncreaseinFPRate)and95%CI

Effectper10Percentage Point–IncreaseinFP

DiffMD/diffFPSS 0.01 0.04(−0.02to0.08) 0.36dB

DiffMD/diffFPSF 0.04 0.06(0.01-0.11)^a 0.60dB

DiffMD/diffFPSFR 0.11 0.05(0.03-0.08)^a 0.51dB

DiffVFI/diffFPSS 0.00 0.07(−0.10to0.22) 0.56%

DiffVFI/diffFPSF 0.03 0.14(−0.01to0.28) 1.37%

DiffVFI/diffFPSFR 0.04 0.10(0.01-0.18)^a 0.95%

DiffTD1%/diffFPSS 0.00 −0.04(−0.25to0.10) −0.4points

DiffTD1%/diffFPSF 0.03 −0.15(−0.30to0.01) −1.5points

DiffTD1%/diffFPSFR 0.04 −0.09(−0.17to−0.01)^a −0.9points

DiffPD1%/diffFPSS 0.00 −0.02(−0.17to0.13) −0.2points

DiffPD1%/diffFPSF 0.00 0.00(−0.11to0.10) 0.0points

DiffPD1%/diffFPSFR 0.00 −0.02(−0.09to0.05) −0.2points

CI=conﬁdenceinterval;diff=difference;FP=falsepositive;MD=meandeviation;PD=patterndeviation;SF=SITAFast;SFR=SITA Faster;SITA=SwedishInteractiveThresholdingAlgorithm;SS=SITAStandard;TD=totaldeviation;VFI=visualﬁeldindex.

aStatisticallysigniﬁcantslope.

centagepointsinFPrates(forexample,anincreaseinFP ratefrom 5%to 15%).EffectsforVFI wereeven smaller 0.6%-1.4% (approximatelycorresponding to 0.2-0.4dB), foranincreaseof10percentagepointsin FPrates.Simi- larly,theassociationsbetweenFPintervisitdifferencesand differencesinnumbersofsignificantlydepressedtestpoints were weak for all 3 test strategies, with many r² values closeto0.Mostofthoserelationshipswerenotstatistically significant.

TherelationshipsbetweenintervisitdifferencesinGH and MD were markedly stronger, with r² values ranging from0.22to0.46forthe3strategies.Therelationshipsbe- tweenGHand VFIintervisitchangeswereweakbutstill

muchstrongerthanforFPvsVFI(Tables2and3).There- lationshipsbetweenGHandnumberofsignificantTDtest points werefairlystrongacrosstestingstrategiesbutwere muchweakerforpointsinPDmaps.Thislatterobservation is notsurprising,because GHwasdesigned tocorrect for generalizedchangesinvisualfieldsensitivity,suchasthose associatedwithcataractdevelopment.

Analysisoflinearregressionresidualvaluesrevealedthat assumptions implicit in linear regressionwere supported.

Histograms of standardized residuals were normally dis- tributed around zero. Most residual points in the scatter plotswerewithinthe±2 intervalsofstandardizedresidu- alsontheyaxesandrandomlydispersedaroundstandard-

(6)

TABLE3.RelationshipsBetweenDifferencesinGeneralHeightandMeanDeviationandVisualFieldIndexValues,andNumberof SigniﬁcantTestPointsinTotalandPatternDeviationProbabilityMapsDifferencesBetweenVisits1and2

r² Slope(ChangeperdBofGH)and95%CI Effectper10-dBChangeinGH

DiffMD/diffGHSS 0.35 0.55(−0.42to0.69) 5.53dB

DiffMD/diffGHSF 0.46 0.93(0.75-1.1)^a 9.25dB

DiffMD/diffGHSFR 0.22 0.56(0.37-0.74)^a 5.55dB

DiffVFI/diffGHSS 0.10 0.84(0.39-1.28)^a 8.35%

DiffVFI/diffGHSF 0.22 1.83(1.22-2.43)^a 18.25%

DiffVFI/diffGHSFR 0.03 0.64(0.00-1.28)^a 6.38%

DiffTD1%/diffGHSS 0.35 −1.95(−2.42to−1.47)^a −19.5points

DiffTD1%/diffGHSF 0.39 −2.56(−3.17to−2.00)^a −25.6points

DiffTD1%/diffGHSFR 0.20 −1.51(−2.05to−0.90)^a −15.1points

DiffPD1%/diffGHSS 0.00 0.06(−0.37to0.50) 0.6points

DiffPD1%/diffGHSF 0.00 −0.18(−0.68to0.32) −1.8points

DiffPD1%/diffGHSFR 0.11 0.92(0.44-1.40)^a 9.2points

CI=conﬁdenceinterval;diff=difference;GH=generalheight;MD=meandeviation;PD=patterndeviation;SF=SITAFast;SFR=SITA Faster;SITA=SwedishInteractiveThresholdingAlgorithm;SS=SITAStandard;TD=totaldeviation;VFI=visualﬁeldindex.

TABLE4.RelationshipsBetweenFalsePositiveResponseRatePercentagesandMeanDeviationValuesatDifferentStagesof Glaucoma

Eyes(n) Strategy Stage r² Slope(DecibelChangeperPercentage

PointChangeinFPRate)

95%CI Effectper10Percentage Point–IncreaseinFPRate(dB)

61 SS^c Early 0.01 0.03 −0.04to0.09 0.3^a

25 SS Moderate 0.00 0.02 −0.14to0.18 0.2

39 SS Severe 0.05 0.11 −0.05to0.27 1.1

58 SF^d Early 0.06 0.04 −0.004to

0.09

0.4

30 SF Moderate 0.01 0.04 −0.13to0.22 0.4

37 SF Severe 0.09 0.14 −0.01to0.28 1.4

63 SFR^e Early 0.06 0.03 0.001-0.05^b 0.3

23 SFR Moderate 0.14 0.05 −0.01to0.11 0.5

39 SFR Severe 0.33 0.20 0.11-0.30^b 2.0

37^a SFR Severe 0.12 0.14 0.01-0.26^b 1.4

CI=conﬁdenceinterval;FP=falsepositive;SF=SITAFast;SFR=SITAFaster;SS=SITAStandard.

aTwooutliersexcluded.

bStatisticallysigniﬁcantslope.

izedpredictedMDand VFIvaluesonthehorizontalaxes (SupplementalFigure1).

TheinfluenceofFPratesonMDandVFIwhendividing thefieldtestsintoseveritystagesispresentedinTables4and Table5.ForMD,therelationshipsdid notreachstatisti- calsignificanceatanydiseasestagewithSSandSF.With SFR,theywerestatisticallysignificantin severeglaucoma andborderlinesignificantforSSandSF.InfluencesonVFI werestatisticallysignificantonlyforSFandSFRinsevere glaucoma. Eliminating 2 outliers among the SFR results inthegroupofeyeswith severeglaucoma(Supplemental Figure 2)reduced theslopesconsiderably. Corresponding

results atdifferent stagesof glaucomaon numbers ofsig- nificantly depressed TDand PDtestpoints are shownin Tables6and7.Noneoftherelationshipswerestatistically significant.

DISCUSSION

Our results indicate that across 3 different perimetric thresholding strategies, FP rate measurements generally showedonlyweakassociationswithvisualfieldthreshold

(7)

TABLE5.RelationshipsBetweenFalsePositiveResponseRatePercentagesandVisualFieldIndexValuesatDifferentStagesof Glaucoma

Eyes(n) Strategy Stage r² SlopeVFIPercentageChangeper

PercentageChangeinFP

95%CI Effectper10Percentage Point–IncreaseinFP(%)

61 SS Early 0.00 0.01 −0.17to0.18 0.08

25 SS Moderate 0.01 0.10 −0.36to0.55 0.95

39 SS Severe 0.04 0.28 −0.20to0.76 2.81

58 SF Early 0.01 0.05 −0.07to0.16 0.47

30 SF Moderate 0.01 0,12 −0.38to0.61 1.17

37 SF Severe 0.11 0.44 0.01-0.88^a 4.43

63 SFR Early 0.01 0.02 −0.04to0.08 0.23

23 SFR Moderate 0.04 0.08 −0.11to0.26 0.76

39 SFR Severe 0.25 0.62 0.26-0.97^a 6.15

37^b SFR Severe 0.09 0.43 −0.04to0.90 4.29

CI=conﬁdenceinterval;FP=falsepositive;SF=SITAFast;SFR=SITAFaster;SS=SITAStandard.

bTwooutliersexcluded.

TABLE6.RelationshipsBetweenFalsePositiveResponseRatePercentagesandNumbersofSigniﬁcantPointsattheP<.01Level inTotalDeviationProbabilityMaps

Eyes(n) Strategy Stage r² Slope(ChangeinNumberof1%Pointsper PercentagePointChangeinFPRate)

PValue Effectper10Percentage Point–IncreaseinFP

61 SS Early 0.01 −0.05 .68 −0.5points

25 SS Moderate 0.00 −0.008 .98 −0.1points

39 SS Severe 0.00 −0.002 1.00 0.0points

58 SF Early 0.02 −0.08 .36 −0.8points

30 SF Moderate 0.02 −0.16 .52 −1.6points

37 SF Severe 0.07 −0.29 .11 −2.9points

63 SFR Early 0.01 −0.04 .42 −0.4points

23 SFR Moderate 0.07 −0.12 .21 −1.2points

39 SFR Severe 0.09 −0.25 .06 −2.5points

FP=falsepositive;SF=SITAFast;SFR=SITAFaster;SS=SITAStandard.

TABLE7.RelationshipsBetweenFalsePositiveResponseRatePercentagesandNumbersofSigniﬁcantPointsattheP<.01 LevelinPatternDeviationProbabilityMaps

Eyes(n) Strategy Stage r² Slope(ChangeinNumberof1%Pointsper PercentagePointChangeinFPRate)

PValue Effectper10Percentage Point–IncreaseinFPRate

61 SS Early 0.00 0.01 .89 0.1points

25 SS Moderate 0.01 −0.11 .57 −1.1points

39 SS Severe 0.00 −0.05 .79 −0.5points

58 SF Early 0.01 0.03 .56 0.3points

30 SF Moderate 0.02 0.13 .52 1.3points

37 SF Severe 0.08 −0.21 .08 −2.1points

63 SFR Early 0.00 −0.02 .69 −0.2points

23 SFR Moderate 0.00 0.15 .82 1.5points

39 SFR Severe 0.06 −0.16 .14 −1.6points

FP=falsepositive;SF=SITAFast;SFR=SITAFaster;SS=SITAStandard.

(8)

sensitivityand associatedanalysismetrics.This finding is somewhatunexpected,butweseeaparallelintheevolu- tionofourthinkingregardingtheroleofFNresponserates

>20yearsago.

Thetraditionaldefinitionofreliabilityinresearchisre- producibility.IfthatiswhatwewantFPmetricstoassess, thenourfindingssuggestthatthecurrentFPindexmaybe oflittleuse.This isnotatallanewfinding, however,as similarresultswerereported20yearsago;theresultsofre- liabilitytestinghadalmostnegligiblecorrelationwithtest reliabilityasexpressedasthresholdreproducibility.¹¹ Ifwe instead define reliability asindicating the “usefulness” of testresults,ourfindingsshowthatFPmeasurementchanges wereassociatedwithchangesintestresultsinthesamedi- rectionasinotherpublishedstudies.⁹^,¹⁰^,²⁰Therefore,inthe currentstudy,increasingratesofFPresponseswereassoci- atedwithincreasesin MDvalues,buttheeffects onMD weresmall,exceptinsevere disease,andevensmallerfor VFI.PDprobabilitymapswerenotinfluencedatall,which isinterestingbecauseahighernumber ofsignificantlyde- pressedPDtestpointsthanTDpointsisoneoftheclassical hallmarksofatrigger-happyfield.Thatobservationalone showstheresultsreportedherein:thattherelationshipof higherFP ratesto signs oftrigger-happyfields isweakto poor.

The effects of FP rates on MD were larger with SFR than with SS and SF, but in early glaucoma the slopes were small with all 3 algorithms, while in severe disease the slope with SFR was considerably larger than those of SS and SF. The SFR results in severe glaucoma were partially explained by 2 outliers (Supple- mental Figure 2). In line with earlier reports, we thus found that FP rates seemed to be more important in eyeswith severe fieldloss. PDprobabilitymapswere not influenced.

Tanandassociates⁹andYohannanandassociates¹⁰both reportedthatFPinfluenced MDto agreater extentwith higherfrequencies ofFP.Weapplied theanalysesof Tan andassociates⁹onourowndatabutcouldnotconfirmtheir findings.Therefore,inourmaterial,theeffectofFPonMD didnotdifferbetweeneyeswithhighvslowFPvalues.Each percentagepointofhigherFPratewasassociatedwithan increaseinMDof0.06dBineyeswithFP≤15%and0.04 dBineyeswithFP>15%.

Themainstrengthofthecurrentstudywasthat2visual fieldswereobtainedwitheachof3perimetricthresholding algorithmswithin averyshorttime interval, eliminating theneedtocomparesingletestresultstoamodelofanex- pectedfield,ashasbeenthecaseinearlierstudies.¹⁰Other strengthsincludeourstudy’smulticenterdesignandthefact thatwecouldassessFPperformanceinsubjectswhowere testedusing3differentthresholdtestingalgorithms,mak- ingit possible to determine if observed trends werecon- sistentacross testingstrategies, and thefact thatwe also studiedthe effectson theresults expressedin probability maps.

Aweaknessofthisstudyisthesomewhatlimitednumber ofenrolledsubjects.Thematerialconsistedmostlyofeyes with manifestglaucomabutalsocontainedglaucomasus- pects.Itwouldhavebeeninterestingtohavehadanequally largeage-matchedcohortofentirelynormalsubjects,each testedtwicewithall3strategies.

This studyis notthe first attempt to address the rela- tionshipbetweenFPresponseratemeasurementsandMD values.Ourresultsreferring toMDvaluesgoin thesame direction but are of a smaller magnitude than those re- ported in a largesimilar populationofpatients with suspect andmanifest glaucoma¹⁰ and in anotherlarge study ofnormalsubjects.⁹Theseearlierstudieshavenotreported the relationship of FP rates to the VFI, wherewe found even smallereffects,and, therefore, wecannot make any comparisons.

We havefound nopreviouspublicationsreportingthe relationshipofFPratestothenumberofsignificantlyde- pressedTDandPDpoints,whicharecentraltotheclinical interpretationofperimetricresults.Wefoundnosignificant influenceofFPratesonPDprobabilitymaps.

OnemayspeculateastowhySFRtestsgeneratealarger numberofFPresponsesthanSF,andwhySFgeneratesmore suchresponsesthanSS.Accordingtosignaldetectionthe- ory,amorelenientresponsecriterionleadstoahigherrate ofFPresponses.¹⁴ Thishasalsobeenshowntohappenin computerizedperimetrictesting,whereinstructionsencour- agingtestsubjectstousemorelenientresponsecriteriare- sultedinhigherFPrates.²¹Inthebeginningofavisualfield test,patientsmustsettheirownsubjectiveresponsecriteria.

InSSandSF,theteststartswithstimulithatarequiteabit moreintensethannormalthresholdsensitivity,whichusu- allyareeasilyperceived.Itseemslikelythatpatientstaking aSFtestmaythenrequirestrongerstimulibeforerespond- ingthan inSFRtests,which startoutatthenormal age- correctedthreshold.Thelackofclearlyvisible(supralimi- nal)stimuliinSFRtestsmakesitreasonabletoassumethat patientsmightthentendtoadoptamorelenientresponse criterion and respondmoreoftenwhennotbeingsure of having seenastimulus.ThismightexplainthehigherFP rateswithSFR.Duringmostofthetest,SFRpresentsstim- uliatthepatient’spredicted50%thresholdlevel,whileSF presentsstimulithatareapproximately1dBbrighterand SS3dBbrighter,possiblyexplainingthesmallerFPdiffer- encebetweenSFandSS.ThetimingalgorithmsinSSand SFareidentical,andtheoneusedinSFRdifferslittlefrom thatofSSandSF.Wedonotbelievethatthedifferences of FPrates amongthe3 algorithms areexplained by the methodusedin SITA to assessFP rates,ie,to registeras FPresponsesanybuttonpressesthatoccurduringthefirst 180millisecondsafterstimulusinitiationorduringaperiod fromtheendofaresponsewindowuntiltheonsetthenext stimulusexposure.⁸

TheintendedaimoftheFPindexwastoflagperimetry testresultsfrom"trigger-happy"patientswherethoseresults cannotbetrusted,andonthatbasiscurrentmethodsofes-

(9)

timatingFPratesarefarfromoptimal.Therefore,itseems likelythattestresultsshouldneverbediscardedsolelybased onFPresponserates.Itisencouragingtonotethatthecor- relation ofGHwith other testmetrics wasmuchgreater thanthatofFPratesandwasusually highlysignificant.It

mightperhapsbepossibletoconstructametricthatisbased inpartonGHtobetteridentifytrigger-happyfieldswhere testresultsshouldnotbetrusted.

ALLAUTHORSHAVECOMPLETEDANDSUBMITTEDTHEICMJEFORMFORDISCLOSUREOFPOTENTIALCONFLICTSOFINTEREST andnonewerereported.

Funding/Support:TheclinicalstudywassupportedbytheHermanJärnhardtFoundation,theFoundationforVisuallyImpairedinFormerMalmöhus County,Sweden.Thesefundingorganizationshadnoroleinthedesignorconductofthisresearch.CarlZeissMeditecInc.,Dublin,California,USAwas directlyinvolvedinthedevelopmentofSITAFasterandloanedperimetersto4oftheparticipatingclinicalsites.CarlZeissMeditecInc.providedresearch fundingforthisstudytoJ.G.F.attheUniversityofCalifornia,Berkeley.FinancialDisclosures:A.H.andB.B.areconsultantsofandareentitledtoroyalties fromCarlZeissMeditec.A.H.isaconsultantforAllerganplcandhasreceivedspeakerhonorariafromAllerganandZeiss.V.M.P.isaconsultantforCarl ZeissMeditecandwasaCarlZeissMeditecemployeeduringthedevelopmentandevaluationofSITAFaster.J.G.F.isaconsultantforandreceivedresearch supportfromCarlZeissMeditec,Inc.A.I.isaconsultantforSantenandhasreceivedspeakerhonorariafromPfizer,Santen,Kowa,Alcon,Heidelberg EngineeringthroughJapanfocusCompany,andCarlZeissMeditec,Tokyo.A.I.alsoholdsapatentlicensedtoTopconwithoutanyroyalties.C.K.L.has receivedspeakerhonorariafromCarlZeissMeditec,Topcon,Tomey,Allergan,Novartis,Santen,Glaukos,andGlobalVision;researchsupportintheform ofinstrumentsfromCarlZeissMeditec,HeidelbergEngineering,Topcon,Tomey,andOptovue;researchgrantsfromCarlZeissMeditec,Topcon,Novartis, Glaukos,Alcon,andOptovue;consultantfeesfromAllerganandNovartis;andhaspatentswithCarlZeissMeditec.A.T.hasnofinancialdisclosures,but CarlZeissMeditecprovidedtheperimeterusedintheclinicalevaluation.G.C.L.andT.C.areemployeesofCarlZeissMeditec.Allauthorsattestthat theymeetthecurrentICMJEcriteriaforauthorship.

REFERENCES

1.Heijl A, Krakau CE. An automatic static perimeter, design and pilot study. Acta Ophthalmol (Copenh). 1975;53(3):293–310.

2.FankhauserF,SpahrJ,Bebié H.Someaspectsoftheautoma- tionofperimetry.SurvOphthalmol.1977;22(2):131–141. 3.AndersonDR,PatellaVM,eds.AutomatedStaticPerimetry.

Mosby;1999.

4.HeijlA,PatellaVM,ChongLX,etal.AnewSITAperimetric thresholdtestingalgorithm:constructionandamulticenter clinicalstudy.AmJOphthalmol.2019;198:154–165.

5.KatzJ,SommerA.Reliabilityindexesofautomatedperimetric tests.ArchOphthalmol.1988;106(9):1252–1254.

6.BengtssonB,HeijlA.False-negativeresponsesinglaucoma perimetry:indicatorsofpatientperformanceortestreliability?

InvestOphthalmolVisSci.2000;41(8):2201–2204.

7.HaleyM.TheFieldAnalyzerPrimer.2nded.HumphreyInstru- ments;1987.

8.Olsson J, Bengtsson B, Heijl A, Rootzen H. An im- proved method to estimate frequency of false positive an- swers in computerized perimetry. Acta Ophthalmol Scand. 1997;75(2):181–183.

9.TanNYQ,ThamYC,KohV,etal.Theeffectoftestingrelia- bilityonvisualfieldsensitivityinnormaleyes:TheSingapore ChineseEyeStudy.Ophthalmology.2018;125(1):15–21. 10.YohannanJ, WangJ, BrownJ, et al.Evidence-based crite-

ria for assessment of visual field reliability. Ophthalmology. 2017;124(11):1612–1620.

11.BengtssonB. Reliabilityofcomputerized perimetricthresh- oldtests asassessedbyreliability indicesand thresholdre- producibilityinpatientswithsuspectandmanifestglaucoma.

ActaOphthalmolScand.2000;78(5):519–522.

12.HeijlA,LindgrenG,OlssonJ.Apackageforthestatistical analysisofvisualfields.In:GreveEL,HeijlA,eds.SeventhIn- ternationalVisualFieldSymposium,Amsterdam,September 1986. DocumentaOphthalmologicaProceedingsSeries,vol 49. Springer, Dordrecht. doi:10.1007/978-94-009-3325-5_ 23.

13.BengtssonB,HeijlA.Acceptablefrequenciesoffalsepositive answersincomputerizedperimetry.InvestOphthalmolVisSci. 2000;41(4):478.

14.Green DM, Swets JA. Signal Detection Theory and Psy- chophysics.Wiley;1966.

15.Phu J, Khuu SK, Agar A, Kalloniatis M. Clinical evalua- tion of SwedishInteractiveThresholding Algorithm-Faster comparedwithSwedishInteractiveThresholdingAlgorith- m-Standard in normal subjects, glaucoma suspects, and patients with glaucoma. Am J Ophthalmol. 2019;208:251–

264.

16.OlssonJ.StatisticinPerimetry.Lund,Sweden:LundUniversity, DepartmentofMathematicalStatistics;1991.

17.Asman P, Heijl A, Olsson J, Rootzen H. Spatial analyses of glaucomatous visual fields; a comparison with tra- ditional visual field indices. Acta Ophthalmol (Copenh). 1992;70(5):679–686.

18.HodappE,Parrish2ndRK,AndersonDR.ClinicalDecisions inGlaucoma.Mosby;1993.

19.MillsRP,BudenzDL,LeePP,etal.Categorizingthestageof glaucomafrompre-diagnosistoend-stagedisease.AmJOph- thalmol.2006;141(1):24–30.

20.Junoy Montolio FG, Wesselink C, Gordijn M, Janso- niusNM.Factorsthatinfluencestandardautomatedperime- try test results in glaucoma: test reliability, technician ex- perience,timeofday,andseason.InvestOphthalmolVisSci. 2012;53(11):7010–7017.

21.KutzkoKE,BritoCF,WallM.Effectofinstructionsoncon- ventional automated perimetry. Invest Ophthalmol Vis Sci. 2000;41(7):2006–2013.