Gaborltershavebeenappliedtoseveraltasksinobjectrecognition,including
segmen-tation, representinglocal features, and extracting size information. Some applications
depend onaseparatesegmentation stepbefore recognitioncanbeperformed. Forthat
reason, theuse of Gaborlters in segmentationwill be considered rst before moving
totheiruseintherepresentationofobjects,whichwillformthemostsignicantpartof
this section.
ThepowerofinspectinglocallyprominentfrequencieshasmadeGaborlterspopularin
textureanalysis. Thishasbeenutilised byJainetal.forthesegmentationofobjectsin
complexbackgrounds[47]. Amulti-channeldecompositionofanimageisrstperformed,
followed by aselection of channels based on minimising the reconstruction error. The
responsesaresubjectedto asigmoidnonlinearity,averaged,and thenclustered toform
the segments. A bottom-up merging has also been used to segment images based on
f 0
1 1 1 _ 1
_ _
16 32 _
8 4
θ
135 90 45 0
Figure4.6: Absolutelterresponsesindierentfrequenciesandorientations.
In addition to segmentation, Gabor lters have been applied as features to represent
objects. Theseapproachescan bedividedinto fourcategoriesdependingonthetypeof
theinformationused: a)responseatasinglepointisused;b)responsesonapredened
gridofpointsareused;c)responsesonadeformablegridareused;d)globalfeaturesof
responsesoveranareaareused.
Gabor lter response on a single point hasbeen used to realiseane invariant
recog-nition of image patches [6, 123, 7]. Full ane invariance was established by alog-log
mappingofthefrequencydomaintogetherwithltersin severalorientations. However,
theimageneedstobesegmentedinordertondthecentresoftheimagepatches. Weber
andCasasentperformeddistortioninvariantrecognitionbyanentirelydierentconcept
usingalinearcombinationofseveralseparatelyoptimisedlters[124,125]. Therelative
lter positions were also optimised in addition to other lter parameters. While the
distortion tolerance of the systemis good, there wasno invarianceto geometric
trans-formations. The author has used Gabor lters to perform distortion tolerant rotation
invariantrecognition of electronic components [54, 53]. The recognitionis based on a
robustestimationofthedimensionsofanobjectindierentorientationsaroundasingle
point. Forasinglesampling point,it ispossibleto realiseinvariancesusing analysisof
thelterresponse.
TheGaborresponsesonarigidpredenedgridhavebeenusedtorecognisehandwritten
numerals [37]. Filter parameters were optimised byhand with a 1-NNclassicationof
theresponsevectors. WalterandArnrichproposetouseagridofninereal(even)Gabor
lters with a backpropagating multi-layerperceptron to position a robot manipulator
[122]. Jain et al. use even symmetric (real) Gabor lters to extract the local ridge
structures ofngerprints[46]. Thengerprintisinspected usingacircular gridaround
adetectedreferencepoint. Foreachcellofthegrid,thedeviationofthelterresponses
for each lteris used to constructthe feature vector. Altogether, noneof the systems
employingrigidsamplingofresponsesachieverotationor scaleinvariances.
Invariancetosmallscalegeometricdeformationshasbeenachievedthroughtheuseofa
deformable graphto representtheGabor lterresponses[15, 14, 65]. Asingle nodeof
agraphrepresentsthelterresponses ofabankoflters,oftencalled aGaborjet[15].
Themodelgraphofanobjectcanbeconstructedeitherusingsamplingonaregulargrid
[15,101]orsamplingininterestlocations[112]. Themodelgraphisthencomparedwith
animagetondthelocationsforeachnodeintheimagethatminimiseboththegeometric
deformation ofthe graphand thedierence oflter responses at these locations. This
approach is invariant to small changes in the locationsof thegraph nodes which may
result,forexample,fromgeometrictransformations. However,fullinvariancetorotation
or scaleis not achieved asthe lterresponses aredirectly compared yet the responses
areinvariantonlytosmall changesinscaleandrotation.Thisrestrictionisalleviatedby
storingseveralmodelgraphsforobjectsofdierentsizes. Theapproachhasbeenapplied
tofacerecognition[15,14,65,128]andclassicationofhandpostures[112,113]. Krüger
and others presenta variantthat utilises Gaborfeatures to extract local line segments
whicharethenfedtoalearningprocessthatoutputsthemodelgraph[61,62]. Parkand
Yang combine the useoflocal Gabor features withRandomizedHough transformtype
evidenceaccumulationtoestimatetheparametersofasimilaritytransform(translation,
rotation, andscaling)[88]. Local Gaborfeaturessimilar to Gaborjets areused tond
matching of local elements is performed in a rotation invariant manner which should
guaranteefullrotationalinvariancewhiletheinvarianceinscaleremainssmall.
Gaborlterresponsescanbecombinedoveranareatoachieveaglobaldescriptorforthat
area. LampinenandOja[69]employclusteringoftheresponsesofaGaborjettoassign
eachpixeltoasinglecluster. Then,thehistogramofclustersoveraGaussianwindowis
used as aninputtoasubspaceclassier. Douville[26] usestheresponsessummedover
the image. Shioyamaet al. [102] compute a histogram of responses over a segmented
image regionas adescription of that segment. They demonstratethe performance by
detecting cars in real trac scenes. The global systems presented are only scale and
translationtolerantanddonotachievefull invariance.
In Publication IV, invariantrecognitionproperties ofGaborltersareexamined in the
contextofdirectionaledgedetectioninbinaryimages. Amethodtoselectlter
parame-tersispresentedwhichassuresthatthecircularGaussianisabletocapturetherequired
number of angles. Translation invariance is achieved by summing the lter responses
overtheimage. Thatis,afeaturevectorGisconstructedfromtheindividualresponses
r(x;y;)as
wherenisthenumberofltersindierentorientations. Becausethelterresponses
rep-resenttheedgesindierentorientations,scaleinvariancecanberealisedbynormalising
thefeaturemagnitudes. LetG 0
Rotation invariantdistance measure for the features canbe presented byrst dening
therotationof thefeature vectoras
G
wherek=0:::n 1istherotationindex. ThentherotationinvariantsquaredEuclidean
distanceoftwofeaturevectorscanbedenedastheminimumoverallrotations,thatis,
d(G;H)=min
where Gand Hare thefeature vectors. InPublication IV theinvariancesare
demon-strated with an experiment with digits. A preliminary version of Publication IV was
published as[64].
InPublication V,therecognitionmethodisappliedtomatchingofbinarysymbols. The
methodisinspectedusingthelargerdatasetpresentedinmoredetailinPublicationIII.
BoththeGaborlteringandHoughtransformmethodsarebasedontheconstructionof
ahistogramofedgesin dierentorientations. ThenoisetoleranceoftheGaborltering
based image matching is found out to be superior to the Hough transform generated
Discussion
Invariantobjectrecognitionhasbeenrecognisedas oneof themostcentral problemsin
computervision. Featureextractionisanimportantpartofavisionsystemwhich
con-vertsdigitalimagedataintohigherlevelfeatures. Inthisthesis,newfeatureextraction
methods have been presented and analysed. Methods based on Hough transformand
Gaborlteringhavebeenimproved. Thepublicationswillnowbereviewedbasedonthe
objectivesofanobjectrecognitionsystem,andthendiscussedonebyoneconcentrating
onthecontributionsandrestrictions.
In Chapter 1, the objectives of an object recognition system were dened as: The
system should be general enoughto recognise a large variety of objects, invariant to
naturalvariations,stable againstdistortions, and computationally ecient. These can
be used to survey the contents of this thesis. Both the line segments detected by the
Houghtransformand theGabor featurescanbethoughtofasgeneralshapeprimitives
because they are not limited to a single application and are able to represent many
objects,particularlythosewhicharemanmadeandhavegeometricshapes. However,it
seemslikelythattheselowlevelfeaturescannotbeuseddirectlybutmustbeusedinthe
generation ofahigherleveldescription.
It is doubtful if global shape typefeatures can be used for the recognitionof complex
objects. Theshapesofsuch objectscanbecomplexandoftencannotbedescribedbya
singlecontour. Also,theglobalfeaturescanoftenbespatiallyinstable,thatis,theyare
notequally sensitiveoverthewhole shape,and theobjectsneedto besegmentedfrom
thebackgroundinorderto useglobalfeatures. Whilelocalfeatures canovercomethese
problems, at least partially, the question is how local and global information should
be combined. The idea of combining local and global information can be seen both
in the Hough and Gabor approaches. The Hough transformis inherentlyan evidence
gatheringprocessthataccumulateslocalevidencetoexaminetheglobalconguration. In
PublicationI arobustlocalestimationtechniqueispresentedtoestimatetheparameters
of a line segment inside a local window to be used in the evidence gathering. The
inspectionsystempresentedinPublication II utilisesinformationonlocallinesegments
andcentroidsofcirclesttingandcomparesittoaknownglobalmodel. Likewise,Gabor
ltering providesawayto robustlyextract localdirectional edgesand lines, which can
then becombinedtoaglobaldescriptionasin Publication IV andPublicationV.
Translation,rotation,andscaleinvariantrecognitionusingglobalGaborfeaturesis
pre-sented in Publication IV and Publication V. In Publication II the rotation invariant
recognitionwasrealisedusinglocalfeatureswithknownspatialrelationships. TheHough
transformbasedtranslation,rotation,andscaleinvariantfeaturespresentedin
Publica-tionIII werebasedonglobalandlocallineorientations.
Theexistenceoftrulyinvariantfeaturesisquestionableeveninthehumanvisualsystem,
asithasbeenfoundthathumansrecognisecertainviewsofanobjectmorequicklythan
others [87]. Inaddition, thehumanvisualsystemcanbeprimedto certainposesof an
object. Forexample,alphabetic charactersare recognisedmorequicklyin theirnormal
positionthanupsidedown. An importantnoteisthatahumanbeingcanstillrecognise
thesecharactersregardlessoftheposeeventhoughtherecognitiontimechanges. While
this evidenceis contradictorytothe existenceof globalinvariants,researchin invariant
recognitionis certainly warrantedand the evidence may, in addition, provideideas for
further approachestocomputervision. Itisnotverylikelythattheglobalfeatureswill
provideenoughinformation for discriminatingimages in largedata sets. It seemsthat
responsesof Gabor lters arepowerfulasdescriptorsof local parts but thelocal parts
havetobecombinedin aglobaldescription.
Thestabilityagainstdistortionsisoneofthemostimportantqualitiesofacomputer
vi-sionsystemwhichoperatesinnaturalsurroundings. InPublication I thenoisetolerance
of theconnectiverandomised Houghtransformwasimprovedwithanewalgorithm for
thelocalestimation. Theexperimentsin Publication V verifythenoisetoleranceofthe
Gabor lteringbased features. Comparingthe Hough and Gaborapproachesit seems
that thelocalfeatures extractedbyGaborlteringaremorestableagainstsuch
distor-tions as noise and small displacementsof the target. However,the evidence gathering
natureoftheHoughtransformmakesitmoretolerantagainstmissingdata. Thus,itcan
only beconcluded that while thestability of features is desirable,the optimallystable
feature extractionmethoddependsontheapplication.
The Hough transform is known to be computationally demanding. In Publication I a
variantwasproposed that reduces thecomputationalburden byemployinglocal
infor-mation. Whilethe method for local estimation ismore complexthan in thepreceding
method,CRHT,thetotalcomputationtimeisdecreasedinsomecasesduetothebetter
estimates of thelocal structure. Whileseparate analysesof parts of thealgorithm can
revealimportant information, theycannotbecombineddirectly to form atruth about
thewhole.
ThestandardHoughtransformisnormallyimpracticallyburdensome. InPublication VI
itwasshownthatwithmodernparallelenvironmentscomputationcanbespedup
consid-erably. ThefocuswasontheRandomizedHoughtransform,whichhasbeenfoundoutto
outperformthestandardHoughtransform. Aproposedparallelalgorithmwasdiscovered
tobesuperiorbothtosequentialrandomisedandparallelstandardHoughtransformsin
thechosenenvironments. Fromtheresearchtheconclusioncanbedrawnthatalow-scale
parallelisation of known algorithms is often possiblein an ecient way. On the other
hand,inhighlyparallelenvironmentsthealgorithmdesignshouldbestronglyrelatedto
Publication I presents a new method to extract local information to improve the
ro-bustness of evidence accumulation in CRHT. The underlying idea of using the local
informationhasbeenpresentedbyotherauthors,butthenovelcontributionofthepaper
istherobustlocalestimationmethod. Arestrictionofthetechniqueisthatitonlyhelps
in the detectionproblem, that is, it is helpful when arandom subset ofevidences, i.e.,
edgepoints,ismissingcausingshortgapsinlocallinesegments. Inaddition,ifonlyshort
gapsarepresent,themethodcanlowerexecutiontimebecauseofthebetterestimatesof
lineparameters. However,themethodisnotapplicableforlonggapsinthelinesbecause
theexecutiontimegrowsintolerable.
Publication II presentsasystemfor inspectingtwo-dimensionalsheetmetalparts. The
contributionofthepublicationisincombiningexisting methodsintoareal-world
appli-cation, and in the analysisof themeasurementaccuracy. A drawbackof thesystemis
that it is basedon two-dimensional models of the parts. Thus, the calibration is
esti-mated usingtwo-dimensionalsimilaritytransform,the partand theimage planeof the
cameramustlieparallel,andthethicknessofthepartscannotbetakenintoaccount.
Publication III presents new global features based on Hough transform for matching
images. Originally, it was desiredthat these features could beused fordiscriminating
betweendierenttypes ofdocuments such as electrical and architectural drawingsand
maps becauseinthese technicaldrawings,theorientationsoflinesoftendepend onthe
type of the drawing. However, it wasdiscoveredthat the feature vectordid not have
strong correlation to the drawing type. Nevertheless, the features were found capable
of discriminating line drawingsymbols. A major restrictionof the method asaglobal
featureisthatsuccessfulsegmentationmustbeperformedbeforerecognition. For
recog-nising thedrawingtype, itseemsthat amoreeectiveschemecouldbeconstructedby
identifying typical primitives for each type, and performing the recognition based on
these.
PublicationIV presentsaglobal,Gaborlteringbasedfeatureforrecognition. The
pub-lication hastwoprimary contributions,theselectionoflterparametersandtheglobal
feature. Themethod ofparameterselectionis basedonaformulationofaGaborlter
that forces the lter envelopesto be circular Gaussians. In addition, the relative
fre-quency bandwidthis selected. Therefore,thespatial size of altercouldbecontrolled
only using the frequency. However, as stated in Chapter 4, the frequency and
orien-tation bandwidths can be controlled independently using parameters and , which
would seem to bea morereasonable way to solvethe problem of parameter selection.
Thehistogramlikescaling androtationinvariantglobalfeaturepresentedistoauthor's
knowledgeoriginal, but itis closelyrelated tothe energyonacertainband, which has
previously been used and isalso computationally lighter to compute. Thenecessityof
segmentationappliesalsoto thisfeature astheresultof itsglobalnature,and presents
oneofthemajorlimitationsfortheapplicationofthefeature.
PublicationV presentstheapplicationoftheglobalGaborfeaturestosymbolrecognition
and compares theresultsto thosein Publication III. Thenoisetolerance of theGabor
feature is found superior to Houghtransform basedfeatures. Thisis an important
at-tributeofusingGabortypefeatures,asthenoisetoleranceisoftenverygoodbecausethe
use ofalimitedrange ofthefrequencyspectrumsuppressesnoiseonother frequencies.
Itwouldbebenecialtocomparethepresentedfeaturestootherglobalfeaturessuchas
Publication VI presentsparallelHough transformalgorithms. Thecontributionsof the
publication are the parallel algorithm for the Randomized Hough transform and the
assessmentofsuitabilityoftheparallelenvironmentto theHoughtransform. Whilethe
results showthat the useof networkedPCscandecrease thecomputation time ofHT,
it isnot verylikelythat theenvironmentwould beuseful in areal applicationbecause
in thecaseofstandardnetworkhardware,thenetworklatenciesarealwaysconsiderable
and thus real time performance is hard to obtain if even possible. However, the use
of multiprocessor PCs seems to provide a cost ecient platform for image processing
and computervision. A problem with the parallel RHT algorithm is that while good
performance is obtained, the algorithm doesnot scale up very well to alarge number
ofprocessorsbecauseoftherandomunpredictablenature ofthealgorithm. For optimal
performanceinmultiprocessorenvironmentsitcouldbebenecialtostudyotherecient
HT variants,such astheAdaptiveHT, that donothaverandombehaviourand would
bemoresuitabletoaparallelenvironment.
Somedirections forfuture researchcanalreadybeseen. First,incorporatingthehigher
levelstructuretothelocalfeaturesseemstobenecessary,sinceglobalinvariantfeatures
do notseem tobeeectivein discriminating largeand complexdata sets. Thespatial
relationships of the local features should be included in the representation. The
cur-rent graphbased approaches suer from the fact that while the features orthe model
matchingcanbeinvariant,thenon-invariantknowledgeaboutlocalfeatures,suchasthe
orientation,isnotincludedinthematching. Anotherareaofresearchistheapplication
of learning. Researchon articialneuralnetworks hasproduced themeansto
incorpo-rateknowledgethroughtrainingbyexamples. However,thedownsideofneuralnetworks
is that their acquired knowledge is often very hard to interpret after the learning. In
contrast,traditional articial intelligence techniques,such asdecision trees, canbe
in-terpreted, but theyoftendonottolerate uncertaintyand alsocannot describecomplex
relationships. Therefore,futureresearchisclearlynecessaryintheuseoflearning,both
inside thefeatureextractionandclassicationstagesandin theinteractionbetweenthe
stages. In1960,SelfridgeandNeisser[99]wrote: Nocurrentprogramcan generatetest
featuresofitsown. Whilethisisnolongertrueasgenerallow-levelfeaturesareknown,
knowledge about the interaction between the stages of a vision system is still mostly
nonexistent,and itisnotclear,whatkindofgeneralhigherleveldescriptionsshouldbe
nonexistent,and itisnotclear,whatkindofgeneralhigherleveldescriptionsshouldbe