2024年4月24日发(作者:固态硬盘修复软件)
Gradient-BasedLearningApplied
toDocumentRecognition
´
YANNLECUN,
MEMBER,IEEE,
LEONBOTTOU,YOSHUABENGIO,
AND
PATRICKHAFFNER
InvitedPaper
Multilayerneuralnetworkstrainedwiththeback-propagation
algorithmconstitutethebestexampleofasuccessfulgradient-
nappropriatenetwork
architecture,gradient-basedlearningalgorithmscanbeused
tosynthesizeacomplexdecisionsurfacethatcanclassify
high-dimensionalpatterns,suchashandwrittencharacters,with
perreviewsvariousmethods
appliedtohandwrittencharacterrecognitionandcomparesthem
utional
neuralnetworks,whicharespecificallydesignedtodealwith
thevariabilityoftwodimensional(2-D)shapes,areshownto
outperformallothertechniques.
Real-lifedocumentrecognitionsystemsarecomposedofmultiple
modulesincludingfieldextraction,segmentation,recognition,
arningparadigm,calledgraph
transformernetworks(GTN’s),allowssuchmultimodulesystems
tobetrainedgloballyusinggradient-basedmethodssoasto
minimizeanoverallperformancemeasure.
Twosystemsforonlinehandwritingrecognitionaredescribed.
Experimentsdemonstratetheadvantageofglobaltraining,and
theflexibilityofgraphtransformernetworks.
Agraphtransformernetworkforreadingabankcheckis
convolutionalneuralnetworkcharacter
recognizerscombinedwithglobaltrainingtechniquestoprovide
ployed
commerciallyandreadsseveralmillionchecksperday.
Keywords—Convolutionalneuralnetworks,documentrecog-
nition,finitestatetransducers,gradient-basedlearning,graph
transformernetworks,machinelearning,neuralnetworks,optical
characterrecognition(OCR).
NN
OCR
PCA
RBF
RS-SVM
SDNN
SVM
TDNN
V-SVM
Neuralnetwork.
Opticalcharacterrecognition.
Principalcomponentanalysis.
Radialbasisfunction.
Reduced-setsupportvectormethod.
Spacedisplacementneuralnetwork.
Supportvectormethod.
Timedelayneuralnetwork.
Virtualsupportvectormethod.
I.I
NTRODUCTION
Overthelastseveralyears,machinelearningtechniques,
particularlywhenappliedtoNN’s,haveplayedanincreas-
inglyimportantroleinthedesignofpatternrecognition
,itcouldbearguedthattheavailability
oflearningtechniqueshasbeenacrucialfactorinthe
recentsuccessofpatternrecognitionapplicationssuchas
continuousspeechrecognitionandhandwritingrecognition.
Themainmessageofthispaperisthatbetterpattern
recognitionsystemscanbebuiltbyrelyingmoreonauto-
ismadepossiblebyrecentprogressinmachinelearning
haracterrecognitionasa
casestudy,weshowthathand-craftedfeatureextractioncan
beadvantageouslyreplacedbycarefullydesignedlearning
documentunderstandingasacasestudy,weshowthatthe
traditionalwayofbuildingrecognitionsystemsbymanually
integratingindividuallydesignedmodulescanbereplaced
byaunifiedandwell-principleddesignparadigm,called
GTN’s,whichallowstrainingallthemodulestooptimize
aglobalperformancecriterion.
Sincetheearlydaysofpatternrecognitionithasbeen
knownthatthevariabilityandrichnessofnaturaldata,
beitspeech,glyphs,orothertypesofpatterns,makeit
almostimpossibletobuildanaccuraterecognitionsystem
uently,mostpatternrecognition
systemsarebuiltusingacombinationofautomaticlearning
almethod
N
OMENCLATURE
GT
GTN
HMM
HOS
K-NN
Graphtransformer.
Graphtransformernetwork.
HiddenMarkovmodel.
Heuristicoversegmentation.
K-nearestneighbor.
ManuscriptreceivedNovember1,1997;revisedApril17,1998.
,,rarewiththeSpeechandImage
ProcessingServicesResearchLaboratory,AT&TLabs-Research,Red
Bank,NJ07701USA.
iswiththeD´epartementd’InformatiqueetdeRecherche
Op´erationelle,Universit´edeMontr´eal,Montr´eal,Qu´ebecH3C3J7Canada.
PublisherItemIdentifierS0018-9219(98)07863-3.
0018–9219/98$10.00©1998IEEE
2278PROCEEDINGSOFTHEIEEE,VOL.86,NO.11,NOVEMBER1998
ionalpatternrecognitionisperformedwithtwo
modules:afixedfeatureextractorandatrainableclassifier.
ofrecognizingindividualpatternsconsistsindividingthe
first
module,calledthefeatureextractor,transformstheinput
patternssothattheycanberepresentedbylow-dimensional
vectorsorshortstringsofsymbolsthat:1)canbeeasily
matchedorcomparedand2)arerelativelyinvariantwith
respecttotransformationsanddistortionsoftheinputpat-
tureextractor
containsmostofthepriorknowledgeandisratherspecific
sothefocusofmostofthedesigneffort,
ssifier,
ontheotherhand,isoftengeneralpurposeandtrainable.
Oneofthemainproblemswiththisapproachisthatthe
recognitionaccuracyislargelydeterminedbytheabilityof
thedesignertocomeupwithanappropriatesetoffeatures.
Thisturnsouttobeadauntingtaskwhich,unfortunately,
amountof
thepatternrecognitionliteratureisdevotedtodescribing
andcomparingtherelativemeritsofdifferentfeaturesets
forparticulartasks.
Historically,theneedforappropriatefeatureextractors
wasduetothefactthatthelearningtechniquesused
bytheclassifierswerelimitedtolow-dimensionalspaces
witheasilyseparableclasses[1].Acombinationofthree
,
theavailabilityoflow-costmachineswithfastarithmetic
unitsallowsforrelianceonmorebrute-force“numerical”
methodsthanonalgorithmicrefi,theavail-
abilityoflargedatabasesforproblemswithalargemarket
andwideinterest,suchashandwritingrecognition,has
enableddesignerstorelymoreonrealdataandlesson
hand-craftedfeatureextractiontobuildrecognitionsystems.
Thethirdandveryimportantfactoristheavailability
ofpowerfulmachinelearningtechniquesthatcanhandle
high-dimensionalinputsandcangenerateintricatedecision
e
arguedthattherecentprogressintheaccuracyofspeech
andhandwritingrecognitionsystemscanbeattributedin
largeparttoanincreasedrelianceonlearningtechniques
enceofthisfact,alarge
proportionofmoderncommercialOCRsystemsusesome
formofmultilayerNNtrainedwithbackpropagation.
Inthisstudy,weconsiderthetasksofhandwritten
characterrecognition(SectionsIandII)andcomparethe
performanceofseverallearningtechniquesonabenchmark
datasetforhandwrittendigitrecognition(SectionIII).
Whilemoreautomaticlearningisbeneficial,nolearning
techniquecansucceedwithoutaminimalamountofprior
aseofmultilayerNN’s,
agoodwaytoincorporateknowledgeistotailoritsarchi-
utionalNN’s[2],introducedin
SectionII,areanexampleofspecializedNNarchitectures
whichincorporateknowledgeabouttheinvariancesoftwo-
dimensional(2-D)shapesbyusinglocalconnectionpatterns
rison
ofseveralmethodsforisolatedhandwrittendigitrecogni-
omtherecognition
ofindividualcharacterstotherecognitionofwordsand
sentencesindocuments,theideaofcombiningmultiple
modulestrainedtoreducetheoverallerrorisintroduced
izingvariable-lengthobjectssuchas
handwrittenwordsusingmultimodulesystemsisbestdone
adstothe
conceptoftrainableGTN,alsointroducedinSectionIV.
SectionVdescribesthenowclassicalmethodofHOSfor
minative
andnondiscriminativegradient-basedtechniquesfortrain-
ingarecognizeratthewordlevelwithoutrequiringmanual
segmentationandlabelingarepresentedinSectionVI.
SectionVIIpresentsthepromisingspace-displacementNN
approachthateliminatestheneedforsegmentationheuris-
ticsbyscanningarecognizeratallpossiblelocationson
ionVIII,itisshownthattrainableGTN’s
canbeformulatedasmultiplegeneralizedtransductions
connectionsbetweenGTN’sandHMM’s,commonlyused
inspeechrecognition,nIXdescribes
agloballytrainedGTNsystemforrecognizinghandwriting
oblemisknownas
“online”handwritingrecognitionsincethemachinemust
e
ultsclearly
demonstratetheadvantagesoftrainingarecognizerat
thewordlevel,ratherthantrainingitonpresegmented,
hand-labeled,nXdescribesa
completeGTN-basedsystemforreadinghandwrittenand
eofthesystemis
theconvolutionalNNcalledLeNet-5,whichisdescribed
stemisincommercialuseinthe
NCRCorporationlineofcheckrecognitionsystemsforthe
adingmillionsofcheckspermonth
inseveralbanksacrosstheUnitedStates.
ngfromData
Thereareseveralapproachestoautomaticmachinelearn-
ing,butoneofthemostsuccessfulapproaches,popularized
inrecentyearsbytheNNcommunity,canbecalled“nu-
merical”rningmachine
computesafunction
th
inputpattern,and
发布者:admin,转转请注明出处:http://www.yc00.com/xitong/1713910122a2342764.html
评论列表(0条)