Gradient-based learning applied to document recognition

Gradient-based learning applied to document recognition


2024年4月24日发(作者:固态硬盘修复软件)

Gradient-BasedLearningApplied

toDocumentRecognition

´

YANNLECUN,

MEMBER,IEEE,

LEONBOTTOU,YOSHUABENGIO,

AND

PATRICKHAFFNER

InvitedPaper

Multilayerneuralnetworkstrainedwiththeback-propagation

algorithmconstitutethebestexampleofasuccessfulgradient-

nappropriatenetwork

architecture,gradient-basedlearningalgorithmscanbeused

tosynthesizeacomplexdecisionsurfacethatcanclassify

high-dimensionalpatterns,suchashandwrittencharacters,with

perreviewsvariousmethods

appliedtohandwrittencharacterrecognitionandcomparesthem

utional

neuralnetworks,whicharespecificallydesignedtodealwith

thevariabilityoftwodimensional(2-D)shapes,areshownto

outperformallothertechniques.

Real-lifedocumentrecognitionsystemsarecomposedofmultiple

modulesincludingfieldextraction,segmentation,recognition,

arningparadigm,calledgraph

transformernetworks(GTN’s),allowssuchmultimodulesystems

tobetrainedgloballyusinggradient-basedmethodssoasto

minimizeanoverallperformancemeasure.

Twosystemsforonlinehandwritingrecognitionaredescribed.

Experimentsdemonstratetheadvantageofglobaltraining,and

theflexibilityofgraphtransformernetworks.

Agraphtransformernetworkforreadingabankcheckis

convolutionalneuralnetworkcharacter

recognizerscombinedwithglobaltrainingtechniquestoprovide

ployed

commerciallyandreadsseveralmillionchecksperday.

Keywords—Convolutionalneuralnetworks,documentrecog-

nition,finitestatetransducers,gradient-basedlearning,graph

transformernetworks,machinelearning,neuralnetworks,optical

characterrecognition(OCR).

NN

OCR

PCA

RBF

RS-SVM

SDNN

SVM

TDNN

V-SVM

Neuralnetwork.

Opticalcharacterrecognition.

Principalcomponentanalysis.

Radialbasisfunction.

Reduced-setsupportvectormethod.

Spacedisplacementneuralnetwork.

Supportvectormethod.

Timedelayneuralnetwork.

Virtualsupportvectormethod.

I.I

NTRODUCTION

Overthelastseveralyears,machinelearningtechniques,

particularlywhenappliedtoNN’s,haveplayedanincreas-

inglyimportantroleinthedesignofpatternrecognition

,itcouldbearguedthattheavailability

oflearningtechniqueshasbeenacrucialfactorinthe

recentsuccessofpatternrecognitionapplicationssuchas

continuousspeechrecognitionandhandwritingrecognition.

Themainmessageofthispaperisthatbetterpattern

recognitionsystemscanbebuiltbyrelyingmoreonauto-

ismadepossiblebyrecentprogressinmachinelearning

haracterrecognitionasa

casestudy,weshowthathand-craftedfeatureextractioncan

beadvantageouslyreplacedbycarefullydesignedlearning

documentunderstandingasacasestudy,weshowthatthe

traditionalwayofbuildingrecognitionsystemsbymanually

integratingindividuallydesignedmodulescanbereplaced

byaunifiedandwell-principleddesignparadigm,called

GTN’s,whichallowstrainingallthemodulestooptimize

aglobalperformancecriterion.

Sincetheearlydaysofpatternrecognitionithasbeen

knownthatthevariabilityandrichnessofnaturaldata,

beitspeech,glyphs,orothertypesofpatterns,makeit

almostimpossibletobuildanaccuraterecognitionsystem

uently,mostpatternrecognition

systemsarebuiltusingacombinationofautomaticlearning

almethod

N

OMENCLATURE

GT

GTN

HMM

HOS

K-NN

Graphtransformer.

Graphtransformernetwork.

HiddenMarkovmodel.

Heuristicoversegmentation.

K-nearestneighbor.

ManuscriptreceivedNovember1,1997;revisedApril17,1998.

,,rarewiththeSpeechandImage

ProcessingServicesResearchLaboratory,AT&TLabs-Research,Red

Bank,NJ07701USA.

iswiththeD´epartementd’InformatiqueetdeRecherche

Op´erationelle,Universit´edeMontr´eal,Montr´eal,Qu´ebecH3C3J7Canada.

PublisherItemIdentifierS0018-9219(98)07863-3.

0018–9219/98$10.00©1998IEEE

2278PROCEEDINGSOFTHEIEEE,VOL.86,NO.11,NOVEMBER1998

ionalpatternrecognitionisperformedwithtwo

modules:afixedfeatureextractorandatrainableclassifier.

ofrecognizingindividualpatternsconsistsindividingthe

first

module,calledthefeatureextractor,transformstheinput

patternssothattheycanberepresentedbylow-dimensional

vectorsorshortstringsofsymbolsthat:1)canbeeasily

matchedorcomparedand2)arerelativelyinvariantwith

respecttotransformationsanddistortionsoftheinputpat-

tureextractor

containsmostofthepriorknowledgeandisratherspecific

sothefocusofmostofthedesigneffort,

ssifier,

ontheotherhand,isoftengeneralpurposeandtrainable.

Oneofthemainproblemswiththisapproachisthatthe

recognitionaccuracyislargelydeterminedbytheabilityof

thedesignertocomeupwithanappropriatesetoffeatures.

Thisturnsouttobeadauntingtaskwhich,unfortunately,

amountof

thepatternrecognitionliteratureisdevotedtodescribing

andcomparingtherelativemeritsofdifferentfeaturesets

forparticulartasks.

Historically,theneedforappropriatefeatureextractors

wasduetothefactthatthelearningtechniquesused

bytheclassifierswerelimitedtolow-dimensionalspaces

witheasilyseparableclasses[1].Acombinationofthree

,

theavailabilityoflow-costmachineswithfastarithmetic

unitsallowsforrelianceonmorebrute-force“numerical”

methodsthanonalgorithmicrefi,theavail-

abilityoflargedatabasesforproblemswithalargemarket

andwideinterest,suchashandwritingrecognition,has

enableddesignerstorelymoreonrealdataandlesson

hand-craftedfeatureextractiontobuildrecognitionsystems.

Thethirdandveryimportantfactoristheavailability

ofpowerfulmachinelearningtechniquesthatcanhandle

high-dimensionalinputsandcangenerateintricatedecision

e

arguedthattherecentprogressintheaccuracyofspeech

andhandwritingrecognitionsystemscanbeattributedin

largeparttoanincreasedrelianceonlearningtechniques

enceofthisfact,alarge

proportionofmoderncommercialOCRsystemsusesome

formofmultilayerNNtrainedwithbackpropagation.

Inthisstudy,weconsiderthetasksofhandwritten

characterrecognition(SectionsIandII)andcomparethe

performanceofseverallearningtechniquesonabenchmark

datasetforhandwrittendigitrecognition(SectionIII).

Whilemoreautomaticlearningisbeneficial,nolearning

techniquecansucceedwithoutaminimalamountofprior

aseofmultilayerNN’s,

agoodwaytoincorporateknowledgeistotailoritsarchi-

utionalNN’s[2],introducedin

SectionII,areanexampleofspecializedNNarchitectures

whichincorporateknowledgeabouttheinvariancesoftwo-

dimensional(2-D)shapesbyusinglocalconnectionpatterns

rison

ofseveralmethodsforisolatedhandwrittendigitrecogni-

omtherecognition

ofindividualcharacterstotherecognitionofwordsand

sentencesindocuments,theideaofcombiningmultiple

modulestrainedtoreducetheoverallerrorisintroduced

izingvariable-lengthobjectssuchas

handwrittenwordsusingmultimodulesystemsisbestdone

adstothe

conceptoftrainableGTN,alsointroducedinSectionIV.

SectionVdescribesthenowclassicalmethodofHOSfor

minative

andnondiscriminativegradient-basedtechniquesfortrain-

ingarecognizeratthewordlevelwithoutrequiringmanual

segmentationandlabelingarepresentedinSectionVI.

SectionVIIpresentsthepromisingspace-displacementNN

approachthateliminatestheneedforsegmentationheuris-

ticsbyscanningarecognizeratallpossiblelocationson

ionVIII,itisshownthattrainableGTN’s

canbeformulatedasmultiplegeneralizedtransductions

connectionsbetweenGTN’sandHMM’s,commonlyused

inspeechrecognition,nIXdescribes

agloballytrainedGTNsystemforrecognizinghandwriting

oblemisknownas

“online”handwritingrecognitionsincethemachinemust

e

ultsclearly

demonstratetheadvantagesoftrainingarecognizerat

thewordlevel,ratherthantrainingitonpresegmented,

hand-labeled,nXdescribesa

completeGTN-basedsystemforreadinghandwrittenand

eofthesystemis

theconvolutionalNNcalledLeNet-5,whichisdescribed

stemisincommercialuseinthe

NCRCorporationlineofcheckrecognitionsystemsforthe

adingmillionsofcheckspermonth

inseveralbanksacrosstheUnitedStates.

ngfromData

Thereareseveralapproachestoautomaticmachinelearn-

ing,butoneofthemostsuccessfulapproaches,popularized

inrecentyearsbytheNNcommunity,canbecalled“nu-

merical”rningmachine

computesafunction

th

inputpattern,and


发布者:admin,转转请注明出处:http://www.yc00.com/xitong/1713910122a2342764.html

相关推荐

发表回复

评论列表(0条)

  • 暂无评论

联系我们

400-800-8888

在线咨询: QQ交谈

邮件:admin@example.com

工作时间:周一至周五,9:30-18:30,节假日休息

关注微信