2024年5月19日发(作者:中关村在线攒机配置)
IEEETRANSACTIONSONPATTERNANALYSISANDMACHINEINTELLIGENCE,VOL.33,NO.2,FEBRUARY2011353
LearningtoDetectaSalientObject
TieLiu,ZejianYuan,JianSun,JingdongWang,NanningZheng,Fellow,IEEE,
XiaoouTang,Fellow,IEEE,andHeung-YeungShum,Fellow,IEEE
Abstract—Inthispaper,ulatethisproblemasabinarylabelingtask
oseasetofnovelfeatures,includingmultiscalecontrast,center-
surroundhistogram,andcolorspatialdistribution,todescribeasalientobjectlocally,regionally,tionalrandom
fieldisler,weextendtheproposedapproachtodetecta
salientoectedalargeimagedatabasecontainingtens
ofthousandsofcarefullylabeledimagesbymultipleusersandavideosegmentdatabase,andconductedasetofexperimentsover
themtodemonstratetheeffectivenessoftheproposedapproach.
IndexTerms—Salientobjectdetection,conditionalrandomfield,visualattention,saliencymap.
Ç
1I
NTRODUCTION
HE
T
humanbrainandvisualsystempaymoreattention
attentionhasbeen
studiedbyresearchersinphysiology,psychology,neural
systems,re
manyapplicationsforvisualattention,forexample,auto-
maticimagecropping[1],adaptiveimagedisplayonsmall
devices[2],image/videocompression,advertisingdesign
[3],andimagecollectionbrowsing[4].Recentstudies[5],
[6],[7]demonstratedthatvisualattentionhelpsobject
recognition,tracking,paper,
westudyoneaspectofvisualattention—salientobject
.1showssomeexamplesofsalientobjects.
Forinstance,peopleareusuallyinterestedintheobjects
inimagesinFig.1,andtheleaf,car,andwomanattractthe
them
salientobjectsorforegroundobjectsthatwearefamiliar
with,applica-
tions,suchasimagedisplayonsmalldevices[2]andimage
collectionbrowsing[4],peoplewanttoshowtheregions
withthemostinterest,paper,
wetrytolocatethesesalientobjectsautomaticallywiththe
suppositionthatasalientobjectexistsinanimage.
.iththeInstituteofArtificialIntelligenceandRobotics,Xi’an
JiaotongUniversity,andtheAnalyticsandOptimizationDepartment,
IBMResearch-China,Building19A2F,ZhongguancunSoftwarePark,8
DongbeiwangWestRoad,HaidianDistrict,Beijing100193,.
E-mail:liultie@.
.rewiththeInstituteofArtificialIntelligenceand
Robotics,Xi’anJiaotongUniversity,28XianningXilu,Xi’an710049,
China.E-mail:yzejian@,nnzheng@.
.iththeVisualComputingGroup,MicrosoftResearchAsia,
5/F,BeijingSigmaCenter,No.49,ZhichunRoad,HaidianDistrict,
Beijing100190,.E-mail:jiansun@.
.withtheMediaComputingGroup,MicrosoftResearchAsia,
5/F,BeijingSigmaCenter,No.49,ZhichunRoad,HaidianDistrict,
Beijing100190,.E-mail:jingdw@.
.withtheDepartmentofInformationEngineering,Chinese
UniversityofHongKong,Shatin,HongKong.
E-mail:xtang@.
.H.-withtheOn-LineServiceDivision,R&D,Microsoft,One
MicrosoftWay,Redmond,WA98052.E-mail:hshum@.
Manuscriptreceived4Dec.2008;revised23Oct.2009;accepted29Nov.
2009;publishedonline2Mar.2010.
ba.
Forinformationonobtainingreprintsofthisarticle,pleasesende-mailto:
tpami@,andreferenceIEEECSLogNumber
TPAMI-2008-12-0834.
DigitalObjectIdentifierno.10.1109/TPAMI.2010.70.
0162-8828/11/$26.00ß2011IEEE
1.1RelatedWork
Mostexistingvisualattentionapproachesarebasedonthe
bottom-upcomputationalframework[8],[9],[10],[11],[12],
[13],[14],[15],[16],wherevisualattentionissupposedtobe
drivenbylow-levelstimulusinthescene,suchasintensity,
contrast,pproachesconsistofthe
followingthreesteps:Thefirststepisfeatureextractionin
whichmultiplelow-levelvisualfeatures,suchasintensity,
color,orientation,texture,andmotion,areextractedfromthe
ondstepissaliency
iencyiscomputedbyacenter-surround
operation[13],self-information[8],orgraph-basedrandom
walk[9]ormalizationand
linear/nonlinearcombination,amastermap[17]ora
saliencymap[14]iscomputedtorepresentthesaliencyof
,afewkeylocationsonthesaliency
mapareidentifiedbywinner-take-all,orinhibition-of-
return,ly,asaliency
modelbasedonlow,middle,andhigh-levelimagefeatures
wastrainedusingthecollectedeyetrackingdata[18].While
theseapproacheshaveworkedwellinfindingafewfixation
locationsinsyntheticandnaturalimages,theyhavenotbeen
abletoaccuratelydetectwherethesalientobjectshouldbe.
Forinstance,themiddlerowinFig.1showsthree
saliencymapscomputedusingItti’salgorithm[13].Note
thatthevisualsaliencyconcentratesonseveralsmalllocal
,thebackground
gridinFig.1a,theshadowinFig.1b,andtheforeground
ghtheleafinFig.1acommands
muchattention,ore,
thesesaliencymapscomputedfromlow-levelfeatures
don’thavethenotationofobjects,andtheyarenotgood
indicationsforwhereasalientobjectislocatedwhile
perusingtheseimages.
Figure-groundsegregationissomehowrelatedtosalient
r,theusuallyfigure-ground
PublishedbytheIEEEComputerSociety
354IEEETRANSACTIONSONPATTERNANALYSISANDMACHINEINTELLIGENCE,VOL.33,NO.2,FEBRUARY2011
ptobottom:inputimagewitha
salientobject,saliencymapcomputedbyItti’sattentionalgorithm(
),andsaliencymapcomputedbyoursalient
objectdetectionapproach.
segregationalgorithmworkswiththesuppositionofthe
categoryofobjects[19],[20],[21]orwithinteractions[22],
[23].Iftheobjectisassignedagivencategory,thespecific
features,forexample,forcows,canbedefinedspecially,
andthesefeaturescannotbeadoptedforothercategories.
Forinteractivefigure-groundsegmentation,theappearance
modelisusuallysetup,whereforoursalientobject
detection,wedonothavesuchanappearancemodel.
Visualattentionisalsostudiedforsequentialimages,
wherethespatiotemporalcuesfromimagesequencesare
instance,motionfromobjectsorbackgroundshelpsto
indicatethesalientfixations[24],[25],[26].Largemotion[27]
andmotioncontract[24]aresupposedtoinduceprominent
attention,y,thevisualsaliencyfroma
singleimageiscombinedwiththemotionsaliencyforbetter
visualattentiondetection,anddifferentcombinationstrate-
giesareintroducedin[27].Videosurprising[11]isalso
related,whereitdescribestheKullbackLeiblerdivergence
betweenthepriorandposteriordistributionofafeature
isualattentionapproachessufferfromthe
similarshortcomingtothevisualattentionapproachesfor
ticobjectdiscovery[28],[29],[30]deals
withasimilarsalientobjectdetectiontaskforsequential
ectsareextractedandtrackedusingmotion-
basedlayersegmentationin[28]andagenerativemodelof
objectsbydefiningswitchvariablesforcombinatorialmodel
selectionisadoptedin[29].Theunsupervisedvideoobject
discovery[30]combinesthetopicmodelandthetemporal
modelforvideos.
1.2OurApproach
Inthispaper,weinvestigateoneaspectofvisualattention,
namely,rporatethehigh-
levelconceptofthesalientobjectintotheprocessofsaliency
eobservedinFig.2,people
naturallypaymoreattentiontosalientobjectsinimages,
suchasaperson,aface,acar,ananimal,oraroadsign.
Therefore,weformulatesalientobjectdetectionasabinary
labelingproblemthatseparatesasalientobjectfromthe
imagesinourimagedatabaseforsalientobject
detection.
cedetection,welearntodetecta
familiarobject;unlikefacedetection,wedetectafamiliar
yetunknownobjectinanimage.
Wepresentasupervisedapproachtolearntodetecta
,we
modelthesalientobjectdetectionproblembyacondition
randomfield(CRF),whereagroupofsalientfeaturesare
er,thesegmenta-
tionisalsoincorporatedintotheCRFtodetectasalientobject
trowinFig.1shows
,to
overcomethechallengethatwedonotknowwhataspecific
objectorobjectcategoryis,weproposeasetofnovellocal,
regional,andglobalsalientfeaturestodefineageneric
definethesalientfeaturesonthe
motionfieldsimilarlytocapturethespatiotemporalcues.
Then,weconstructalargeimagedatabasewith20,000+well-
estofour
knowledge,itisthefirsttimealargeimagedatabasehasbeen
madeavailableforquantitativeevaluation.
Theremainderofthepaperisorganizedasfollows:
Section2introducestheformulationofthesalientobject
detectionproblem,andthesalientobjectfeaturesare
n4introducestheimage
n5dis-
cussestheconnectionsbetweenourapproachandrelated
approaches,andtheconclusionfollowsinSection6.
2F
ORMULATION
GivenanimageI,werepresentthesalientobjectasabinary
maskA¼fa
x
hpixelx,a
x
2f1;0gisabinarylabel
toindicatewhetherthepixelxbelongstothesalientobject.
Similarly,thesalientobjectsinsequentialimages,
fI
1
;...;I
t
;...;I
N
g,arerepresentedbyasequenceofbinary
masksfA
1
;...;A
t
;...;A
N
g,withA
t
correspondingto
imageI
t
.
Inthispaper,weformulatethesalientobjectdetection
problemasabinarylabelingtaskbyinspectingwhether
tpresentthe
conditionalrandomfieldformulationtothesingle-image
发布者:admin,转转请注明出处:http://www.yc00.com/num/1716061261a2713821.html
评论列表(0条)