Systems and methods for training machine-learned visual attention models
Опубликовано: 12-04-2023
Автор(ы): Bardia DOOSTI, Bradley Ray Green, Ching-Hui Chen, Raviteja Vemulapalli, Xuhui JIA, YuKun ZHU
Принадлежит: Google LLC
Реферат: Systems and methods of the present disclosure are directed to a method for training a machine-learned visual attention model. The method can include obtaining image data that depicts a head of a person and an additional entity. The method can include processing the image data with an encoder portion of the visual attention model to obtain latent head and entity encodings. The method can include processing the latent encodings with the visual attention model to obtain a visual attention value and processing the latent encodings with a machine-learned visual location model to obtain a visual location estimation. The method can include training the models by evaluating a loss function that evaluates differences between the visual location estimation and a pseudo visual location label derived from the image data and between the visual attention value and a ground truth visual attention label.
Systems and methods for training machine-learned visual attention models
Номер патента: WO2022031261A1. Автор: YuKun ZHU,Ching-Hui Chen,Bradley Ray Green,Raviteja Vemulapalli,Xuhui JIA,Bardia DOOSTI. Владелец: Google LLC. Дата публикации: 2022-02-10.