11-06-2020 дата публикации
Номер: WO2020117028A1
Принадлежит:
A query response method performed by a query response device comprises the steps of: classifying an image frame, audio data, and subtitle data that are included in video data of a data set in one shot unit, on the basis of one same subtitle; extracting a shot feature vector by calculating a feature vector of each of an image frame, audio data, and subtitle data that are included in a shot; extracting, from a pair of query responses included in the data set, a feature vector of question data and a feature vector of each of a plurality of pieces of option data corresponding to the question data; calculating a video feature vector by inputting the shot feature vector to a neural network including a plurality of layers, assigning an attention weight calculated on the basis of the question data to an output vector of each of the layers, and then summing results; and selecting a final answer from among the plurality of pieces of option data, on the basis of a similarity between the video feature ...
Подробнее