Efficient cross-platform serving of deep neural networks for low latency applications
Номер патента: EP4356306A1
Опубликовано: 24-04-2024
Автор(ы): Aaron Andalman
Принадлежит: Cognitiv Corp
Опубликовано: 24-04-2024
Автор(ы): Aaron Andalman
Принадлежит: Cognitiv Corp
Реферат: Systems, apparatuses, and methods for implementation of an inference or prediction process using a recurrent neural network (RNN) that is particularly advantageous for low-latency applications. Embodiments introduce an implementation of a recurrent neural network-based system which results in a fixed inference time (i.e., a constant computation time to perform an inference stage) that is independent of input data sequence length. Embodiments may be used to implement real-time data mapping and management and perform an inference strategy that enables the system to be used for serving different types of models, including sequential deep neural networks for low latency (i.e., real-time, or close to real-time) applications.
Efficient cross-platform serving of deep neural networks for low latency applications
Номер патента: CA3221902A1. Автор: Aaron Andalman. Владелец: Cognitiv Corp. Дата публикации: 2022-12-22.