Поиск патентов

Настройки

Глубина выборки

Укажите год

Небесная энциклопедия

Космические корабли и станции, автоматические КА и методы их проектирования, бортовые комплексы управления, системы и средства жизнеобеспечения, особенности технологии производства ракетно-космических систем

Подробнее

Мониторинг СМИ

Мониторинг СМИ и социальных сетей. Сканирование интернета, новостных сайтов, специализированных контентных площадок на базе мессенджеров. Гибкие настройки фильтров и первоначальных источников.

Подробнее

Форма поиска

Ключевые слова. Может быть несколько по одной на строку

Поддерживает ввод нескольких поисковых фраз (по одной на строку). При поиске обеспечивает поддержку морфологии русского и английского языка

Автор

Ведите корректный номера.

Владелец

Ведите корректный номера.

Классы IPC

Ведите корректный номера.

Классы CPC

Ведите корректный номера.

Начиная с года

Укажите год

Заканчивая годом

Укажите год

Применить Всего найдено 268. Отображено 102.

11-09-2013 дата публикации

Heterogeneous multi-core thread scheduling method, heterogeneous multi-core thread scheduling system and heterogeneous multi-core processor

Номер: CN103294550A

Автор: Wang Lei, Chen Yunji, Chen Tianshi, Lu Chao, Li Mengzhu

Принадлежит:

The invention relates to a heterogeneous multi-core thread scheduling method. The heterogeneous multi-core thread scheduling method includes: respectively generating a ranking list for threads and cores according to dynamic characteristics of a program, finding out optimal stable match of the threads and the cores according to the ranking lists, and performing thread scheduling according to stable match. The heterogeneous multi-core thread scheduling method specifically includes: receiving characteristic vectors of the threads running in the cores, and selecting priority ranking of the cores for the threads according to the characteristic vectors; ranking each thread for each core; receiving the ranking lists of the threads and the cores, and finding out stable match results of the threads and the cores; and receiving the stable match results, scheduling by an operating system and allocating each thread to the corresponding core for running. Huge expenditure caused by sampling scheduling ...

Подробнее

Номер записи: 1

18-09-2013 дата публикации

Network congestion information transmission method and device

Номер: CN103312603A

Автор: Liu Shaoli, Chen Yunji, Chen Tianshi, Li Ling, Sun Guoqing

Принадлежит:

The invention provides a network congestion information transmission method and device. The method comprises the following steps: a first node receives a data packet sent by a second node through a main network, the data packet carries first congestion information, and the first node obtains the first congestion information from the data packet. As the data packet transmitted between the nodes carries the first congestion information, the transmission of the first congestion information is realized, and a special additional network is not needed to be added to transmit the congestion information. Thus, the power consumption and the area overhead of the network on a chip are not needed to be increased.

Подробнее

Номер записи: 2

17-01-2019 дата публикации

METHOD AND DEVICE FOR ON-CHIP REPETITIVE ADDRESSING

Номер: US20190018766A1

Автор: Chen Tianshi, Chen Yunji, GUO QI

Принадлежит:

The present disclosure may include a method that comprises: partitioning data on an on-chip and/or an off-chip storage medium into different data blocks according to a pre-determined data partitioning principle, wherein data with a reuse distance less than a pre-determined distance threshold value is partitioned into the same data block; and a data indexing step for successively loading different data blocks to at least one on-chip processing unit according a pre-determined ordinal relation of a replacement policy, wherein the repeated data in a loaded data block being subjected to on-chip repetitive addressing. Data with a reuse distance less than a pre-determined distance threshold value is partitioned into the same data block, and the data partitioned into the same data block can be loaded on a chip once for storage, and is then used as many times as possible, so that the access is more efficient. 1. An on-chip repetitive addressing means , comprising:a data partitioning step for partitioning data on an on-chip storage medium and/or an off-chip storage medium into different data blocks according to a pre-determined data partitioning principle, wherein on the basis of the pre-determined data partitioning principle, the data with a reuse distance less than a pre-determined distance threshold value is partitioned into the same data block; anda data indexing step for successively loading the different data blocks to at least one on-chip processing unit according a pre-determined ordinal relation of a replacement policy, wherein the repeated data in a loaded data block being subjected to on-chip repetitive addressing.2. The on-chip repetitive addressing means according to claim 1 , wherein an index address for a data is consisted of a data block address and an in-block address;the data indexing step comprises successively loading different data blocks to the at least one on-chip processing unit according to the pre-determined ordinal relation of the replacement policy ...

Подробнее

Номер записи: 3

24-01-2019 дата публикации

ON-CHIP DATA PARTITIONING READ-WRITE METHOD, SYSTEM, AND DEVICE

Номер: US20190026246A1

Автор: Chen Tianshi, Chen Yunji, Du Zidong, GUO QI

Принадлежит:

The present invention is directed to the storage technical field and discloses an on-chip data partitioning read-write method, the method comprises: a data partitioning step for storing on-chip data in different areas, and storing the on-chip data in an on-chip storage medium and an off-chip storage medium respectively, based on a data partitioning strategy; a pre-operation step for performing an operational processing of an on-chip address index of the on-chip storage data in advance when implementing data splicing; and a data splicing step, for splicing the on-chip storage data and the off-chip input data to obtain a representation of the original data based on a data splicing strategy. Also provided are a corresponding on-chip data partitioning read-write system and device. Thus, read and write of repeated data can be efficiently realized, reducing memory access bandwidth requirements while providing good flexibility, thus reducing on-chip storage overhead. 1. An on-chip data partitioning read-write method , comprising:a data partitioning step for storing on-chip data in different areas, and storing the on-chip data in an on-chip storage medium and an off-chip storage medium respectively, based on a data partitioning strategy;a pre-operation step for performing an operational processing of an on-chip address index of the on-chip storage data in advance when implementing data splicing; anda data splicing step for splicing the on-chip storage data and the off-chip input data to obtain a representation of the original data based on a data splicing strategy.2. The on-chip data partitioning read-write method according to claim 1 , further comprising:a data storing step for storing and carrying the on-chip storage data of the on-chip storage medium and the off-chip input data from the off-chip storage medium;read-write ports are separated in the data storing step, and read and write of the data are independent from each other;the pre-operation step further comprises: ...

Подробнее

Номер записи: 4

24-01-2019 дата публикации

NEURAL NETWORK ACCELERATOR AND OPERATION METHOD THEREOF

Номер: US20190026626A1

Автор: Chen Tianshi, Chen Yunji, Du Zidong, GUO QI

Принадлежит:

A neural network accelerator and an operation method thereof applicable in the field of neural network algorithms are disclosed. The neural network accelerator comprises an on-chip storage medium for storing data externally transmitted or for storing data generated during computing; an on-chip address index module for mapping to a correct storage address on the basis of an input index when an operation is performed; a core computing module for performing a neural network operation; and a multi-ALU device for obtaining input data from the core computing module or the on-chip storage medium to perform a nonlinear operation which cannot be completed by the core computing module. By introducing a multi -ALU design into the neural network accelerator, an operation speed of the nonlinear operation is increased, such that the neural network accelerator is more efficient. 1. A neural network accelerator , comprising:an on-chip storage medium for storing data transmitted from an external of the neural network accelerator or for storing data generated during computation;an on-chip address index module for mapping to a correct storage address on the basis of an input index when an operation is performed;a core computing module for performing a linear operation of a neural network operation; anda multi-ALU device for obtaining input data from the core computing module or the on-chip storage medium to perform a nonlinear operation which cannot be performed by the core computing module.2. The neural network accelerator according to claim 1 , wherein the data generated during computation comprises a computation result or an intermediate computation result.3. The neural network accelerator according to claim 1 , wherein the multi-ALU device comprises: an input mapping unit claim 1 , a plurality of arithmetic logical units (ALUs) and an output mapping unit claim 1 ,the input mapping unit is configured for mapping the input data obtained from the on-chip storage medium or the core ...

Подробнее

Номер записи: 5

30-01-2020 дата публикации

COMPUTING APPARATUS AND RELATED PRODUCT

Номер: US20200034698A1

Автор: Chen Tianshi, Chen Xiaobin, LIU Daofu, Liu Shaoli, Wang Zai, ZHUANG Yimin

Принадлежит: Shanghai Cambricon Information Technology Co., Ltd.

The present application provides an operation device and related products. The operation device is configured to execute operations of a network model, wherein the network model includes a neural network model and/or non-neural network model; the operation device comprises an operation unit, a controller unit and a storage unit, wherein the storage unit includes a data input unit, a storage medium and a scalar data storage unit. The technical solution provided by this application has advantages of a fast calculation speed and energy-saving. 1. An operation device , comprising:a storage unit configured to store data and instructions;a controller unit configured to extract, from the storage unit, a first instruction including sorting instructions or sparse processing instructions and a first data corresponding to the first instruction including input neuron data and weight data; andan operation unit configured to, in response to the first instruction, perform an operation corresponding to the first instruction on the input neuron data and the weight data, to obtain an operation result.2. The device according to claim 1 , whereinthe controller unit comprises an instruction buffer unit configured to buffer the instructions and an instruction processing unit configured to implement decoding function.3. The device according to claim 2 , further comprising: a configuration parsing unit and a mapping unit claim 2 , and when the first instruction is the sparse processing instruction and the first data further includes preset configuration data claim 2 ,the configuration parsing unit is configured to set a mapping mode according to the preset configuration data;the mapping unit is configured to perform mapping processing on the input neuron and the weight data according to the mapping mode to obtain an input neuron-weight pair which is a mapping relationship between the input neuron data and the weight data after the mapping processing;the instruction buffer unit is ...

Подробнее

Номер записи: 6

14-02-2019 дата публикации

Apparatus and methods for non-linear function operations

Номер: US20190050369A1

Автор: Chen Tianshi, Chen Yunji, GUO QI, Lan Huiying, Li Shangying, Li Zhen

Принадлежит:

A nonlinear function operation device and method are provided. The device may include a table looking-up module and a linear fitting module. The table looking-up module may be configured to acquire a first address of a slope value k and a second address of an intercept value b based on a floating-point number. The linear fitting module may be configured to obtain a linear function expressed as y=k×x+b based on the slope value k and the intercept value b, and substitute the floating-point number into the linear function to calculate a function value of the linear function, wherein the calculated function value is determined as the function value of a nonlinear function corresponding to the floating-point number. 1. A nonlinear function operation device , comprising: 'acquire a first address of a slope value k and a second address of an intercept value b based on a floating-point number; and', 'a table looking-up module configured to'}a linear fitting module configured toobtain a linear function expressed as y=k×x+b based on the slope value k and the intercept value b, andsubstitute the floating-point number into the linear function to calculate a function value of the linear function, wherein the calculated function value is determined as the function value of a nonlinear function corresponding to the floating-point number.2. The device of claim 1 , further comprising:a slope and intercept storing module configured to store multiple slope values and multiple intercept values of a plurality of linear functions, wherein the plurality of linear functions are obtained through piecewise-linear fitting of the nonlinear function, wherein the table looking-up module comprises a selecting module configured to acquire the first address of the slope value k and the second address of the intercept value b based on the floating-point number.3. The device of claim 2 , wherein the table looking-up module is further configured to acquire the slope value k and the intercept value b ...

Подробнее

Номер записи: 7

14-02-2019 дата публикации

Apparatus for operations at maxout layer of neural networks

Номер: US20190050736A1

Автор: Chen Tianshi, Chen Yunji, GUO QI, HAN Dong

Принадлежит:

Aspects for maxout layer operations in neural network are described herein. The aspects may include a load/store unit configured to retrieve input data from a storage module. The input data may be formatted as a three-dimensional vector that includes one or more feature values stored in a feature dimension of the three-dimensional vector. The aspects may further include a pruning unit configured to divide the one or more feature values into one or more feature groups based on one or more data ranges and select a maximum feature value from each of the one or more feature groups. Further still, the pruning unit may be configured to delete, in each of the one or more feature groups, feature values other than the maximum feature value and update the input data with the one or more maximum feature values. 1. An apparatus for data pruning at a maxout layer of a neural network , comprising: 'wherein the input data is formatted as a three-dimensional vector that includes one or more feature values stored in a feature dimension of the three-dimensional vector; and', 'a load/store unit configured to retrieve input data from a storage module,'} divide the one or more feature values into one or more feature groups based on one or more data ranges,', 'select a maximum feature value from each of the one or more feature groups,', 'delete, in each of the one or more feature groups, feature values other than the maximum feature value, and', 'update the input data with the one or more maximum feature values., 'a pruning unit configured to'}2. The apparatus of claim 1 , wherein the input data further includes an abscissa and an ordinate.3. The apparatus of claim 1 , further comprising a data conversion unit configured to adjust a write sequence in storing the input data.4. The apparatus of claim 3 , wherein the load/store unit is further configured to store the one or more feature values prior to storing data in other dimensions of the input data in accordance with the adjusted write ...

Подробнее

Номер записи: 8

13-02-2020 дата публикации

Apparatus and methods for matrix multiplication

Номер: US20200050453A1

Автор: Shaoli Liu, Tianshi CHEN, Xiao Zhang, Yunji CHEN

Принадлежит: Cambricon Technologies Corp Ltd

Aspects for matrix multiplication in neural network are described herein. The aspects may include a controller unit configured to receive a matrix-multiply-matrix (MM) instruction that includes a first starting address of a first matrix, a first size of the first matrix, a second starting address of a second matrix, and a second size of the second matrix; multiple computation modules configured to respectively multiply, in response to the MM instruction, row vectors of the first matrix with column vectors of the second matrix to generate one or more result elements; and an interconnection unit configured to combine the result elements to generate one or more row vectors of a result matrix.

Подробнее

Номер записи: 9

13-02-2020 дата публикации

PROCESSING APPARATUS AND PROCESSING METHOD

Номер: US20200050918A1

Автор: Chen Tianshi, Du Zidong, GUO QI, Li Wei, Liu Shaoli, LUO Yuzhe, Wang Zai, Wei Jie, Zhi Tian, ZHOU Shengyuan

Принадлежит:

A processing device with dynamically configurable operation bit width, characterized by comprising: a memory for storing data, the data comprising data to be operated, intermediate operation result, final operation result, and data to be buffered in a neural network; a data width adjustment circuit for adjusting the width of the data to be operated, the intermediate operation result, the final operation result, and/or the data to be buffered; an operation circuit for operating the data to be operated, including performing operation on data to be operated of different bit widths by using an adder circuit and a multiplier; and a control circuit for controlling the memory, the data width adjustment circuit and the operation circuit. The device of the present disclosure can have the advantages of strong flexibility, high configurability, fast operation speed, low power consumption or the like. 1. A processing device with dynamically configurable operation bit width , comprising:a memory for storing data, the data comprising data to be operated, intermediate operation result, final operation result, and data to be buffered of a neural network;a data width adjustment circuit for adjusting the width of the data to be operated, the intermediate operation result, the final operation result, and/or the data to be buffered;an operation circuit for operating the data to be operated of the neural network; anda control circuit for controlling the memory, the data width adjustment circuit, and the operation circuit.2. The device according to claim 1 , wherein the operation circuit operating the data to be operated of the neural network comprises determining a type of a multiplier circuit and an adder circuit of the operation circuit according to the data to be operated so as to perform the operation.3. The device according to claim 1 , wherein the data width adjustment circuit comprises:an input data processing module, configured to adjust the data width of the data in the memory; ...

Подробнее

Номер записи: 10

13-02-2020 дата публикации

MULTIPLICATION AND ADDITION DEVICE FOR MATRICES, NEURAL NETWORK COMPUTING DEVICE, AND METHOD

Номер: US20200050927A1

Автор: Chen Tianshi, Chen Yunji, GUO QI, Liu Shaoli, ZHUANG Yimin

Принадлежит:

Aspects of a neural network operation device are described herein. The aspects may include a matrix element storage module configured to receive a first matrix that includes one or more first values, each of the first values being represented in a sequence that includes one or more bits. The matrix element storage module may be further configured to respectively store the one or more bits in one or more storage spaces in accordance with positions of the bits in the sequence. The aspects may further include a numeric operation module configured to calculate an intermediate result for each storage space based on one or more second values in a second matrix and an accumulation module configured to sum the intermediate results to generate an output value. 1. A neural network operation device , comprising:a submatrix divider circuit configured to select a portion of an input data matrix as an input submatrix; receive a convolution kernel matrix that includes one or more kernel values, wherein each of the one or more kernel values is represented as a sequence that includes one or more bits, and', 'respectively store the one or more bits in one or more storage spaces in accordance with positions of the one or more bits in the sequence;, 'a matrix element memory configured toa calculator circuit configured to calculate an intermediate result for each storage space based on one or more input elements in the input submatrix, wherein the one or more input elements correspond to non-zero values stored in the storage space;an accumulator circuit configured to sum the intermediate results to generate an output value; anda convolution result assembler circuit configured to assemble the output values calculated for different portions of the input data matrix to generate an output matrix.2. The neural network operation device of claim 1 , wherein the submatrix divider circuit is further configured to select the input submatrix in accordance with the convolution kernel matrix.3. The ...

Подробнее

Номер записи: 11

21-02-2019 дата публикации

Appartus and methods for submatrix operations

Номер: US20190057063A1

Автор: Chen Tianshi, Chen Yunji, Liu Shaoli, Zhang Xiao

Принадлежит:

Aspects for submatrix operations in neural network are described herein. The aspects may include a controller unit configured to receive a submatrix instruction. The submatrix instruction may include a starting address of a submatrix of a matrix, a width of the submatrix, a height of the submatrix, and a stride that indicates a position of the submatrix relative to the matrix. The aspects may further include a computation module configured to select one or more values from the matrix as elements of the submatrix in accordance with the starting address of the matrix, the starting address of the submatrix, the width of the submatrix, the height of the submatrix, and the stride. 1. An apparatus for submatrix operations in a neural network , comprising:a controller unit configured to receive a submatrix instruction, wherein the submatrix instruction includes a starting address of a submatrix of the matrix, a width of the submatrix, a height of the submatrix, and a stride that indicates a position of the submatrix relative to a matrix; anda computation module configured to select one or more values from the matrix as elements of the submatrix in accordance with the starting address of the submatrix, the width of the submatrix, the height of the submatrix, and the stride.2. The apparatus of claim 1 , further comprising a matrix caching unit configured to store the matrix that includes one or more matrix elements.3. The apparatus of claim 1 , further comprising an instruction register configured to store the starting address of the submatrix claim 1 , the width of the submatrix claim 1 , the height of the submatrix claim 1 , and the stride.4. The apparatus of claim 1 , wherein the submatrix instruction is an instruction selected from a submatrix-multiply-vector (SMMV) instruction claim 1 , a vector-multiply-submatrix (VMSM) instruction claim 1 , a submatrix-multiply-scalar (SMMS) instruction claim 1 , a TENS instruction claim 1 , a submatrix-addition (SMA) instruction ...

Подробнее

Номер записи: 12

20-02-2020 дата публикации

PROCESSING DEVICE AND RELATED PRODUCTS

Номер: US20200057647A1

Автор: Chen Tianshi, Liu Shaoli, WANG Bingrui, Zhang Yao

Принадлежит: CAMBRICON TECHNOLOGIES CORPORATION LIMITED

A convolution operation method and a processing device for performing the same are provided. The method is performed by a processing device. The processing device includes a main processing circuit and a plurality of basic processing circuits. The basic processing circuits are configured to perform convolution operation in parallel. The technical solutions disclosed by the present disclosure can provide short operation time and low energy consumption. 1. A convolution operation method performed by a processing device , comprising:receiving, by the processing device, input data and a weight, wherein the input data is four-dimensional data arranged in dimensions N, C, H, and W, wherein dimension C is between an outermost layer and an innermost layer of the input data;rearranging, by the processing device, the input data such that dimension C becomes the innermost layer of the input data; andperforming, by the processing device, a convolution operation between the rearranged input data and the weight.2. The convolution operation method of claim 1 , wherein the input data is in a dimension order of NCHM claim 1 , wherein rearranging the input data further includes:changing the dimension order of the input data to NHWC or NWHC.3. The convolution operation method of claim 1 , wherein the processing device includes a main processing circuit and a plurality of basic processing circuits claim 1 , and wherein performing the convolution operation between the input data and the weight further includes:dividing, by the main processing circuit, the weight into a plurality of basic data blocks;distributing, by the main processing circuit, the plurality of basic data blocks to the basic processing circuits; andbroadcasting, by the main processing circuit, respective parts of data in the input data to the basic processing circuits.4. The convolution operation method of claim 3 , wherein performing the convolution operation between the input data and the weight further includes: ...

Подробнее

Номер записи: 13

20-02-2020 дата публикации

PROCESSING DEVICE AND RELATED PRODUCTS

Номер: US20200057648A1

Автор: Chen Tianshi, Liu Shaoli, WANG Bingrui, Zhang Yao

Принадлежит: CAMBRICON TECHNOLOGIES CORPORATION LIMITED

A convolution operation method and a processing device for performing the same are provided. The method is performed by a processing device. The processing device includes a main processing circuit and a plurality of basic processing circuits. The basic processing circuits are configured to perform convolution operation in parallel. The technical solutions disclosed by the present disclosure can provide short operation time and low energy consumption. 1. A convolution operation method performed by a processing device , wherein the processing device comprises a main processing circuit and a plurality of basic processing circuits , wherein the convolution operation method comprises:receiving, by the main processing circuit, input data and a weight;dividing, by the main processing circuit, the weight into a plurality of basic data blocks;distributing, by the main processing circuit, the plurality of basic data blocks to the basic processing circuits;broadcasting, by the main processing circuit, respective parts of data in the input data to the basic processing circuits;performing, by the basic processing circuits, operations on the respective parts of data and distributed basic data blocks respectively to obtain operation results;providing, by the basic processing circuits, the operation results to the main processing circuit; andobtaining, by the main processing circuit, a calculation result of the convolution operation according to the operation results of the basic processing circuits.2. The convolution operation method of claim 1 , wherein the input data and the weight are four-dimensional data blocks claim 1 , wherein the input data is arranged in dimensions N claim 1 , C claim 1 , H claim 1 , and W claim 1 , and the weight is arranged in dimensions M claim 1 , C claim 1 , KH claim 1 , and KW.3. The convolution operation method of claim 2 , wherein the weight includes M convolution kernels claim 2 , and dividing the weight into a plurality of basic data blocks ...

Подробнее

Номер записи: 14

20-02-2020 дата публикации

PROCESSING DEVICE AND RELATED PRODUCTS

Номер: US20200057649A1

Автор: Chen Tianshi, Liu Shaoli, WANG Bingrui, Zhang Yao

Принадлежит: CAMBRICON TECHNOLOGIES CORPORATION LIMITED

A pooling operation method and a processing device for performing the same are provided. The pooling operation method may rearrange a dimension order of the input data before pooling is performed. The technical solutions provided by the present disclosure have the advantages of short operation time and low energy consumption. 1. A pooling operation method performed by a processing device , comprising:receiving, by the processing device, input data, wherein the input data is four-dimensional data arranged in dimensions N, C, H, and W, wherein dimension C is between an outermost layer and an innermost layer of the input data;rearranging, by the processing device, the input data such that dimension C becomes the innermost layer of the input data; andperforming, by the processing device, a pooling operation on the input data.2. The pooling operation method of claim 1 , wherein the input data is in a dimension order of NCHM claim 1 , wherein rearranging the input data further includes:changing the dimension order of the input data to NHWC or NWHC.3. The pooling operation method of claim 1 , wherein the chip device includes a main processing circuit and a plurality of basic processing circuits claim 1 , and wherein performing the pooling operation on the input data further includes:obtaining, by the main processing circuit, data to be calculated by sliding an operation window in dimensions H and W of the rearranged input data; andperforming the pooling operation on the data to be calculated at each position of the operation window as it slides.4. The pooling operation method of claim 3 , wherein obtaining the data to be calculated further includes:sliding the operation window in the dimension W after sliding the operation window in the dimension H, orsliding the operation window in the dimension H after sliding the operation window in the dimension W.5. The pooling operation method of claim 1 , wherein the pooling operation is a maximum value operation or an average value ...

Подробнее

Номер записи: 15

20-02-2020 дата публикации

PROCESSING DEVICE AND RELATED PRODUCTS

Номер: US20200057650A1

Автор: Chen Tianshi, Liu Shaoli, WANG Bingrui, Zhang Yao

Принадлежит: CAMBRICON TECHNOLOGIES CORPORATION LIMITED

A fully connected operation method and a processing device for performing the same are provided. The fully connected operation method designates distribution data and broadcast data. The distribution data is divided into basic data blocks and distributed to parallel processing units, and the broadcast data is broadcasted to the parallel processing units. Operations between the basic data blocks and the broadcasted data are carried out by the parallel processing units before the results are returned to a main unit for further processing. The technical solutions disclosed by the present disclosure provide short Operation time and low energy consumption. 1. A fully connected operation method performed by a processing device , wherein the processing device comprises a main processing circuit and a plurality of basic processing circuits , and the method comprises:receiving, by the main processing circuit, input data, a weight, and a fully connected operation instruction;designating, by the main processing circuit, one of the input data and the weight as distribution data and the other one as broadcasting data;dividing, by the main processing circuit, the distribution data into M basic data blocks;distributing, by the main processing circuit, the M basic data blocks to the plurality of basic processing circuits;broadcasting, by the main processing circuit, the broadcasting data to the plurality of basic processing circuits;performing, by the plurality of basic processing circuits, inner-product operations in parallel on the basic data blocks and the broadcasting data to obtain a plurality of processing results;providing, by the plurality of basic processing circuits, the plurality of processing results to the main processing circuit; and.combining, by the main processing circuit, the plurality of processing results to obtain a computation result of the fully connected operation instruction.2. The fully connected operation method of claim 1 , wherein designating one of the ...

Подробнее

Номер записи: 16

20-02-2020 дата публикации

PROCESSING DEVICE AND RELATED PRODUCTS

Номер: US20200057651A1

Автор: Chen Tianshi, Liu Shaoli, WANG Bingrui, Zhang Yao

Принадлежит: CAMBRICON TECHNOLOGIES CORPORATION LIMITED

A matrix-multiplying-matrix operation method and a processing device for performing the same are provided. The matrix-multiplying-matrix method includes distributing, by a main processing circuit, basic data blocks of one matrix and broadcasting the other matrix to a plurality of the basic processing circuits. That way, the basic processing circuits can perform inner-product operations between the basic data blocks and the broadcasted matrix in parallel. The results are then provided back to main processing circuit for combining. The technical solutions proposed by the present disclosure provide short operation time and low energy consumption. 1. A matrix-multiplying-matrix operation method performed by a processing device , wherein the processing device comprises a main processing circuit and a plurality of basic processing circuits , and the method comprises:receiving, by the main processing circuit, a matrix A, a matrix B, and a multiplication instruction A*B;dividing, by the main processing circuit, the matrix A into M basic data blocks;distributing, by the main processing circuit, the M basic data blocks to the plurality of basic processing circuits;broadcasting, by the main processing circuit, the matrix B to the plurality of basic processing circuits;performing, by the plurality of basic processing circuits, inner-product operations in parallel on the basic data blocks and the matrix B to obtain a plurality of processing results;providing, by the plurality of basic processing circuits, the plurality of processing results to the main processing circuit; andcombining, by the main processing circuit, the plurality of processing results to obtain a computation result of the multiplication instruction.2. The matrix-multiplying-matrix operation method of claim 1 , wherein distributing the M basic data blocks to the plurality of basic processing circuits includes:distributing the M basic data blocks to the plurality of basic processing circuits non-repetitively in ...

Подробнее

Номер записи: 17

20-02-2020 дата публикации

PROCESSING DEVICE AND RELATED PRODUCTS

Номер: US20200057652A1

Автор: Chen Tianshi, Liu Shaoli, WANG Bingrui, Zhang Yao

Принадлежит: CAMBRICON TECHNOLOGIES CORPORATION LIMITED

A matrix-multiplying-vector operation method and a processing device for performing the same are provided. The matrix-multiplying-vector method includes distributing, by a main processing circuit, basic data blocks of the matrix and broadcasting the vector to a plurality of the basic processing circuits. That way, the basic processing circuits can perform inner-product operations between the basic data blocks and the broadcasted vector in parallel. The results are then provided back to main processing circuit for combining. The technical solutions proposed by the present disclosure provide short operation time and low energy consumption. 1. A matrix-multiplying-vector operation method performed by a processing device , wherein the processing device comprises a main processing circuit and a plurality of basic processing circuits , and the method comprises:receiving, by the main processing circuit, a matrix A, a vector B, and a matrix-multiplying-vector operation instruction;dividing, by the main processing circuit, the matrix A into M basic data blocks;distributing, by the main processing circuit, the M basic data blocks to the plurality of basic processing circuits;broadcasting, by the main processing circuit, the vector B to the plurality of basic processing circuits;performing, by the plurality of basic processing circuits, inner-product operations in parallel on the basic data blocks and the vector B to obtain a plurality of processing results;providing, by the plurality of basic processing circuits, the plurality of processing results to the main processing circuit; andcombining, by the main processing circuit, the plurality of processing results to obtain a computation result of the matrix-multiplying-vector operation instruction.2. The matrix-multiplying-vector operation method of claim 1 , wherein distributing the M basic data blocks to the plurality of basic processing circuits includes:distributing the M basic data blocks to the plurality of basic ...

Подробнее

Номер записи: 18

28-02-2019 дата публикации

APPARATUS AND METHODS FOR GENERATING DOT PRODUCT

Номер: US20190065184A1

Автор: Chen Tianshi, Chen Yunji, GUO QI, Liu Shaoli, Zhi Tian

Принадлежит:

Aspects for generating a dot product for two vectors in neural network are described herein. The aspects may include a controller unit configured to receive a vector load instruction that includes a first address of a first vector and a length of the first vector. The aspects may further include a direct memory access unit configured to retrieve the first vector from a storage device based on the first address of the first vector. Further still, the aspects may include a caching unit configured to store the first vector. 1. An apparatus for vector dot product computation in a neural network , comprising:a controller unit configured to receive a vector dot product instruction that includes a first address of a first vector and a second address of a second vector; and wherein the one or more multipliers are respectively configured to multiply, in response to the vector dot product instruction, a first element of the first vector with a corresponding second element of the second vector to generate one or more multiplication results; and', 'wherein the adder is configured to sum, in response to the vector dot product instruction, the multiplication results to generate a vector dot product computation result., 'a computation module that includes one or more multipliers and an adder,'}2. The apparatus of claim 1 , wherein the one or more multipliers are configured to transmit the multiplication results directly to the adder.3. The apparatus of claim 1 , wherein the vector dot product instruction further includes a length of the first vector.4. The apparatus of claim 3 , wherein the vector dot product instruction further includes a length of the second vector.5. The apparatus of claim 4 , further comprising a direct memory access unit configured toretrieve the first vector based on the first address and the length of the first vector, andretrieve the second vector based on the second address and the length of the second vector.6. The apparatus of claim 5 , further ...

Подробнее

Номер записи: 19

28-02-2019 дата публикации

Apparatus and Methods for Vector Operations

Номер: US20190065187A1

Автор: Chen Tianshi, Chen Yunji, GUO QI, Liu Shaoli, Zhi Tian

Принадлежит:

Aspects for vector operations in neural network are described herein. The aspects may include a vector caching unit configured to store a vector, wherein the vector includes one or more elements. The aspects may further include a computation module that includes one or more comparers configured to compare the one or more elements to generate an output result that satisfies a predetermined condition included in an instruction. 1. An apparatus for vector operations in a neural network , comprising:a controller unit configured to receive a vector operation instruction that indicates an address of a vector and a predetermined condition; and wherein the vector includes one or more elements, and', 'wherein the computation module includes one or more comparers configured to compare the one or more elements to generate an output result that satisfies the predetermined condition included in the vector operation instruction., 'a computation module configured to receive the vector based on the address included in the vector operation instruction,'}2. The apparatus of claim 1 ,wherein the vector operation instruction further indicates a length of the vector, andwherein the computation module is configured to retrieve the vector based on the length of the vector and the address of the vector.3. The apparatus of claim 1 , wherein the vector operation instruction includes one or more register IDs that identify one or more registers configured to store the address of the vector and a length of the vector.4. The apparatus of claim 1 , wherein the comparers are configured to select a maximum element from the one or more elements as the output result.5. The apparatus of claim 1 , wherein the comparers are configured to select a minimum element from the one or more elements as the output result.6. The apparatus of claim 1 , wherein the controller unit comprises an instruction obtaining module configured to obtain the vector operation instruction from an instruction storage device.7. ...

Подробнее

Номер записи: 20

28-02-2019 дата публикации

Apparatus and Methods for Comparing Vectors

Номер: US20190065189A1

Автор: Chen Tianshi, Chen Yunji, HAN Dong, Liu Shaoli, Zhang Xiao

Принадлежит:

Aspects for vector comparison in neural network are described herein. The aspects may include a direct memory access unit configured to receive a first vector and a second vector from a storage device. The first vector may include one or more first elements and the second vector may include one or more second elements. The aspects may further include a computation module that includes one or more comparers respectively configured to generate a comparison result by comparing one of the one or more first elements to a corresponding one of the one or more second elements in accordance with an instruction. 1. An apparatus for vector comparison in a neural network , comprising:a controller unit configured to receive a vector comparison instruction;a direct memory access unit configured to receive a first vector and a second vector from a storage device, wherein the first vector includes one or more first elements and the second vector includes one or more second elements; anda computation module that includes one or more comparers respectively configured to generate a comparison result by comparing one of the one or more first elements to a corresponding one of the one or more second elements in accordance with the vector comparison instruction.2. The apparatus of claim 1 , wherein the vector comparison instruction includes a first address of the first vector claim 1 , a second address of the second vector claim 1 , and an output address for an output comparison vector.3. The apparatus of claim 2 , wherein the vector comparison instruction further includes a bit length of the first vector and the second vector.4. The apparatus of claim 3 , wherein the comparison result is generated as a true value based on a determination that the one of the one or more first elements is not less than the corresponding one of the one or more second elements if the vector comparison instruction is a greater-than-equal-to (GE) instruction.5. The apparatus of claim 3 , wherein the ...

Подробнее

Номер записи: 21

28-02-2019 дата публикации

Apparatus and Methods for Matrix Multiplication

Номер: US20190065190A1

Автор: Chen Tianshi, Chen Yunji, Liu Shaoli, Zhang Xiao

Принадлежит:

Aspects for matrix multiplication in neural network are described herein. The aspects may include a master computation module configured to receive a first matrix and transmit a row vector of the first matrix. In addition, the aspects may include one or more slave computation modules respectively configured to store a column vector of a second matrix, receive the row vector of the first matrix, and multiply the row vector of the first matrix with the stored column vector of the second matrix to generate a result element. Further, the aspects may include an interconnection unit configured to combine the one or more result elements generated respectively by the one or more slave computation modules to generate a row vector of a result matrix and transmit the row vector of the result matrix to the master computation module. 1. An apparatus for matrix multiplication in a neural network , comprising: receive, in response to an instruction, a first matrix, and', 'transmit a row vector of the first matrix;, 'a master computation module configured to'} store a column vector of a second matrix,', 'receive the row vector of the first matrix, and', 'multiply, in response to the instruction, the row vector of the first matrix with the stored column vector of the second matrix to generate a result element; and, 'one or more slave computation modules respectively configured to'} combine the one or more result elements generated respectively by the one or more slave computation modules to generate a row vector of a result matrix, and', 'transmit the row vector of the result matrix to the master computation module., 'an interconnection unit configured to'}2. The apparatus of claim 1 , wherein the instruction is a matrix-multiply-vector (MMV) instruction that includes a first starting address of the first matrix claim 1 , a first length of the first matrix claim 1 , a first starting address of the column vector of the second matrix claim 1 , and a length of the column vector.3. The ...

Подробнее

Номер записи: 22

28-02-2019 дата публикации

Apparatus and Methods for Vector Based Transcendental Functions

Номер: US20190065191A1

Автор: Chen Tianshi, Chen Yunji, HAN Dong, Zhang Xiao

Принадлежит:

Aspects for generating a dot product for two vectors in neural network are described herein. The aspects may include a controller unit configured to receive a transcendental function instruction that includes an address of a vector and an operation code that identifies a transcendental function. The aspects may further include a CORDIC processor configured to receive the vector that includes one or more elements based on the address of the vector in response to the transcendental function instruction. The CORDIC processor may be further configured to apply the transcendental function to each element of the vector to generate an output vector. 1. An apparatus for neural network operations , comprising:a controller unit configured to receive a transcendental function instruction that indicates an address of a vector and an operation code that identifies a transcendental function; anda CORDIC processor configured to receive the vector that includes one or more elements based on the address of the vector in response to the transcendental function instruction, wherein the CORDIC processor is further configured to apply the transcendental function to each element of the vector to generate an output vector.2. The apparatus of claim 1 , wherein the transcendental function instruction includes one or more register IDs that identify one or more registers configured to store the address of the vector and the length of the vector.3. The apparatus of claim 1 ,wherein the transcendental function instruction further indicates a length of the vector, andwherein the CORDIC processor is configured to retrieve the vector based on the length of the vector and the address of the vector.4. The apparatus of claim 1 , wherein the CORDIC processor includes one or more CORDIC modules respectively configured to apply the transcendental function to one of the one or more elements to generate a result.5. The apparatus of claim 4 ,wherein the transcendental function instruction is an exponential ...

Подробнее

Номер записи: 23

28-02-2019 дата публикации

APPARATUS AND METHODS FOR VECTOR OPERATIONS

Номер: US20190065192A1

Автор: Chen Tianshi, Chen Yunji, Liu Shaoli, Tao Jinhua, Zhi Tian

Принадлежит:

Aspects for vector operations in neural network are described herein. The aspects may include a vector caching unit configured to store a first vector and a second vector, wherein the first vector includes one or more first elements and the second vector includes one or more second elements. The aspects may further include one or more adders and a combiner. The one or more adders may be configured to respectively add each of the first elements to a corresponding one of the second elements to generate one or more addition results. The combiner may be configured to combine a combiner configured to combine the one or more addition results into an output vector. 1. An apparatus for vector operations in a neural network , comprising:a controller unit configured to receive a vector addition instruction that indicates a first address of a first vector, a second address of a second vector, and an operation code that indicates an operation to add the first vector and the second vector; anda computation module configured to receive the first vector and the second vector in response to the vector addition instruction based on the first address and the second address,wherein the first vector includes one or more first elements and the second vector includes one or more second elements, and one or more adders configured to respectively add each of the first elements to a corresponding one of the second elements to generate one or more addition results, and', 'a combiner configured to combine the one or more addition results into an output vector., 'wherein the computation module includes2. The apparatus of claim 1 ,wherein the vector addition instruction further indicates a first length of the first vector, andwherein the computation module is configured to retrieve the first vector based on the first address and the first length.3. The apparatus of claim 1 ,wherein the vector addition instruction further indicates a second length of the second vector, andwherein the computation ...

Подробнее

Номер записи: 24

28-02-2019 дата публикации

APPARATUS AND METHODS FOR VECTOR OPERATIONS

Номер: US20190065193A1

Автор: Chen Tianshi, Chen Yunji, Liu Shaoli, Tao Jinhua, Zhi Tian

Принадлежит:

Aspects for vector operations in neural network are described herein. The aspects may include a vector caching unit configured to store a first vector and a second vector, wherein the first vector includes one or more first elements and the second vector includes one or more second elements. The aspects may further include one or more adders and a combiner. The one or more adders may be configured to respectively add each of the first elements to a corresponding one of the second elements to generate one or more addition results. The combiner may be configured to combine a combiner configured to combine the one or more addition results into an output vector. 1. An apparatus for vector operations in a neural network , comprising:a controller unit configured to receive a vector subtraction instruction that includes a first address of a first vector, a second address of a second vector, and an operation code that indicates an operation to subtract the first vector from the second vector;a computation module configured to receive the first vector and the second vector in response to the vector subtraction instruction based on the first address and the second address,wherein the first vector includes one or more first elements and the second vector includes one or more second elements, and one or more subtractors configured to respectively subtract each of the first elements from a corresponding one of the second elements to generate one or more subtraction results, and', 'a combiner configured to combine the one or more subtraction results into an output vector., 'wherein the computation module includes2. The apparatus of claim 1 ,wherein the vector subtraction instruction further indicates a first length of the first vector, andwherein the computation module is configured to retrieve the first vector based on the first address and the first length.3. The apparatus of claim 1 ,wherein the vector subtraction instruction further indicates a second length of the second ...

Подробнее

Номер записи: 25

28-02-2019 дата публикации

APPARATUS AND METHODS FOR VECTOR OPERATIONS

Номер: US20190065194A1

Автор: Chen Tianshi, Chen Yunji, Liu Shaoli, Tao Jinhua, Zhi Tian

Принадлежит:

Aspects for vector operations in neural network are described herein. The aspects may include a vector caching unit configured to store a first vector and a second vector, wherein the first vector includes one or more first elements and the second vector includes one or more second elements. The aspects may further include one or more adders and a combiner. The one or more adders may be configured to respectively add each of the first elements to a corresponding one of the second elements to generate one or more addition results. The combiner may be configured to combine a combiner configured to combine the one or more addition results into an output vector. 1. An apparatus for vector operations in a neural network , comprising:a controller unit configured to receive a vector-multiply-vector instruction that includes a first address of a first vector, a second address of a second vector, and an operation code that indicates an operation to multiply the first vector with the second vector; anda computation module configured to receive the first vector and the second vector in response to the vector-multiply-vector instruction based on the first address and the second address,wherein the first vector includes one or more first elements and the second vector includes one or more second elements, and one or more multipliers configured to respectively multiply each of the first elements with a corresponding one of the second elements to generate one or more multiplication results, and', 'a combiner configured to combine the one or more multiplication results into an output vector., 'wherein the computation module includes2. The apparatus of claim 1 ,wherein the vector-multiply-vector instruction further indicates a first length of the first vector, andwherein the computation module is configured to retrieve the first vector based on the first address and the first length.3. The apparatus of claim 1 ,wherein the vector-multiply-vector instruction further indicates a ...

Подробнее

Номер записи: 26

28-02-2019 дата публикации

PROCESSING DEVICE AND RELATED PRODUCTS

Номер: US20190065208A1

Автор: Chen Tianshi, Liu Shaoli, WANG Bingrui, Zhang Yao

Принадлежит: CAMBRICON TECHNOLOGIES CORPORATION LIMITED

A processing device and related products are disclosed. The processing device includes a main unit and a plurality of basic units in communication with the main unit. The main unit is configured to perform a first set of operations in a neural network in series, and transmit data to the plurality of basic units. The plurality of basic units are configured to receive the data transmitted from the main unit, perform a second set of operations in the neural network in parallel based on the data received from the main unit, and return operation results to the main unit. 1. A processing device , comprising:a main unit configured to perform a first set of operations in a neural network in series; and receive data transmitted from the main unit;', 'perform a second set of operations in the neural network in parallel based on the data received from the main unit; and', 'return operation results to the main unit., 'a plurality of basic units configured to2. The processing device of claim 1 , wherein the main unit is further configured to:divide the data into a distribution data block and a broadcast data block according to an operation instruction;split the distribution data block into a plurality of basic data blocks;distribute each of the plurality of basic data blocks to a corresponding basic unit; andbroadcast the broadcast data block to the plurality of basic units.3. The processing device of claim 2 , wherein the basic units are further configured to:obtain inner-product processing results by performing inner-product operations between the basic data blocks and the broadcast data block; andreturn the inner-product processing results as the operation results to the main unit.4. The processing device of claim 2 , further comprising a branch unit disposed between the main unit and at least one basic unit claim 2 , the branch unit being configured to forward data between the main unit and the at least one basic unit.5. The processing device of claim 2 , wherein each basic ...

Подробнее

Номер записи: 27

28-02-2019 дата публикации

Apparatus and methods for combining vectors

Номер: US20190065435A1

Автор: Shaoli Liu, Tianshi CHEN, Xiao Zhang, Yunji CHEN, ZHEN Li

Принадлежит: Cambricon Technologies Corp Ltd

Aspects for vector combination in neural network are described herein. The aspects may include a direct memory access unit configured to receive aa first vector, a second vector, and a controller vector. The first vector, the second vector, and the controller vector may each include one or more elements indexed in accordance with a same one-dimensional data structure. The aspects may further include a computation module configured to select one of the one or more control values, determine that the selected control value satisfies a predetermined condition, and select one of the one or more first elements that corresponds to the selected control value in the one-dimensional data structure as an output element based on a determination that the selected control value satisfies the predetermined condition.

Подробнее

Номер записи: 28

28-02-2019 дата публикации

APPARATUS AND METHODS FOR MATRIX ADDITION AND SUBTRACTION

Номер: US20190065436A1

Автор: Chen Tianshi, Chen Yunji, Liu Shaoli, Zhang Xiao

Принадлежит:

Aspects for matrix multiplication in neural network are described herein. The aspects may include a controller unit configured to receive a matrix-addition instruction. The aspects may further include a computation module configured to receive a first matrix and a second matrix. The first matrix may include one or more first elements and the second matrix includes one or more second elements. The one or more first elements and the one or more second elements may be arranged in accordance with a two-dimensional data structure. The computation module may be further configured to respectively add each of the first elements to each of the second elements based on a correspondence in the two-dimensional data structure to generate one or more third elements for a third matrix. 1. An apparatus of matrix operations in a neural network , comprising:a controller unit configured to receive a matrix-addition instruction that indicates a first address of a first matrix and a second address of a second matrix; and [ wherein the first matrix includes one or more first elements and the second matrix includes one or more second elements, and', 'wherein the one or more first elements and the one or more second elements are arranged in accordance with a two-dimensional data structure, and, 'retrieve the first matrix and the second matrix from a storage device based on the first address of the first matrix and the second address of the second matrix,'}, 'respectively add each of the first elements to each of the second elements based on a correspondence in the two-dimensional data structure in accordance with the matrix-addition instruction to generate one or more third elements for a third matrix., 'a computation module configured to2. The apparatus of claim 1 , wherein the computation module includes a data controller configured to select a first portion of the first elements and a second portion of the second elements.3. The apparatus of claim 2 , wherein the computation module ...

Подробнее

Номер записи: 29

28-02-2019 дата публикации

APPARATUS AND METHODS FOR MATRIX ADDITION AND SUBTRACTION

Номер: US20190065437A1

Автор: Chen Tianshi, Chen Yunji, Liu Shaoli, Zhang Xiao

Принадлежит:

Aspects for matrix multiplication in neural network are described herein. The aspects may include a controller unit configured to receive a matrix-addition instruction. The aspects may further include a computation module configured to receive a first matrix and a second matrix. The first matrix may include one or more first elements and the second matrix includes one or more second elements. The one or more first elements and the one or more second elements may be arranged in accordance with a two-dimensional data structure. The computation module may be further configured to respectively add each of the first elements to each of the second elements based on a correspondence in the two-dimensional data structure to generate one or more third elements for a third matrix. 1. An apparatus of matrix operations in a neural network , comprising:a controller unit configured to receive a matrix-add-scalar instruction that includes an address of the first matrix and a scalar value; anda computation module configured to wherein the first matrix includes one or more first elements, and', 'wherein the one or more first elements are arranged in accordance with a two-dimensional data structure; and, 'receive the first matrix from a storage device based on the address of the first matrix,'}respectively add the scalar value to each of the one or more first elements of the first matrix in accordance with the matrix-add-scalar instruction to generate one or more second elements for a second matrix.2. The apparatus of claim 1 , wherein the computation module includes a data controller configured to select a portion of the first elements.3. The apparatus of claim 2 , wherein the computation module includes one or more adders configured to respectively add the scalar value to each of the portion of the first elements.4. The apparatus of claim 1 , wherein the matrix-add-scalar instruction includes claim 1 , a size of the first matrix claim 1 , and an address of the second matrix.5. The ...

Подробнее

Номер записи: 30

28-02-2019 дата публикации

APPARATUS AND METHODS FOR FORWARD PROPAGATION IN FULLY CONNECTED LAYERS OF CONVOLUTIONAL NEURAL NETWORKS

Номер: US20190065934A1

Автор: Chen Tianshi, Chen Yunji, GUO QI, Lan Huiying, Liu Shaoli

Принадлежит:

Aspects for forward propagation in fully connected layers of a convolutional artificial neural network are described herein. The aspects may include multiple slave computation modules configured to parallelly calculate multiple groups of slave output values based on an input vector received via the interconnection unit. Further, the aspects may include a master computation module connected to the multiple slave computation modules via an interconnection unit, wherein the master computation module is configured to generate an output vector based on the intermediate result vector. 1. An apparatus for forward propagation in fully connected layers of a multilayer neural network , comprising:a master computation module configured to transmit an input vector via an interconnection unit; and to receive an intermediate result vector combined by the interconnection unit based on the multiple groups of slave output values calculated by the multiple slave computation modules, and', 'to generate an output vector based on the intermediate result vector., 'one or more slave computation modules configured to parallelly calculate multiple groups of slave output values based on the input vector received via the interconnection unit, wherein the master computation module is configured'}2. The apparatus of claim 1 , wherein the master computation module is configured to perform one operation selected from the group consisting of:adding a bias value to the intermediate result vector;activating the intermediate result vector with an activation function;outputting a predetermined value based on a comparison between the intermediate result vector and a random number; andpooling the intermediate result vector.3. The apparatus of claim 1 , wherein each of the slave computation modules includes a slave neuron caching unit configured to store the input vector.4. The apparatus of claim 1 , wherein the interconnection unit is structured as a binary tree including one or more levels claim 1 , ...

Подробнее

Номер записи: 31

28-02-2019 дата публикации

Apparatus and Methods for Pooling Operations

Номер: US20190065938A1

Автор: Chen Tianshi, Chen Yunji, Liu Shaoli, Song Jin

Принадлежит:

Aspects for pooling operations in a multilayer neural network (MNN) in a MNN acceleration processor are described herein. The aspects may include a direct memory access unit configured to receive multiple input values from a storage device. The aspects may further include a pooling processor configured to select a portion of the input values based on a pooling kernel that include a data range, and generate a pooling result based on the selected portion of the input values. 1. An apparatus of pooling operation in a neural network , comprising:a controller unit configured to receive a pooling instruction; receive multiple input values,', 'select a portion of the input values based on a pooling kernel that include a data range in response to the pooling instruction, and', 'generate a pooling result based on the selected portion of the input values., 'a pooling processor configured to2. The apparatus of claim 1 , wherein the pooling processor is configured to calculate an average value for the selected portion of the input values as the pooling result.3. The apparatus of claim 1 , wherein the pooling processor is configured to select a maximum value from the selected portion of the input values as the pooling result.4. The apparatus of claim 1 , wherein the input values are indexed as a two-dimensional data structure.5. The apparatus of claim 1 , wherein the data range of the pooling kernel is a two-dimensional data range.6. The apparatus of claim 1 , wherein the pooling processor is further configured to adjust the data range in the pooling kernel.7. The apparatus of claim 1 , wherein the pooling processor is further configured to calculate an output data gradient vector based on a size of the pooling kernel and an input data gradient vector.8. The apparatus of claim 3 , wherein the pooling processor is further configured to calculate an output gradient vector based on an index vector associated with the maximum value and an input data gradient vector.9. A method for ...

Подробнее

Номер записи: 32

28-02-2019 дата публикации

Apparatus and Methods for Generating Random Vectors

Номер: US20190065952A1

Автор: Chen Tianshi, Chen Yunji, LIU Daofu, Liu Shaoli, Zhang Xiao

Принадлежит:

Aspects for vector operations in neural network are described herein. The aspects may include a controller unit configured to receive an instruction to generate a random vector that includes one or more elements. The instruction may include a predetermined distribution, a count of the elements, and an address of the random vector. The aspects may further include a computation module configured to generate the one or more elements, wherein the one or more elements are subject to the predetermined distribution. 1. An apparatus for random vector generation in a neural network , comprising:a controller unit configured to receive a random vector generation instruction to generate a random vector that includes one or more elements, wherein the instruction indicates a predetermined distribution and includes a count of the elements and an address of the random vector; anda computation module configured to generate, in response to the random vector generation instruction, the one or more elements, wherein the one or more elements are subject to the predetermined distribution.2. The apparatus of claim 1 , wherein the predetermined distribution is a uniform distribution and wherein the instruction further includes a minimum value and a maximum value of the one or more elements.3. The apparatus of claim 1 , wherein the predetermined distribution is a Gaussian distribution and wherein the instruction further includes a mean and a variance of the one or more elements.4. The apparatus of claim 1 , further comprising a vector caching unit configured to store the generated one or more elements.5. The apparatus of claim 2 , further comprising one or more scalar registers configured to store a count of the elements claim 2 , the address of the random vector claim 2 , the minimum value of the one or more elements claim 2 , and the maximum value of the one or more elements.6. The apparatus of claim 3 , further comprising one or more scalar registers configured to store a count of the ...

Подробнее

Номер записи: 33

28-02-2019 дата публикации

Device and Method for Performing Self-Learning Operations of an Artificial Neural Network

Номер: US20190065953A1

Автор: Chen Tianshi, Chen Yunji, GUO QI, Li Zhen

Принадлежит:

Aspects for self-learning operations of an artificial neural network are described herein. The aspects may include a master computation module configured to transmit an input vector via an interconnection unit and one or more slave computation modules connected to the master computation module via the interconnection unit. Each of the one or more slave computation modules may be configured to respectively store a column weight vector of a weight matrix and multiply the input vector with the column weight vector to generate a first multiplication result. The interconnection unit may be configured to combine the one or more first multiplication results into a first multiplication vector and transmit the first multiplication vector to the master computation module. 1. An apparatus for neural network operations , comprisinga master computation module configured to transmit an input vector via an interconnection unit; andone or more slave computation modules connected to the master computation module via the interconnection unit, respectively store a column weight vector of a weight matrix, and', 'multiply the input vector with the column weight vector to generate a first multiplication result, and, 'wherein each of the one or more slave computation modules is configured to combine the one or more first multiplication results into a first multiplication vector, and', 'transmit the first multiplication vector to the master computation module., 'wherein the interconnection unit is configured to2. The apparatus of claim 1 , wherein the master computation module is further configured to:add a first bias vector to the first multiplication vector to generate a first biased vector,activate the first biased vector by applying a first activation function to the first biased vector to generate a first activated vector, andsample the first activated vector by a Gibbs sampler to generate a first phase hidden layer vector.3. The apparatus of claim 2 ,wherein the master computation ...

Подробнее

Номер записи: 34

28-02-2019 дата публикации

Apparatus and Methods for Training in Fully Connected Layers of Convolutional Networks

Номер: US20190065958A1

Автор: Chen Tianshi, Chen Yunji, GUO QI, Zhang Shijin

Принадлежит:

Aspects for backpropagation in a fully connect layer of a convolutional neural network are described herein. The aspects may include a direct memory access unit configured to receive input data and one or more first data gradients from a storage device. The aspects may further include a master computation module configured to transmit the input data and the one or more first data gradients to one or more slave computation modules. The slave computation modules are respectively configured to multiply one of the one or more first data gradients with the input data to generate a default weight gradient vector. 1. An apparatus for backpropagation in a fully connected layer of a neural network , comprising:a controller unit configured to receive an instruction; andone or more computation modules that include a master computation module and one or more slave computation modules, receive input data and one or more first data gradients in response to the instruction, and', 'transmit the input data and the one or more first data gradients to one or more slave computation modules, and, 'wherein the master computation module configured to'}wherein the one or more slave computation modules are respectively configured to multiply one of the one or more first data gradients with the input data to generate a default weight gradient vector.2. The apparatus of claim 1 , wherein the master computation module is further configured to update one or more weight values based on the default weight gradient vector.3. The apparatus of claim 1 , wherein the master computation module is further configured tocalculate a scaled weight gradient vector based on the default weight gradient vector and a predetermined threshold value; andupdate one or more weight values based on the scaled weight gradient vector.4. The apparatus of claim 1 , wherein the master computation module is further configured to apply a derivative of an activation function to the one or more first data gradients to generate ...

Подробнее

Номер записи: 35

28-02-2019 дата публикации

Apparatus and methods for training in convolutional neural networks

Номер: US20190065959A1

Автор: Qi Guo, Shaoli Liu, TIAN Zhi, Tianshi CHEN, Yunji CHEN

Принадлежит: Cambricon Technologies Corp Ltd

Aspects for backpropagation of a convolutional neural network are described herein. The aspects may include a direct memory access unit configured to receive input data from a storage device and a master computation module configured to select one or more portions of the input data based on a predetermined convolution window. Further, the aspects may include one or more slave computation modules respectively configured to convolute one of the one or more portions of the input data with one of one or more previously calculated first data gradients to generate a kernel gradient, wherein the master computation module is further configured to update a prestored convolution kernel based on the kernel gradient.

Подробнее

Номер записи: 36

07-03-2019 дата публикации

APPARATUS AND METHODS FOR VECTOR OPERATIONS

Номер: US20190073219A1

Автор: Chen Tianshi, Chen Yunji, Liu Shaoli, Luo Tao, Zhi Tian

Принадлежит:

Aspects for vector operations in neural network are described herein. The aspects may include a vector caching unit configured to store a first vector and a second vector, wherein the first vector includes one or more first elements and the second vector includes one or more second elements. The aspects may further include a computation module that includes one or more bitwise processors and a combiner. The bitwise processors may be configured to perform bitwise operations between each of the first elements and a corresponding one of the second elements to generate one or more operation results. The combiner may be configured to combine the one or more operation results into an output vector. 1. An apparatus for vector operations in a neural network , comprising:a controller unit configured to receive a vector bitwise operation instruction that indicates a first address of the first vector and a second address of a second vector; and one or more bitwise processors configured to perform bitwise operations between each of the first elements and a corresponding one of the second elements to generate one or more operation results, and', 'a combiner configured to combine the one or more operation results into an output vector., 'a computation module configured to receive the first vector and the second vector based on the first address and the second address in response to the vector bitwise operation instruction, wherein the first vector includes one or more first elements and the second vector includes one or more second elements, and wherein the computation module includes2. The apparatus of claim 1 , wherein the vector bitwise operation instruction includes one or more register IDs that identify one or more registers configured to store the first address of the first vector claim 1 , the second address of the second vector claim 1 , a length of the first vector claim 1 , and a length of the second vector.3. The apparatus of claim 1 , further comprising one or more ...

Подробнее

Номер записи: 37

07-03-2019 дата публикации

DATA READ-WRITE SCHEDULER AND RESERVATION STATION FOR VECTOR OPERATIONS

Номер: US20190073220A1

Автор: Chen Tianshi, Chen Yunji, HAN Dong, Liu Shaoli

Принадлежит:

The present disclosure provides a data read-write scheduler and a reservation station for vector operations. The data read-write scheduler suspends the instruction execution by providing a read instruction cache module and a write instruction cache module and detecting conflict instructions based on the two modules. After the time is satisfied, instructions are re-executed, thereby solving the read-after-write conflict and the write-after-read conflict between instructions and guaranteeing that correct data are provided to a vector operations component. Therefore, the subject disclosure has more values for promotion and application. 1. A data write scheduler for vector operations , comprising: receive a vector write instruction,', 'detect at least one read-after-write conflict between the vector write instruction and one or more vector read instructions stored in a data read scheduler, and', 'identify one of the vector read instructions that result the read-after-write conflict as a dependent vector read instruction that corresponds to the vector write instruction; and, 'a write instruction preprocessing unit configured to'} extract one or more write requests from the received vector write instruction,', 'sequentially submit the one or more write requests and content data to an on-chip RAM based on an execution status of the dependent vector read instruction that corresponds to the receive vector write instruction, and', 'receive feedback from the on-chip RAM regarding the write requests., 'a write control module configured to'}2. The data write scheduler of claim 1 , wherein write instruction preprocessing unit is further configured to: wherein the first pair of starting and ending addresses identify a target write range in the on-chip RAM, and', 'wherein each of the one or more write request is configured to write a portion of the content data to a write address within the target write range,, 'obtain a first pair of starting and ending addresses from the vector ...

Подробнее

Номер записи: 38

07-03-2019 дата публикации

DATA READ-WRITE SCHEDULER AND RESERVATION STATION FOR VECTOR OPERATIONS

Номер: US20190073221A1

Автор: Chen Tianshi, Chen Yunji, HAN Dong, Liu Shaoli

Принадлежит:

The present disclosure provides a data read-write scheduler and a reservation station for vector operations. The data read-write scheduler suspends the instruction execution by providing a read instruction cache module and a write instruction cache module and detecting conflict instructions based on the two modules. After the time is satisfied, instructions are re-executed, thereby solving the read-after-write conflict and the write-after-read conflict between instructions and guaranteeing that correct data are provided to a vector operations component. Therefore, the subject disclosure has more values for promotion and application. 1wherein the on-chip RAM is configured to store the input data required for vector operations and store output data obtained through computation;wherein the I/O interface is configured to perform read-write access to the on-chip RAM by devices outside of the reservation station, the read-write access including: loading the input data that need to be processed to the on-chip RAM, and moving the output data obtained after operations to external devices;wherein the decoder is configured to read instructions from an external instruction cache queue, and decode the instructions into vector read instructions, vector write instructions and vector operations instructions, send the vector read instructions and vector write instructions to the data read-write scheduler, and send the vector operations instructions to the vector operations component;wherein the vector operations component is configured to receive the input data from the data read-write scheduler for operations after receiving the vector operations instruction sent by the decoder, and then transmit the output data obtained from operations to the data read-write scheduler;wherein the data read-write scheduler comprises a data read scheduler and a data write scheduler; wherein the data read scheduler comprises a read instruction preprocessing module and a read control module; the data ...

Подробнее

Номер записи: 39

07-03-2019 дата публикации

APPARATUS AND METHODS FOR VECTOR OPERATIONS

Номер: US20190073339A1

Автор: Chen Tianshi, Chen Yunji, Liu Shaoli, Tao Jianhua, Zhi Tian

Принадлежит:

Aspects for vector operations in neural network are described herein. The aspects may include a vector caching unit configured to store a first vector and a second vector, wherein the first vector includes one or more first elements and the second vector includes one or more second elements. The aspects may further include one or more adders and a combiner. The one or more adders may be configured to respectively add each of the first elements to a corresponding one of the second elements to generate one or more addition results. The combiner may be configured to combine a combiner configured to combine the one or more addition results into an output vector. 1. An apparatus for vector operations in a neural network , comprising:a controller unit configured to receive a scalar-subtract-vector instruction that includes a first address of a vector, a second address of a scalar value, and an operation code that indicates an operation to subtract the vector from the scalar value; anda computation module configured to receive the vector and the scalar value in response to the scalar-subtract-vector instruction based on the first address and the second address,wherein the vector includes one or more elements, and one or more subtractors configured to respectively subtract the scalar value from each element of the vector to generate one or more subtraction results, and', 'a combiner configured to combine the one or more subtraction results into an output vector., 'wherein the computation module includes2. The apparatus of claim 1 ,wherein the scalar-subtract-vector instruction further indicates a first length of the first vector, andwherein the computation module is configured to retrieve the first vector based on the first address and the first length.3. The apparatus of claim 1 ,wherein the scalar-subtract-vector instruction further indicates a second length of the second vector, andwherein the computation module is configured to retrieve the second vector based on the ...

Подробнее

Номер записи: 40

07-03-2019 дата публикации

APPARATUS AND METHODS FOR FORWARD PROPAGATION IN CONVOLUTIONAL NEURAL NETWORKS

Номер: US20190073583A1

Автор: Chen Tianshi, Chen Yunji, GUO QI, HAN Dong, Liu Shaoli

Принадлежит:

Aspects for forward propagation of a convolutional artificial neural network are described herein. The aspects may include a direct memory access unit configured to receive input data from a storage device and a master computation module configured to select one or more portions of the input data based on a predetermined convolution window. Further, the aspects may include one or more slave computation modules respectively configured to convolute a convolution kernel with one of the one or more portions of the input data to generate a slave output value. Further still, the aspects may include an interconnection unit configured to combine the one or more slave output values into one or more intermediate result vectors, wherein the master computation module is further configured to merge the one or more intermediate result vectors into a merged intermediate vector. 1. An apparatus for forward propagation of a convolutional neural network , comprising: receive input data, and', 'select, in response to an instruction, one or more portions of the input data based on a predetermined convolution window;, 'a master computation module configured to'}one or more slave computation modules respectively configured to convolute a portion of a convolution kernel with one of the one or more portions of the input data to generate a slave output value; andan interconnection unit configured to combine the one or more slave output values into one or more intermediate result vectors, wherein the master computation module is further configured to merge the one or more intermediate result vectors into a merged intermediate vector.2. The apparatus of claim 1 , wherein each of the one or more slave computation modules includes a slave neuron caching unit configured to store one of the one or more portions of the input data.3. The apparatus of claim 1 , wherein each of the one or more slave computation modules includes a weight value caching unit configured to store the portion of the ...

Подробнее

Номер записи: 41

07-03-2019 дата публикации

Apparatus and methods for forward propagation in neural networks supporting discrete data

Номер: US20190073584A1

Автор: Chen Tianshi, Chen Yunji, Liu Shaoli, Yu Yong

Принадлежит:

Aspects for forward propagation of a multilayer neural network (MNN) in a neural network processor are described herein. As an example, the aspects may include a computation module that includes a master computation module and one or more slave computation modules. The master computation module may be configured to receive one or more groups of MNN data. The one or more groups of MNN data may include input data and one or more weight values and wherein at least a portion of the input data and the weight values are stored as discrete values. The one or more slave computation modules may be configured to calculate one or more groups of slave output values based on a data type of each of the one or more groups of MNN data. 2. The apparatus of claim 1 , wherein the interconnection unit is configured to combine the one or more groups of slave output values to generate one or more intermediate result vectors.3. The apparatus of claim 1 , wherein the one or more slave computation modules are configured to parallelly calculate the one or more groups of slave output values based on the input data and the weight values.4. The apparatus of claim 1 , wherein the master computation module is configured to perform one operation selected from the group consisting of:adding a bias value to the merged intermediate vector;activating the merged intermediate vector with an activation function, wherein the activation function is a function selected from the group consisting of non-linear sigmoid, tanh, relu, and softmax;outputting a predetermined value based on a comparison between the merged intermediate vector and a random number; andpooling the merged intermediate vector.5. The apparatus of claim 1 , wherein the interconnection unit is connected to the master computation module and the one or more slave computation modules and exchange data between the master computation module and the one or more slave computation modules.6. The apparatus of claim 1 , wherein the master computation ...

Подробнее

Номер записи: 42

14-03-2019 дата публикации

Apparatus and Methods for Neural Network Operations Supporting Floating Point Numbers of Short Bit Length

Номер: US20190079727A1

Автор: Chen Tianshi, Chen Yunji, GUO QI, Liu Shaoli

Принадлежит:

Aspects for neural network operations with floating-point number of short bit length are described herein. The aspects may include a neural network processor configured to process one or more floating-point numbers to generate one or more process results. Further, the aspects may include a floating-point number converter configured to convert the one or more process results in accordance with at least one format of shortened floating-point numbers. The floating-point number converter may include a pruning processor configured to adjust a length of a mantissa field of the process results and an exponent modifier configured to adjust a length of an exponent field of the process results in accordance with the at least one format. 1. An apparatus for neural network operations , comprising:a neural network processor configured to process one or more floating-point numbers to generate one or more process results; and a pruning processor configured to adjust a length of a mantissa field of the process results, and', 'an exponent modifier configured to adjust a length of an exponent field of the process results in accordance with the at least one format., 'a floating-point number converter configured to convert the one or more process results in accordance with at least one format of shortened floating-point numbers, wherein the floating-point number converter includes'}2. The apparatus of claim 1 , further comprising a floating-point number analyzing processor configured to determine the at least one format of the shortened floating-point numbers claim 1 , wherein the floating-point number analyzing processor includes:a data extractor configured to collect one or more categories of the floating-point numbers;a data analyzer configured to statistically analyze the one or more categories of the floating-point numbers to determine a data range for each of the one or more categories of the floating-point numbers and a distribution pattern for each of the one or more categories ...

Подробнее

Номер записи: 43

14-03-2019 дата публикации

APPARATUS AND METHODS FOR VECTOR OPERATIONS

Номер: US20190079765A1

Автор: Chen Tianshi, Chen Yunji, Liu Shaoli, Tao Jinhua, Zhi Tian

Принадлежит:

Aspects for vector operations in neural network are described herein. The aspects may include a vector caching unit configured to store a first vector and a second vector, wherein the first vector includes one or more first elements and the second vector includes one or more second elements. The aspects may further include one or more adders and a combiner. The one or more adders may be configured to respectively add each of the first elements to a corresponding one of the second elements to generate one or more addition results. The combiner may be configured to combine a combiner configured to combine the one or more addition results into an output vector. 1. An apparatus for vector operations in a neural network , comprising:a controller unit configured to receive a vector-add-scalar instruction that includes a first address of a vector, a second address of a scalar value, and an operation code that indicates an operation to add the vector to the scalar value; anda computation module configured to receive the vector and the scalar value in response to the vector-add-scalar instruction based on the first address and the second address,wherein the vector includes one or more elements, and one or more adders configured to respectively add the scalar value to each element of the vector to generate one or more addition results, and', 'a combiner configured to combine the one or more addition results into an output vector., 'wherein the computation module includes2. The apparatus of claim 1 ,wherein the vector-add-scalar instruction further indicates a first length of the first vector, andwherein the computation module is configured to retrieve the first vector based on the first address and the first length.3. The apparatus of claim 1 ,wherein the vector-add-scalar instruction further indicates a second length of the second vector, andwherein the computation module is configured to retrieve the second vector based on the second address and the second length.4. The ...

Подробнее

Номер записи: 44

14-03-2019 дата публикации

APPARATUS AND METHODS FOR VECTOR OPERATIONS

Номер: US20190079766A1

Автор: Chen Tianshi, Chen Yunji, Liu Shaoli, Tao Jinhua, Zhi Tian

Принадлежит:

Aspects for vector operations in neural network are described herein. The aspects may include a vector caching unit configured to store a first vector and a second vector, wherein the first vector includes one or more first elements and the second vector includes one or more second elements. The aspects may further include one or more adders and a combiner. The one or more adders may be configured to respectively add each of the first elements to a corresponding one of the second elements to generate one or more addition results. The combiner may be configured to combine a combiner configured to combine the one or more addition results into an output vector. 1. An apparatus for vector operations in a neural network , comprising:a controller unit configured to receive a scalar-subtract-vector instruction that includes a first address of a vector, a second address of a scalar value, and an operation code that indicates an operation to subtract the vector from the scalar value; anda computation module configured to receive the vector and the scalar value in response to the scalar-subtract-vector instruction based on the first address and the second address,wherein the vector includes one or more elements, and one or more subtractors configured to respectively subtract the scalar value from each element of the vector to generate one or more subtraction results, and', 'a combiner configured to combine the one or more subtraction results into an output vector., 'wherein the computation module includes2. The apparatus of claim 1 ,wherein the scalar-subtract-vector instruction further indicates a first length of the first vector, andwherein the computation module is configured to retrieve the first vector based on the first address and the first length.3. The apparatus of claim 1 ,wherein the scalar-subtract-vector instruction further indicates a second length of the second vector, andwherein the computation module is configured to retrieve the second vector based on the ...

Подробнее

Номер записи: 45

14-03-2019 дата публикации

Apparatus and methods for backward propagation in neural networks supporting discrete data

Номер: US20190080241A1

Автор: Chen Tianshi, Chen Yunji, GUO QI, Yu Yong

Принадлежит:

Aspects for backpropagation of a multilayer neural network (MNN) in a neural network processor are described herein. The aspects may include a computation module configured to receive one or more groups of MNN data. The computation module may further include a master computation module configured to calculate an input gradient vector based on a first output gradient vector from an adjacent layer and based on a data type of each of the one or more groups of MNN data. Further still, the computation module may include one or more slave computation modules configured to parallelly calculate portions of a second output vector based on the input gradient vector calculated by the master computation module and based on the data type of each of the one or more groups of MNN data. 1. An apparatus for backpropagation of a multilayer neural network (MNN) , comprising: wherein the one or more groups of MNN data include input data and one or more weight values,', 'wherein at least a portion of the input data and the weight values are presented as discrete values, and', a master computation module configured to calculate an input gradient vector based on a first output gradient vector from an adjacent layer and based on a data type of each of the one or more groups of MNN data, and', 'one or more slave computation modules configured to parallelly calculate portions of a second output vector based on the input gradient vector calculated by the master computation module and based on the data type of each of the one or more groups of MNN data; and, 'wherein the computation module includes], 'a computation module configured to receive one or more groups of MNN data,'}a controller unit configured to decode an instruction that initiates a backpropagation process and transmit the decoded instruction to the computation module.2. The apparatus of claim 1 , wherein the interconnection unit is configured to combine the portions of the second output gradient vector to generate the second ...

Подробнее

Номер записи: 46

21-03-2019 дата публикации

Apparatus and method for executing recurrent neural network and lstm computations

Номер: US20190087709A1

Автор: Qi Guo, Tianshi CHEN, Xunyu CHEN, Yunji CHEN

Принадлежит: Cambricon Technologies Corp Ltd

Aspects for Long Short-Term Memory (LSTM) blocks in a recurrent neural network (RNN) are described herein. As an example, the aspects may include one or more slave computation modules, an interconnection unit, and a master computation module collectively configured to calculate an activated input gate value, an activated forget gate value, a current cell status of the current computation period, an activated output gate value, and a forward pass result.

Подробнее

Номер записи: 47

21-03-2019 дата публикации

APPARATUS AND METHOD FOR EXECUTING RECURRENT NEURAL NETWORK AND LSTM COMPUTATIONS

Номер: US20190087710A1

Автор: Chen Tianshi, CHEN Xunyu, Chen Yunji, GUO QI

Принадлежит:

Aspects for Long Short-Term Memory (LSTM) blocks in a recurrent neural network (RNN) are described herein. As an example, the aspects may include one or more slave computation modules, an interconnection unit, and a master computation module collectively configured to calculate an activated input gate value, an activated forget gate value, a current cell status of the current computation period, an activated output gate value, and a forward pass result. 1. An apparatus for backward pass in a recurrent neural network (RNN) network , comprising:one or more slave computation modules configured to calculate a first cell output partial sum and a second cell output partial sum; andan interconnection unit configured to add the first cell output partial sum and the second cell output partial sum to generate one or more cell output gradients.2. The apparatus of claim 1 , further comprising a master computation module configured to activate a dormant output gate value with a derivative of an activation function to generate an activated output gate value claim 1 ,wherein the one or more slave computation modules are configured to multiply the cell output gradients with activated current cell status to generate a cell output multiplication result, andwherein the master computation module is further configured to multiply the activated output gate value with the cell output multiplication result to generate one or more output gate gradients.3. The apparatus of claim 2 ,wherein the one or more slave computation modules are configured to calculate a first cell status partial sum, a second cell status partial sum, a third cell status partial sum, a fourth cell status partial sum, and a fifth cell status partial sum; andwherein the interconnection unit is configured to add the first cell status partial sum, the second cell status partial sum, the third cell status partial sum, the fourth cell status partial sum, and the fifth cell status partial sum to generate one or more cell ...

Подробнее

Номер записи: 48

21-03-2019 дата публикации

METHOD AND SYSTEM FOR PROCESSING NEURAL NETWORK

Номер: US20190087716A1

Автор: Chen Tianshi, Chen Yunji, Du Zidong, GUO QI

Принадлежит:

The present disclosure provides a neural network processing system that comprises a multi-core processing module composed of a plurality of core processing modules and for executing vector multiplication and addition operations in a neural network operation, an on-chip storage medium, an on-chip address index module, and an ALU module for executing a non-linear operation not completable by the multi-core processing module according to input data acquired from the multi-core processing module or the on-chip storage medium, wherein the plurality of core processing modules share an on-chip storage medium and an ALU module, or the plurality of core processing modules have an independent on-chip storage medium and an ALU module. The present disclosure improves an operating speed of the neural network processing system, such that performance of the neural network processing system is higher and more efficient. 1. A system for processing a neural network , comprising:at least one on-chip storage medium for storing data transmitted from outside of a neural network processing system, or storing data generated during processing;at least one on-chip address index module for executing mapping according to an input index to a correct storage address during operation;a multi-core processing module composed of a plurality of core processing modules and for executing vector multiplication and addition operations in a neural network operation, andat least one ALU module for executing a non-linear operation not completable by the multi-core processing module according to input data acquired from the multi-core processing module or the on-chip storage medium,wherein the plurality of core processing modules share the on-chip storage medium and the ALU module, or the plurality of core processing modules have an independent on-chip storage medium and an ALU module.2. The processing system according to claim 1 , wherein the data generated during processing comprises a processing result or ...

Подробнее

Номер записи: 49

19-03-2020 дата публикации

DATA SHARING SYSTEM AND DATA SHARING METHOD THEREFOR

Номер: US20200089534A1

Автор: Chen Tianshi, Liu Shaoli, ZHOU Shengyuan

Принадлежит:

The disclosure provides a task segmentation device and method, a task processing device and method, a multi-core processor. The task segmentation device includes a granularity task segmentation unit configured to segment a task by adopting at least one granularity to form subtasks, and a task segmentation granularity selection unit configured to select the granularity to be adopted. 1. A task segmentation device for a neural network , comprising:a granularity task segmentation unit configured to segment a task into one or more subtasks in accordance with at least one granularity; anda task segmentation granularity selection unit configured to determine the granularity for segmenting the task.2. The task segmentation device of claim 1 , wherein the granularity task segmentation unit includes at least one of a first granularity task segmentation unit configured to identify the task as one of the one or more subtask claim 1 , divide sample data associated with the task into one more subset of sample data, and', 'identify a computation of each subset of sample data as one of the one or more subtask,, 'a second granularity task segmentation unit configured toa third granularity task segmentation unit configured to segment the task according to layer types of the neural network, wherein computation for layers of the same layer type is identified as one of the one or more subtask,a fourth granularity task segmentation unit configured to segment the task according to an interlayer structure of the neural network, wherein computation for multiple adjacent layers is identified as one of the one or more subtask, anda fifth granularity task segmentation unit configured to segment the task according to intra-layer structures of the neural network to segment computation types in each of the layers of the neural network into subtasks.3. The task segmentation device of claim 2 , wherein the task segmentation granularity selection unit is configured to select at least one of the ...

Подробнее

Номер записи: 50

19-03-2020 дата публикации

DATA SHARING SYSTEM AND DATA SHARING METHOD THEREFOR

Номер: US20200089535A1

Автор: Chen Tianshi, Du Zidong, Hu Shuai, Liu Shaoli, Wang Zai, Zhou Xuda

Принадлежит:

The application provides a processor and processing method. The processor includes a task segmentation device configured to perform task segmentation according to a task segmentation granularity and a hardware resource division device configured to divide hardware resources of the processor according to a task segmentation result. The processor and processing method provided by the application improve the processing performance and reduce the overhead by performing task segmentation and configuring different hardware according to task segmentation. 1. A processor , comprisinga task segmentation device configured to segment a task into multiple subtasks according to a task segmentation granularity; anda hardware resource division device configured to divide hardware resources of the processor respectively for the multiple subtasks.2. The processor of claim 1 , further comprising multiple processing elements claim 1 ,wherein the hardware resource division device is configured to divide the multiple processing elements of the processor into multiple computation groups respectively for the multiple subtasks.3. The processor of claim 2 , wherein the hardware resource division device is configured to dynamically adjust the multiple computation groups of the processing elements.4. The processor of claim 1 , wherein the task segmentation device includes:a task segmentation granularity selection unit configured to determine the task segmentation granularity.5. The processor of for neural network claim 4 , wherein the granularity task segmentation unit includes at least one of the following units:a first granularity task segmentation unit configured to take the whole task as one of the subtasks; divide sample data associated with the task into one or more subset of sample data, and', 'identify a computation of each subset of sample data as one of the subtasks;, 'a second granularity task segmentation unit configured toa third granularity task segmentation unit configured to ...

Подробнее

Номер записи: 51

19-03-2020 дата публикации

DATA SHARING SYSTEM AND DATA SHARING METHOD THEREFOR

Номер: US20200089623A1

Автор: Chen Tianshi, GAO Yufeng, HAO Yifan, Hu Shuai

Принадлежит:

The disclosure provides an information processing device and method. The information processing device includes a storage module a storage module configured to acquire information data, wherein the information data including at least one key feature and the storage module pre-storing true confidence corresponding to the key feature; an operational circuit configured to determine predicted confidence corresponding to the key feature according to the information data and judge whether the predicted confidence of the key feature exceeds a preset threshold value range of the true confidence corresponding to the key feature or not; a controlling circuit configured to control the storage module to modify the key feature or send out a modification signal to the outside when the predicted confidence exceeds the preset threshold value of the true confidence. The information processing device of the disclosure can automatically correct and modify handwriting, text, image or video actions instead of artificial method. 1. An information processing device , comprising acquire information data that includes at least one key feature, and', 'store at least one true confidence corresponding to the at least one key feature;, 'a storage module configured to determine a predicted confidence corresponding to the key feature according to the acquired information data and', 'determine whether the predicted confidence of the key feature exceeds a preset threshold value range of the true confidence corresponding to the key feature; and, 'an operational circuit configured toa controlling circuit configured to control the storage module to modify the key feature based on a determination that the predicted confidence exceeds the preset threshold value of the true confidence.2. The information processing device of claim 1 , store the predicted confidence determined by operation of the operational circuit, and', 'send the true confidence and the predicted confidence into the operational circuit ...

Подробнее

Номер записи: 52

19-03-2020 дата публикации

DATA SHARING SYSTEM AND DATA SHARING METHOD THERFOR

Номер: US20200090024A1

Автор: Chen Tianshi, HAO Yifan, Hu Shuai

Принадлежит:

The application provides an information processing device, system and method. The information processing device mainly includes a storage module and a data processing module, where the storage module is configured to receive and store input data, instruction and output data, and the input data includes one or more key features; the data processing module is configured to identify the key features included in the input data and score the input data in the storage module according to a judgment result. The information processing device, system and method provided by the application automatically scores text, pictures, audio, video, and the like instead of manually scoring, which is more accurate and faster. 1. An information processing device , comprising:a storage module configured to receive and store input data and one or more instructions; and identify one or more key features included in the input data to generate a judgment result, and', 'score the input data in the storage module according to the judgment result., 'a data processing module configured to2. The information processing device of claim 1 , wherein the input data is original input data or preprocessed data obtained by preprocessing the original input data.3. The information processing device of claim 1 , wherein the data processing module is configured to compute confidence of the key features included in the input data claim 1 , and wherein the confidence is the judgment result.4. The information processing device of claim 1 , wherein the storage module stores data and one or more instructions claim 1 ,wherein the data includes the input data, input neurons, weights, output neurons,wherein the input data is transmitted to each input node in an artificial neural network for subsequent computation,wherein values of the output neurons include the judgment result and a score andwherein the judgment result and the score are determined as output data.5. The information processing device of claim 4 , ...

Подробнее

Номер записи: 53

01-04-2021 дата публикации

Information processing method and terminal device

Номер: US20210097326A1

Автор: Shaoli Liu, Shuai Hu, Tianshi CHEN, Zai WANG

Принадлежит: Shanghai Cambricon Information Technology Co Ltd

Disclosed are an information processing method and a terminal device. The method comprises: acquiring first information, wherein the first information is information to be processed by a terminal device; calling an operation instruction in a calculation apparatus to calculate the first information so as to obtain second information; and outputting the second information. By means of the examples in the present disclosure, a calculation apparatus of a terminal device can be used to call an operation instruction to process first information, so as to output second information of a target desired by a user, thereby improving the information processing efficiency. The present technical solution has advantages of a fast computation speed and high efficiency.

Подробнее

Номер записи: 54

01-04-2021 дата публикации

INFORMATION PROCESSING METHOD AND TERMINAL DEVICE

Номер: US20210097332A1

Автор: Chen Tianshi, Hu Shuai, Liu Shaoli, Wang Zai

Принадлежит:

Disclosed are an information processing method and a terminal device. The method comprises: acquiring first information, wherein the first information is information to be processed by a terminal device; calling an operation instruction in a calculation apparatus to calculate the first information so as to obtain second information; and outputting the second information. By means of the examples in the present disclosure, a calculation apparatus of a terminal device can be used to call an operation instruction to process first information, so as to output second information of a target desired by a user, thereby improving the information processing efficiency. The present technical solution has advantages of a fast computation speed and high efficiency. 1. An information processing method applied to a computation circuit , wherein the computation lit comprises a communication circuit and an operation circuit , and the method comprises:controlling, by the computation circuit, the communication circuit to obtain a target image to be processed, wherein the target image includes one or more feature identification objects which are used to identify the target image or identify objects included in the target image; andcontrolling, by the computation circuit, the operation circuit to obtain an operation instruction and execute the operation instruction to classify the feature identification objects in the target image to obtain a target classification result, wherein the target classification result indicates at least one target classification to which the feature identification objects belong.2. The method of claim 1 , wherein the computation circuit further includes a register circuit and a controller circuit claim 1 , and the controlling claim 1 , by the computation circuit claim 1 , the operation circuit to obtain an operation instruction and execute the operation instruction to classify feature identification objects in the target image claim 1 , so as to obtain a ...

Подробнее

Номер записи: 55

01-04-2021 дата публикации

INFORMATION PROCESSING METHOD AND TERMINAL DEVICE

Номер: US20210097647A1

Автор: Chen Tianshi, Hu Shuai, Liu Shaoli, Wang Zai

Принадлежит:

Disclosed are an information processing method and a terminal device. The method comprises: acquiring first information, wherein the first information is information to be processed by a terminal device; calling an operation instruction in a calculation apparatus to calculate the first information so as to obtain second information; and outputting the second information. By means of the examples in the present disclosure, a calculation apparatus of a terminal device can be used to call an operation instruction to process first information, so as to output second information of a target desired by a user, thereby improving the information processing efficiency. The present technical solution has advantages of a fast computation speed and high efficiency. 1. An information processing method applied to a computation circuit , wherein the computation circuit includes a communication circuit and an operation circuit , and the method comprises:controlling, by the computation circuit, the communication circuit to obtain a first image to be processed, wherein the first image has an image parameter of a first index size;controlling, by the computation circuit, the operation circuit to obtain and call an operation instruction to retouch the first image to obtain a second image, whereinthe second image has an image parameter of a second index size, the second index size is better than the first index size, and the operation instruction is a preset instruction for image retouching.2. The method of claim 1 , wherein the computation circuit further includes a register circuit and a controller circuit claim 1 , and the first image has an image optimization parameter input by a user claim 1 , and the controlling claim 1 , by the computation circuit claim 1 , the operation circuit to obtain and call an operation instruction to retouch the first image claim 1 , so as to obtain the second image includes:controlling, by the computation circuit, the controller circuit to fetch an ...

Подробнее

Номер записи: 56

01-04-2021 дата публикации

INFORMATION PROCESSING METHOD AND TERMINAL DEVICE

Номер: US20210098001A1

Автор: Chen Tianshi, Hu Shuai, Liu Shaoli, Wang Zai

Принадлежит:

Disclosed are an information processing method and a terminal device. The method comprises: acquiring first information, wherein the first information is information to be processed by a terminal device; calling an operation instruction in a calculation apparatus to calculate the first information so as to obtain second information; and outputting the second information. By means of the examples in the present disclosure, a calculation apparatus of a terminal device can be used to call an operation instruction to process first information, so as to output second information of a target desired by a user, thereby improving the information processing efficiency. The present technical solution has advantages of a fast computation speed and high efficiency. 1. An information processing method applied to a computation circuit , wherein the computation circuit comprises a communication circuit and an operation circuit , and the method comprises:controlling, by the computation circuit, the communication circuit to obtain a voice to be identified input by a user;controlling, by the computation circuit, the operation circuit to obtain and call an operation instruction to perform voice identification processing on the voice to be identified to obtain target text information corresponding to the voice to be identified, whereinthe operation instruction is a preset instruction for voice identification.2. The method of claim 1 , wherein the computation circuit further includes a register circuit and a controller circuit claim 1 , and the controlling claim 1 , by the computation circuit claim 1 , the operation circuit to obtain and call an operation instruction to perform voice identification processing on the voice to be identified to obtain target text information corresponding to the voice to be identified includes:controlling, by the computation circuit, the controller circuit to fetch a first operation instruction and a second operation instruction associated with a network ...

Подробнее

Номер записи: 57

28-03-2019 дата публикации

APPARATUS AND METHODS FOR VECTOR OPERATIONS

Номер: US20190095206A1

Автор: Chen Tianshi, Chen Yunji, Liu Shaoli, Tao Jinhua, Zhi Tian

Принадлежит:

Aspects for vector operations in neural network are described herein. The aspects may include a vector caching unit configured to store a first vector and a second vector, wherein the first vector includes one or more first elements and the second vector includes one or more second elements. The aspects may further include one or more adders and a combiner. The one or more adders may be configured to respectively add each of the first elements to a corresponding one of the second elements to generate one or more addition results. The combiner may be configured to combine a combiner configured to combine the one or more addition results into an output vector. 1. An apparatus for vector operations in a neural network , comprising:a controller unit configured to receive a vector-divide instruction that includes a first address of a first vector, a second address of a second vector, and an operation code that indicates an operation to divide the first vector by the second vector; anda computation module configured to receive the first vector and the second vector in response to the vector-divide instruction based on the first address and the second address,wherein the first vector includes one or more first elements and the second vector includes one or more second elements, and one or more dividers configured to respectively divide each of the first elements by a corresponding one of the second elements to generate one or more division results, and', 'a combiner configured to combine the one or more division results into an output vector., 'wherein the computation module includes2. The apparatus of claim 1 ,wherein the vector-divide instruction further indicates a first length of the first vector, andwherein the computation module is configured to retrieve the first vector based on the first address and the first length.3. The apparatus of claim 1 ,wherein the vector-divide instruction further indicates a second length of the second vector, andwherein the computation ...

Подробнее

Номер записи: 58

28-03-2019 дата публикации

APPARATUS AND METHODS FOR VECTOR OPERATIONS

Номер: US20190095207A1

Автор: Chen Tianshi, Chen Yunji, Liu Shaoli, Tao Jinhua, Zhi Tian

Принадлежит:

Aspects for vector operations in neural network are described herein. The aspects may include a vector caching unit configured to store a first vector and a second vector, wherein the first vector includes one or more first elements and the second vector includes one or more second elements. The aspects may further include one or more adders and a combiner. The one or more adders may be configured to respectively add each of the first elements to a corresponding one of the second elements to generate one or more addition results. The combiner may be configured to combine a combiner configured to combine the one or more addition results into an output vector. 1. An apparatus for vector operations in a neural network , comprising:a controller unit configured to receive a scalar-divide-vector instruction that includes a first address of a vector, a second address of a scalar value, and an operation code that indicates an operation to divide the vector by the scalar value; anda computation module configured to receive the vector and the scalar value in response to the scalar-divide-vector instruction based on the first address and the second address,wherein the vector includes one or more elements, and one or more dividers configured to respectively divide each of the elements by the scalar value to generate one or more division results, and', 'a combiner configured to combine the one or more division results into an output vector., 'wherein the computation module includes2. The apparatus of claim 1 ,wherein the scalar-divide-vector instruction further indicates a first length of the first vector, andwherein the computation module is configured to retrieve the first vector based on the first address and the first length.3. The apparatus of claim 1 ,wherein the scalar-divide-vector instruction further indicates a second length of the second vector, andwherein the computation module is configured to retrieve the second vector based on the second address and the second ...

Подробнее

Номер записи: 59

28-03-2019 дата публикации

APPARATUS AND METHODS FOR VECTOR OPERATIONS

Номер: US20190095401A1

Автор: Chen Tianshi, Chen Yunji, Liu Shaoli, Tao Jinhua, Zhi Tian

Принадлежит:

Aspects for vector operations in neural network are described herein. The aspects may include a vector caching unit configured to store a first vector and a second vector, wherein the first vector includes one or more first elements and the second vector includes one or more second elements. The aspects may further include one or more adders and a combiner. The one or more adders may be configured to respectively add each of the first elements to a corresponding one of the second elements to generate one or more addition results. The combiner may be configured to combine a combiner configured to combine the one or more addition results into an output vector. 1. An apparatus for vector operations in a neural network , comprising:a controller unit configured to receive a vector-multiply-scalar instruction that includes a first address of a vector, a second address of a scalar value, and an operation code that indicates an operation to multiply the vector with the scalar value; anda computation module configured to receive the vector and the scalar value in response to the vector-multiply-scalar instruction based on the first address and the second address,wherein the vector includes one or more elements, and one or more multipliers configured to respectively multiply each of the elements with the scalar value to generate one or more multiplication results, and', 'a combiner configured to combine the one or more multiplication results into an output vector., 'wherein the computation module includes2. The apparatus of claim 1 ,wherein the vector-multiply-scalar instruction further indicates a first length of the first vector, andwherein the computation module is configured to retrieve the first vector based on the first address and the first length.3. The apparatus of claim 1 ,wherein the vector-multiply-scalar instruction further indicates a second length of the second vector, andwherein the computation module is configured to retrieve the second vector based on the ...

Подробнее

Номер записи: 60

26-03-2020 дата публикации

APPARATUS AND METHODS FOR VECTOR OPERATIONS

Номер: US20200097520A1

Автор: Chen Tianshi, Chen Yunji, Liu Shaoli, Luo Tao, Zhi Tian

Принадлежит:

Aspects for vector operations in neural network are described herein. The aspects may include a vector caching unit configured to store a first vector and a second vector. The first vector may include one or more first elements and the second vector may include one or more second elements. The aspects may further include a computation module configured to calculate a cross product between the first vector and the second vector in response to an instruction. 1. An apparatus for vector operations in a neural network , comprising:a controller unit configured to receive a vector cross product instruction that indicates a first address of a first vector, a second address of a second vector, and an operation code that indicates an operation to calculate a cross product between the first vector and the second vector; and wherein the first vector includes one or more first elements,', 'wherein the second vector includes one or more second elements, and', 'wherein the computation module is configured to calculate a cross product between the first vector and the second vector in response to the vector cross product instruction., 'a computation module configured to receive the first vector and the second vector based on the first address and the second address in response to the vector cross product instruction,'}2. The apparatus of claim 1 , wherein the vector cross product instruction further includes a first length of the first vector claim 1 , and wherein the computation module is configured to receive the first vector based on the first address and the first length.3. The apparatus of claim 1 , wherein the vector cross product instruction further includes a second length of the second vector claim 1 , and wherein the computation module is configured to receive the second vector based on the second address and the second length.4. The apparatus of claim 1 , wherein the vector cross product instruction includes one or more register IDs that identify one or more registers ...

Подробнее

Номер записи: 61

26-03-2020 дата публикации

PROCESSING APPARATUS AND PROCESSING METHOD

Номер: US20200097792A1

Автор: Chen Tianshi, Du Zidong, GUO QI, ZHOU Shengyuan

Принадлежит:

The present disclosure relates to a processing device including a memory configured to store data to be computed; a computational circuit configured to compute the data to be computed, which includes performing acceleration computations on the data to be computed by using an adder circuit and a multiplier circuit; and a control circuit configured to control the memory and the computational circuit, which includes performing acceleration computations according to the data to be computed. The present disclosure may have high flexibility, good configurability, fast computational speed, low power consumption, and other features. 1. A processing device comprising:a memory configured to store data, wherein the data include data to be computed in a neural network;a computational circuit configured to compute the data to be computed in the neural network, including performing computations on the data to be computed in the neural network with different computation bit widths by using an adder circuit and a multiplier circuit; anda control circuit configured to control the memory and the computational circuit, including determining a type of the multiplier circuit and a type of the adder circuit of the computational circuit according to the data to be computed so as to perform computations.2. The device of claim 1 , wherein the memory includes:an input storage module configured to store the data to be computed in the neural network,an output storage module configured to store a computation result,a neuron storage module configured to store neuron parameters,a synaptic storage module configured to store synaptic parameters, and wherein the output storage module further includes:', 'an intermediate computation result storage sub-module configured to store an intermediate computation result, and', 'a final computation result storage sub-module configured to store a final computation result., 'a caching module configured to cache data,'}3. The device of claim 2 , wherein the ...

Подробнее

Номер записи: 62

26-03-2020 дата публикации

PROCESSING APPARATUS AND PROCESSING METHOD

Номер: US20200097793A1

Автор: Chen Tianshi, Du Zidong, GUO QI, ZHOU Shengyuan

Принадлежит:

The present disclosure relates to a fused vector multiplier for computing an inner product between vectors, where vectors to be computed are a multiplier number vector {right arrow over (A)}{AN . . . A2A1A0} and a multiplicand number {right arrow over (B)} {BN . . . B2B1B0}, {right arrow over (A)} and {right arrow over (B)} have the same dimension which is N+1. The multiplier includes: N+1 multiplication sub-units configured to perform multiplication on each dimension of a vector respectively, and take lower n bits of the multiplier number vector for multiplication each time, where the n bits are removed from the binary number of each dimension of the multiplier number vector after the n bits are taken, and n is larger than 1 and less than N+1; an adder tree configured to perform addition on results of N+1 multiplication sub-units obtained from a same operation each time; and a result register configured to hold a result of every addition performed by the adder tree and send the result to the adder tree for next computation. 1. A fused vector multiplier for computing an inner product between vectors , comprising: wherein the multiplier number vector includes one or more first elements and the multiplicand vector includes one or more second elements, and', respectively retrieve one or more least significant bits of each of the one or more first elements,', 'multiply each of the one or more least significant bits with a respective one of the one or more second elements, and', 'remove the retrieved one or more least significant bits from each of the one or more first elements;, 'wherein each of the multiplication sub-units is configured to], 'multiple multiplication sub-units configured to perform multiplication between a multiplier number vector and a multiplicand vector;'}an adder tree configured to add results output from the multiplication sub-units; anda result register configured to store addition results output from the adder tree and feed the addition results ...

Подробнее

Номер записи: 63

26-03-2020 дата публикации

PROCESSING APPARATUS AND PROCESSING METHOD

Номер: US20200097794A1

Автор: Chen Tianshi, Wang Zai, Wei Jie, Zhi Tian

Принадлежит:

The present disclosure provides a counting device and counting method. The device includes a storage unit, a counting unit, and a register unit, where the storage unit may be connected to the counting unit for storing input data to be counted and storing a number of elements satisfying a given condition in the input data after counting; the register unit may be configured to store an address where input data to be counted is stored in the storage unit; and the counting unit may be connected to the register unit, and may be configured to acquire a counting instruction, read a storage address of the input data to be counted in the register unit according to the counting instruction, acquire corresponding input data to be counted in the storage unit, perform statistical counting on a number of elements in the input data to be counted that satisfy the given condition, and obtain a counting result. The counting device and the method may improve the computation efficiency by writing an algorithm of counting a number of elements that satisfy a given condition in input data into an instruction form. 1. A counting device , comprising:a storage circuit configured to store input data to be counted and store a count of elements in the input data that satisfy a given condition after counting;a register circuit is configured to store an address in the storage circuit where the input data to be counted is stored; and acquire a counting instruction,', 'read a storage address of the input data in the register circuit according to the counting instruction,', 'acquire corresponding input data to be counted in the storage circuit,', 'identifying elements in the input data that satisfy the given condition, and', 'obtain a counting result., 'a counting circuit connected to the register circuit and the storage circuit wherein the counting circuit is configured to2. The counting device of claim 1 , wherein the storage circuit is main storage claim 1 , and/or a cache.3. The counting device ...

Подробнее

Номер записи: 64

26-03-2020 дата публикации

PROCESSING APPARATUS AND PROCESSING METHOD

Номер: US20200097795A1

Автор: Chen Tianshi, Li Wei, Liu Shaoli, Zhi Tian

Принадлежит:

The present disclosure provides a computation device and method. The device may include an input module configured to acquire input data; a model generation module configured to construct an offline model according to an input network structure and weight data; a neural network operation module configured to generate a computation instruction based on the offline model and cache the computation instruction, and compute the data to be processed based on the computation instruction to obtain a computation result; and an output module configured to output a computation result. The device and method may avoid the overhead caused by running an entire software architecture, which is a problem in a traditional method. 1. A computation device comprising:an input circuit configured to acquire input data, wherein the input data includes data to be processed;a model generation circuit configured to construct an offline model according to the input data; determine a computation instruction based on the offline model,', 'cache the computation instruction, and', 'compute the data to be processed according to the computation instruction to obtain a computation result., 'a neural network operation circuit configured to2. The computation device of claim 1 , wherein the input data includes offline model data.3. The computation device of claim 1 , wherein claim 1 ,the input data includes a network structure, and weight data.4. The computation device of claim 1 , further comprising:a control circuit configured to determine content of the input data, instruct the input circuit to transmit the network structure and the weight data in the input data to the model generation circuit,', 'instruct the model generation circuit to generate the offline model based on the weight data and the network structure,', 'control the neural network operation circuit to compute the data to be processed based on the generated offline model,, 'wherein, based on a determination that the input data includes a ...

Подробнее

Номер записи: 65

26-03-2020 дата публикации

COMPUTING DEVICE AND METHOD

Номер: US20200097796A1

Автор: Chen Tianshi, Du Zidong, Liu Shaoli

Принадлежит:

A computing device, comprising: a computing module, comprising one or more computing units; and a control module, comprising a computing control unit, and used for controlling shutdown of the computing unit of the computing module according to a determining condition. Also provided is a computing method. The computing device and method have the advantages of low power consumption and high flexibility, and can be combined with the upgrading mode of software, thereby further increasing the computing speed, reducing the computing amount, and reducing the computing power consumption of an accelerator. 1. An operation device comprising:a transformation module configured to perform spatial transformation on input data and/or a parameter matrix from first geometric space into second geometric space; andan operation module connected to the transformation module and configured to receive transformed input data and parameter matrix, and then perform operations.2. The operation device of claim 1 , wherein the input data and the parameter matrix are presented by a linear combination of a basis vector of the second geometric space through spatial transformation claim 1 , which in other words claim 1 , refers to that the input data and the parameter matrix are expanded in the second geometric space.3. The operation device of claim 1 , wherein the input data and the parameter matrix are input data and a parameter matrix used by a convolutional layer claim 1 , a down-sampling layer claim 1 , a normalization layer claim 1 , or a regularization layer.4. The operation device of claim 1 ,wherein the first geometric space is a spatial domain, and the second geometric space is a frequency domain, andwherein a manner of the spatial transformation is an invertible spatial transformation including FFT, DFT, DCT, or DST.5. The operation device of claim 1 ,wherein the operation module includes:a multiplier configured to multiply data that are input into the multiplier to obtain output after ...

Подробнее

Номер записи: 66

26-03-2020 дата публикации

INTEGRATED CIRCUIT CHIP DEVICE AND RELATED PRODUCT

Номер: US20200097804A1

Автор: Chen Tianshi, Liu Shaoli, WANG Bingrui, Zhang Yao

Принадлежит:

The present disclosure provides an integrated circuit chip device and a related product. The integrated circuit chip device includes: a primary processing circuit and a plurality of basic processing circuits. The primary processing circuit or at least one of the plurality of basic processing circuits includes the compression mapping circuits configured to perform compression on each data of a neural network operation. The technical solution provided by the present disclosure has the advantages of a small amount of computations and low power consumption. 1. An integrated circuit chip device , comprising a primary processing circuit , k branch circuits , and k groups of basic processing circuits , wherein the primary processing circuit is connected to the k branch circuits respectively , each branch circuit of the k branch circuits corresponds to one group of basic processing circuits of the k groups of basic processing circuits , and one group of basic processing circuits includes at least one basic processing circuit , whereineach of the branch circuits includes a compression mapping circuit configured to perform compression on each data in a neural network operation;the primary processing circuit is configured to perform operations of the neural network in series and transmit the data to the k branch circuits connected to the primary processing circuit;the k branch circuits are configured to forward the data between the primary processing circuit and the k groups of basic processing circuits, and control whether to start the compression mapping circuit to perform compression on the transmitted data according to the operation of the transmitted data; andthe k basic processing circuits are configured to perform operations of the neural network in series according to the data or the compressed data, and transmit an operation result to the primary processing circuit.2. The integrated circuit chip device of claim 1 , further comprising:the primary processing circuit is ...

Подробнее

Номер записи: 67

26-03-2020 дата публикации

PROCESSING METHOD AND ACCELERATING DEVICE

Номер: US20200097806A1

Автор: Chen Tianshi, HAO Yifan, Liu Shaoli

Принадлежит:

The present disclosure provides a processing device including: a coarse-grained pruning unit configured to perform coarse-grained pruning on a weight of a neural network to obtain a pruned weight, an operation unit configured to train the neural network according to the pruned weight. The coarse-grained pruning unit is specifically configured to select M weights from the weights of the neural network through a sliding window, and when the M weights meet a preset condition, all or part of the M weights may be set to 0. The processing device can reduce the memory access while reducing the amount of computation, thereby obtaining an acceleration ratio and reducing energy consumption. 1. An operation device , comprising:a filtering unit configured to select a feature map and a weight corresponding to the feature map participating in subsequent operations according to a connection array of the feature map composed of the output neuron and an input neuron, and output the feature map and the weight corresponding to the feature map to an operation unit; and/or configured to select a row of the feature map and a row of weight corresponding to the row of the feature map according to a connection array of each row in the feature map composed of an output neuron and an input neuron, and output the row of the feature map and the row of weight corresponding to the row of the feature map to the operation unit; and/or configured to select a column of the feature map and a weight column corresponding to the column of the feature map according to a connection array of each column in the feature map composed of an output neuron and an input neuron, and output the column of the feature map and the weight column of the column of the feature map to an operation unit; andthe operation unit configured to perform a corresponding artificial neural network operation supporting structure clipping on data output by the filtering unit according to an instruction to obtain an output neuron.2. The ...

Подробнее

Номер записи: 68

26-03-2020 дата публикации

PROCESSING METHOD AND ACCELERATING DEVICE

Номер: US20200097826A1

Автор: Chen Tianshi, Du Zidong, Liu Shaoli, Zhou Xuda

Принадлежит:

The present disclosure provides a processing device including: a coarse-grained pruning unit configured to perform coarse-grained pruning on a weight of a neural network to obtain a pruned weight, an operation unit configured to train the neural network according to the pruned weight. The coarse-grained pruning unit is specifically configured to select M weights from the weights of the neural network through a sliding window, and when the M weights meet a preset condition, all or part of the M weights may be set to 0. The processing device can reduce the memory access while reducing the amount of computation, thereby obtaining an acceleration ratio and reducing energy consumption. 1. A processing device , comprising:a coarse-grained pruning unit configured to perform coarse-grained pruning on a weight of a neural network to obtain a pruned weight; andan operation unit configured to train the neural network according to the pruned weight;wherein the coarse-grained pruning unit is configured to:select one or more weights from weights of the neural network through a sliding window ; andwhen the M weights meet a preset condition, set at least a portion of the selected weights to 0.2. The processing device of claim 1 , wherein the preset condition is an information quantity of the M weights is less than a first given threshold.3. The processing device of claim 2 , whereinthe information quantity of the M weights is an arithmetic mean of an absolute value of the M weights, a geometric mean of the absolute value of the M weights or a maximum value of the absolute value of the M weights; the first given threshold is a first threshold, a second threshold or a third threshold; and the information quantity of the M weights being less than the first given threshold includes:the arithmetic mean of the absolute value of the M weights being less than the first threshold, or the geometric mean of the absolute value of the M weights being less than the second threshold, or the ...

Подробнее

Номер записи: 69

26-03-2020 дата публикации

Processing method and accelerating device

Номер: US20200097827A1

Автор: Tianshi CHEN, Xuda ZHOU, Zai WANG, Zidong DU

Принадлежит: Shanghai Cambricon Information Technology Co Ltd

The present disclosure provides a processing device including: a coarse-grained pruning unit configured to perform coarse-grained pruning on a weight of a neural network to obtain a pruned weight, an operation unit configured to train the neural network according to the pruned weight. The coarse-grained pruning unit is specifically configured to select M weights from the weights of the neural network through a sliding window, and when the M weights meet a preset condition, all or part of the M weights may be set to 0. The processing device can reduce the memory access while reducing the amount of computation, thereby obtaining an acceleration ratio and reducing energy consumption.

Подробнее

Номер записи: 70

26-03-2020 дата публикации

PROCESSING METHOD AND ACCELERATING DEVICE

Номер: US20200097828A1

Автор: Chen Tianshi, Du Zidong, Wang Zai, Zhou Xuda

Принадлежит:

The present disclosure provides a processing device including: a coarse-grained pruning unit configured to perform coarse-grained pruning on a weight of a neural network to obtain a pruned weight, an operation unit configured to train the neural network according to the pruned weight. The coarse-grained pruning unit is specifically configured to select M weights from the weights of the neural network through a sliding window, and when the M weights meet a preset condition, all or part of the M weights may be set to 0. The processing device can reduce the memory access while reducing the amount of computation, thereby obtaining an acceleration ratio and reducing energy consumption. 1. A processing device , comprising:a coarse-grained selection unit configured to input position information of a neuron and a target weight, and select a neuron to be computed, where the target weight is a weight whose absolute value is greater than a given threshold;a lookup table unit configured to receive a quantized target weight dictionary and a quantized target weight codebook, perform a table lookup operation to obtain a target weight of a neural network and output the target weight of the neural network; andan operation unit configured to receive the selected neuron and target weight, perform an operation on the neural network, and output the neuron.2. The processing device of claim 1 , wherein the lookup table unit is further configured to transmit an unquantized target weight directly to the operation unit by a bypass.3. The processing device of claim 1 , further comprising:an instruction control unit configured to receive and decode the instruction to obtain control information to control the operation unit.4. The processing device of claim 1 , further comprising:a storage unit configured to store a neuron, a weight, and an instruction of the neural network.5. The processing device of claim 4 , wherein the storage unit is further configured to store the target weight and ...

Подробнее

Номер записи: 71

26-03-2020 дата публикации

Processing method and accelerating device

Номер: US20200097831A1

Автор: Tianshi CHEN, Xuda ZHOU, Zai WANG, Zidong DU

Принадлежит: Shanghai Cambricon Information Technology Co Ltd

The present disclosure provides a processing device including: a coarse-grained pruning unit configured to perform coarse-grained pruning on a weight of a neural network to obtain a pruned weight, an operation unit configured to train the neural network according to the pruned weight. The coarse-grained pruning unit is specifically configured to select M weights from the weights of the neural network through a sliding window, and when the M weights meet a preset condition, all or part of the M weights may be set to 0. The processing device can reduce the memory access while reducing the amount of computation, thereby obtaining an acceleration ratio and reducing energy consumption.

Подробнее

Номер записи: 72

08-04-2021 дата публикации

NEURAL NETWORK COMPUTING METHOD, SYSTEM AND DEVICE THEREFOR

Номер: US20210103818A1

Автор: Chen Tianshi, Chen Yunji, Du Zidong, GUO QI

Принадлежит:

The present disclosure provides a neural network computing method, system and device therefor to be applied in the technical field of computers. The computing method comprises the following steps: A. dividing a neural network into a plurality of subnetworks having consistent internal data characteristics; B. computing each of the subnetworks to obtain a first computation result for each subnetwork; and C. computing a total computation result of the neural network on the basis of the first computation result of each subnetwork. By means of the method, the present disclosure improves the computing efficiency of the neutral network. 1. A neural network computing method , comprising the following steps:A. dividing a neural network into a plurality of subnetworks having consistent internal data characteristics;B. computing each of the subnetworks to obtain a first computation result for each subnetwork; andC. computing a total computation result of the neural network on the basis of the first computation result of each subnetwork.2. The computing method according to claim 1 , wherein the step A comprises:A1. dividing the neural network into a plurality of subnetworks having consistent internal data characteristics on the basis of output neurons of the neural network;A2. dividing the neural network into a plurality of subnetworks having consistent internal data characteristics on the basis of input neurons of the neural network; andA3. dividing the neural network into a plurality of subnetworks having consistent internal data characteristics on the basis of neuron weights of the neural network.3. The computing method according to claim 2 , wherein the step A3 comprises:dividing the neural network into a plurality of subnetworks having consistent internal data characteristics on the basis of distribution of the neuron weights of the neural network; ordividing the neural network into a plurality of subnetworks having consistent internal data characteristics on the basis of ...

Подробнее

Номер записи: 73

02-04-2020 дата публикации

DATA PROCESSING APPARATUS AND METHOD

Номер: US20200104167A1

Автор: Chen Tianshi, Liu Shaoli, ZHANG LEI

Принадлежит:

The disclosure provides a data processing device and method. The data processing device may include: a task configuration information storage unit and a task queue configuration unit. The task configuration information storage unit is configured to store configuration information of tasks. The task queue configuration unit is configured to configure a task queue according to the configuration information stored in the task configuration information storage unit. According to the disclosure, a task queue may be configured according to the configuration information. 1. A data processing device , comprising:a task configuration information storage unit configured to store configuration information of tasks, wherein the task configuration information storage unit includes multiple storage units configured to store multiple task queues respectively into each storage unit according to the configuration information; anda task queue configuration unit configured to configure a task queue according to the configuration information stored in the task configuration information storage unit.2. The data processing device of claim 1 , wherein the configuration information includes at least one of a start tag of one of the tasks claim 1 , an end tag of one of the tasks claim 1 , a priority of one of the tasks claim 1 , and launching manners for the tasks claim 1 , a priority of the storage unit claim 1 , or a priority of each task in the storage unit.3. The data processing device of claim 1 , wherein the multiple storage units include:a first storage unit associated with a first priority, configured to store a first task queue, anda second storage unit associated with a second priority, configured to store a second task queue, wherein the first priority is higher than the second priority, and a priority of the first task queue is higher than a priority of the second task queue.4. The data processing device of claim 2 , wherein a task in the task queue of the storage unit is stored ...

Подробнее

Номер записи: 74

02-04-2020 дата публикации

DATA PROCESSING APPARATUS AND METHOD

Номер: US20200104207A1

Автор: Chen Tianshi, Du Zidong, Wang Zai, Zhou Xuda

Принадлежит:

The disclosure provides a data processing device and method. The data processing device may include: a task configuration information storage unit and a task queue configuration unit. The task configuration information storage unit is configured to store configuration information of tasks. The task queue configuration unit is configured to configure a task queue according to the configuration information stored in the task configuration information storage unit. According to the disclosure, a task queue may be configured according to the configuration information. 1. A data redundancy method comprising:dividing data into one or more importance ranks;extracting an important bit of each piece of data in each of the one or more importance ranks; andperforming data redundancy processing on the important bit.2. The data redundancy method of claim 1 , wherein the data redundancy processing includes replica redundancy processing and/or error correcting code processing.3. The data redundancy method of claim 2 , wherein the performing error correcting code processing on data includes:performing redundancy storage on the data in a CRC manner,when a read operation is executed, reading a stored CRC code and performing a CRC code computation on raw data,when two CRC codes are inconsistent, correcting the data according to the stored CRC code, andwhen a write operation is executed, storing both of the raw data and the CRC code of the data.4. The data redundancy method of claim 2 , wherein the performing error correcting code processing on data includes:performing redundancy storage on the data in a manner of ECC memory, andwhen the read/write operation is executed, automatically performing ECC processing by the ECC memory.6. The data redundancy method of claim 1 , wherein the dividing data into the M importance ranks includes dividing the data according to at least one of a size of the data claim 1 , an absolute value of the data claim 1 , a type of the data claim 1 , a read ...

Подробнее

Номер записи: 75

02-04-2020 дата публикации

DATA PROCESSING APPARATUS AND METHOD

Номер: US20200104573A1

Автор: Chen Tianshi, HE Haoyuan, Hu Shuai

Принадлежит:

The disclosure provides a data processing device and method. The data processing device may include: a task configuration information storage unit and a task queue configuration unit. The task configuration information storage unit is configured to store configuration information of tasks. The task queue configuration unit is configured to configure a task queue according to the configuration information stored in the task configuration information storage unit. According to the disclosure, a task queue may be configured according to the configuration information. 1. An information processing device comprisinga storage unit configured to store data and an instruction; and receive the data and the instruction sent by the storage unit,', 'perform extraction and computational processing on key features included in the data, and', 'generate a multidimensional vector according to a computational processing result., 'a data processing unit connected with the storage unit and configured to2. The information processing device of claim 1 , wherein the key features include a facial action claim 1 , a facial expression claim 1 , and a corresponding position thereof3. The information processing device of claim 1 , wherein the input data includes one or more images claim 1 , and the data processing unit generates a multidimensional vector for each image according to the computational processing result.4. The information processing device of claim 1 , wherein the multidimensional vector is an emotion vector claim 1 , and each element included in the multidimensional vector represents an emotion on the face claim 1 , wherein the emotion includes anger claim 1 , delight claim 1 , pain claim 1 , depression claim 1 , sleepiness claim 1 , or doubt.5. The information processing device of claim 4 , wherein a value of each element of the emotion vector is a number between zero and one claim 4 , and the value represents a probability of appearance of an emotion corresponding to the ...

Подробнее

Номер записи: 76

02-04-2020 дата публикации

PROCESSING METHOD AND ACCELERATING DEVICE

Номер: US20200104693A1

Автор: Chen Tianshi, Du Zidong, Liu Shaoli, Zhou Xuda

Принадлежит:

The present disclosure provides a processing device including: a coarse-grained pruning unit configured to perform coarse-grained pruning on a weight of a neural network to obtain a pruned weight, an operation unit configured to train the neural network according to the pruned weight. The coarse-grained pruning unit is specifically configured to select M weights from the weights of the neural network through a sliding window, and when the M weights meet a preset condition, all or part of the M weights may be set to 0. The processing device can reduce the memory access while reducing the amount of computation, thereby obtaining an acceleration ratio and reducing energy consumption. 1. A processing device , comprising:a storage unit configured to store an input neuron, an output neuron, a weight, and an instruction of a neural network; select M weights from weights of the neural network through a sliding window, where the M is an integer greater than 1; and', 'when the M weights meet a preset condition, all or part of the M weights are set to 0;, 'a coarse-grained pruning unit configured to perform coarse-grained pruning on the weight of the neural network to obtain a pruned weight and store the pruned weight and position information of a target weight into the storage unit, wherein an absolute value of the target weight is greater than a second given threshold, and the coarse-grained pruning unit is specifically configured toa coarse-grained selection unit configured to receive an input neuron and position information of the target weight and select an input neuron corresponding to the target weight according to the position information of the target weight; andan operation unit configured to perform training according to the pruned weight, and the weights that have been set to 0 in the training process remain 0, perform a neural network operation according to an input target weight and an input neuron corresponding to the target weight to get an output neuron, and ...

Подробнее

Номер записи: 77

02-04-2020 дата публикации

COMPUTING DEVICE AND METHOD

Номер: US20200104713A1

Автор: Chen Tianshi, Hu Shuai, Wang Zai, ZHOU Shengyuan

Принадлежит:

A computing device, comprising: a computing module, comprising one or more computing units; and a control module, comprising a computing control unit, and used for controlling shutdown of the computing unit of the computing module according to a determining condition. Also provided is a computing method. The computing device and method have the advantages of low power consumption and high flexibility, and can be combined with the upgrading mode of software, thereby further increasing the computing speed, reducing the computing amount, and reducing the computing power consumption of an accelerator. 1. A sparse training method comprising:selectively zeroing one or more gradients corresponding to one or more neurons included in a layer of a neural network according to a zero-setting condition; andperforming training operations by using one or more non-zeroed gradients to obtain updated gradients and synapses.2. The sparse training method of claim 1 , further comprising: screening the one or more neurons included in the layer randomly prior to zeroing corresponding gradients of selected neurons according to the zero-setting condition.3. The sparse training method of claim 2 , wherein the zero-setting condition is a zero-setting probability condition claim 2 , wherein the zero-setting probability is p claim 2 , N*p neurons are selected randomly claim 2 , and corresponding gradients of the N*p neurons are set to zero.4. The sparse training method of claim 1 , wherein the zero-setting condition is a zero-setting threshold condition claim 1 , wherein the zero-setting threshold condition includes: being less than a given threshold claim 1 , being greater than a given threshold claim 1 , being within a given value range claim 1 , or being outside a given value range claim 1 , andwherein the zero-setting threshold condition is being less than a given threshold, where the given threshold is set to a first threshold, when a gradient is less than the first threshold, the gradient ...

Подробнее

Номер записи: 78

09-04-2020 дата публикации

COMPUTING DEVICE AND METHOD

Номер: US20200110609A1

Автор: Chen Tianshi, Du Zidong, Liu Shaoli, Zhou Xuda

Принадлежит:

A computing device, comprising: a computing module, comprising one or more computing units; and a control module, comprising a computing control unit, and used for controlling shutdown of the computing unit of the computing module according to a determining condition. Also provided is a computing method. The computing device and method have the advantages of low power consumption and high flexibility, and can be combined with the upgrading mode of software, thereby further increasing the computing speed, reducing the computing amount, and reducing the computing power consumption of an accelerator. 1. A processor , comprising:an instruction control unit configured to fetch a processing instruction; andan operation module configured to receive frame information, neural network parameters, and the processing instruction, and perform neural network operations on the frame information and the neural network parameters according to the processing instructions.2. The processor of further comprising:a storage module configured to store the frame information and the neural network parameters,wherein the frame information includes complete frame information and reference frame information, and the neural network parameters include neurons, weights, topological structures and/or processing instructions, andwherein the operation module includes: fetch the complete frame information and weights in the neural network parameters,', 'perform neural network operations to obtain a first operation result, and', 'transfer the first operation result to the storage module, and an approximate operation unit configured to:', 'fetch the reference frame information and an operation result of the reference frame which is obtained in advanced and stored in the storage module,', 'perform approximate operations to obtain a second operation result, and', 'transfer the second operation result to the storage module., 'an accurate operation unit configured to3. The processor of claim 2 , wherein the ...

Подробнее

Номер записи: 79

09-04-2020 дата публикации

DATA PROCESSING APPARATUS AND METHOD

Номер: US20200110635A1

Автор: Chen Tianshi, Hu Shuai, Zhou Xuda

Принадлежит:

The disclosure provides a data processing device and method. The data processing device may include: a task configuration information storage unit and a task queue configuration unit. The task configuration information storage unit is configured to store configuration information of tasks. The task queue configuration unit is configured to configure a task queue according to the configuration information stored in the task configuration information storage unit. According to the disclosure, a task queue may be configured according to the configuration information. 1. A dynamic voltage frequency scaling (DVFS) method , comprising:obtaining a processor load and a neural network configuration signal within a time period; andpredicting a frequency of a processor in a next time period.2. The method of claim 1 , further comprising predicting a voltage of a processor in the next time period according to a predicted frequency.3. The method of claim 1 , wherein predicting the frequency of the processor in the next time period includes predicting a frequency of a storage unit and/or a computation unit.4. The method of claim 1 , wherein predicting the frequency of the processor in the next time period includes:{'b': 0', '1', '0', '1', '0', '1, 'predicting m frequency scaling ranges for the computation unit, and generating m+1 frequency segmentation points f, f, . . . , and fm in total, wherein f Подробнее

Номер записи: 80

09-04-2020 дата публикации

APPARATUS AND METHODS FOR FORWARD PROPAGATION IN CONVOLUTIONAL NEURAL NETWORKS

Номер: US20200110983A1

Автор: Chen Tianshi, Chen Yunji, GUO QI, HAN Dong, Liu Shaoli

Принадлежит:

Aspects for forward propagation of a convolutional artificial neural network are described herein. The aspects may include a direct memory access unit configured to receive input data from a storage device and a master computation module configured to select one or more portions of the input data based on a predetermined convolution window. Further, the aspects may include one or more slave computation modules respectively configured to convolute a convolution kernel with one of the one or more portions of the input data to generate a slave output value. Further still, the aspects may include an interconnection unit configured to combine the one or more slave output values into one or more intermediate result vectors, wherein the master computation module is further configured to merge the one or more intermediate result vectors into a merged intermediate vector. 1. An apparatus for neural network operations , comprising:a controller circuit configured to receive an instruction; and receive input data,', 'select, in response to an instruction, one or more portions of the input data based on a predetermined convolution window, wherein the instruction includes a first address of the one or more portions of the input data, a first size of the one or more portions of the input data, a second address of a portion of a convolution kernel, and a second size of the portion of the convolution kernel,', 'convolute the portion of the convolution kernel with one of the one or more portions of the input data to generate a slave output value,', 'combine the one or more slave output values into one or more intermediate result vectors, and', 'merge the one or more intermediate result vectors into a merged intermediate vector., 'a computation circuit configured to2. The apparatus of claim 1 , wherein the computation circuit includes a master computation circuit configured to:receive the input data,select, in response to the instruction, the one or more portions of the input data ...

Подробнее

Номер записи: 81

09-04-2020 дата публикации

COMPUTING DEVICE AND METHOD

Номер: US20200110988A1

Автор: Chen Tianshi, Du Zidong, Wang Zai, ZHOU Shengyuan

Принадлежит:

A computing device, comprising: a computing module, comprising one or more computing units; and a control module, comprising a computing control unit, and used for controlling shutdown of the computing unit of the computing module according to a determining condition. Also provided is a computing method. The computing device and method have the advantages of low power consumption and high flexibility, and can be combined with the upgrading mode of software, thereby further increasing the computing speed, reducing the computing amount, and reducing the computing power consumption of an accelerator. 1. An operation device comprising:an operation module comprising one or more operation units; anda control module comprising an operation control unit configured to disable at least one of the one or more operation units according to a determination condition.2. The operation device of claim 1 , wherein each of the one or more operation units includes a temporary caching unit and one or more operation components claim 1 , and the operation components include one or more of adder claim 1 , multiplier claim 1 , or selector.3. The operation device of claim 2 , wherein the operation module includes n multipliers located at a first stage and an adder tree of n input located at a second stage claim 2 , wherein n represents a positive integer.4. The operation device of claim 1 ,wherein the determination condition includes a threshold determination condition or a function mapping determination condition,wherein the determination condition is a threshold determination condition including: being less than a given threshold, being greater than a given threshold, being within a given value range, or being outside a given value range, andwherein the determination condition determines whether a given condition is satisfied after a function transformation is performed.5. The operation device of claim 3 , wherein the n multipliers located at the first stage are connected to the operation ...

Подробнее

Номер записи: 82

09-04-2020 дата публикации

APPARATUS AND METHODS FOR TRAINING IN CONVOLUTIONAL NEURAL NETWORKS

Номер: US20200111007A1

Автор: Chen Tianshi, Chen Yunji, GUO QI, Liu Shaoli, Zhi Tian

Принадлежит:

Aspects for backpropagation of a convolutional neural network are described herein. The aspects may include a direct memory access unit configured to receive input data from a storage device and a master computation module configured to select one or more portions of the input data based on a predetermined convolution window. Further, the aspects may include one or more slave computation modules respectively configured to convolute one of the one or more portions of the input data with one of one or more previously calculated first data gradients to generate a kernel gradient, wherein the master computation module is further configured to update a prestored convolution kernel based on the kernel gradient. 1. An apparatus for backpropagation of a convolutional neural network , comprising:a controller unit configured to receive an instruction; and receive input data,', 'select one or more portions of the input data based on a predetermined convolution window in response to an instruction,', 'respectively convolute one of the one or more portions of the input data with one of one or more calculated first data gradients to generate a kernel gradient, and', 'update a prestored convolution kernel based on the kernel gradient., 'a computation circuit configured to2. The apparatus of claim 1 , wherein the computation circuit includes a master computation module configured to:receive the input data,select the one or more portions of the input data based on the predetermined convolution window in response to the instruction, andupdate a prestored convolution kernel based on the kernel gradient.3. The apparatus of claim 1 , wherein the computation circuit include one or more slave computation modules respectively configured to convolute the one of the one or more portions of the input data with the one of one or more calculated first data gradients to generate the kernel gradient.4. The apparatus of claim 3 , wherein the one or more slave computation modules are respectively ...

Подробнее

Номер записи: 83

13-05-2021 дата публикации

Video retrieval method, and method and apparatus for generating video retrieval mapping relationship

Номер: US20210142069A1

Автор: Qun Liu, Shengyuan ZHOU, Tianshi CHEN, ZHOU Fang, Zidong DU

Принадлежит: Cambricon Technologies Corp Ltd

The present disclosure relates to a video retrieval method, system and device for generating a video retrieval mapping relationship, and a storage medium. The video retrieval method comprises: acquiring a retrieval instruction, wherein the retrieval instruction carries retrieval information for retrieving a target frame picture; and obtaining the target frame picture according to the retrieval information and a preset mapping relationship. The method for generating a video retrieval mapping relationship comprises: performing a feature extraction operation on each frame picture in a video stream by using a feature extraction model so as to obtain a key feature sequence corresponding to each frame picture; inputting the key feature sequence corresponding to each frame picture into a text sequence extraction model for processing so as to obtain a text description sequence corresponding to each frame picture; and constructing a mapping relationship according to the text description sequence corresponding to each frame picture.

Подробнее

Номер записи: 84

16-04-2020 дата публикации

DATA SHARING SYSTEM AND DATA SHARING METHOD THEREFOR

Номер: US20200117519A1

Автор: Chen Tianshi, Liu Shaoli, ZHOU Shengyuan

Принадлежит:

A data sharing system may include a storage module and at least two processing modules. The at least two processing modules may share the storage module and the at least two processing modules communicate to implement data sharing. A data sharing method for the data sharing system is provided. According to the disclosure, a storage communication overhead may be reduced, and a data access delay may be effectively reduced. 1. A data sharing system , comprising:a first processing module that includes a first internal storage unit; anda second processing module configured to transmit, to the first processing module, a request signal that includes a data address in the first internal storage unit, retrieve, upon receiving the request signal, data at the data address in the first internal storage unit, and', 'transmit the retrieved data to the second processing module., 'wherein the first processing module is configured to2. The data sharing system of claim 1 , wherein the first processing module is further configured to transmit an acknowledge signal to the second processing module upon receiving the request signal.3. The data sharing system of claim 1 , wherein each of the first processing module and the second processing module includes a physical processor.4. The data sharing system of claim 3 , wherein the physical processor includes an artificial neural network processor configured to perform artificial neural network forward computations.5. The data sharing system of claim 4 , wherein the artificial neural network processor includes an instruction caching unit configured to read an instruction from a Direct Memory Access (DMA) and cache the read instruction.6. The data sharing system of claim 5 , wherein the artificial neural network processor includes a controlling unit configured to read the instruction from the instruction caching unit and decode the instruction into one or more microinstructions.7. The data sharing system of claim 6 , wherein the artificial ...

Подробнее

Номер записи: 85

25-04-2019 дата публикации

Apparatus and Methods for Neural Network Operations Supporting Fixed Point Numbers of Short Bit Length

Номер: US20190122094A1

Автор: Qi Guo, Shaoli Liu, Tianshi CHEN, Yunji CHEN

Принадлежит: Cambricon Technologies Corp Ltd

Aspects for neural network operations with fixed-point number of short bit length are described herein. The aspects may include a fixed-point number converter configured to convert one or more first floating-point numbers to one or more first fixed-point numbers in accordance with at least one format. Further, the aspects may include a neural network processor configured to process the first fixed-point numbers to generate one or more process results.

Подробнее

Номер записи: 86

16-04-2020 дата публикации

Processing apparatus and processing method

Номер: US20200117976A1

Автор: Qi Guo, Shaoli Liu, Tianshi CHEN, Yuzhe LUO

Принадлежит: Shanghai Cambricon Information Technology Co Ltd

The present disclosure provides a processing device and method. The device includes: an input/output module, a controller module, a computing module, and a storage module. The input/output module is configured to store and transmit input and output data; the controller module is configured to decode a computation instruction into a control signal to control other modules to perform operation; the computing module is configured to perform four arithmetic operation, logical operation, shift operation, and complement operation on data; and the storage module is configured to temporarily store instructions and data. The present disclosure can execute a composite scalar instruction accurately and efficiently.

Подробнее

Номер записи: 87

16-04-2020 дата публикации

DATA SHARING SYSTEM AND DATA SHARING METHOD THEREFOR

Номер: US20200118004A1

Автор: Chen Tianshi, GAO Yufeng, HAO Yifan, Hu Shuai

Принадлежит:

The present disclosure provides a processing device for performing generative adversarial network and a method for machine creation applying the processing device. The processing device includes a memory configured to receive input data including a random noise and reference data, and store a discriminator neural network parameter and a generator neural network parameter, and the processing device further includes a computation device configured to transmit the random noise input data into a generator neural network and perform operation to obtain a noise generation result, and input both of the noise generation result and the reference data into a discriminator neural network and perform operation to obtain a discrimination result, and further configured to update the discriminator neural network parameter and the generator neural network parameter according to the discrimination result. 1. A processing device for performing a generative adversarial network , comprising: receive input data that includes a random noise and reference data, and', 'store discriminator neural network parameters and generator neural network parameters; and, 'a memory configured to transmit the random noise input data into a generator neural network and perform operation to obtain a noise generation result, and', 'input the noise generation result and the reference data into a discriminator neural network to obtain a discrimination result, and', 'update the discriminator neural network parameters and the generator neural network parameters according to the discrimination result., 'a computation device configured to2. The processing device of claim 1 , wherein the memory is further configured to store a computation instruction claim 1 , and the processing device further includes:a controller configured to decode the computation instruction into one or more operation instructions and send the one or more operation instructions to the computation device.3. The processing device of claim 1 , ...

Подробнее

Номер записи: 88

27-05-2021 дата публикации

INFORMATION PROCESSING METHOD AND TERMINAL DEVICE

Номер: US20210157992A1

Автор: Chen Tianshi, Hu Shuai, Liu Shaoli, Wang Zai

Принадлежит:

Disclosed are an information processing method and a terminal device. The method comprises: acquiring first information, wherein the first information is information to be processed by a terminal device; calling an operation instruction in a calculation apparatus to calculate the first information so as to obtain second information; and outputting the second information. By means of the examples in the present disclosure, a calculation apparatus of a terminal device can be used to call an operation instruction to process first information, so as to output second information of a target desired by a user, thereby improving the information processing efficiency. The present technical solution has advantages of a fast computation speed and high efficiency. 1. An information processing method applied to a computation circuit , wherein the computation circuit comprises a communication circuit and an operation circuit , and the method comprises:controlling, by the computation circuit, the communication circuit to obtain first language information input by a user;controlling, by the computation circuit, the operation circuit to obtain and call an operation instruction to process the first language information to obtain second language information, whereinwhen the processing is language translation processing, an applied language corresponding to the second language information is different from an applied language corresponding to the first language information; when the processing is chat prediction processing, the second language information is chat feedback information obtained by predicting the first language information; and the operation instruction is an instruction for language processing preset by a user side or a terminal side.2. The method of claim 1 , wherein the computation circuit further includes a register circuit and a controller circuit claim 1 , and the controlling claim 1 , by the computation circuit claim 1 , the operation circuit to obtain and call an ...

Подробнее

Номер записи: 89

27-05-2021 дата публикации

INFORMATION PROCESSING METHOD AND TERMINAL DEVICE

Номер: US20210158484A1

Автор: Chen Tianshi, Hu Shuai, Liu Shaoli, Wang Zai

Принадлежит:

Disclosed are an information processing method and a terminal device. The method comprises: acquiring first information, wherein the first information is information to be processed by a terminal device; calling an operation instruction in a calculation apparatus to calculate the first information so as to obtain second information; and outputting the second information. By means of the examples in the present disclosure, a calculation apparatus of a terminal device can be used to call an operation instruction to process first information, so as to output second information of a target desired by a user, thereby improving the information processing efficiency. The present technical solution has advantages of a fast computation speed and high efficiency. 1. An information processing method applied to a computation circuit , wherein the computation circuit includes a communication circuit and operation circuit , and the method comprises:controlling, by the computation circuit, the communication circuit to obtain a first image to be processed, wherein the first image has a resolution of a first-level size;controlling, by the computation circuit, the operation circuit to obtain and execute an operation instruction to perform resolution optimization on the first image to obtain a second image, wherein the second image has a resolution of a second-level size, the first-level size is smaller than the second-level size, and the operation instruction is a preset instruction for optimizing an image resolution2. The method of claim 1 , wherein the controlling claim 1 , by the computation circuit claim 1 , the communication circuit to obtain a first image to be processed includes:controlling, by the computation clrcuit, the communication circuit to obtain an original image to be processed input by a user, wherein the original image has a resolution of the first-level size, andcontrolling, by the computation circuit, the operation circuit to pre-process the original image to ...

Подробнее

Номер записи: 90

02-05-2019 дата публикации

APPARATUS AND METHODS FOR CIRCULAR SHIFT OPERATIONS

Номер: US20190129858A1

Автор: Chen Tianshi, Chen Yunji, LIU Daofu, Liu Shaoli, Zhang Xiao

Принадлежит:

Aspects for vector circular shifting in neural network are described herein. The aspects may include a direct memory access unit configured to receive a vector that includes multiple elements. The multiple elements are stored in a one-dimensional data structure. The direct memory access unit may store the vector in a vector caching unit. The aspects may further include an instruction caching unit configured to receive a vector shifting instruction that includes a step length for shifting the elements in the vector. Further still, the aspects may include a computation module configured to shift the elements of the vector toward one direction by the step length. 1. An apparatus for vector shifting in a neural network , comprising:a controller unit configured to receive a vector shifting instruction; and receive, in response to the vector shifting instruction, a vector that includes multiple elements, wherein the multiple elements are stored in a one-dimensional data structure, and', 'shift, in response to the vector shifting instruction, the elements of the vector toward one direction in accordance with the vector shifting instruction, wherein the vector shifting instruction includes a step length for shifting the elements in the vector., 'a computation module configured to2. The apparatus of claim 1 , wherein the computation module includes an element caching unit configured to temporarily store one or more of the elements.3. The apparatus of claim 2 , wherein the computation module is configured to duplicate a first portion of the elements to the element caching unit.4. The apparatus of claim 3 , wherein the computation module is further configured to overwrite the first portion of the elements in the vector caching unit with a second portion of the elements.5. The apparatus of claim 4 , wherein the computation module is further configured toretrieve the first portion of the elements from the element caching unit, andoverwrite the second portion of the elements ...

Подробнее

Номер записи: 91

02-05-2019 дата публикации

Apparatus and methods for backward propagation in neural networks supporting discrete data

Номер: US20190130274A1

Автор: Chen Tianshi, Chen Yunji, GUO QI, Yu Yong

Принадлежит:

Aspects for backpropagation of a multilayer neural network (MNN) in a neural network processor are described herein. The aspects may include a computation module configured to receive one or more groups of MNN data. The computation module may further include a master computation module configured to calculate an input gradient vector based on a first output gradient vector from an adjacent layer and based on a data type of each of the one or more groups of MNN data. Further still, the computation module may include one or more slave computation modules configured to parallelly calculate portions of a second output vector based on the input gradient vector calculated by the master computation module and based on the data type of each of the one or more groups of MNN data. 1. An apparatus for backpropagation of a multilayer neural network (MNN) , comprising: wherein the one or more groups of MNN data include input data and one or more weight values,', 'wherein at least a portion of the input data and the weight values are presented as discrete values, and', a master computation module configured to calculate an input gradient vector based on a first output gradient vector from an adjacent layer and based on a data type of each of the one or more groups of MNN data, and', 'one or more slave computation modules configured to parallelly calculate portions of a second output vector based on the input gradient vector calculated by the master computation module and based on the data type of each of the one or more groups of MNN data; and, 'wherein the computation module includes], 'a computation module configured to receive one or more groups of MNN data,'}a controller unit configured to decode an instruction that initiates a backpropagation process and transmit the decoded instruction to the computation module.2. The apparatus of claim 1 , wherein the interconnection unit is configured to combine the portions of the second output gradient vector to generate the second ...

Подробнее

Номер записи: 92

23-04-2020 дата публикации

COMPUTING DEVICE AND METHOD

Номер: US20200125938A1

Автор: Chen Tianshi, Du Zidong, Liu Shaoli, ZHOU Shengyuan

Принадлежит:

A computing device, comprising: a computing module, comprising one or more computing units; and a control module, comprising a computing control unit, and used for controlling shutdown of the computing unit of the computing module according to a determining condition. Also provided is a computing method. The computing device and method have the advantages of low power consumption and high flexibility, and can be combined with the upgrading mode of software, thereby further increasing the computing speed, reducing the computing amount, and reducing the computing power consumption of an accelerator. 1. A training device comprising:a data processing module configured to compress or expand input data; andan operation module connected to the data processing module and configured to receive data processed by the data processing module to perform operations.4. The training device of claim 2 , wherein the data compression unit is configured to screen and compress input data according to sparse index values of the input data to obtain data to be operated.5. The training device of claim 2 , wherein the data compression unit is configured to:make determination according to values of the input data, andscreen and compress the input data to obtain data that satisfies the compression determination condition.6. The training device of claim 4 , wherein the data compression unit is configured to screen and compress input neuron data according to sparse index values of synaptic data to obtain neuron data to be operated.7. The training device of claim 4 , wherein the data compression unit is configured to screen and compress input synaptic data according to sparse index values of neuron data to obtain synaptic data to be operated.8. The training device of claim 4 , wherein the data compression unit is configured to:compare values of synapses with a given threshold,screen and compress the synapses to obtain synaptic data which absolute values are not less than the given threshold.9. ...

Подробнее

Номер записи: 93

23-04-2020 дата публикации

IMAGE PROCESSING APPARATUS AND METHOD

Номер: US20200126554A1

Автор: Chen Tianshi, Chen Xiaobing, Hu Shuai

Принадлежит:

The present disclosure discloses an image processing device including: a receiving module configured to receive a voice signal and an image to be processed; a conversion module configured to convert the voice signal into an image processing instruction and determine a target area according to a target voice instruction conversion model, in which the target area is a processing area of the image to be processed; and a processing module configured to process the target area according to the image processing instruction and a target image processing model. The examples may realize the functionality of using voice commands to control image processing, which may save users' time spent in learning image processing software prior to image processing, and improve user experience. 1. An image processing device comprising:a voice collector configured to collect a voice signal input by users;an instruction converter configured to convert the voice signal into an image processing instruction and a target area according to a target voice instruction conversion model, wherein the target area is a processing area of the image to be processed; andan image processor configured to process the target area according to the image processing instruction and a target image processing model.3. The image processing device of claim 2 , wherein the image processor includes:an instruction fetch module configured to obtain M image processing instructions from the memory in a preset time window, anda processing module configured to process the target area according to the M image processing instructions and the target image processing model.4. The image processing device of claim 3 , wherein the processing module is configured to:delete image processing instructions with identical functions in the M image processing instructions to obtain N image processing instructions, wherein the N is an integer smaller than the M, andprocess the target area according to the N image processing instructions ...

Подробнее

Номер записи: 94

23-04-2020 дата публикации

IMAGE PROCESSING APPARATUS AND METHOD

Номер: US20200126555A1

Автор: Chen Tianshi, Chen Xiaobing, Hu Shuai

Принадлежит:

The present disclosure discloses an image processing device including: a receiving module configured to receive a voice signal and an image to be processed; a conversion module configured to convert the voice signal into an image processing instruction and determine a target area according to a target voice instruction conversion model, in which the target area is a processing area of the image to be processed; and a processing module configured to process the target area according to the image processing instruction and a target image processing model. The examples may realize the functionality of using voice commands to control image processing, which may save users' time spent in learning image processing software prior to image processing, and improve user experience. 1. An image processing device comprising:an input/output unit configured to input a voice signal and an image to be processed;a storage unit configured to store the voice signal and the image to be processed;an image processing unit configured to convert the voice signal into an image processing instruction and a target area, wherein the target area is a processing area of the image to be processed, process the target area according to the image processing instruction to obtain a processed image, and store the image to be processed into the storage unit; andthe input/output unit is further configured to output the processed image.2. The image processing device of claim 1 , wherein the storage unit includes a neuron storage unit and a weight cache unit claim 1 , and a neural network operation unit of the image processing unit includes a neural network operation subunit claim 1 ,when the neuron storage unit is configured to store the voice signal and the image to be processed, and the weight cache unit is configured to store a target voice instruction conversion model and a target image processing model, the neural network operation subunit is configured to convert the voice signal into the image ...

Подробнее

Номер записи: 95

09-05-2019 дата публикации

Apparatus and Methods for Performing Multiple Transcendental Function Operations

Номер: US20190138570A1

Автор: Chen Tianshi, Chen Yunji, Li Shangying, Zhang Shijin

Принадлежит:

The present invention discloses an apparatus and a method for performing a variety of transcendental function operations. The apparatus comprises a pre-processing unit group, a core unit and a post-processing unit group, wherein the pre-processing unit group is configured to transform an externally input independent variable a into x, y coordinates, an angle z, and other information k, and determine an operation mode to be used by the core unit; the core unit is configured to perform trigonometric or hyperbolic transformation on the x, y coordinates and the angle z, obtain transformed x′, y′ coordinates and angle z′, and output them to the post-processing unit group; and the post-processing unit group is configured to transform the x′, y′ coordinates and the angle z′ input by the core unit according to the other information k and a function f input by the pre-processing unit group to obtain an output result c. The present invention solves the problems of excessive overheads in the general-purpose processor manner and poor precision in the pure linear approximation manner, and efficiently strengthens the support for various transcendental function operations. 1. An apparatus for performing multiple transcendental function operations , comprising: 'transform an externally input independent variable a into a first coordinate, a second coordinate, an angle z, and other information k;', 'a pre-processing unit group configured toa core unit is configured to perform a transformation on the first coordinate x, the second coordinate y, and the angle z, obtain a transformed first coordinate x′, a transformed second coordinate y′, and a transformed angle z′; anda post-processing unit group is configured to transform the transformed first coordinate x′, the transformed second coordinate y′, and the transformed angle z′ input by the core unit according to the other information k and a function f input by the pre-processing unit group to obtain an output result c.2. The apparatus ...

Подробнее

Номер записи: 96

09-05-2019 дата публикации

Apparatus and methods for forward propagation in neural networks supporting discrete data

Номер: US20190138922A1

Автор: Chen Tianshi, Chen Yunji, Liu Shaoli, Yu Yong

Принадлежит:

Aspects for forward propagation of a multilayer neural network (MNN) in a neural network processor are described herein. As an example, the aspects may include a computation module that includes a master computation module and one or more slave computation modules. The master computation module may be configured to receive one or more groups of MNN data. The one or more groups of MNN data may include input data and one or more weight values and wherein at least a portion of the input data and the weight values are stored as discrete values. The one or more slave computation modules may be configured to calculate one or more groups of slave output values based on a data type of each of the one or more groups of MNN data. 1. An apparatus for forward propagation of a multilayer neural network (MNN) , comprising:a computation module that includes a master computation module and one or more slave computation modules, receive one or more groups of MNN data, wherein the one or more groups of MNN data include input data and one or more weight values and wherein at least a portion of the input data and the weight values are stored as discrete values, and', 'transmit the MNN data to an interconnection unit; and, 'wherein the master computation module configured to receive the one or more groups of MNN data, and', calculate a merged intermediate vector based on the data type of each of the one or more groups of MNN data, and', 'generate an output vector based on the merged intermediate vector; and, 'calculate one or more groups of slave output values based on a data type of each of the one or more groups of MNN data, wherein the master computation module is further configured to], 'wherein the one or more slave computation modules configured to'}a controller unit configured to transmit one or more instructions to the computation module.2. The apparatus of claim 1 , wherein the interconnection unit is configured to combine the one or more groups of slave output values to ...

Подробнее

Номер записи: 97

30-04-2020 дата публикации

Processing method and accelerating device

Номер: US20200134460A1

Автор: Tianshi CHEN, Xuda ZHOU, Zai WANG, Zidong DU

Принадлежит: Shanghai Cambricon Information Technology Co Ltd

The present disclosure provides a processing device including: a coarse-grained pruning unit configured to perform coarse-grained pruning on a weight of a neural network to obtain a pruned weight, an operation unit configured to train the neural network according to the pruned weight. The coarse-grained pruning unit is specifically configured to select M weights from the weights of the neural network through a sliding window, and when the M weights meet a preset condition, all or part of the M weights may be set to 0. The processing device can reduce the memory access while reducing the amount of computation, thereby obtaining an acceleration ratio and reducing energy consumption.

Подробнее

Номер записи: 98

16-05-2019 дата публикации

Apparatus and Methods for Vector Based Transcendental Functions

Номер: US20190146793A1

Автор: Chen Tianshi, Chen Yunji, HAN Dong, Zhang Xiao

Принадлежит:

Aspects for generating a dot product for two vectors in neural network are described herein. The aspects may include a controller unit configured to receive a transcendental function instruction that includes an address of a vector and an operation code that identifies a transcendental function. The aspects may further include a CORDIC processor configured to receive the vector that includes one or more elements based on the address of the vector in response to the transcendental function instruction. The CORDIC processor may be further configured to apply the transcendental function to each element of the vector to generate an output vector. 1. An apparatus for neural network operations , comprising:a controller unit configured to receive a transcendental function instruction that indicates an address of a vector and an operation code that identifies a transcendental function; anda CORDIC processor configured to receive the vector that includes one or more elements based on the address of the vector in response to the transcendental function instruction, wherein the CORDIC processor is further configured to apply the transcendental function to each element of the vector to generate an output vector.2. The apparatus of claim 1 ,wherein the transcendental function instruction includes one or more register IDs that identify one or more registers configured to store the address of the vector and the length of the vector,wherein the transcendental function instruction further indicates a length of the vector,wherein the CORDIC processor is configured to retrieve the vector based on the length of the vector and the address of the vector, andwherein the CORDIC processor includes one or more CORDIC modules respectively configured to apply the transcendental function to one of the one or more elements to generate a result.3. The apparatus of claim 2 ,wherein the transcendental function instruction is an instruction selected from a group consisting of an exponential ...

Подробнее

Номер записи: 99

16-05-2019 дата публикации

APPARATUS AND METHODS FOR MATRIX ADDITION AND SUBTRACTION

Номер: US20190147015A1

Автор: Chen Tianshi, Chen Yunji, Liu Shaoli, Zhang Xiao

Принадлежит:

Aspects for matrix multiplication in neural network are described herein. The aspects may include a controller unit configured to receive a matrix-addition instruction. The aspects may further include a computation module configured to receive a first matrix and a second matrix. The first matrix may include one or more first elements and the second matrix includes one or more second elements. The one or more first elements and the one or more second elements may be arranged in accordance with a two-dimensional data structure. The computation module may be further configured to respectively add each of the first elements to each of the second elements based on a correspondence in the two-dimensional data structure to generate one or more third elements for a third matrix. 1. An apparatus of matrix operations in a neural network , comprising:a controller unit configured to receive a matrix-addition instruction that indicates a first address of a first matrix and a second address of a second matrix; and [ wherein the first matrix includes one or more first elements and the second matrix includes one or more second elements, and', 'wherein the one or more first elements and the one or more second elements are arranged in accordance with a two-dimensional data structure, and, 'retrieve the first matrix and the second matrix from a storage device based on the first address of the first matrix and the second address of the second matrix,'}, 'respectively add each of the first elements to each of the second elements based on a correspondence in the two-dimensional data structure in accordance with the matrix-addition instruction to generate one or more third elements for a third matrix., 'a computation module configured to2. The apparatus of claim 1 , wherein the computation module includes a data controller configured to select a first portion of the first elements and a second portion of the second elements.3. The apparatus of claim 2 , wherein the computation module ...

Подробнее

Номер записи: 100

17-06-2021 дата публикации

INFORMATION PROCESSING METHOD AND TERMINAL DEVICE

Номер: US20210182077A1

Автор: Chen Tianshi, Hu Shuai, Liu Shaoli, Wang Zai

Принадлежит:

Disclosed are an information processing method and a terminal device. The method comprises: acquiring first information, wherein the first information is information to be processed by a terminal device; calling an operation instruction in a calculation apparatus to calculate the first information so as to obtain second information; and outputting the second information. By means of the embodiments in the present disclosure, a calculation apparatus of a terminal device can be used to call an operation instruction to process first information, so as to output second information of a target desired by a user, thereby improving the information processing efficiency. 1. An information processing method , wherein the method is applied to a terminal device that includes a computation device , and the computation device stores an instruction set which includes at least one operation instruction , and the method includes:obtaining first information, wherein the first information is to be processed by the terminal device;calling the operation instruction in the computation device to process the first information to obtain second information; andoutputting the second information.2. The method of claim 1 , wherein the obtaining the first information includes:pre-processing raw information to obtain the first information, wherein the first information is in a preset format, and the pre-processing includes at least one of: data deduplication, data encoding, data conversion, and normalization.3. The method of claim 1 , wherein the operation instruction includes at least one of: a matrix-multiply-vector instruction claim 1 , a vector-multiply-matrix instruction claim 1 , a matrix-multiply-scalar instruction claim 1 , a tensor operation instruction claim 1 , a matrix addition instruction claim 1 , a matrix subtraction instruction claim 1 , a matrix retrieving instruction claim 1 , a matrix loading instruction claim 1 , a matrix saving instruction claim 1 , and a matrix moving ...

Подробнее

Номер записи: 101

24-06-2021 дата публикации

INFORMATION PROCESSING METHOD AND TERMINAL DEVICE

Номер: US20210192245A1

Автор: Chen Tianshi, Hu Shuai, Liu Shaoli, Wang Zai

Принадлежит:

Disclosed are an information processing method and a terminal device. The method comprises: acquiring first information, wherein the first information is information to be processed by a terminal device; calling an operation instruction in a calculation apparatus to calculate the first information so as to obtain second information; and outputting the second information. By means of the examples in the present disclosure, a calculation apparatus of a terminal device can be used to call an operation instruction to process first information, so as to output second information of a target desired by a user, thereby improving the information processing efficiency. The present technical solution has advantages of a fast computation speed and high efficiency. 1. An information processing method applied to a computation circuit , wherein the computation circuit includes a communication circuit and an operation circuit , and the method comprises:controlling, by the computation circuit, the communication circuit to obtain a target image to be processed, wherein the target image includes a target object to be identified; andcontrolling, by the computation circuit, the operation circuit to obtain and call an operation instruction to perform object detection and identification on the target image, so as to obtain a target detection result, whereinthe target detection result is used to indicate the target object in the target image, and the operation instruction is a pre-stored instruction for object detection and identification.2. The method of claim 1 , wherein the computation circuit further includes a register circuit and a controller circuit claim 1 , and the controlling claim 1 , by the computation circuit claim 1 , the operation circuit to obtain an operation instruction to perform object detection and identification on the target image claim 1 , so as to obtain a target detection result includes:controlling, by the computation circuit, the controller circuit to fetch a ...

Подробнее

Номер записи: 102