Поиск патентов

Настройки

Глубина выборки

Укажите год

Небесная энциклопедия

Космические корабли и станции, автоматические КА и методы их проектирования, бортовые комплексы управления, системы и средства жизнеобеспечения, особенности технологии производства ракетно-космических систем

Подробнее

Мониторинг СМИ

Мониторинг СМИ и социальных сетей. Сканирование интернета, новостных сайтов, специализированных контентных площадок на базе мессенджеров. Гибкие настройки фильтров и первоначальных источников.

Подробнее

Форма поиска

Ключевые слова. Может быть несколько по одной на строку

Поддерживает ввод нескольких поисковых фраз (по одной на строку). При поиске обеспечивает поддержку морфологии русского и английского языка

Автор

Ведите корректный номера.

Владелец

Ведите корректный номера.

Классы IPC

Ведите корректный номера.

Классы CPC

Ведите корректный номера.

Начиная с года

Укажите год

Заканчивая годом

Укажите год

Применить Всего найдено 5554. Отображено 100.

26-01-2012 дата публикации

Generating Hardware Events Via the Instruction Stream for Microprocessor Verification

Номер: US20120023315A1

Автор: Bryan Glen Hickerson, Christopher Lee Colletti, Michael Joseph Schiffli

Принадлежит: International Business Machines Corp

A processor receives an instruction operation (OP) code from a verification system. The instruction OP code includes instruction bits and forced event bits. The processor identifies a forced event based upon the forced event bits, which is unrelated to an instruction that corresponds to the instruction bits. In turn, the processor executes the forced event.

Подробнее

Номер записи: 1

09-02-2012 дата публикации

Event-based bandwidth allocation mode switching method and apparatus

Номер: US20120036518A1

Автор: Jack Kang, Yu-Chi Chuang

Принадлежит: Jack Kang, Yu-Chi Chuang

A system, apparatus, and method for allocation mode switching on an event-driven basis are described herein. The allocation mode switching method includes detecting an event, selecting a bandwidth allocation mode associated with the detected event, and allocating a plurality of execution cycles of an instruction execution period of a processor core among a plurality of instruction execution threads based at least in part on the selected bandwidth allocation mode. Other embodiments may be described and claimed.

Подробнее

Номер записи: 2

29-03-2012 дата публикации

Debugging of a data processing apparatus

Номер: US20120079458A1

Автор: Michael John Williams, Richard Roy Grisenthwaite

Принадлежит: ARM LTD

A data processing apparatus is provided comprising processing circuitry and instruction decoding circuitry. The data processing apparatus is capable of operating at a plurality of different privilege. Processing circuitry of the data processing apparatus imposes on program instructions different access permissions to at least one of a memory and a set of registers at different ones of the different privilege levels. A debug privilege-level switching instruction is provided and decoding circuitry is responsive to this instruction to switch the processing circuitry from a current privilege level to a target privilege level if the processing circuitry is in a debug mode. However, if the processing circuitry is in a non-debug mode the instruction decoding circuitry prevents execution of the privilege-level switching instruction regardless of the current privilege level.

Подробнее

Номер записи: 3

12-04-2012 дата публикации

Decoding instructions from multiple instructions sets

Номер: US20120089818A1

Автор: Simon John Craske

Принадлежит: ARM LTD

A data processing apparatus, method and computer program are described that are capable of decoding instructions from different instruction sets. The method comprising: receiving an instruction; if an operation code of said instruction is an operation code of an instruction from a base set of instructions decoding said instruction according to decode rules for said base set of instructions; and if said operation code of said instruction is an operation code of an instruction from at least one further set of instructions decoding said instruction according to a set of decode rules determined by an indicator value indicating which of said at least one further set of instructions is currently to be decoded.

Подробнее

Номер записи: 4

21-06-2012 дата публикации

System and method for performing deterministic processing

Номер: US20120159131A1

Автор: Barry Edward Blancha, Carlos Heil, Paulo Mendes

Принадлежит: Bin1 ATE LLC

A system and method is provided for performing deterministic processing on a non-deterministic computer system. In one example, the system forces execution of one or more computer instructions to execute within a constant execution time. A deterministic engine, if necessary, waits a variable amount of time to ensure that the execution of the computer instructions is performed over the constant execution time. Because the execution time is constant, the execution is deterministic and therefore may be used in applications requiring deterministic behavior. For example, such a deterministic engine may be used in automated test equipment (ATE) applications.

Подробнее

Номер записи: 5

28-06-2012 дата публикации

Executing a Perform Frame Management Instruction

Номер: US20120166758A1

Автор: Charles W Gainey, JR., Damian L. Osisek, Dan F. Greiner, Gustav E. Sittmann, Lisa C. Heller, Timothy J. Slegel

Принадлежит: International Business Machines Corp

What is disclosed is a frame management function defined for a machine architecture of a computer system. In one embodiment, a frame management instruction is obtained which identifies a first and second general register. The first general register contains a frame management field having a key field with access-protection bits and a block-size indication. If the block-size indication indicates a large block then an operand address of a large block of data is obtained from the second general register. The large block of data has a plurality of small blocks each of which is associated with a corresponding storage key having a plurality of storage key access-protection bits. If the block size indication indicates a large block, the storage key access-protection bits of each corresponding storage key of each small block within the large block is set with the access-protection bits of the key field.

Подробнее

Номер записи: 6

19-07-2012 дата публикации

Apparatus and method for compressing trace data

Номер: US20120185675A1

Автор: Chang-Moo Kim, Dong-hoon Yoo, Hee-Jun Shim, Jae-Young Kim, Yeon-gon Cho

Принадлежит: SAMSUNG ELECTRONICS CO LTD

An apparatus and method for compressing trace data is provided. The apparatus includes a detection unit configured to detect trace data corresponding to one or more function units performing a substantially significant operation in a reconfigurable processor as valid trace data, and a compression unit configured to compress the valid trace data.

Подробнее

Номер записи: 7

09-08-2012 дата публикации

Embedded opcode within an intermediate value passed between instructions

Номер: US20120204006A1

Автор: Jorn Nystad

Принадлежит: ARM LTD

A data processing system 2 is used to evaluate a data processing function by executing a sequence of program instructions including an intermediate value generating instruction Inst 0 and an intermediate value consuming instruction Inst 1 . In dependence upon one or more input operands to the evaluation, an embedded opcode within the intermediate value passed between the intermediate value generating instruction and the intermediate value consuming instruction may be set to have a value indicating that a substitute instruction should be used in place of the intermediate value consuming instruction. The instructions may be floating point instructions, such as a floating point power instruction evaluating the data processing function a b .

Подробнее

Номер записи: 8

09-08-2012 дата публикации

Configurable pipeline based on error detection mode in a data processing system

Номер: US20120204012A1

Автор: Jeffrey W. Scott, William C. Moyer

Принадлежит: RAMBUS INC

A method includes providing a data processor having an instruction pipeline, where the instruction pipeline has a plurality of instruction pipeline stages, and where the plurality of instruction pipeline stages includes a first instruction pipeline stage and a second instruction pipeline stage. The method further includes providing a data processor instruction that causes the data processor to perform a first set of computational operations during execution of the data processor instruction, performing the first set of computational operations in the first instruction pipeline stage if the data processor instruction is being executed and a first mode has been selected, and performing the first set of computational operations in the second instruction pipeline stage if the data processor instruction is being executed and a second mode has been selected.

Подробнее

Номер записи: 9

11-10-2012 дата публикации

Load multiple and store multiple instructions in a microprocessor that emulates banked registers

Номер: US20120260042A1

Автор: G. Glenn Henry, Rodney E. Hooker, Terry Parks

Принадлежит: Via Technologies Inc

A microprocessor supports an instruction set architecture that specifies: processor modes, architectural registers associated with each mode, and a load multiple instruction that instructs the microprocessor to load data from memory into specified ones of the registers. Direct storage holds data associated with a first portion of the registers and is coupled to an execution unit to provide the data thereto. Indirect storage holds data associated with a second portion of the registers and cannot directly provide the data to the execution unit. Which architectural registers are in the first and second portions varies dynamically based upon the current processor mode. If a specified register is currently in the first portion, the microprocessor loads data from memory into the direct storage, whereas if in the second portion, the microprocessor loads data from memory into the direct storage and then stores the data from the direct storage to the indirect storage.

Подробнее

Номер записи: 10

11-10-2012 дата публикации

Heterogeneous isa microprocessor that preserves non-isa-specific configuration state when reset to different isa

Номер: US20120260066A1

Автор: G. Glenn Henry, Rodney E. Hooker, Terry Parks

Принадлежит: Via Technologies Inc

A microprocessor capable of operating as both an x86 ISA and an ARM ISA microprocessor includes first, second, and third storage that stores x86 ISA-specific, ARM ISA-specific, and non-ISA-specific state, respectively. When reset, the microprocessor initializes the first storage to default values specified by the x86 ISA, initializes the second storage to default values specified by the ARM ISA, initializes the third storage to predetermined values, and begins fetching instructions of a first ISA. The first ISA is the x86 ISA or the ARM ISA and a second ISA is the other ISA. The microprocessor updates the third storage in response to the first ISA instructions. In response to a subsequent one of the first ISA instructions that instructs the microprocessor to reset to the second ISA, the microprocessor refrains from modifying the non-ISA-specific state stored in the third storage and begins fetching instructions of the second ISA.

Подробнее

Номер записи: 11

22-11-2012 дата публикации

Electronic Device and Method for Data Processing Using Virtual Register Mode

Номер: US20120297165A1

Автор: Marko Krüger, Markus Kösler, Steven Bartling

Принадлежит: Texas Instruments Inc

The invention relates to an electronic device for data processing, which includes an execution unit with a temporary register, a register file, a first feedback path from the data output of the execution unit to the register file, a second feedback path from the data output of the execution unit to the temporary register, a switch configured to connect the first feedback path and/or the second feedback path, and a logic stage coupled to control the switch. The control stage is configured to control the switch to connect the second feedback path if the data output of an execution unit is used as an operand in the subsequent operation of an execution unit.

Подробнее

Номер записи: 12

27-12-2012 дата публикации

Compressed instruction format

Номер: US20120331271A1

Автор: Bret Toll, Doron Orenstien, Robert Valentine

Принадлежит: Individual

A technique for decoding an instruction in an a variable-length instruction set. In one embodiment, an instruction encoding is described, in which legacy, present, and future instruction set extensions are supported, and increased functionality is provided, without expanding the code size and, in some cases, reducing the code size.

Подробнее

Номер записи: 13

27-12-2012 дата публикации

System and method for compiling machine-executable code generated from a sequentially ordered plurality of processor instructions

Номер: US20120331451A1

Автор: Robert Keith Mykland

Принадлежит: Robert Keith Mykland

A method and system are provided for deriving a resultant software program from an originating software program having overlapping branches, wherein the resultant software project has either no overlapping branches or fewer overlapping branches than the originating software program. A preferred embodiment of the invented method generates a resultant software program that has no overlapping branches. The resultant software is more easily converted into programming reconfigurable logic than the originating software program. Separate and individually applicable aspects of the invented method are used to eliminate all four possible states of two overlapping branches, i.e., forward branch overlapping forward branch, back branch overlapping back branch, and each of the two possible and distinguishable states of forward branch and back branch overlap. One or more elements of each aspect of the invention may be performed by one or more computers or processors, or by means of a computer or a communications network.

Подробнее

Номер записи: 14

28-02-2013 дата публикации

Clock data recovery circuit and clock data recovery method

Номер: US20130054941A1

Автор: Masayuki Tsuji

Принадлежит: Fujitsu Semiconductor Ltd

A processor includes: an arithmetic unit configured to execute instructions; an instruction decode part configured to decode the instructions executed in the arithmetic unit and to output opcodes; and an interrupt register configured to receive interrupt signals, wherein the instruction decode part includes an instruction code map that stores the opcodes in correspondence to instructions and outputs the opcodes in accordance with the instructions inputted, and the instruction code map stores a plurality of sets of opcodes to be output as switch opcodes corresponding to additional instructions, the additional instructions are a part of the instructions, and switches the sets of the switch opcodes in accordance with the interrupt signal.

Подробнее

Номер записи: 15

28-03-2013 дата публикации

Processor and instruction processing method in processor

Номер: US20130080747A1

Автор: Young-Su Kwon

Принадлежит: Electronics and Telecommunications Research Institute ETRI

The present invention relates to a processor including: an instruction cache configured to store at least some of first instructions stored in an external memory and second instructions each including a plurality of micro instructions; a micro cache configured to store third instructions corresponding to the plurality of micro instructions included in the second instructions; and a core configured to read out the first and second instructions from the instruction cache and perform calculation, in which the core performs calculation by the first instructions from the instruction cache under a normal mode, and when the process enters a micro instruction mode, the core performs calculation by the third instructions corresponding to the plurality of micro instructions provided from the micro cache.

Подробнее

Номер записи: 16

18-04-2013 дата публикации

CONDITIONAL COMPARE INSTRUCTION

Номер: US20130097408A1

Автор: CRASKE Simon John, Seal David James

Принадлежит: ARM LIMITED

An instruction decoder () is responsive to a conditional compare instruction to generate control signals for controlling processing circuitry () to perform a conditional compare operation. The conditional compare operation comprises: (i) if a current condition state of the processing circuitry () passes a test condition, then performing a compare operation on a first operand and a second operand and setting the current condition state to a result condition state generated during the compare operation; and (ii) if the current condition state fails the test condition, then setting the current condition state to a fail condition state specified by the conditional compare instruction. The conditional compare instruction can be used to represent chained sequences of comparison operations where each individual comparison operation may test a different kind of relation between a pair of operands. 2. The data processing apparatus according to claim 1 , wherein said status store comprises a status register.3. The data processing apparatus according to claim 1 , wherein said current condition state comprises the value of at least one condition code flag stored within said status store.4. The data processing apparatus according to claim 1 , wherein said conditional compare instruction includes a field for specifying said test condition.5. The data processing apparatus according to claim 1 , wherein said fail condition state is specified as an immediate value by said conditional compare instruction.6. The data processing apparatus according to claim 5 , wherein said immediate value is a programmable value set by the programmer of a program comprising said conditional compare instruction.7. The data processing apparatus according to claim 5 , wherein said immediate value is a programmable value set by a compiler of a program comprising said conditional compare instruction claim 5 , said compiler selecting said programmable value in dependence on a desired condition that is to be ...

Подробнее

Номер записи: 17

25-04-2013 дата публикации

Data processing device and method, and processor unit of same

Номер: US20130103930A1

Автор: Takashi Horikawa

Принадлежит: NEC Corp

A processor unit ( 200 ) includes: cache memory ( 210 ); an instruction execution unit ( 220 ); a processing unit ( 230 ) that detects fact that a thread enters an exclusive control section which is specified in advance to become a bottleneck; a processing unit ( 240 ) that detects a fact that the thread exits the exclusive control section; and an execution flag ( 250 ) that indicates whether there is the thread that is executing a process in the exclusive control section based on detection results. The cache memory ( 210 ) temporarily stores a priority flag in each cache entry, and the priority flag indicates whether data is to be used during execution in the exclusive control section. When the execution flag ( 250 ) is set, the processor unit ( 200 ) sets the priority flag that belongs to an access target of cache entries. The processor unit ( 200 ) leaves data used in the exclusive control section in the cache memory by determining a replacement target of cache entries using the priority flag when a cache miss occurs.

Подробнее

Номер записи: 18

09-05-2013 дата публикации

Method and Apparatus for Unpacking Packed Data

Номер: US20130117538A1

Автор: Eitan Benny, Mennemeier Larry M., Mittal Millind, Peleg Alexander, Yaari Yaakov

Принадлежит:

An apparatus includes an instruction decoder, first and second source registers and a circuit coupled to the decoder to receive packed data from the source registers and to unpack the packed data responsive to an unpack instruction received by the decoder. A first packed data element and a third packed data element are received from the first source register. A second packed data element and a fourth packed data element are received from the second source register. The circuit copies the packed data elements into a destination register resulting with the second packed data element adjacent to the first packed data element, the third packed data element adjacent to the second packed data element, and the fourth packed data element adjacent to the third packed data element. 137-. (canceled)38. A system comprising:a processing system to support: 2D/3D graphics, image processing, video compression, video decompression, and audio manipulation; wherein the processing system is further to be coupled with a display device and to be coupled with an input device, the processing system comprising a processor including,a register file including a first register to hold a first packed data including a first low data element and a first high data element, a second register to hold a second packed data including a second low data element and a second high data element, and a third register;a decoder to decode a first unpack instruction;a functional unit coupled the decoder and the register file, the functional unit, in response to the decoder decoding the first unpack instruction to transfer the first low data element to a high position of the third register and the second low data element to the low position of the third register.39. The system of claim 38 , wherein the decoder is further to decode a second unpack instruction claim 38 , and wherein the functional unit claim 38 , in response to the decode decoding the second unpack instruction claim 38 , to transfer the first high ...

Подробнее

Номер записи: 19

16-05-2013 дата публикации

Method and Apparatus for Unpacking Packed Data

Номер: US20130124830A1

Автор: Eitan Benney, Mennemeier Larry M., Mittal Millind, Peleg Alexander, Yaari Yaakov

Принадлежит:

An apparatus includes an instruction decoder, first and second source registers and a circuit coupled to the decoder to receive packed data from the source registers and to unpack the packed data responsive to an unpack instruction received by the decoder. A first packed data element and a third packed data element are received from the first source register. A second packed data element and a fourth packed data element are received from the second source register. The circuit copies the packed data elements into a destination register resulting with the second packed data element adjacent to the first packed data element, the third packed data element adjacent to the second packed data element, and the fourth packed data element adjacent to the third packed data element. 137-. (canceled)38. A computing device comprising:a communication busa cache;a decoder operable to decode instructions specifying data manipulation operations;a register file comprising:a first set of registers operable to store 32-bit integer data; anda second set of registers operable to store a first packed data and a second packed data respectively including a first plurality of data elements and a second plurality of data elements;a functional unit coupled to the cache, the decoder, the register file, and the communication bus, and operable to execute decoded instructions specifying data manipulation operations, including:a first move instruction that, when executed by the functional unit, causes data to be transferred between a first packed data register and a second packed data register;a second move instruction that, when executed by the functional unit, causes data to be transferred between the first packed data register and a main memory;a third move instruction that, when executed by the functional unit, causes data to be transferred between the first packed data register and a 32-bit register; andan unpack instruction that, when executed by the functional unit, causes data elements from ...

Подробнее

Номер записи: 20

16-05-2013 дата публикации

Method and Apparatus for Unpacking Packed Data

Номер: US20130124832A1

Автор: Eitan Benney, Mennemeier Larry M., Mittal Millind, Peleg Alexander, Yaari Yaakov

Принадлежит:

An apparatus includes an instruction decoder, first and second source registers and a circuit coupled to the decoder to receive packed data from the source registers and to unpack the packed data responsive to an unpack instruction received by the decoder. A first packed data element and a third packed data element are received from the first source register. A second packed data element and a fourth packed data element are received from the second source register. The circuit copies the packed data elements into a destination register resulting with the second packed data element adjacent to the first packed data element, the third packed data element adjacent to the second packed data element, and the fourth packed data element adjacent to the third packed data element. 137-. (canceled)38. A processor comprising:a register file including at least a first register to hold a first data element, a second data element, a third data element, and a fourth data element; a second register to hold a fifth data element, a sixth data element, a seventh data element, and an eighth data element, and a third register;a decoder to decode a packed instruction, the packed in instruction to identify the first and the second registers as source registers and the third register as a destination register;a functional unit coupled to the register file and the decoder, the functional unit, responsive to the first instruction, to store the sixth data element to a least significant portion of the third register, the eighth data element to a second least significant portion of the third register, the second data element to a third least significant portion of the third register, and the fourth data element to the fourth least significant portion of the third register.39. The processor of claim 38 , wherein the first claim 38 , second claim 38 , and third registers are to hold 32-bits claim 38 , and wherein each of the first through eight data elements are to be 8 bits in size.40. The ...

Подробнее

Номер записи: 21

16-05-2013 дата публикации

Method and Apparatus for Unpacking Packed Data

Номер: US20130124833A1

Автор: Eitan Benny, Mennemeier Larry M., Mittal Millind, Peleg Alexander, Yaari Yaakov

Принадлежит:

An apparatus includes an instruction decoder, first and second source registers and a circuit coupled to the decoder to receive packed data from the source registers and to unpack the packed data responsive to an unpack instruction received by the decoder. A first packed data element and a third packed data element are received from the first source register. A second packed data element and a fourth packed data element are received from the second source register. The circuit copies the packed data elements into a destination register resulting with the second packed data element adjacent to the first packed data element, the third packed data element adjacent to the second packed data element, and the fourth packed data element adjacent to the third packed data element. 137-. (canceled)38. A processor comprising:a register file having a plurality of registers;a decoder coupled to the register file, the decoder to decode a first instruction, the first instruction to:have a 32-bit instruction format, include a first field to identify a first source register of the register file that is to store a first plurality of packed 8-bit integers,include a second field to identify a second source register of the register file that is to store a second plurality of packed 8-bit integers, andinclude a third field to identify a destination register,wherein each of the first and second pluralities of packed 8-bit integers are to include four packed 8-bit integers; anda functional unit including circuitry coupled to the decoder, the functional unit to generate a result that is to be stored in the destination register responsive to the first instruction, the destination register to have a same number of bits as the first and second source registers, wherein the result is to include a third plurality of packed 8-bit integers, the third plurality to include four packed 8-bit integers,the third plurality of packed 8-bit integers to include half of the 8-bit integers from the first ...

Подробнее

Номер записи: 22

16-05-2013 дата публикации

Method and Apparatus for Unpacking Packed Data

Номер: US20130124834A1

Автор: Eitan Benny, Mennemeier Larry M., Mittal Millind, Peleg Alexander, Yaari Yaakov

Принадлежит:

An apparatus includes an instruction decoder, first and second source registers and a circuit coupled to the decoder to receive packed data from the source registers and to unpack the packed data responsive to an unpack instruction received by the decoder. A first packed data element and a third packed data element are received from the first source register. A second packed data element and a fourth packed data element are received from the second source register. The circuit copies the packed data elements into a destination register resulting with the second packed data element adjacent to the first packed data element, the third packed data element adjacent to the second packed data element, and the fourth packed data element adjacent to the third packed data element. 137-. (canceled)38. A processor comprising:a register file including at least a first register to hold a first data element, a second data element, a third data element, and a fourth data element; a second register to hold a fifth data element, a sixth data element, a seventh data element, and an eighth data element, and a third register;a decoder to decode a packed instruction, the packed in instruction to identify the first and the second registers as source registers and the third register as a destination register;a functional unit coupled to the register file and the decoder, the functional unit, responsive to the first instruction, to store the fifth data element to a least significant portion of the third register, the seventh data element to a second least significant portion of the third register, the first data element to a third least significant portion of the third register, and the third data element to the fourth least significant portion of the third register.39. The processor of claim 38 , wherein the first claim 38 , second claim 38 , and third registers are to hold 32-bits claim 38 , and wherein each of the first through eight data elements are to be 8 bits in size.40. The ...

Подробнее

Номер записи: 23

16-05-2013 дата публикации

Method and Apparatus for Packing Packed Data

Номер: US20130124835A1

Автор: Eitan Benny, Mennemeier Larry M., Mittal Millind, Peleg Alexander, Yaari Yaakov

Принадлежит:

An apparatus includes an instruction decoder, first and second source registers and a circuit coupled to the decoder to receive packed data from the source registers and to unpack the packed data responsive to an unpack instruction received by the decoder. A first packed data element and a third packed data element are received from the first source register. A second packed data element and a fourth packed data element are received from the second source register. The circuit copies the packed data elements into a destination register resulting with the second packed data element adjacent to the first packed data element, the third packed data element adjacent to the second packed data element, and the fourth packed data element adjacent to the third packed data element. 137-. (canceled)38. A processor comprising:a register file having a plurality of registers;a decoder coupled with the register file, the decoder to decode a first instruction, the first instruction having a 32-bit instruction format, the first instruction having a first field to specify a first source register of the register file having a first plurality of packed signed 16-bit integers and a second field to specify a second source register of the register file having a second plurality of packed signed 16-bit integers; anda functional unit including circuitry coupled with the decoder, the functional unit to generate a result according to the first instruction that is to be stored in a destination register specified by a third field of the first instruction, the result including a third plurality of packed 8-bit integers,the third plurality of the packed 8-bit integers including an 8-bit integer for each 16-bit integer in the first plurality of the packed signed 16-bit integers, and an 8-bit integer for each 16-bit integer in the second plurality of the packed signed 16-bit integers,the 8-bit integers corresponding to the first plurality of the packed signed 16-bit integers next to one another in ...

Подробнее

Номер записи: 24

23-05-2013 дата публикации

Method of compressing and decompressing an executable or interpretable program

Номер: US20130132710A1

Автор: Jean-Roch Coulon, Jorge Perez, Sylvere Teissier

Принадлежит: Invia SAS

The method of compressing and decompressing an executable program, can be executed by a microprocessor or interpreted by an interpreter of an integrated circuit device: instructions are reformatted into the format of an initial set of instructions of said program for obtaining instructions in the format of an intermediate set of instructions; repetition templates in the program are determined and, for each repetition template, a pair is defined, formed of said repetition template and of an instruction in the format of a set of instructions; intermediate instructions are replaced by compressed instructions and the links of the compressed program are modified; the compressed program is stored in a memory of the device; and the compressed program is decompressed and the initial instructions are executed by said microprocessor or interpreted by said interpreter. The invention applies, in particular, to the integrated circuits of embedded devices.

Подробнее

Номер записи: 25

06-06-2013 дата публикации

SYSTEM AND METHOD FOR PERFORMING A BRANCH OBJECT CONVERSION TO PROGRAM CONFIGURABLE LOGIC CIRCUITRY

Номер: US20130145134A1

Автор: MYKLAND ROBERT KEITH

Принадлежит:

A method and system are provided for deriving a resultant software code from an originating ordered list of instructions that does not include overlapping branch logic. The method may include deriving a plurality of unordered software constructs from a sequence of processor instructions; associating software constructs in accordance with an original logic of the sequence of processor instructions; determining and resolving memory precedence conflicts within the associated plurality of software constructs; resolving forward branch logic structures into conditional logic constructs; resolving back branch logic structures into loop logic constructs; and/or applying the plurality of unordered software constructs in a programming operation by a parallel execution logic circuitry. The resultant plurality of unordered software constructs may be converted into programming reconfigurable logic, computers or processors, and also by means of a computer network or an electronics communications network. 1. A method for forming logic circuits from a software encoded logic , the method comprising:a. Selecting an ordered list of instructions having no overlapping branch logic;b. Converting the ordered list of instructions into an unordered plurality of software-encoded logic constructs (“constructs”), wherein the plurality of constructs encodes all necessary opcode information and dependency information of the ordered list of instructions; andc. Applying the plurality of constructs to the design of a digital logic circuit, whereby an internal connectivity and a structure of the digital logic circuit are formed to embody at least some of the opcode information and dependency information of the software encoded logic.2. The method of claim 1 , wherein the dependency information embodied by the digital logic circuit includes at least one data dependency.3. The method of claim 1 , wherein the dependency information embodied by the digital logic circuit includes at least one logic ...

Подробнее

Номер записи: 26

13-06-2013 дата публикации

Securing microprocessors against information leakage and physical tampering

Номер: US20130151865A1

Автор: Csaba Andras Moritz, Kristopher Carver, Saurabh Chheda

Принадлежит: BlueRISC Inc

A processor system comprising: performing a compilation process on a computer program; encoding an instruction with a selected encoding; encoding the security mutation information in an instruction set architecture of a processor; and executing a compiled computer program in the processor using an added mutation instruction, wherein executing comprises executing a mutation instruction to enable decoding another instruction. A processor system with a random instruction encoding and randomized execution, providing effective defense against offline and runtime security attacks including software and hardware reverse engineering, invasive microprobing, fault injection, and high-order differential and electromagnetic power analysis.

Подробнее

Номер записи: 27

20-06-2013 дата публикации

Instruction set architecture with extended register addressing

Номер: US20130159676A1

Автор: Adam J. Muff, Matthew R. Tubbs, Paul E. Schardt, Robert A. Shearer

Принадлежит: International Business Machines Corp

A method and circuit arrangement selectively repurpose bits from a primary opcode portion of an instruction for use in decoding one or more operands for the instruction. Decode logic of a processor, for example, may be placed in a predetermined mode that decodes a primary opcode for an instruction that is different from that specified in the primary opcode portion of the instruction, and then utilize one or more bits in the primary opcode portion to decode one or more operands for the instruction. By doing so, additional space is freed up in the instruction to support a larger register file and/or additional instruction types, e.g., as specified by a secondary or extended opcode.

Подробнее

Номер записи: 28

27-06-2013 дата публикации

Apparatus comprising a plurality of arithmetic logic units

Номер: US20130166890A1

Автор: David Smith

Принадлежит: STMicroelectronics Research and Development Ltd

An arrangement of at least two arithmetic logic units carries out an operation defined by a decoded instruction including at least one operand and more than one operation code. The operation codes and at least one operand are received and corresponding executions are performed by the arithmetic logic units on a single clock cycle. The result of the execution from one arithmetic logic unit is used as an operand by a further arithmetic logic unit. The decoding of the instruction is performed in an immediately preceding single clock cycle.

Подробнее

Номер записи: 29

25-07-2013 дата публикации

INSTRUCTIONS AND LOGIC TO PERFORM MASK LOAD AND STORE OPERATIONS

Номер: US20130191615A1

Автор: Eitan Benny, ORENSTIEN DORON, Sperber Zeev, Valentine Robert

Принадлежит:

In one embodiment, logic is provided to receive and execute a mask move instruction to transfer a vector data element including a plurality of packed data elements from a source location to a destination location, subject to mask information for the instruction. Other embodiments are described and claimed. 1. A processor comprising:a decoder to decode instruction;a storage to store decoded instructions; anda logic to receive and execute a mask move instruction to transfer a vector data element including a plurality of packed data elements from a source location to a destination location, wherein the mask move instruction is to be executed subject to mask information in a vector mask register.2. The processor of claim 1 , further comprising a register file comprising a plurality of extended registers each to store a vector data element including a plurality of packed data elements claim 1 , and a control register to store the mask information.3. The processor of claim 2 , wherein the processor further comprises a memory subsystem having a store buffer including a plurality of entries each to store a pending instruction claim 2 , a destination identifier claim 2 , a source identifier claim 2 , and claim 2 , if the pending instruction is a mask store instruction claim 2 , the mask information.4. The processor of claim 1 , wherein the mask move instruction is a mask load instruction including an opcode claim 1 , a source identifier and a destination identifier claim 1 , and wherein the logic is to access the vector mask register responsive to the mask load instruction to obtain the mask information.5. The processor of claim 4 , wherein the logic is to access a first bit of each of a plurality of fields of the vector mask register to obtain the mask information claim 4 , wherein each first bit is a mask value for a corresponding one of the plurality of packed data elements of a vector data element.6. The processor of claim 1 , wherein the mask move instruction is a mask ...

Подробнее

Номер записи: 30

25-07-2013 дата публикации

INSTRUCTION CONTROL CIRCUIT, PROCESSOR, AND INSTRUCTION CONTROL METHOD

Номер: US20130191616A1

Автор: Nishikawa Takashi

Принадлежит: FUJITSU SEMICONDUCTOR LIMITED

In a vector processing device, a data dependence detecting unit detects a data dependence relation between a preceding instruction and a succeeding instruction which are inputted from an instruction buffer, and an instruction issuance control unit controls issuance of an instruction based on a detection result thereof. When there is a data dependence relation between the preceding instruction and the succeeding instruction, the instruction issuance control unit generates a new instruction equivalent to processing related to a vector register including the data dependence relation with the succeeding instruction in processing executed by the preceding instruction and issues the new instruction between the preceding instruction and the succeeding instruction, and thereby a data hazard can be avoided between the preceding instruction and the succeeding instruction without making a stall occur. 1. An instruction control circuit of a vector processing device , the instruction control circuit comprising:an instruction buffer which stores a plurality of instructions;a data dependence detecting unit which detects a data dependence relation between a preceding instruction and a succeeding instruction which succeeds the preceding instruction among the plurality of instructions inputted from the instruction buffer; andan instruction issuance control unit which controls issuance of an instruction based on a detection result in the data dependence detecting unit,wherein the instruction issuance control unitgenerates a new instruction including a same instruction type as the preceding instruction when there is a data dependence relation between the preceding instruction and the succeeding instruction, and issues the generated new instruction between the preceding instruction and the succeeding instruction and, in the generation of the new instruction,determines identification information of a second register of the new instruction from identification information of a first ...

Подробнее

Номер записи: 31

01-08-2013 дата публикации

Major branch instructions

Номер: US20130198492A1

Автор: Brian R. Prasky, Christopher A. Krygowski, Chung-Lung K. Shum, Fadi Y. Busaba, Steven R. Carlough

Принадлежит: International Business Machines Corp

Major branch instructions are provided that enable execution of a computer program to branch from one segment of code to another segment of code. These instructions also create a new stream of processing at the other segment of code enabling execution of the other segment of code to be performed in parallel with the segment of code from which the branch was taken. In one example, the other stream of processing starts a transaction for processing instructions of the other stream of processing.

Подробнее

Номер записи: 32

01-08-2013 дата публикации

Major branch instructions

Номер: US20130198496A1

Автор: Brian R. Prasky, Christopher A. Krygowski, Chung-Lung K. Shum, Fadi Y. Busaba, Steven R. Carlough

Принадлежит: International Business Machines Corp

Major branch instructions are provided that enable execution of a computer program to branch from one segment of code to another segment of code. These instructions also create a new stream of processing at the other segment of code enabling execution of the other segment of code to be performed in parallel with the segment of code from which the branch was taken. In one example, the other stream of processing starts a transaction for processing instructions of the other stream of processing.

Подробнее

Номер записи: 33

08-08-2013 дата публикации

PROCESSOR PERFORMANCE IMPROVEMENT FOR INSTRUCTION SEQUENCES THAT INCLUDE BARRIER INSTRUCTIONS

Номер: US20130205121A1

Автор: GUTHRIE Guy L., STARKE William J., Williams Derek E.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

A technique for processing an instruction sequence that includes a barrier instruction, a load instruction preceding the barrier instruction, and a subsequent memory access instruction following the barrier instruction includes determining, by a processor core, that the load instruction is resolved based upon receipt by the processor core of an earliest of a good combined response for a read operation corresponding to the load instruction and data for the load instruction. The technique also includes if execution of the subsequent memory access instruction is not initiated prior to completion of the barrier instruction, initiating by the processor core, in response to determining the barrier instruction completed, execution of the subsequent memory access instruction. The technique further includes if execution of the subsequent memory access instruction is initiated prior to completion of the barrier instruction, discontinuing by the processor core, in response to determining the barrier instruction completed, tracking of the subsequent memory access instruction with respect to invalidation. 1. A method of processing an instruction sequence that includes a barrier instruction , a load instruction preceding the barrier instruction , and a subsequent memory access instruction following the barrier instruction , the method comprising:determining, by a processor core, that the load instruction is resolved based upon receipt by the processor core of an earliest of a good combined response for a read operation corresponding to the load instruction and data for the load instruction;if execution of the subsequent memory access instruction is not initiated prior to completion of the barrier instruction, initiating by the processor core, in response to determining the barrier instruction completed, execution of the subsequent memory access instruction; andif execution of the subsequent memory access instruction is initiated prior to completion of the barrier instruction, ...

Подробнее

Номер записи: 34

08-08-2013 дата публикации

Instruction set architecture-based inter-sequencer communications with a heterogeneous resource

Номер: US20130205122A1

Автор: Baiju Patel, Bryant Bigbee, Dion Rodgers, Gad Sheaffer, Gautham Chinya, HONG Jiang, Hong Wang, James P. Held, John Shen, Per Hammarlund, Richard Hankins, Shiv Kaushik, Yoav Talgam, Yuval Yosef

Принадлежит: Individual

In one embodiment, the present invention includes a method for directly communicating between an accelerator and an instruction sequencer coupled thereto, where the accelerator is a heterogeneous resource with respect to the instruction sequencer. An interface may be used to provide the communication between these resources. Via such a communication mechanism a user-level application may directly communicate with the accelerator without operating system support. Further, the instruction sequencer and the accelerator may perform operations in parallel. Other embodiments are described and claimed.

Подробнее

Номер записи: 35

15-08-2013 дата публикации

Processor to Execute Shift Right Merge Instructions

Номер: US20130212359A1

Автор: Debes Eric L., JR. William W., Macy, Nguyen Huy V., Sebot Julien

Принадлежит:

Method, apparatus, and program means for performing bitstream buffer manipulation with a SIMD merge instruction. The method of one embodiment comprises determining whether any unprocessed data bits for a partial variable length symbol exist in a first data block is made. A shift merge operation is performed to merge the unprocessed data bits from the first data block with a second data block. A merged data block is formed. A merged variable length symbol comprised of the unprocessed data bits and a plurality of data bits from the second data block is extracted from the merged data block. 1a plurality of registers to store 128-bit operands;{'b': '128', 'a decoder to decode a single instruction multiple data (SIMD) instruction, the SIMD instruction to indicate a first 128-bit operand having a first set of sixteen byte elements and a second -bit operand having a second set of sixteen byte elements, the SIMD instruction to have a 4-bit immediate to specify a number (n) of bytes;'}an execution unit coupled with the decoder and the plurality of registers, the execution unit in response to the SIMD instruction to store a 128-bit result in a destination indicated by the instruction, wherein the result is to includethe number (n) least significant byte elements of the second operand in the number (n) most significant bytes of the result,concatenated with sixteen minus the number (n) most significant byte elements of the first operand in sixteen minus the number (n) least significant bytes of the result.. A processor comprising: The present application is a continuation of U.S. patent application Ser. No. 13/602,546, filed on Sep. 4, 2012, entitled “PROCESSOR TO EXECUTE SHIFT RIGHT MERGE INSTRUCTIONS,” now pending, which is a continuation of U.S. patent application Ser. No. 13/477,544, filed on May 22, 2012, entitled “PROCESSOR TO EXECUTE SHIFT RIGHT MERGE INSTRUCTIONS,” now pending, which is a continuation of U.S. patent application Ser. No. 12/907,843 filed on Oct. 19, 2010 ...

Подробнее

Номер записи: 36

05-09-2013 дата публикации

Unpacking Packed Data In Multiple Lanes

Номер: US20130232321A1

Автор: HARGIL Asaf, Orenstein Doron

Принадлежит:

Receiving an instruction indicating first and second operands. Each of the operands having packed data elements that correspond in respective positions. A first subset of the data elements of the first operand and a first subset of the data elements of the second operand each corresponding to a first lane. A second subset of the data elements of the first operand and a second subset of the data elements of the second operand each corresponding to a second lane. Storing result, in response to instruction, including: (1) in first lane, only lowest order data elements from first subset of first operand interleaved with corresponding lowest order data elements from first subset of second operand; and (2) in second lane, only highest order data elements from second subset of first operand interleaved with corresponding highest order data elements from second subset of second operand. 1. A method comprising:receiving an instruction, the instruction indicating a first operand and a second operand, each of the first and second operands having a plurality of packed data elements that correspond in respective positions, a first subset of the packed data elements of the first operand and a first subset of the packed data elements of the second operand each corresponding to a first lane, and a second subset of the packed data elements of the first operand and a second subset of the packed data elements of the second operand each corresponding to a second lane; andstoring a result in response to the instruction, the result including: (1) in the first lane, only lowest order data elements from the first subset of the first operand interleaved with corresponding lowest order data elements from the first subset of the second operand; and (2) in the second lane, only highest order data elements from the second subset of the first operand interleaved with corresponding highest order data elements from the second subset of the second operand.2. The method of claim 1 , wherein the ...

Подробнее

Номер записи: 37

19-09-2013 дата публикации

Run-time-instrumentation controls emit instruction

Номер: US20130246747A1

Автор: Charles W. Gainey, Jr., Chung-Lung K. Shum, Kevin A. Stoodley, Marcel Mitran

Принадлежит: International Business Machines Corp

Embodiments of the invention relate to executing a run-time-instrumentation EMIT (RIEMIT) instruction. A processor is configured to capture the run-time-instrumentation information of a stream of instructions. The RIEMIT instruction is fetched and executed. It is determined if the current run-time-instrumentation controls are configured to permit capturing and storing of run-time-instrumentation information in a run-time-instrumentation program buffer. If the controls are configured to store run-time-instrumentation instructions, then a RIEMIT instruction specified value is stored as an emit record of a reporting group in the run-time-instrumentation program buffer.

Подробнее

Номер записи: 38

19-09-2013 дата публикации

DETERMINING THE STATUS OF RUN-TIME-INSTRUMENTATION CONTROLS

Номер: US20130246748A1

Автор: Farrell Mark S., Gainey, JR. Charles W., Shum Chung-Lung K., Smith Brian L.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

The invention relates to determining the status of run-time-instrumentation controls. The status is determined by executing a test run-time-instrumentation controls (TRIC) instruction. The TRIC instruction is executed in either a supervisor state or a lesser-privileged state. The TRIC instruction determines whether the run-time-instrumentation controls have changed. The run-time-instrumentation controls are set to an initial value using a privileged load run-time-instrumentation controls (LRIC) instruction. The TRIC instruction is fetched and executed. If the TRIC instruction is enabled, then it is determined if the initial value set by the run-time-instrumentation controls has been changed. If the initial value set by the run-time-instrumentation controls has been changed, then a condition code is set to a first value. 1. A computer implemented method for modifying run-time-instrumentation controls from a lesser-privileged state , the method comprising:setting a set of run-time-instrumentation controls to an initial value using a privileged load run-time-instrumentation controls (LRIC) instruction;fetching the TRIC instruction; based on the TRIC instruction being enabled, determining whether the initial value set by the run-time-instrumentation controls has been changed; and', 'based on determining the initial value set by the run-time-instrumentation controls been changed, setting a condition code to a first value., 'executing the TRIC instruction, the executing comprising2. The method according to claim 1 , wherein determining that the TRIC instruction is enabled comprises any one of:based on the TRIC instruction being executed in supervisor mode, determining that the TRIC instruction is enabled; andbased on the TRIC instruction being executed in the lesser-privileged state and a field of the run-time-instrumentation controls is set.3. The method according to claim 1 , further comprising:based on the TRIC instruction being not-enabled, setting the condition code ...

Подробнее

Номер записи: 39

19-09-2013 дата публикации

DATA PROCESSOR

Номер: US20130246765A1

Автор: Arakawa Fumio

Принадлежит: RENESAS ELECTRONICS CORPORATION

For efficient issue of a superscalar instruction a circuit is employed which retrieves an instruction of each instruction code type other than a prefix based on a determination result of decoders for determining instruction code type, adds the immediately preceding instruction to the retrieved instruction, and outputs the resultant. When an instruction of a target code type is detected in a plurality of instruction units to be searched, the circuit outputs the detected instruction code and the immediately preceding instruction other than the target code type as prefix code candidates. When an instruction of a target code type cannot be detected at the rear end of the instruction units, the circuit outputs the instruction at the rear end as a prefix code candidate. When an instruction of a target code type is detected at the head in the instruction code search, the circuit outputs the instruction code at the head. 14-. (canceled)5. A data processor of an instruction set architecture comprising a prefix code for modifying a subsequent instruction ,wherein the instruction set comprises a fixed register using instruction which is implicitly assigned by an instruction, andwherein the prefix code modifies the subsequent instruction so that the subsequent instruction becomes an instruction having the same function as that of the fixed register using instruction and the fixed register is replaced with an operand which is not limited to a fixed register.6. The data processor according to claim 5 , wherein the prefix code is arranged before the fixed register using instruction and modifies the fixed register using instruction so as to replace the fixed register with another register which can be assigned by an instruction.7. The data processor according to claim 5 , wherein the prefix code is arranged before the fixed register using instruction and modifies the fixed register using instruction so as to replace the fixed register with an intermediate value.8. The data ...

Подробнее

Номер записи: 40

19-09-2013 дата публикации

TRANSFORMING NON-CONTIGUOUS INSTRUCTION SPECIFIERS TO CONTIGUOUS INSTRUCTION SPECIFIERS

Номер: US20130246766A1

Автор: Gschwind Michael K.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

Emulation of instructions that include non-contiguous specifiers is facilitated. A non-contiguous specifier specifies a resource of an instruction, such as a register, using multiple fields of the instruction. For example, multiple fields of the instruction (e.g., two fields) include bits that together designate a particular register to be used by the instruction. Non-contiguous specifiers of instructions defined in one computer system architecture are transformed to contiguous specifiers usable by instructions defined in another computer system architecture. The instructions defined in the another computer system architecture emulate the instructions defined for the one computer system architecture. 1. A method of transforming instruction specifiers of a computing environment , the method comprising:obtaining, by a processor, from a first instruction defined for a first computer architecture, a non-contiguous specifier, the non-contiguous specifier having a first portion and a second portion, wherein the obtaining comprises obtaining the first portion from a first field of the instruction and the second portion from a second field of the instruction, the first field separate from the second field;generating a contiguous specifier using the first portion and the second portion, the generating using one or more rules based on the opcode of the first instruction; andusing the contiguous specifier to indicate a resource to be used in execution of a second instruction, the second instruction defined for a second computer architecture different from the first computer architecture and emulating a function of the first instruction.2. The method of claim 1 , wherein the processor comprises an emulator claim 1 , and wherein the first portion includes a first one or more bits claim 1 , and the second portion includes a second one or more bits claim 1 , and the generating comprises concatenating the second one or more bits with the first one or more bits to form the ...

Подробнее

Номер записи: 41

19-09-2013 дата публикации

RUN-TIME INSTRUMENTATION MONITORING OF PROCESSOR CHARACTERISTICS

Номер: US20130246771A1

Автор: Farrell Mark S., Gainey, JR. Charles W., Mitran Marcel, Osisek Damian L., Shum Chung-Lung K., Slegel Timothy J., Smith Brian L.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

Embodiments of the invention relate to monitoring processor characteristic information of a processor using run-time-instrumentation. An aspect of the invention includes executing an instruction stream on the processor and detecting a run-time instrumentation sample point of the executing instruction stream on the processor. A reporting group is stored in a run-time instrumentation program buffer based on the run-time instrumentation sample point. The reporting group includes processor characteristic information associated with the processor. 1. A computer implemented method for monitoring processor characteristic information of a processor using run-time-instrumentation , the method comprising:executing an instruction stream on a processor;detecting a run-time instrumentation sample point of the executing instruction stream on the processor; andstoring a reporting group in a run-time instrumentation program buffer based on the run-time instrumentation sample point, the reporting group including processor characteristic information associated with the processor.2. The method of claim 1 , further comprising:checking current processor characteristic information prior to storing a subsequent reporting group in the run-time instrumentation program buffer based on the subsequent run-time instrumentation sample point; and storing the subsequent reporting group in the run-time instrumentation program buffer;', 'suppressing storage of the subsequent reporting group in the run-time instrumentation program buffer; and', 'halting run-time instrumentation., 'based on the current processor characteristic information, determining whether to perform one of3. The method of claim 2 , further comprising:determining whether processors in a current configuration are configured to operate with a common CPU capability; and reading a suppression control of a run-time instrumentation control; and', 'suppressing storage of the subsequent reporting group in the run-time instrumentation ...

Подробнее

Номер записи: 42

19-09-2013 дата публикации

RUN-TIME INSTRUMENTATION INDIRECT SAMPLING BY INSTRUCTION OPERATION CODE

Номер: US20130246772A1

Автор: Bradbury Jonathan D., Gainey, Gschwind Michael K., JR. Charles W., Schwarz Eric M.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

Embodiments of the invention relate to implementing run-time instrumentation indirect sampling by instruction operation code. An aspect of the invention includes a method for implementing run-time instrumentation indirect sampling by instruction operation code. The method includes reading sample-point instruction operation codes from a sample-point instruction array, and comparing, by a processor, the sample-point instruction operation codes to an operation code of an instruction from an instruction stream executing on the processor. The method also includes recognizing a sample point upon execution of the instruction with the operation code matching one of the sample-point instruction operation codes. The run-time instrumentation information is obtained from the sample point. The method further includes storing the run-time instrumentation information in a run-time instrumentation program buffer as a reporting group. 1. A computer implemented method for implementing run-time instrumentation indirect sampling by instruction operation code , the method comprising:reading sample-point instruction operation codes from a sample-point instruction array;comparing, by a processor, the sample-point instruction operation codes to an operation code of an instruction from an instruction stream executing on the processor;recognizing a sample point upon execution of the instruction with the operation code matching one of the sample-point instruction operation codes, wherein run-time instrumentation information is obtained from the sample point; andstoring the run-time instrumentation information in a run-time instrumentation program buffer as a reporting group.2. The method of claim 1 , wherein the run-time instrumentation information comprises run-time instrumentation event records collected in a collection buffer of the processor and the reporting group further comprises system information records in combination with the run-time instrumentation event records.3. The method of ...

Подробнее

Номер записи: 43

19-09-2013 дата публикации

HARDWARE BASED RUN-TIME INSTRUMENTATION FACILITY FOR MANAGED RUN-TIMES

Номер: US20130246773A1

Автор: Mitran Marcel, Shum Chung-Lung K., Stoodley Kevin A.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

Embodiments of the invention relate to performing run-time instrumentation. Run-time instrumentation is captured, by a processor, based on an instruction stream of instructions of an application program executing on the processor. The capturing includes storing the run-time instrumentation data in a collection buffer of the processor. A run-time instrumentation sample point trigger is detected by the processor. Contents of the collection buffer are copied into a program buffer as a reporting group based on detecting the run-time instrumentation sample point trigger. The program buffer is located in main storage in an address space that is accessible by the application program. 1. A computer implemented method for performing run-time instrumentation , the method comprising:capturing, by a processor, run-time instrumentation data based on an instruction stream of instructions of an application program executing on the processor, the capturing comprising storing the run-time instrumentation data in a collection buffer of the processor;detecting, by the processor, a run-time instrumentation sample point trigger; andcopying contents of the collection buffer into a program buffer as a reporting group based on the detecting the run-time instrumentation sample point trigger, the program buffer located in main storage in an address space that is accessible by the application program.2. The method of claim 1 , wherein the collection buffer is implemented by hardware located on the processor.3. The method of claim 1 , wherein the collection buffer is not accessible by the application program.4. The method of claim 1 , wherein the capturing and the detecting are performed in a manner that is transparent to the executing.5. The method of further comprising capturing claim 1 , in the collection buffer claim 1 , instruction addresses and metadata corresponding to events detected during the executing of the instruction stream.6. The method of claim 1 , wherein the reporting group ...

Подробнее

Номер записи: 44

19-09-2013 дата публикации

RUN-TIME INSTRUMENTATION SAMPLING IN TRANSACTIONAL-EXECUTION MODE

Номер: US20130246774A1

Автор: Bradbury Jonathan D., Gainey, Gschwind Michael K., JR. Charles W.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

Embodiments of the invention relate to implementing run-time instrumentation indirect sampling by address. An aspect of the invention includes a method for implementing run-time instrumentation indirect sampling by address. The method includes reading sample-point addresses from a sample-point address array, and comparing, by a processor, the sample-point addresses to an address associated with an instruction from an instruction stream executing on the processor. The method further includes recognizing a sample point upon execution of the instruction associated with the address matching one of the sample-point addresses. Run-time instrumentation information is obtained from the sample point. The method also includes storing the run-time instrumentation information in a run-time instrumentation program buffer as a reporting group. 1. A computer implemented method for implementing run-time instrumentation indirect sampling by address , the method comprising:reading sample-point addresses from a sample-point address array;comparing, by a processor, the sample-point addresses to an address associated with an instruction from an instruction stream executing on the processor;recognizing a sample point upon execution of the instruction associated with the address matching one of the sample-point addresses, wherein run-time instrumentation information is obtained from the sample point; andstoring the run-time instrumentation information in a run-time instrumentation program buffer as a reporting group.2. The method of claim 1 , wherein the address associated with the instruction claim 1 , based on address type claim 1 , is one of: an address of the instruction and an address of an operand of the instruction.3. The method of claim 1 , further comprising:initializing a run-time-instrumentation control based on executing a load run-time instrumentation controls (LRIC) instruction, the LRIC instruction establishing a sampling mode and a sample-point address (SPA) control.4. The ...

Подробнее

Номер записи: 45

19-09-2013 дата публикации

RUN-TIME INSTRUMENTATION SAMPLING IN TRANSACTIONAL-EXECUTION MODE

Номер: US20130246775A1

Автор: Bradbury Jonathan D., Gainey, Gschwind Michael K., JR. Charles W.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

Embodiments of the invention relate to implementing run-time instrumentation sampling in transactional-execution mode. An aspect of the invention includes a method for implementing run-time instrumentation sampling in transactional-execution mode. The method includes determining, by a processor, that the processor is configured to execute instructions of an instruction stream in a transactional-execution mode, the instructions defining a transaction. The method also includes interlocking completion of storage operations of the instructions to prevent instruction-directed storage until completion of the transaction. The method further includes recognizing a sample point during execution of the instructions while in the transactional-execution mode. The method additionally includes run-time-instrumentation-directed storing, upon successful completion of the transaction, run-time instrumentation information obtained at the sample point. 1. A computer implemented method for implementing run-time instrumentation sampling in transactional-execution mode , the method comprising:determining, by a processor, that the processor is configured to execute instructions of an instruction stream in a transactional-execution mode, the instructions defining a transaction;interlocking completion of storage operations of the instructions to prevent instruction-directed storage until completion of the transaction;recognizing a sample point during execution of the instructions while in the transactional-execution mode; andrun-time-instrumentation-directed storing, upon successful completion of the transaction, run-time instrumentation information obtained at the sample point.2. The method of claim 1 , wherein run-time-instrumentation-directed storing the run-time instrumentation information obtained at the sample point further comprises:collecting run-time instrumentation events in a collection buffer while in the transactional-execution mode;deferring storage of the collected run-time ...

Подробнее

Номер записи: 46

19-09-2013 дата публикации

RUN-TIME INSTRUMENTATION REPORTING

Номер: US20130246776A1

Автор: Farrell Mark S., Gainey, JR. Charles W., Shum Chung-Lung K., Smith Brian L.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

Embodiments of the invention relate to run-time instrumentation reporting. An instruction stream is executed by a processor. Run-time instrumentation information of the executing instruction stream is captured by the processor. Run-time instrumentation records are created based on the captured run-time instrumentation information. A run-time instrumentation sample point of the executing instruction stream on the processor is detected. A reporting group is stored in a run-time instrumentation program buffer. The storing is based on the detecting and the storing includes: determining a current address of the run-time instrumentation program buffer, the determining based on instruction accessible run-time instrumentation controls; and storing the reporting group into the run-time instrumentation program buffer based on an origin address and the current address of the run-time instrumentation program buffer, the reporting group including the created run-time instrumentation records. 1. A computer implemented method for run-time instrumentation reporting , the method comprising:executing an instruction stream by a processor;capturing, by the processor, run-time instrumentation information of said executing instruction stream;based on said captured run-time instrumentation information, creating run-time instrumentation records;detecting a run-time instrumentation sample point of the executing instruction stream on the processor; and determining a current address of the run-time instrumentation program buffer, the determining based on instruction accessible run-time instrumentation controls; and', 'storing the reporting group into the run-time instrumentation program buffer based on an origin address and the current address of the run-time instrumentation program buffer, the reporting group comprising said created run-time instrumentation records., 'storing a reporting group in a run-time instrumentation program buffer, the storing based on the detecting a run-time ...

Подробнее

Номер записи: 47

19-09-2013 дата публикации

METHOD AND APPARATUS FOR BANDWIDTH ALLOCATION MODE SWITCHING BASED ON RELATIVE PRIORITIES OF THE BANDWIDTH ALLOCATION MODES

Номер: US20130247072A1

Автор: Chuang Yu-Chi, Kang Jack

Принадлежит: MARVELL WORLD TRADE LTD.

A system, apparatus, and method for allocation mode switching on an event-driven basis are described herein. The allocation mode switching method includes detecting an event, selecting a bandwidth allocation mode associated with the detected event, and allocating a plurality of execution cycles of an instruction execution period of a processor core among a plurality of instruction execution threads based at least in part on the selected bandwidth allocation mode. Other embodiments may be described and claimed. 1. A method of operating in one of at least a first bandwidth allocation mode and a second bandwidth allocation mode , the method comprising: detecting a first event associated with the first bandwidth allocation mode, and', 'comparing a priority of the first bandwidth allocation mode with a priority of the second bandwidth allocation mode; and, 'while operating in the second bandwidth allocation mode,'}based on comparing the priority of the first bandwidth allocation mode with the priority of the second bandwidth allocation mode, either (i) continue operating in the second bandwidth allocation mode or (ii) switching to the first bandwidth allocation mode.2. The method of claim 1 , wherein either (i) continue operating in the second bandwidth allocation mode or (ii) switching to the first bandwidth allocation mode further comprises:in response to the priority of the first bandwidth allocation mode being higher than the priority of the second bandwidth allocation mode, switching to the first bandwidth allocation mode.3. The method of claim 1 , wherein either (i) continue operating in the second bandwidth allocation mode or (ii) switching to the first bandwidth allocation mode further comprises:in response to the priority of the first bandwidth allocation mode being lower than the priority of the second bandwidth allocation mode, continue operating in the second bandwidth allocation mode.4. The method of claim 1 , further comprising:while operating in the first ...

Подробнее

Номер записи: 48

03-10-2013 дата публикации

PROCESSOR AND METHOD FOR DRIVING THE SAME

Номер: US20130262828A1

Автор: Yoneda Seiichi

Принадлежит: SEMICONDUCTOR ENERGY LABORATORY CO., LTD.

A low-power processor that does not easily malfunction is provided. Alternatively, a low-power processor having high processing speed is provided. Alternatively, a method for driving the processor is provided. In power gating, the processor performs part of data backup in parallel with arithmetic processing and performs part of data recovery in parallel with arithmetic processing. Such a driving method prevents a sharp increase in power consumption in a data backup period and a data recovery period and generation of instantaneous voltage drops and inhibits increases of the data backup period and the data recovery period. 1. A processor comprising:an instruction decoder;a logic unit including a plurality of logic circuit blocks including a volatile memory block and a nonvolatile memory block;a backup/recovery controller including a storage storing first reference instruction enumeration and second reference instruction enumeration;a power controller; anda flag storage.2. The processor according to claim 1 , wherein the volatile memory block includes a register.3. The processor according to claim 1 , wherein the nonvolatile memory block includes a transistor including an oxide semiconductor.4. The processor according to claim 1 , wherein the processor is incorporated in one selected from the group consisting of an air conditioner claim 1 , an electric refrigerator-freezer claim 1 , an image display device claim 1 , and an electric vehicle.5. A processor comprising:an instruction decoder;a logic unit including a plurality of logic circuit blocks including a volatile memory block and a nonvolatile memory block;a backup/recovery controller including a storage storing first reference instruction enumeration and second reference instruction enumeration;a power controller; anda flag storage,wherein the instruction decoder receives an instruction from an outside of the processor and gives an instruction to the logic unit, the backup/recovery controller, and the power ...

Подробнее

Номер записи: 49

03-10-2013 дата публикации

INSTRUCTION MERGING OPTIMIZATION

Номер: US20130262840A1

Автор: Gschwind Michael K., Salapura Valentina

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

A computer-implemented method includes determining that two or more instructions of an instruction stream are eligible for optimization. Eligibility is based on a first instruction specifying a first target register and a second instruction specifying the first target register as a source register and a target register. The method includes merging the two or more machine instructions into a single optimized internal instruction that is configured to perform first and second functions of two or more machine instructions employing operands specified by the two or more machine instructions. The single optimized internal instruction specifies the first target register only as a single target register and the single optimized internal instruction specifies the first and second functions to be performed. The method includes executing the single optimized internal instruction to perform the first and second functions of the two or more instructions. 1. A computer-implemented method comprising:determining that two or more instructions of an instruction stream are eligible for optimization, wherein the being eligible comprises determining that the two or more machine instructions comprise a first instruction specifying a first target register and a second instruction specifying the first target register as a source register and a target register, wherein the second instruction is a next sequential instruction of the first instruction in program order, wherein the first instruction specifies a first function to be performed, and the second instruction specifies a second function to be performed;merging the two or more machine instructions into a single optimized internal instruction that is configured to perform the first and second functions of the two or more machine instructions employing operands specified by the two or more machine instructions, wherein the single optimized internal instruction specifies the first target register only as a single target register, wherein ...

Подробнее

Номер записи: 50

03-10-2013 дата публикации

INSTRUCTION MERGING OPTIMIZATION

Номер: US20130262841A1

Автор: Gschwind Michael K., Salapura Valentina

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

A computer-implemented method includes determining that two or more instructions of an instruction stream are eligible for optimization, where the two or more instructions include a memory load instruction and a data processing instruction to process data based on the memory load instruction. The method includes merging, by a processor, the two or more instructions into a single optimized internal instruction and executing the single optimized internal instruction to perform a memory load function and a data processing function corresponding to the memory load instruction and the data processing instruction. 1. A computer-implemented method comprising:determining that two or more instructions of an instruction stream are eligible for optimization, the two or more instruction including a memory load instruction and a data processing instruction to process data based on the memory load instruction;merging, by a processor, the two or more instructions into a single optimized internal instruction; andexecuting the single optimized internal instruction to perform a memory load function and a data processing function corresponding to the memory load instruction and the data processing instruction.2. The computer-implemented method of claim 1 , wherein executing the single optimized internal instruction includes executing the single optimized internal instruction instead of two or more separate internal instructions corresponding to the two or more instructions of the instruction stream.3. The computer-implemented method of claim 1 , further comprising storing the single optimized internal instruction in a single instruction slot of a queue claim 1 ,wherein executing the single optimized internal instruction includes fetching the single optimized internal instruction from the queue and generating from the single optimized internal instruction two or more separate internal instructions corresponding to the memory load instruction and the data processing instruction.4. The ...

Подробнее

Номер записи: 51

03-10-2013 дата публикации

FUNCTION-BASED SOFTWARE COMPARISON METHOD

Номер: US20130262843A1

Автор: Du Ben-Chuan

Принадлежит: MStar Semiconductor, Inc.

A method for comparing a first subroutine and a second subroutine in functionality, includes: defining a plurality of instruction sets, each instruction set associated with a corresponding instruction set process; obtaining a first program section and a second program section from a first subroutine and a second subroutine, respectively, and categorizing the first subroutine and the second subroutine to one of the instruction sets, respectively; performing a program section comparison process to select and perform one of the instruction sets according to the instruction set to which the first program section is categorized and the instruction set to which the second program section is categorized, so as to compare whether the first program section and the second program section have identical functions, and to accordingly determine whether the first subroutine and the second subroutine are equivalent in functionality. 1. A method for comparing a first subroutine and a second subroutine in functionality , comprising:defining a plurality of instruction sets, each of which being associated with a corresponding instruction set process;performing a capturing process to obtain a first program section and a second program section respectively from the first subroutine and the second subroutine, and respectively categorizing the first program section and the second program section to one of the instruction sets; andperforming a program section comparison process to select and perform one of the instruction set processes according to the instruction set to which the first program section is categorized and the instruction set to which the second program section is categorized, so as to compare whether the first program section and the second program section are identical in functionality.2. The method according to claim 1 , wherein the program section comparison process comprises:when the first program section and the second program section are categorized to the same ...

Подробнее

Номер записи: 52

10-10-2013 дата публикации

CORE SWITCHING ACCELERATION IN ASYMMETRIC MULTIPROCESSOR SYSTEM

Номер: US20130268742A1

Автор: Ginzburg Boris, Haber Gadi, Levit-Gurevich Konstantin, Li Wei, Mishaeli Michael, Natanzon Esfir, Naveh Alon, Ronen Ronny, Weissmann Eliezer, Yamada Koichi

Принадлежит:

An asymmetric multiprocessor system (ASMP) may comprise computational cores implementing different instruction set architectures and having different power requirements. Program code executing on the ASMP is analyzed by a binary analysis unit to determine what functions are called by the program code and select which of the cores are to execute the program code, or a code segment thereof. Selection may be made to provide for native execution of the program code, to minimize power consumption, and so forth. Control operations based on this selection may then be inserted into the program code, forming instrumented program code. The instrumented program code is then executed by the ASMP. 1. A device comprising:a code analyzer unit to determine one or more instructions called by a code segment; anda code instrumentation unit to select a subset of a plurality of processing cores to execute the code segment and to modify the code segment to include one or more control operations based on the selected subset of the plurality of processing cores.2. The device of claim 1 , the plurality of processing cores comprising a first core and a second core claim 1 , wherein the one or more operations are to initiate migration of the code segment to the first core or the second core.3. The device of claim 1 , wherein the one or more operations are to wake one or more of the plurality of processing cores.4. The device of claim 1 , wherein the plurality of processing cores comprise a first core to execute a first instruction set and a second core to execute a second instruction set.5. A processor comprising:a first core to operate at a first maximum power consumption rate;a second core to operate at a second maximum power consumption rate which is less than the first maximum power consumption rate; and determine what instructions are called by one or more code segments within the program code;', 'select which of the first core or the second core to assign the one or more code segments ...

Подробнее

Номер записи: 53

17-10-2013 дата публикации

METHOD AND APPARATUS TO PROCESS KECCAK SECURE HASHING ALGORITHM

Номер: US20130275722A1

Автор: Dixon Martin G., Feghali Wajdi K., Gopal Vinodh, Guilford James D., Gulley Sean M., Ozturk Erdinc, Wolrich Gilbert M., Yap Kirk S.

Принадлежит: Intel Corporation

A processor includes a plurality of registers, an instruction decoder to receive an instruction to process a KECCAK state cube of data representing a KECCAK state of a KECCAK hash algorithm, to partition the KECCAK state cube into a plurality of subcubes, and to store the subcubes in the plurality of registers, respectively, and an execution unit coupled to the instruction decoder to perform the KECCAK hash algorithm on the plurality of subcubes respectively stored in the plurality of registers in a vector manner. 1. A processor , comprising:a plurality of registers;an instruction decoder to receive an instruction to process a KECCAK state cube of data representing a KECCAK state of a KECCAK hash algorithm, to partition the KECCAK state cube into a plurality of subcubes, and to store the subcubes in the plurality of registers, respectively; andan execution unit coupled to the instruction decoder to perform the KECCAK hash algorithm on the plurality of subcubes respectively stored in the plurality of registers in a vector manner.2. The processor of claim 1 , wherein the KECCAK state cube includes 64 slices partitioned into 4 subcubes claim 1 , wherein each subcube contains 16 slices.3. The processor of claim 2 , wherein the plurality of registers include 4 registers claim 2 , each having at least 450 bits.4. The processor of claim 1 , wherein claim 1 , for each round of the KECCAK algorithm claim 1 , the execution unit is configured to perform KECCAK_THETA operations claim 1 , includingperforming a θ function of the KECCAK algorithm on the subcubes stored in the registers in parallel, andperforming a first portion of a ρ function of the KECCAK algorithm on the subcubes in parallel.5. The processor of claim 4 , wherein the execution unit is further configured to perform KECCAK_ROUND operations claim 4 , includingperforming a second portion of the ρ function of the KECCAK algorithm on the subcubes in parallel,performing a π function of the KECCAK algorithm on the ...

Подробнее

Номер записи: 54

17-10-2013 дата публикации

SYSTEMS, APPARATUSES, AND METHODS FOR GENERATING A DEPENDENCY VECTOR BASED ON TWO SOURCE WRITEMASK REGISTERS

Номер: US20130275724A1

Автор: Bharadwaj Jayashankar

Принадлежит:

Embodiments of systems, apparatuses, and methods of performing in a computer processor dependency index vector calculation in response to an instruction that includes a first and second source writemask register operands, a destination vector register operand, and an opcode are described. 1. A method of performing in a computer processor dependency index vector calculation in response to an instruction that includes a first and second source writemask register operands , a destination vector register operand , and an opcode , the method comprising steps of:executing the instruction to determine, for each bit position the first source writemask register, a dependence value that indicates for an iteration corresponding to that bit position, which bit position that it is dependent on;storing the determined dependence values in corresponding data element positions of the destination vector register.2. The method of claim 1 , wherein the destination vector register is a 128-bit vector register.3. The method of claim 1 , wherein the destination vector register is a 256-bit vector register.4. The method of claim 1 , wherein the destination vector register is a 512-bit vector register.5. The method of claim 1 , wherein the source writemask registers are 16-bit registers.6. The method of claim 1 , wherein the source writemask registers are 64-bit registers.7. The method of claim 1 , wherein the determining and storing further comprises:setting a counter value and a temporary value to 0;determining if a value in the counter value bit position of the first source writemask register is 1;when the value in the counter value bit position of the first source writemask register is 1, setting a destination vector register data element at position counter value to be the temporary value;when the value in the counter value bit position of the first source writemask register is 0, setting a destination vector register data element at position counter value to be 0;determining if a ...

Подробнее

Номер записи: 55

24-10-2013 дата публикации

Computer Program Instruction Analysis

Номер: US20130283011A1

Автор: David A. Gilbert

Принадлежит: International Business Machines Corp

Disclosed is a method of analysis of a computer program instruction for use in a central processing unit having a decoding unit. The method comprises receiving an address of an instruction to be analysed, fetching said instruction stored at said address, decoding by a decoding unit associated with the central processing unit, the fetched instruction; and returning the results of said decoding of said fetched instruction. The decoded results are returned as a data block stored in memory associated with the central processing unit or in one or more registers of the central processing unit. The decoded results include the type of the instruction and/or the instruction length. The method optionally further comprises analysing the decoded results to determine whether the instruction may be replaced with one of a trap or a break point. Also disclosed is a system and computer program for analysis of a computer program instruction for use in a central processing unit having a decoding unit.

Подробнее

Номер записи: 56

24-10-2013 дата публикации

PROCESSORS, METHODS, SYSTEMS, AND INSTRUCTIONS TO GENERATE SEQUENCES OF INTEGERS IN NUMERICAL ORDER THAT DIFFER BY A CONSTANT STRIDE

Номер: US20130283019A1

Автор: Abraham Seth, Gradstein Amit, Ould-Ahmed-Vall Elmoustapha, Sperber Zeev, Valentine Robert

Принадлежит:

A method of an aspect includes receiving an instruction indicating a destination storage location. A result is stored in the destination storage location in response to the instruction. The result includes a sequence of at least four non-negative integers in numerical order with all integers in consecutive positions differing by a constant stride of at least two. In an aspect, storing the result including the sequence of the at least four integers is performed without calculating the at least four integers using a result of a preceding instruction. Other methods, apparatus, systems, and instructions are disclosed. 1. A method comprising:receiving an instruction, the instruction indicating a destination storage location; andstoring a result in the destination storage location in response to the instruction, the result including a sequence of at least four non-negative integers in numerical order with all integers in consecutive positions differing by a constant stride of at least two,wherein storing the result including the sequence of the at least four integers is performed without calculating the at least four integers using a result of a preceding instruction.2. The method of claim 1 , wherein receiving the instruction comprises receiving a control indexes generation instruction claim 1 , and wherein storing the result comprises storing the sequence of the at least four integers as at least four corresponding control indexes.3. The method of claim 1 , wherein storing the integers in the numerical order with the integers in the consecutive positions differing by a constant stride is fixed by an opcode of the instruction.4. The method of claim 1 , wherein receiving the instruction comprises receiving an instruction specifying the constant stride.5. The method of claim 1 , wherein receiving the instruction comprises receiving an instruction specifying an integer offset claim 1 , and wherein storing comprises storing a smallest one of the at least four integers which ...

Подробнее

Номер записи: 57

24-10-2013 дата публикации

APPARATUS AND METHOD OF IMPROVED INSERT INSTRUCTIONS

Номер: US20130283021A1

Автор: Charney Mark J., Corbal Jesus, Gradstein Amit, Ould-Ahmed-Vall Elmoustapha, Sperber Zeev, Toll Bret L., Valentine Robert

Принадлежит: LURGI GMBH

An apparatus is described having instruction execution logic circuitry to execute first, second, third and fourth instruction. Both the first instruction and the second instruction insert a first group of input vector elements to one of multiple first non overlapping sections of respective first and second resultant vectors. The first group has a first bit width. Each of the multiple first non overlapping sections have a same bit width as the first group. Both the third instruction and the fourth instruction insert a second group of input vector elements to one of multiple second non overlapping sections of respective third and fourth resultant vectors. The second group has a second bit width that is larger than said first bit width. Each of the multiple second non overlapping sections have a same bit width as the second group. The apparatus also includes masking layer circuitry to mask the first and third instructions at a first resultant vector granularity, and, mask the second and fourth instructions at a second resultant vector granularity. 1. An apparatus , comprising:instruction execution logic circuitry to execute:a) a first instruction and a second instruction, where, both said first instruction and said second instruction insert a first group of input vector elements to one of multiple first non overlapping sections of respective first and second resultant vectors, said first group having a first bit width, each of said multiple first non overlapping sections having a same bit width as said first group;b) a third instruction and a fourth instruction, where, both said third instruction and said fourth instruction insert a second group of input vector elements to one of multiple second non overlapping sections of respective third and fourth resultant vectors, said second group having a second bit width that is larger than said first bit width, each of said multiple second non overlapping sections having a same bit width as said second group;masking layer ...

Подробнее

Номер записи: 58

07-11-2013 дата публикации

Decomposing Operations in More than One Dimension into One Dimensional Point Operations

Номер: US20130297908A1

Автор: Krig Scott A.

Принадлежит:

A processing architecture uses stationary operands and opcodes common on a plurality of processors. Only data moves through the processors. The same opcode and operand is used by each processor assigned to operate, for example, on one row of pixels, one row of numbers, or one row of points in space. 1. A method comprising:in a computer processor, converting operations in more than one dimension into a series of one dimensional operations.2. The method of including converting an area operation into a series of one dimensional operations.3. The method of including converting an area operation into point operations implemented by memory writes into memory cells.4. The method of including combining a memory cell value with an operand according to a processing opcode and accumulating results back into the memory cell.5. The method of including:programming a plurality of parallel processors with the same operand and the same opcode; andperforming a plurality of parallel operations and storing the results in one line in a memory.6. The method of including performing a precision and numeric conversion in said processors.7. The method of wherein moving only data claim 5 , and not instructions claim 5 , along a processing pipeline.8. The method of including providing a parallel processor for each row of pixels in a frame.9. The method of including providing a storage and accumulation cell in said memory for each pixel.10. The method of including enabling each processor to perform both a point operation and an accumulation into the storage cell.11. A non-transitory computer readable medium storing instructions to implement a method comprising:converting operations in more than one dimension into a series of one dimensional operations.12. The medium of including converting an area operation into a series of one dimensional operations.13. The medium of including converting an area operation into point operations implemented by memory writes into memory cells.14. The medium of ...

Подробнее

Номер записи: 59

07-11-2013 дата публикации

FLAG NON-MODIFICATION EXTENSION FOR ISA INSTRUCTIONS USING PREFIXES

Номер: US20130297915A1

Автор: Brandt Jason W., Combs Jonathan D., Valentine Robert

Принадлежит:

In one embodiment, a processor includes an instruction decoder to receive and decode an instruction having a prefix and an opcode, an execution unit to execute the instruction based on the opcode, and flag modification override logic to prevent the execution unit from modifying a flag register of the processor based on the prefix of the instruction. 1. A method , comprising:in response to an instruction having a prefix and an opcode received at a processor, executing, by an execution unit of the processor, the instruction based on the opcode; andpreventing the execution unit from modifying a flag register of the processor based on the prefix of the instruction.2. The method of claim 1 , further comprising:extracting the prefix from the instruction; anddetermining whether the instruction is valid based on the prefix in view of a capability of the processor, wherein the execution unit is to execute the instruction only if the instruction is valid.3. The method of claim 2 , wherein determining whether the instruction is valid comprises examining a value of one or more bits of the prefix in view of a processor identifier that identifies a type of the processor.4. The method of claim 2 , further comprising generating an exception indicating that the instruction is invalid claim 2 , if one or more bits of the prefix matches a predetermined bit pattern based on the capability of the processor.5. The method of claim 1 , further comprising:preventing the execution unit from modifying the flag register if one or more bits of the prefix match a first predetermined bit pattern; andallowing the execution unit to modify the flag register if one or more bits of the prefix match a second predetermined bit pattern.6. The method of claim 1 , wherein the opcode of the instruction represents an integer operation that when executed would normally modify the flag register.7. The method of claim 1 , wherein the prefix represents a vector length when the opcode includes a vector ...

Подробнее

Номер записи: 60

14-11-2013 дата публикации

Microprocessor that enables arm isa program to access 64-bit general purpose registers written by x86 isa program

Номер: US20130305014A1

Автор: Mark John Ebersole

Принадлежит: Via Technologies Inc

A microprocessor includes hardware registers that instantiate the Intel 64 Architecture R8-R15 GPRs. The microprocessor associates with each of the R8-R15 GPRs a respective unique MSR address. The microprocessor also includes hardware registers that instantiate the ARM Architecture GPRs. In response to an ARM MRRC instruction that specifies the respective unique MSR address of one of the R8-R15 GPRs, the microprocessor reads the contents of the hardware register that instantiates the specified one of the R8-R15 GPRs into the hardware registers that instantiate two of the ARM GPRs registers. In response to an ARM MCRR instruction that specifies the respective unique MSR address of one of the R8-R15 GPRs, the microprocessor writes into the hardware register that instantiates the specified one of the R8-R15 GPRs the contents of the hardware registers that instantiate two of the ARM Architecture GPRs registers. The hardware registers may be shared by the two Architectures.

Подробнее

Номер записи: 61

14-11-2013 дата публикации

MFENCE and LFENCE Micro-Architectural Implementation Method and System

Номер: US20130305018A1

Автор: Salvador Palanca, Shekoufeh Qawami, Stephen Fischer, SUBRAMANIAM MAIYURAN

Принадлежит: Individual

A system and method for fencing memory accesses. Memory loads can be fenced, or all memory access can be fenced. The system receives a fencing instruction that separates memory access instructions into older accesses and newer accesses. A buffer within the memory ordering unit is allocated to the instruction. The access instructions newer than the fencing instruction are stalled. The older access instructions are gradually retired. When all older memory accesses are retired, the fencing instruction is dispatched from the buffer.

Подробнее

Номер записи: 62

14-11-2013 дата публикации

EXECUTION OF A PERFORM FRAME MANAGEMENT FUNCTION INSTRUCTION

Номер: US20130305023A1

Автор: Gainey, Greiner Dan F., Heller Lisa Cranton, III Gustav E., JR. Charles W., Osisek Damian L., Sittmann

Принадлежит:

Optimizations are provided for frame management operations, including a clear operation and/or a set storage key operation, requested by pageable guests. The operations are performed, absent host intervention, on frames not resident in host memory. The operations may be specified in an instruction issued by the pageable guests. 1. A computer system for executing an instruction , the computer system comprising:a memory; and obtaining a perform frame management function (PFMF) machine instruction, the PFMF machine instruction comprising an opcode field, a first field and a second field;', 'performing an operation on a guest frame designated by the second field, said guest frame being non-resident in host memory, the operation being specified in a location indicated by the first field and comprising a clear operation, and wherein the performing is absent host intervention and is based on a usage indicator specified in the location.', 'executing, by a pageable guest, the obtained PFMF machine instruction, the executing comprising], 'a processor in communications with the memory, wherein the computer system is configured to perform a method, said method comprising2. The computer system of claim 1 , wherein the usage indicator specifies that a program has indicated that it is likely to use the guest frame within a near future claim 1 , and wherein the clear operation includes:obtaining a host frame from a list of cleared available frames; andattaching the obtained host frame to the guest frame to be cleared.3. The computer system of claim 2 , wherein the operation further comprises a set storage key operation claim 2 , and the performing comprises including a value of a key in a control block used by a host managing the pageable guest.4. The computer system of claim 1 , wherein the usage indicator specifies that a program has indicated that it is not likely to use the guest frame within a near future claim 1 , and wherein the clear operation includes:marking one or more ...

Подробнее

Номер записи: 63

21-11-2013 дата публикации

RUNNING STATE POWER SAVING VIA REDUCED INSTRUCTIONS PER CLOCK OPERATION

Номер: US20130311755A1

Автор: Henry G. Glenn, Parks Terry

Принадлежит: VIA TECHNOLOGIES, INC.

A microprocessor includes functional units and control registers writeable to cause the functional units to institute actions that reduce the instructions-per-clock rate of the microprocessor to reduce power consumption when the microprocessor is operating in its lowest performance running state. Examples of the actions include in-order vs. out-of-order execution, serial vs. parallel cache access and single vs. multiple instruction issue, retire, translation and/or formatting per clock cycle. The actions may be instituted only if additional conditions exist, such as residing in the lowest performance running state for a minimum time, not running in a higher performance state for more than a maximum time, a user did not disable the feature, the microprocessor supports multiple running states and the operating system supports multiple running states. 1. A microprocessor , comprising:functional units; andcontrol registers, writeable to cause the functional units to institute one or more actions that reduce the instructions-per-clock rate of the microprocessor to reduce power consumption when the microprocessor is operating in its lowest performance running state;wherein the lowest performance running state comprises a non-sleeping state in which the microprocessor runs at its lowest supported clock frequency.2. The microprocessor of claim 1 , wherein the one or more actions comprise:the functional units switch from executing instructions out of program order to executing instructions in program order.3. The microprocessor of claim 1 , wherein the functional units comprise an instruction issue unit claim 1 , wherein the one or more power saving actions comprise:the instruction issue unit switches from issuing for execution multiple instructions per clock cycle to issuing only one instruction per clock cycle.4. The microprocessor of claim 1 , wherein the functional units comprise an instruction retire unit claim 1 , wherein the one or more power saving actions comprise: ...

Подробнее

Номер записи: 64

05-12-2013 дата публикации

METHOD, APPARATUS AND INSTRUCTIONS FOR PARALLEL DATA CONVERSIONS

Номер: US20130326194A1

Автор: Ramanujam Gopalan

Принадлежит:

Method, apparatus, and program means for performing a conversion. In one embodiment, a disclosed apparatus includes a destination storage location corresponding to a first architectural register. A functional unit operates responsive to a control signal, to convert a first packed first format value selected from a set of packed first format values into a plurality of second format values. Each of the first format values has a plurality of sub elements having a first number of bits The second format values have a greater number of bits. The functional unit stores the plurality of second format values into an architectural register. 1. A system comprising:a processor comprisinga register file including a first packed data register and a second packed data register,a decoder to decode a first instruction,scheduling logic to allocate resources and queue operations corresponding to the first instruction for execution, andexecution logic coupled to the decoder and the scheduling logic,wherein, responsive to the decoder decoding the first instruction, the execution logic is to convert a plurality of first packed data elements to a plurality of results,wherein the plurality of first packed data elements from the first packed data register is converted to the plurality of results, the results are saturated and stored in the second packed data register, and each of the first packed data elements has a first number of bits, each of the results has a second number of bits, and the second number of bits is one half the first number of bits;a memory controller coupled to the processor, wherein the memory controller is integral with the processor;a communication interface to a wireless network, the communication interface coupled to the processor; anda graphics interface to a display, the graphics interface coupled to the processor.2. The system of claim 1 , wherein the first number of bits is 32 and the second number of bits is 16.3. The system of claim 1 , wherein the first ...

Подробнее

Номер записи: 65

05-12-2013 дата публикации

SYSTEMS, APPARATUSES, AND METHODS FOR PERFORMING VECTOR PACKED UNARY DECODING USING MASKS

Номер: US20130326196A1

Автор: Ould-Ahmed-Vall Elmoustapha, Willhalm Thomas

Принадлежит:

Embodiments of systems, apparatuses, and methods for performing in a computer processor vector packed unary value decoding using masks in response to a single vector packed unary decoding using masks instruction that includes a destination vector register operand, a source writemask register operand, and an opcode are described. 1. A method of performing in a computer processor vector packed unary value decoding using masks in response to a single vector packed unary decoding using masks instruction that includes a destination vector register operand , a source writemask register operand , and an opcode , the method comprising steps of:executing the single vector packed unary value decoding using masks instruction to determine and decode the unary encoded values stored in the source writemask register; andstoring each determined and decode unary encoded values as packed data elements in packed data element positions of the destination register that correspond to their position in the source writemask register.2. The method of claim 1 , wherein each unary encoded value is stored in a format of its most significant bit position in the writemask being a 1 value and zero or more 0 values following the 1 value in bit positions of the destination writemask register that are of less significance than the bit position of the 1 value.3. The method of claim 1 , wherein the decoded least significant unary encoded value of the source vector register is stored in the least significant packed data element position of the destination register.4. The method of claim 1 , wherein the source writemask register is 16 bits.5. The method of claim 1 , wherein the source writemask register is 64 bits.6. The method of claim 1 , wherein after all of the decoded unary encoded values are stored in the destination register claim 1 , all remaining packed data element positions of the destination vector register are set to all 1s.7. The method of claim 1 , wherein the executing step comprises: ...

Подробнее

Номер записи: 66

12-12-2013 дата публикации

SET SAMPLING CONTROLS INSTRUCTION

Номер: US20130332709A1

Автор: Bartik Jane H., Heller Lisa C., JR. Patrick M., Osisek Damian L., Schmidt Donald W., West, Yeh Phil C.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

A measurement sampling facility takes snapshots of the central processing unit (CPU) on which it is executing at specified sampling intervals to collect data relating to tasks executing on the CPU. The collected data is stored in a buffer, and at selected times, an interrupt is provided to remove data from the buffer to enable reuse thereof. The interrupt is not taken after each sample, but in sufficient time to remove the data and minimize data loss. 1. A computer system for executing a machine instruction in a central processing unit , the computer system comprising:a memory; and [ an opcode field identifying a set sampling controls instruction; and', 'a first field and a second field to be used to form a second operand address; and, 'obtaining a machine instruction for execution, the machine instruction being defined for computer execution according to a computer architecture, the machine instruction comprising, activating sampling for one or more sampling intervals to obtain information relating to processing of the central processing unit, wherein the activating sampling comprises at least one of activating basic sampling to obtain a set of architected sample data or activating diagnostic sampling to obtain a set of non-architected sample data; and', 'placing in one or more control registers one or more sampling controls of a request block located in one or more storage locations designated by the second operand address., 'executing the machine instruction, the executing comprising], 'a processor in communications with the memory, wherein the computer system is configured to perform a method, said method comprising2. The computer system of claim 1 , wherein the activating sampling comprises activating basic sampling claim 1 , and wherein the one or more control registers comprises information regarding one or more instructions executed by the central processing unit.3. The computer system of claim 1 , wherein the request block comprises at least one of:a size ...

Подробнее

Номер записи: 67

12-12-2013 дата публикации

MODULATING DYNAMIC OPTIMAIZATIONS OF A COMPUTER PROGRAM

Номер: US20130332710A1

Автор: Kruglick Ezekiel

Принадлежит: EMPIRE TECHNOLOGY DEVELOPMENT LLC

Technologies and implementations for modulating dynamic optimizations of a computer program during execution are generally disclosed. 1. A method comprising:receiving an intermediate representation (IR) of machine executable instructions;performing an optimization of the received IR to generate one or more intermediately optimized IRs, the one or more intermediately optimized IRs being a predetermined percentage below that of a fully optimized IR; andutilizing at least one of the one or more intermediately optimized IRs during an execution of the machine executable code.2. The method of further comprising storing the one or more of intermediately optimized IRs.3. The method of claim 1 , wherein receiving the IR comprises receiving the IR at a dynamic run-time compiler.4. The method of claim 2 , wherein the dynamic run-time compiler comprises a Just-in-Time (JIT) compiler.5. The method of claim 1 , wherein receiving the IR comprises receiving byte-code.6. The method of claim 1 , wherein receiving the IR comprise receiving virtual machine type instructions.7. The method of claim 1 , wherein utilizing the at least one of the one or more intermediately optimized IRs comprises utilizing claim 1 , in a random manner claim 1 , the at least one of the one or more intermediately optimized IRs.8. The method of claim 1 , wherein utilizing the at least one of the one or more intermediately optimized IRs comprises utilizing claim 1 , in a random time dependent manner claim 1 , the at least one of the one or more intermediately optimized IRs.9. The method of claim 1 , wherein utilizing the at least one of the one or more intermediately optimized IRs comprises adjusting claim 1 , in a random manner claim 1 , the predetermined percentage.10. A machine readable non-transitory medium having stored therein instructions that claim 1 , when executed by one or more processors claim 1 , operatively enable a programming translation module to:receive an intermediate representation (IR) of ...

Подробнее

Номер записи: 68

12-12-2013 дата публикации

Systems and methods for efficient scheduling of concurrent applications in multithreaded processors

Номер: US20130332711A1

Автор: Dean E. Walker, Joe Bolding, John D. Leidel, Kevin R. Wadleigh, Tony Brewer

Принадлежит: Convey Computer

Systems and methods which provide a modular processor framework and instruction set architecture designed to efficiently execute applications whose memory access patterns are irregular or non-unit stride as disclosed. A hybrid multithreading framework (HMTF) of embodiments provides a framework for constructing tightly coupled, chip-multithreading (CMT) processors that contain specific features well-suited to hiding latency to main memory and executing highly concurrent applications. The HMTF of embodiments includes an instruction set designed specifically to exploit the high degree of parallelism and concurrency control mechanisms present in the HMTF hardware modules. The instruction format implemented by a HMTF of embodiments is designed to give the architecture, the runtime libraries, and/or the application ultimate control over how and when concurrency between thread cache units is initiated. For example, one or more bit of the instruction payload may be designated as a context switch bit (CTX) for expressly controlling context switching.

Подробнее

Номер записи: 69

19-12-2013 дата публикации

Selectively controlling instruction execution in transactional processing

Номер: US20130339328A1

Автор: Christian Jacobi, Dan F. Greiner, Robert R. Rogers, Timothy J. Slegel

Принадлежит: International Business Machines Corp

Execution of instructions in a transactional environment is selectively controlled. A TRANSACTION BEGIN instruction initiates a transaction and includes controls that selectively indicate whether certain types of instructions are permitted to execute within the transaction. The controls include one or more of an allow access register modification control and an allow floating point operation control.

Подробнее

Номер записи: 70

19-12-2013 дата публикации

INSTRUCTION EXECUTION UNIT THAT BROADCASTS DATA VALUES AT DIFFERENT LEVELS OF GRANULARITY

Номер: US20130339664A1

Автор: Charney Mark J., Corbal Jesus, Ould-Ahmed-Vall Elmoustapha, Toll Bret L., Valentine Robert

Принадлежит:

An apparatus is described that includes an execution unit to execute a first instruction and a second instruction. The execution unit includes input register space to store a first data structure to be replicated when executing the first instruction and to store a second data structure to be replicated when executing the second instruction. The first and second data structures are both packed data structures. Data values of the first packed data structure are twice as large as data values of the second packed data structure. The first data structure is four times as large as the second data structure. The execution unit also includes replication logic circuitry to replicate the first data structure when executing the first instruction to create a first replication data structure, and, to replicate the second data structure when executing the second instruction to create a second replication data structure. 1. An apparatus , comprising:an instruction execution pipeline coupled to register space, said instruction execution pipeline having an execution unit to execute a first instruction and a second instruction wherein:i) said register space is to store a first data structure to be replicated when said execution unit executes said first instruction and to store a second data structure to be replicated when said execution unit executes said second instruction, said first and second data structures both being packed data structures, data values of said first packed data structure being twice as large as data values of said second packed data structure, said first data structure being four times as large as said second data structure;ii) said execution unit includes replication logic circuitry to replicate said first data structure when executing said first instruction to create a first replication data structure, and, to replicate said second data structure when executing said second instruction to create a second replication data structure;iii) said instruction ...

Подробнее

Номер записи: 71

19-12-2013 дата публикации

SPECIAL CASE REGISTER UPDATE WITHOUT EXECUTION

Номер: US20130339667A1

Автор: ALEXANDER GREGORY W., BARRICK BRIAN D., Busaba Fadi Y., Shum Chung-Lum K.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

A method of changing a value of associated with a logical address in a computing device. The method includes: receiving an instruction at an instruction decoder, the instruction including a target register expressed as a logical value; determining at an instruction decoder that a result of the instruction is to set the target register to a constant value, the target register being in a physical register file associated with an execution unit; and mapping, in a register mapper, the logical address to a location represented by a special register tag. 1. A computer program product for changing a value of associated with a logical address in a computing device , the computer program product comprising:a tangible storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for performing a method comprising:receiving an instruction at an instruction decoder, the instruction including a target register expressed as a logical value;determining at an instruction decoder that a result of the instruction is to set the target register to a constant value, the target register being in a physical register file associated with an execution unit; andmapping, in a register mapper, the logical address to a location represented by a special register tag.2. The computer program product of claim 1 , wherein the method further comprises:assigning one or more of the registers in the physical register file an unchangeable constant value; andwherein the special register tag is equal to an address of the register in the physical register file having an unchangeable constant value equal to the constant value.3. The computer program product of claim 1 , wherein the location represented by the special register tag is not contained in the physical register file.4. The computer program product of claim 3 , wherein the special register tag is equal to or can be converted to the constant value.5. The computer program product of claim 4 , the ...

Подробнее

Номер записи: 72

19-12-2013 дата публикации

SYSTEMS, APPARATUSES, AND METHODS FOR PERFORMING DELTA DECODING ON PACKED DATA ELEMENTS

Номер: US20130339668A1

Автор: Drysdale Tracy Garrett, Ould-Ahmed-Vall Elmoustapha, Willhalm Thomas

Принадлежит:

Embodiments of systems, apparatuses, and methods for performing delta decoding on packed data elements of a source and storing the results in packed data elements of a destination using a single vector packed delta decode instruction are described. 1. A method of performing delta decoding on packed data elements of a source and storing the results in packed data elements of a destination using a single vector packed delta decode instruction , the method comprising the steps of:executing, in execution resources of a processor core, a decoded vector packed delta decode instruction that includes a source operand and a destination operand each having a plurality of packed data elements to calculate for each packed data element position of the source operand a value that comprises a packed data element of that packed data element position and all packed data elements of packed data element positions that are of lesser significance; andfor each calculated value, storing the value into a packed data element position of the destination operand that corresponds to the packed data element position of the source operand.2. The method of claim 1 , wherein the source and destination operands are vector registers.3. The method of claim 2 , wherein the vector registers are 512-bit in size.4. The method of claim 1 , wherein the packed data elements are 32-bits in size.5. The method of claim 1 , wherein the values are calculated by adding all of the packed data elements of the source together and claim 1 , for each packed data element position claim 1 , subtracting all data elements that come from packed data element positions of equal or greater significance.6. The method of claim 2 , wherein the vector registers are 128-bit in size.7. The method of claim 2 , wherein the vector registers are 256-bit in size.8. A method of performing delta decoding on packed data elements of a source and storing the results in packed data elements of a destination using a single vector packed delta ...

Подробнее

Номер записи: 73

19-12-2013 дата публикации

PROCESSING APPARATUS, TRACE UNIT AND DIAGNOSTIC APPARATUS

Номер: US20130339686A1

Автор: CRASKE Simon John, Gibbs Michael John, Gilkerson Paul Anthony, Horley John Michael

Принадлежит: ARM LIMITED

A processing circuit is responsive to at least one conditional instruction to perform a conditional operation in dependence on a current value of a subset of at least one condition flag. A trace circuit is provided for generating trace data elements indicative of operations performed by the processing circuit. When the processing circuit processes at least one selected instruction, then the trace circuit generates a trace data element including a traced condition value indicating at least the subset of condition flags required to determine the outcome of the conditional instruction. A corresponding diagnostic apparatus uses the traced condition value to determine a processing outcome of the at least one conditional instruction. 1. A processing apparatus comprising:processing circuitry configured to perform processing operations in response to program instructions;a condition status storage location configured to store at least one condition flag indicating a condition of said processing circuitry; andtrace circuitry configured to generate trace data elements indicative of said processing operations performed by said processing circuitry in response to said program instructions; wherein:said processing circuitry is responsive to at least one conditional instruction to perform a conditional operation in dependence on a current value of a subset of said at least one condition flag; andsaid trace circuitry is configured, in response to said processing circuitry processing at least one of said at least one conditional instruction, to generate a trace data element including a traced condition value indicative of at least said subset of said at least one condition flag, said traced condition value providing information for determining a processing outcome of said at least one conditional instruction.2. The processing apparatus according to claim 1 , wherein said traced condition value comprises an identifier identifying a value of at least said subset of said at least one ...

Подробнее

Номер записи: 74

19-12-2013 дата публикации

METHOD AND SYSTEM FOR POLLING NETWORK CONTROLLERS

Номер: US20130339710A1

Автор: Ding Jianzu

Принадлежит: Fortinet, Inc.

Improving the performance of multitasking processors are provided. For example, a subset of M processors within a Symmetric Multi-Processing System (SMP) with N processors is dedicated for a specific task. The M (M>0) of the N processors are dedicate to a task, thus, leaving (N-M) processors for running normal operating system (OS). The processors dedicated to the task may have their interrupt mechanism disabled to avoid interrupt handler switching overhead. Therefore, these processors run in an independent context and can communicate with the normal OS and cooperation with the normal OS to achieve higher network performance. 1. A method for improving the performance of a multi-processor system , the method comprising:dedicating M general-purpose processors from N general-purpose processors to perform a network polling task, wherein N is greater than M and the M processors are dedicated as network processors (NPs), the dedicating including disabling of interrupts to prevent context switching of the NPs and to prevent the NPs from performing tasks other than the network polling task.2. The method of claim 1 , further comprising:bypassing network interface controller (NIC) initialization during normal boot of an operating system;reserving memory in a shared memory as a pseudo NIC; andperforming network polling by coupling the NPs and network interface controllers, via the pseudo NIC, to facilitate communication between the NPs and network interface controllers.3. The method of claim 1 , wherein dedicating the M general-purpose processors as NPs includes obtaining control of the M general-purpose processors such that the M general-purpose processors perform the network polling task.4. The method of claim 1 , wherein the task of polling comprises one or more of subtasks comprising:processing packets, forwarding packets, routing packets, processing content, sending packets to and from network interface controller and processing for other networks.5. A computer-readable ...

Подробнее

Номер записи: 75

26-12-2013 дата публикации

Optimizing Performance Of Instructions Based On Sequence Detection Or Information Associated With The Instructions

Номер: US20130346728A1

Автор: Falik Ohad, GABOR Ron, Kurolap Yulia, Mishaeli Michael, Rappoport Lihu

Принадлежит:

In one embodiment, the present invention includes an instruction decoder that can receive an incoming instruction and a path select signal and decode the incoming instruction into a first instruction code or a second instruction code responsive to the path select signal. The two different instruction codes, both representing the same incoming instruction may be used by an execution unit to perform an operation optimized for different data lengths. Other embodiments are described and claimed. 1. A method comprising:determining whether an iterative copy instruction can be optimized based at least in part on information associated with the iterative copy instruction;if so performing a first portion of the iterative copy instruction by a first sequence of conditional copy operations using a power of two tree of copies to copy up to a first amount of data in up to a first number of chunks to first destination locations from first source locations;performing a second portion of the iterative copy instruction by copying a second amount of data via a fast loop of copy operations to second destination locations from second source locations if a remainder of the data to be copied is greater than a first threshold; andthereafter performing a third portion of the iterative copy instruction by a second sequence of conditional copy operations to copy up to a third amount of data in up to a third number of chunks to third destination locations from third source locations, if any of the data remains to be copied.2. The method of claim 1 , further comprising obtaining set up information for the fast loop and the second sequence of conditional copy operations before executing the first sequence of conditional copy operations.3. The method of claim 1 , further comprising determining if the second amount of data is greater than a second threshold claim 1 , and if so using a caching hint to copy the second amount of data directly to a memory without storage in a cache.4. The method of ...

Подробнее

Номер записи: 76

09-01-2014 дата публикации

METHOD AND SYSTEM ADAPTED FOR CONVERTING SOFTWARE CONSTRUCTS INTO RESOURCES FOR IMPLEMENTATION BY A DYNAMICALLY RECONFIGURABLE PROCESSOR

Номер: US20140013080A1

Автор: MYKLAND ROBERT KEITH

Принадлежит:

A method and system are provided for deriving a resultant software code from an originating ordered list of instructions that does not include overlapping branch logic. The method may include deriving a plurality of unordered software constructs from a sequence of processor instructions; associating software constructs in accordance with an original logic of the sequence of processor instructions; determining and resolving memory precedence conflicts within the associated plurality of software constructs; resolving forward branch logic structures into conditional logic constructs; resolving back branch logic structures into loop logic constructs; and/or applying the plurality of unordered software constructs in a programming operation by a parallel execution logic circuitry. The resultant plurality of unordered software constructs may be converted into programming reconfigurable logic, computers or processors, and also by means of a computer network or an electronics communications network. 1. In an information technology system , a method comprising:a. accessing a first data flow model of a first software construct type, wherein the first data flow model includes at least one resource, the at least one resource modeling a component of a dynamically reconfigurable processor;b. initiating a compilation of a plurality of software constructs;c. determining a first instance of a software construct of the plurality of software constructs that conforms to the first software construct type; andd. expressing the first instance of the first software construct type as an instance of the first data flow model in a resultant data flow model generated from the compilation of the plurality of software constructs.2. The method of claim 1 , wherein the first data flow model comprises a plurality of resources claim 1 , wherein each resource represents at least one component of a dynamically reconfigurable processor.3. The method of claim 1 , wherein the first data flow model ...

Подробнее

Номер записи: 77

16-01-2014 дата публикации

VECTOR FREQUENCY EXPAND INSTRUCTION

Номер: US20140019714A1

Автор: DOSHI Kshitij A., Ould-Ahmed-Vall Elmoustapha, Sair Suleyman, Toll Bret L., Yount Charles

Принадлежит:

A processor core that includes a hardware decode unit and an execution engine unit. The hardware decode unit to decode a vector frequency expand instruction, wherein the vector frequency compress instruction includes a source operand and a destination operand, wherein the source operand specifies a source vector register that includes one or more pairs of a value and run length that are to be expanded into a run of that value based on the run length. The execution engine unit to execute the decoded vector frequency expand instruction which causes, a set of one or more source data elements in the source vector register to be expanded into a set of destination data elements comprising more elements than the set of source data elements and including at least one run of identical values which were run length encoded in the source vector register. 1. A method of performing a vector frequency expand instruction in a computer processor , comprising:fetching the vector frequency expand instruction that includes a source operand and a destination operand, wherein the source operand specifies a source vector register that includes one or more pairs of a value and run length that are to be expanded into a run of that value based on the run length;decoding the fetched vector frequency expand instruction; andexecuting the decoded vector frequency expand instruction causing, a set of one or more source data elements in the source vector register to be expanded into a set of destination data elements comprising more elements than the set of source data elements and including at least one run of identical values which were run length encoded in the source vector register.2. The method of claim 1 , wherein the executing the decoded vector frequency expand instruction further causes an exception be raised when a source data element contains the value to be expanded into a run of without a run length pair.3. The method of claim 1 , wherein the executing the decoded vector frequency ...

Подробнее

Номер записи: 78

16-01-2014 дата публикации

METHODS, APPARATUS, AND INSTRUCTIONS FOR CONVERTING VECTOR DATA

Номер: US20140019720A1

Автор: Carmean Douglas M., Cavin Robert D., Rohillah Anwar, Sprangle Eric

Принадлежит:

A computer processor includes a decoder for decoding machine instructions and an execution unit for executing those instructions. The decoder and the execution unit are capable of decoding and executing vector instructions that include one or more format conversion indicators. For instance, the processor may be capable of executing a vector-load-convert-and-write (VLoadConWr) instruction that provides for loading data from memory to a vector register. The VLoadConWr instruction may include a format conversion indicator to indicate that the data from memory should be converted from a first format to a second format before the data is loaded into the vector register. Other embodiments are described and claimed. 1. A processor , comprising:a cache memory to store data;a memory controller to provide access to an external random access memory; instruction fetch logic to fetch one or more instructions;', 'instruction decode logic to decode one or more instructions;', 'a register file including a set of vector registers, each vector register to store a plurality of vector data elements;', 'an execution unit to execute a first instruction to read a single-precision floating point value from memory, convert the value to a double-precision floating point value, store the results in a vector register of the register file,, 'a plurality of processing cores in a single chip package, wherein each processing core is to execute multiple threads simultaneously, and wherein each processing core further compriseswherein the execution unit is to execute a second instruction to convert a single-precision floating point value to a signed integer value and store the results in a storage location,and wherein the execution unit is to execute a third instruction to convert a single-precision floating point value to an unsigned integer value and store the results in the storage location.2. The processor as in wherein the register file comprises one set of physical registers for storing ...

Подробнее

Номер записи: 79

16-01-2014 дата публикации

BINARY TRANSLATION IN ASYMMETRIC MULTIPROCESSOR SYSTEM

Номер: US20140019723A1

Автор: Ginzburg Boris, Haber Gadi, Levit-Gurevich Konstantin, Li Wei, Mishaeli Michael, Natanzon Esfir, Naveh Alon, Ronen Ronny, Weissmann Eliezer, Yamada Koichi

Принадлежит:

An asymmetric multiprocessor system (ASMP) may comprise computational cores implementing different instruction set architectures and having different power requirements. Program code for execution on the ASMP is analyzed and a determination is made as to whether to allow the program code, or a code segment thereof to execute on a first core natively or to use binary translation on the code and execute the translated code on a second core which consumes less power than the first core during execution. 1. A device comprising:a control unit to select whether to execute a code segment on a first core or translate the code segment for execution on a second core;a migration unit to accept the selection to execute the code segment on the first core and migrate the code segment to the first core; anda binary translator unit to accept the selection to translate the code segment and generate a binary translation of the code segment to execute on the second core;2. The device of claim 1 , the first core to execute instructions from a first instruction set architecture and the second core to execute instructions from a second instruction set architecture comprising a subset of the first instruction set architecture.3. The device of claim 1 , further comprising a translation blacklist unit to maintain a list of instructions to not perform binary translation on.4. The device of claim 1 , the selecting whether to execute or translate the code segment comprising determining a code segment length and translating when the code segment length is below a pre-determined length threshold.5. A processor comprising:a first core to operate at a first maximum power consumption rate;a second core to operate at a second maximum power consumption rate which is less than the first maximum power consumption rate; and when to execute program code on the first core without binary translation; and', 'when to apply binary translation to the program code to generate translated program code and execute ...

Подробнее

Номер записи: 80

16-01-2014 дата публикации

COOPERATIVE THREAD ARRAY REDUCTION AND SCAN OPERATIONS

Номер: US20140019724A1

Автор: Coon Brett W., FAHS Brian, Nickolls John R., Nyland Lars, SIU Ming Y.

Принадлежит: NVIDIA CORPORATION

One embodiment of the present invention sets forth a technique for performing aggregation operations across multiple threads that execute independently. Aggregation is specified as part of a barrier synchronization or barrier arrival instruction, where in addition to performing the barrier synchronization or arrival, the instruction aggregates (using reduction or scan operations) values supplied by each thread. When a thread executes the barrier aggregation instruction the thread contributes to a scan or reduction result, and waits to execute any more instructions until after all of the threads have executed the barrier aggregation instruction. A reduction result is communicated to each thread after all of the threads have executed the barrier aggregation instruction and a scan result is communicated to each thread as the barrier aggregation instruction is executed by the thread. 1. A method for performing a scan operation across multiple threads , the method comprising:receiving a barrier instruction that specifies the scan operation for execution by a first thread of the multiple threads;combining a value associated with the first thread with an scan result for the multiple threads;communicating the scan result to the first thread; andcausing another instruction to be executed without waiting until the barrier instruction is received by a second thread of the multiple threads.2. The method of claim 1 , further comprising the steps of:determining that the second thread is the last thread of the multiple threads to receive the barrier instruction; andinitializing the scan result.3. The method of claim 1 , wherein the communication of the scan result to the first thread occurs before the value associated with the first thread is combined with the scan result.4. The method of claim 1 , wherein the communication of the scan result to the first thread occurs after the value associated with the first thread is combined with the scan result.5. The method of claim 1 , ...

Подробнее

Номер записи: 81

16-01-2014 дата публикации

GENERALIZED BIT MANIPULATION INSTRUCTIONS FOR A COMPUTER PROCESSOR

Номер: US20140019731A1

Автор: Anand Christopher Kumar, Broadhead Simon Christopher, Enenkel Robert Frederick

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

Methods of bit manipulation within a computer processor are disclosed. Improved flexibility in bit manipulation proves helpful in computing elementary functions critical to the performance of many programs and for other applications. In one embodiment, a unit of input data is shifted/rotated and multiple non-contiguous bit fields from the unit of input data are inserted in an output register. In another embodiment, one of two units of input data is optionally shifted or rotated, the two units of input data are partitioned into a plurality of bit fields, bitwise operations are performed on each bit field, and pairs of bit fields are combined with either an AND or an OR bitwise operation. Embodiments are also disclosed to simultaneously perform these processes on multiple units and pairs of units of input data in a Single Input, Multiple Data processing environment capable of performing logical operations on floating point data. 1. A method for bit manipulation within a computer processor , the method comprising:provisioning at least one unit of input data; partition a unit of input data from the at least one unit of input data into a plurality of bit fields, where a length of a bit field is determined based on a capability of each bit in the bit field to be manipulated according to a common manipulation rule to achieve a particular result, and', 'manipulate the plurality of bit fields in accordance with at least one manipulation rule to accomplish the particular result;, 'provisioning control data, wherein the control data is configured to provide information necessary topartitioning the at least one unit of input data into the plurality of bit fields in accordance with the control data; andmanipulating each bit field in accordance with the control data to achieve the particular result.2. The method of claim 1 , further comprising:storing a vector of units of input data from the at least one unit of input data for parallel processing in a single instruction multiple ...

Подробнее

Номер записи: 82

16-01-2014 дата публикации

REAL TIME INSTRUCTION TRACING COMPRESSION OF RET INSTRUCTIONS

Номер: US20140019733A1

Автор: Brandt Jason, Lastor Dennis, Tyler Jonathan, Zurawski John

Принадлежит:

In accordance with embodiments disclosed herein, there are provided methods, systems, mechanisms, techniques, and apparatuses for implementing Real Time Instruction Tracing compression of RET instructions For example, in one embodiment, such means may include an integrated circuit having means for initiating instruction tracing for instructions of a traced application, mode, or code region, as the instructions are executed by the integrated circuit; means for generating a plurality of packets describing the instruction tracing; and means for compressing a multi-bit RET instruction (RETurn instruction) to a single bit RET instruction. 1. A method in an integrated circuit , wherein the method comprises:initiating instruction tracing for instructions of a traced application, mode, or code region, as the instructions are executed by the integrated circuit;generating a plurality of packets describing the instruction tracing; andcompressing a multi-bit RET instruction (RETurn instruction) to a single bit RET instruction.2. The method of claim 1 , wherein compressing the multi-bit RET instruction to the single bit RET instruction comprises:the integrated circuit executing and retiring a near call;the integrated circuit storing a linear address of an instruction following the near call;the integrated circuit executing and retiring RET instruction (RETurn instruction);determining whether a target address of the RET instruction matches a current call NLIP (Next Linear Instruction Pointer) entry and storing “0” when not matching and storing “1” when matching, into a history buffer; anddetermining whether the history buffer is full and sending a TNT packet (Taken-Not-Taken packet) when full and waiting until the history buffer is full when the history buffer is not full.3. The method of claim 1 , wherein the multi-bit RET is one of a 24-bit or a 56-bit return instruction.4. The method of claim 1 , wherein the multi-bit RET instruction comprises an indirect jump whose target ...

Подробнее

Номер записи: 83

16-01-2014 дата публикации

DATA PROCESSING APPARATUS AND METHOD USING CHECKPOINTING

Номер: US20140019734A1

Автор: Begon Florent, Chaussade Nicolas, Jaubert Jocelyn Francois Orion, Teyssier Melanie Emanuelle Lucie, Teyssier Rémi

Принадлежит: ARM LIMITED

A data processing apparatus and method of data processing are provided. The data processing apparatus comprises execution circuitry configured to execute a sequence of program instructions. Checkpoint circuitry is configured to identify an instance of a predetermined type of instruction in the sequence of program instructions and to store checkpoint information associated with that instance. The checkpoint information identifies a state of the data processing apparatus prior to execution of that instance of the predetermined type of instruction, wherein the predetermined type of instruction has an expected long completion latency. If the execution circuitry does not complete execution of that instance of the predetermined type of instruction due to occurrence of a predetermined event, the data processing apparatus is arranged to reinstate the state of the data processing apparatus with reference to the checkpoint information, such that the execution circuitry is then configured to recommence execution of the sequence of program instructions at that instance of the predetermined type of instruction. 1. A data processing apparatus comprising:execution circuitry configured to execute a sequence of program instructions;checkpoint circuitry configured to identify an instance of a predetermined type of instruction in said sequence of program instructions and to store checkpoint information associated with said instance of said predetermined type of instruction, said checkpoint information identifying a state of said data processing apparatus prior to execution of said instance of said predetermined type of instruction,wherein said predetermined type of instruction has an expected long completion latency, andwherein if said execution circuitry does not complete execution of said instance of said predetermined type of instruction due to occurrence of a predetermined event, said data processing apparatus is arranged to reinstate said state of said data processing apparatus ...

Подробнее

Номер записи: 84

23-01-2014 дата публикации

CONTROL APPARATUS

Номер: US20140025936A1

Автор: Morikawa Daisuke

Принадлежит:

A control apparatus configured to receive instruction data from a transmission unit and to control a controlled apparatus based on the instruction data includes a determination unit configured to determine an error in reception of the instruction data from the transmission unit, a communication unit configured to receive the instruction data from the transmission unit and to transmit reply data according to a result of determination of the determination unit to the transmission unit, a module configured to control the controlled apparatus based on the instruction data, and a control unit configured to, if a content of current instruction data received by the communication unit matches a content of previous instruction data received by the communication unit, control the module not to control the controlled apparatus based on the current instruction data. 1. A control apparatus configured to receive instruction data from a transmission unit and to control a controlled apparatus based on the instruction data , the control apparatus comprising:a determination unit configured to determine an error in reception of the instruction data from the transmission unit;a communication unit configured to receive the instruction data from the transmission unit and to transmit reply data according to a result of determination of the determination unit to the transmission unit;a module configured to control the controlled apparatus based on the instruction data; anda control unit configured to, if a content of current instruction data received by the communication unit matches a content of previous instruction data received by the communication unit, control the module not to control the controlled apparatus based on the current instruction data.2. The control apparatus according to claim 1 , wherein the control unit is configured to claim 1 , if the current instruction data includes a specific instruction claim 1 , control the module to control the controlled apparatus based on the ...

Подробнее

Номер записи: 85

30-01-2014 дата публикации

METHODS AND APPARATUS TO MANAGE PARTIAL-COMMIT CHECKPOINTS WITH FIXUP SUPPORT

Номер: US20140032885A1

Автор: Borin Edson, Wu Youfeng

Принадлежит:

Example methods and apparatus to manage partial commit-checkpoints are disclosed. A disclosed example method includes identifying a commit instruction associated with a region of instructions executed by a processor, identifying candidate instructions from the region of instructions, and generating a processor partial commit-checkpoint to save a current state of the processor, the checkpoint based on calculated register values associated with live instructions, and including instruction reference addresses to link the candidate instructions. 1distinguishing between first instructions and candidate instructions from a region of instructions, the candidate instructions to produce an unused value in the region of instructions;preventing creation of a precise checkpoint state for a register associated with the candidate instructions in response to a speculative execution attempt; andgenerating code to calculate the precise checkpoint state of the register and associating the code with an instruction reference address to link to the candidate instructions.. A method to reduce processor resources during checkpoint creation, comprising: This patent arises from an application claiming priority as a continuation of U.S. application Ser. No. 12/644,151, (Now U.S. Pat. No. 8,549,267), entitled “METHODS AND APPARATUS TO MANAGE PARTIAL-COMMIT CHECKPOINTS WITH FIXUP SUPPORT” which was filed on Dec. 22, 2009, granted on Oct. 1, 2013 and is hereby incorporated herein by reference in its entirety.The present disclosure relates to speculative execution, and in particular, to methods and apparatus to manage partialcommit-checkpoints with fixup support.In the context of microprocessors, a speculative execution system (SES) is a system that enables the speculative execution of instructions. Speculative execution is typically leveraged to enable safe execution of dynamically optimized code (e.g., execution of optimized regions of code in a hardware (HW) and/or software (SW) co-designed ...

Подробнее

Номер записи: 86

06-02-2014 дата публикации

Storage Method, Memory, and Storing System with Accumulated Write Feature

Номер: US20140040602A1

Автор: Jochen Hoffmann

Принадлежит: Xi'an Sinochip Semiconductors Co., Ltd.

A storage method, a memory and a storage system that have an accumulated write feature are provided in which the OR and AND operation are shifted from CPU/ALU (controller) to the memory, and the frequency for switching data transmission lines between read and write instructions can be reduced. In the memory, the interface unit includes a write arithmetic instruction interface, a write instruction interface, and an address instruction interface; the instruction/address decoder is configured to decode a write arithmetic instruction, a write instruction and an address instruction; and the pFET has a higher driving capability than the data switches, and the nFET has a lower driving capability than the data switches. The storage method, memory and storage system can reduce work load of CPU/ALU, and enable continuous data writing to the memory. 1. A storage method with an accumulated write feature , comprising steps of:1) providing a standard instruction interface between a controller or CPU and a memory, so that the controller or CPU can send a write instruction, an address instruction and a write arithmetic instruction to the memory, wherein the write arithmetic instruction comprises a “write_OR” instruction and/or a “write_AND” instruction;2) decoding the write instruction, the address instruction and the write arithmetic instruction by an instruction/address decoder in the memory;3) if a “write_OR” instruction is decoded, turning on a “write_OR” data switch of complementary data switches in a memory cell corresponding to the address instruction, wherein data written from a data transmission line can switch non-inverted data in cross-coupled inverters from 0 to 1, but not from 1 to 0; if a “write_AND” instruction is decoded, turning on a “write_AND” data switch of the complementary data switches in the memory cell corresponding to the address instruction, wherein the data written from the data transmission line can switch the non-inverted data in the cross-coupled ...

Подробнее

Номер записи: 87

13-02-2014 дата публикации

FUSING FLAG-PRODUCING AND FLAG-CONSUMING INSTRUCTIONS IN INSTRUCTION PROCESSING CIRCUITS, AND RELATED PROCESSOR SYSTEMS, METHODS, AND COMPUTER-READABLE MEDIA

Номер: US20140047221A1

Автор: Brown Melinda J., Dieffenderfer James Norris, Irwin Andrew S., McIlvaine Michael Scott, Morrow Michael William, Schottmiller Jeffery M., Smith Rodney Wayne, Stempel Brian Michael

Принадлежит: QUALCOMM INCORPORATED

Fusing flag-producing and flag-consuming instructions in instruction processing circuits and related processor systems, methods, and computer-readable media are disclosed. In one embodiment, a flag-producing instruction indicating a first operation generating a first flag result is detected in an instruction stream by an instruction processing circuit. The instruction processing circuit also detects a flag-consuming instruction in the instruction stream indicating a second operation consuming the first flag result as an input. The instruction processing circuit generates a fused instruction indicating the first operation generating the first flag result and indicating the second operation consuming the first flag result as the input. In this manner, as a non-limiting example, the fused instruction eliminates a potential for a read-after-write hazard between the flag-producing instruction and the flag-consuming instruction. 1. An instruction processing circuit , configured to:detect a flag-producing instruction in an instruction stream indicating a first operation generating a first flag result;detect a flag-consuming instruction in the instruction stream indicating a second operation consuming the first flag result as an input; andgenerate a fused instruction indicating the first operation generating the first flag result and indicating the second operation consuming the first flag result as the input.2. The instruction processing circuit of claim 1 , configured to detect the flag-producing instruction indicating the first operation setting one or more condition code flags.3. The instruction processing circuit of claim 1 , configured to detect the flag-consuming instruction located adjacent to the flag-producing instruction in the instruction stream.4. The instruction processing circuit of claim 1 , further configured to:detect at least one intervening instruction in the instruction stream between the flag-producing instruction and the flag-consuming instruction; ...

Подробнее

Номер записи: 88

13-02-2014 дата публикации

METHOD AND DEVICE FOR RECOMBINING RUNTIME INSTRUCTION

Номер: US20140047222A1

Автор: Wang Jiaxiang

Принадлежит: Beijing Zhongtian Antai Technology Co., Ltd.

A method for recombining runtime instruction comprising: an instruction running environment is buffered; the machine instruction segment to be scheduled is obtained; the second jump instruction which directs an entry address of an instruction recombining platform is inserted before the last instruction of the obtained machine instruction segment to generate the recombined instruction segment comprising the address A″; the value A of the address register of the buffered instruction running environment is modified to the address A″; the instruction running environment is recovered. A device for recombining the runtime instruction comprising: an instruction running environment buffering and recovering unit suitable for buffering and recovering the instruction running environment; an instruction obtaining unit suitable for obtaining the machine instruction segment to be scheduled; an instruction recombining unit suitable for generating the recombined instruction segment comprised the address A″; and an instruction replacing unit suitable for modifying the value of the address register of the buffered instruction running environment to the address of the recombined instruction segment. The monitoring and control of the runtime instruction of the computing device is completed. 1. A runtime instruction recombination method , comprising:storing an instruction execution context;acquiring a machine instruction segment to be scheduled, inserting a second control transfer instruction before the last instruction of the machine instruction segment to be scheduled, the second control transfer instruction pointing to an entry address of an instruction recombination platform, which generates a recombined instruction segment, and modifying value of an address register in the instruction execution context to an address of the recombined instruction segment; andrestoring the instruction execution context, wherein the address register's value is updated.2. The runtime instruction ...

Подробнее

Номер записи: 89

20-02-2014 дата публикации

TECHNIQUE TO PERFORM THREE-SOURCE OPERATIONS

Номер: US20140052963A1

Автор: Farcy Alexandre, HAMMARLUND PER, Jourdan Stephan, Sodani Avinash

Принадлежит:

A technique to perform three-source instructions. At least one embodiment of the invention relates to converting a three-source instruction into at least two instructions identifying no more than two source values. 1. A processor comprising:a decoder unit to decode and convert a first instruction, in which is identified at least three source operands and a single mathematical operation to be performed on the at least three source operands, into at least two instructions, each identifying no more than two source operands, to perform a function prescribed by the first instruction; anda processing stage to perform the at least two instructions, the processing stage including a plurality of stages, each having no more than two read ports to receive the no more than two source operands of the at least two instructions2. The processor of claim 1 , wherein the first instruction corresponds to a three-source micro-operation (uop).3. The processor of claim 2 , wherein the decoder is to convert the three-source uop into at least two two-source uops.4. The processor of claim 1 , wherein the plurality of stages includes a reservation station claim 1 , a re-order buffer claim 1 , and an execution unit.5. The processor of claim 4 , wherein the execution unit is to perform the at least two instructions out of program order.6. The processor of claim 5 , further comprising a retirement unit to retire the at least two instructions in program order.7. The processor of claim 6 , wherein each of the at least two instructions includes a field to indicate correspondence to the first instruction.8. The processor of claim 6 , wherein the first one of the at least two instructions includes a pointer to a second one of the at least two instructions.9. The processor of claim 8 , wherein the retirement unit is to access the pointer to retire the at least two instructions in the program order.10. The processor of claim 4 , further comprising a register allocation table.11. The processor of claim ...

Подробнее

Номер записи: 90

20-02-2014 дата публикации

Programmable Logic Unit and Method for Translating and Processing Instructions Using Interpretation Registers

Номер: US20140052964A1

Автор: John Robson, Jonathan Bloomfield, Nick MURPHY

Принадлежит: 3DLabs Ltd

An architecture for microprocessors and the like in which instructions include a type identifier, which selects one of several interpretation registers. The interpretation registers hold information for interpreting the opcode of each instruction, so that a stream of compressed instructions (with type identifiers) can be translated into a stream of expanded instructions. Preferably the type identifiers also distinguish sequencer instructions from processing-element instructions, and can even distinguish among different types of sequencer instructions (as well as among different types of processing-element instructions).

Подробнее

Номер записи: 91

20-02-2014 дата публикации

OPCODE COUNTING FOR PERFORMANCE MEASUREMENT

Номер: US20140052970A1

Автор: Gara Alan, Satterfield David L., Walkup Robert E.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

Methods, systems and computer program products are disclosed for measuring a performance of a program running on a processing unit of a processing system. In one embodiment, the method comprises informing a logic unit of each instruction in the program that is executed by the processing unit, assigning a weight to each instruction, assigning the instructions to a plurality of groups, and analyzing the plurality of groups to measure one or more metrics. In one embodiment, each instruction includes an operating code portion, and the assigning includes assigning the instructions to the groups based on the operating code portions of the instructions. In an embodiment, each type of instruction is assigned to a respective one of the plurality of groups. These groups may be combined into a plurality of sets of the groups. 1. A method of measuring a performance of a program running on a processing unit of a processing system , the method comprising:informing a logic unit of each instruction in the program that is executed by the processing unit;assigning a weight to said each instruction;assigning the instructions to a plurality of groups; andanalyzing said plurality of groups to measure one or more metrics of the program. the logic unit includes a first circuit portion and a second circuit portion; applying a first input to the first circuit portion of the logic unit each time the program executes a floating point operation, and', 'applying a second input to the second circuit portion of the logic unit each time the program executes an integer operation, 'the informing the logic unit of each instruction includes2. The method according to claim 1 , wherein each instruction includes an operating code portion claim 1 , and the assigning includes assigning the instructions to said groups based on the operating code portions of the instructions.3. The method according to claim 1 , wherein the assigning a weight includes using the first circuit portion of the logic unit to ...

Подробнее

Номер записи: 92

27-02-2014 дата публикации

System Core for Transferring Data Between an External Device and Memory

Номер: US20140059324A1

Автор: Carl Donald Busboom, Charles W. Kurak, Jr., Dale Edward Schneider, David Strube, Edward A. Wolff, Edwin Franklin Barry, Gerald George Pechanek, Grayson Morris, Marco Jacobs, Nikos P. Pitsianis, Patrick R. Marchand, Ricardo Rodriguez

Принадлежит: Individual

Details of a highly cost effective and efficient implementation of a manifold array (ManArray) architecture and instruction syntax for use therewith are described herein. Various aspects of this approach include the regularity of the syntax, the relative ease with which the instruction set can be represented in database form, the ready ability with which tools can be created, the ready generation of self-checking codes and parameterized test cases. Parameterizations can be fairly easily mapped and system maintenance is significantly simplified.

Подробнее

Номер записи: 93

27-02-2014 дата публикации

DETECTING CROSS-TALK ON PROCESSOR LINKS

Номер: US20140059327A1

Автор: Berry, Haridass Anand, Jayaraman Prasanna, JR. Robert W.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

A first of a plurality of data lanes of a first of a plurality of processor links is determined to have a weakest of base performance measurements for the plurality of data lanes. A switching data pattern is transmitted via a first set of the remainder processor links and a quiet data pattern is transmitted via a second set of the remainder processor links. If performance of the first data lane increases vis-à-vis the corresponding base performance measurement, the first set of remainder processor links is eliminated from the remainder processor links. If performance of the first data lanes decreases vis-à-vis the corresponding base performance measurement, the second set of remainder processor links is eliminated from the remainder processor links. The above operations are repeatedly executed until an aggressor processor link that is determined to decrease performance of the first of the plurality of data lanes is identified. 1. A method comprising:determining that a first of a plurality of data lanes has a base performance measurement that is a weakest of base performance measurements for the plurality of data lanes, wherein a first processor link of a plurality of processor links comprises the plurality of data lanes;determining that other processor links of the plurality of processor links cause variation in performance of the first of the plurality of data lanes of the first processor link; transmitting a switching data pattern via a first set of remainder processor links and a quiet data pattern via a second set of the remainder processor links, wherein the remainder processor links initially comprise the plurality of processor links excluding the first processor link,', 'determining whether performance of the first data lane of the first processor link increases or decreases with respect to the base performance measurement of the first data lane,', 'eliminating the first set of remainder processor links from the remainder processor links if the performance of ...

Подробнее

Номер записи: 94

06-03-2014 дата публикации

INSTRUCTION ADDRESS ENCODING AND DECODING BASED ON PROGRAM CONSTRUCT GROUPS

Номер: US20140068229A1

Автор: Krishnamoorthy Prakash, Madhani Parag, Tekumalla Ramesh C.

Принадлежит: LSI Corporation

Coding circuitry comprises at least an encoder configured to encode an instruction address for transmission to a decoder. The encoder is operative to identify the instruction address as belonging to a particular one of a plurality of groups of instruction addresses associated with respective distinct program constructs, and to encode the instruction address based on the identified group. The decoder is operative to identify the encoded instruction address as belonging to the particular one of a plurality of groups of instruction addresses associated with respective distinct program constructs, and to decode the encoded instruction address based on the identified group. The coding circuitry may be implemented as part of an integrated circuit or other processing device that includes associated processor and memory elements. In such an arrangement, the processor may generate the instruction address for delivery over a bus to the memory. 1. A method comprising:obtaining an instruction address; andencoding the instruction address;wherein said encoding comprises:identifying the instruction address as belonging to a particular one of a plurality of groups of instruction addresses associated with respective distinct program constructs; andencoding the instruction address based on the identified group.2. The method of further comprising transmitting the encoded instruction address over a bus to a decoder.3. The method of wherein the encoded instruction address includes at least an identifier of the particular group.4. The method of wherein the identifier of the particular group specifies a corresponding branch target address.5. The method of wherein the distinct program constructs comprise two or more of a sequential construct claim 1 , a loop construct claim 1 , an if-then-else construct claim 1 , and a subroutine call/return construct.6. The method of wherein a given one of the groups is associated with a sequential construct having a starting address and a stride ...

Подробнее

Номер записи: 95

06-03-2014 дата публикации

MICRO-ARCHITECTURE FOR ELIMINATING MOV OPERATIONS

Номер: US20140068230A1

Автор: Allen James D., Combs Jonathan, Madduri Venkateswara, Phillips James E., Robinson Stephen J., Tyler Jonathan J.

Принадлежит:

A computer system and processor for elimination of move operations include circuits that obtain a computer instruction and bypass execution units in response to determining that the instruction includes a move operation that involves a transfer of data from a logical source register to a logical destination register. Instead of executing the move operation, the transfer of the data is performed by tracking changes in data dependencies of the source and the destination registers, and assigning a physical register associated with the source register to the destination register based on the dependencies. 1. A computer system that is configured to perform the following:obtaining a computer instruction;responsive to determining that the instruction includes a move operation that involves a transfer of data from a logical source register to a logical destination register, bypassing any execution units in the system to prevent the execution units from executing the operation; and tracking changes in data dependencies of the source and the destination registers, and', 'assigning a physical register associated with the source register to the destination register based on the dependencies., 'performing the transfer of the data by2. The computer system of claim 1 , wherein:the system includes a renaming table containing pointers to the physical registers; andthe assigning of the physical register is performed by assigning a pointer associated with the source register to the destination register.3. The computer system of claim 2 , wherein the pointers to the physical registers include speculative pointers that point to physical registers containing speculative copies of data claim 2 , as well as architectural pointers that point to physical registers containing architectural copies of data.4. The computer system of claim 3 , wherein the system is configured to:when the instruction's existing speculative pointer is updated, change the instruction's architectural pointer to point ...

Подробнее

Номер записи: 96

20-03-2014 дата публикации

Encoding to Increase Instruction Set Density

Номер: US20140082334A1

Автор: KING Steven R., Kochuguev Sergey, Makineni Srihari, Redkin Alexander

Принадлежит:

A conventional instruction set architecture such, as the x86 instruction set architecture, may be reencoded to reduce the amount of memory used by the instructions. This may be particularly useful in applications that are memory sized limited, as is the case with microcontrollers. With a reencoded instruction set that is more dense, more functions can be implemented or a smaller memory size may be used. The encoded instructions are then naturally decoded at run time in the predecoder and decoder of the core pipeline. 1. A method comprising:compressing an instruction set for a processor.2. The method of including compressing instructions using Huffman coding.3. The method of including controlling compression based on a user input.4. The method of including controlling compression based on a user input about the number of new instructions.5. The method of including controlling compression based on a user input about the maximum compression.6. The method of including controlling compression based on a user input about a binary size goal.7. The method of including allow for some reserved instructions of a specified length based on user input.8. The method of including collecting information from a compiler and using that information to control compression.9. The method of including calculating information from the compiler about how many times an instruction was used to control compression.10. The method of including calculating information from the computer about an amount of memory used by an instruction.11. The method of including compressing more frequently used instructions more than less frequently used instructions.12. The method of including identifying new instructions with more efficient operand encoding.13. The method of including identifying new compact opcodes for instructions without using Huffman encoding.14. A non-transitory computer readable medium storing instructions to enable a processor to implement a method comprising:compressing an instruction set ...

Подробнее

Номер записи: 97

27-03-2014 дата публикации

CACHING OPTIMIZED INTERNAL INSTRUCTIONS IN LOOP BUFFER

Номер: US20140089636A1

Автор: Gschwind Michael K., Salapura Valentina

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

Embodiments of the invention relate to a computer system for storing an internal instruction loop in a loop buffer. The computer system includes a loop buffer and a processor. The computer system is configured to perform a method including fetching instructions from memory to generate an internal instruction to be executed, detecting a beginning of a first instruction loop in the instructions, determining that a first internal instruction loop corresponding to the first instruction loop is not stored in the loop buffer, fetching the first instruction loop, optimizing one or more instructions corresponding to the first instruction loop to generate a first optimized internal instruction loop, and storing the first optimized internal instruction loop in the loop buffer based on the determination that the first internal instruction loop is not stored in the loop buffer. 1. A computer program product for implementing an instruction loop buffer , the computer program product comprising: fetching instructions from memory to generate an internal instruction to be executed;', 'determining, by a processor, that a first instruction from the instructions corresponds to a first instruction loop;', 'determining that a first internal instruction loop corresponding to the first instruction loop is not stored in a loop buffer;', 'optimizing one or more internal instructions of the first instruction loop; and', 'storing a resulting first optimized internal instruction loop in the loop buffer based on the determining that the first internal instruction loop is not stored in the loop buffer., 'a tangible storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for performing a method comprising2. The computer program product of claim 1 , wherein optimizing the one or more instructions includes merging at least two machine instructions of the one or more instructions to generate an optimized internal instruction claim 1 , andthe ...

Подробнее

Номер записи: 98

03-04-2014 дата публикации

Prefix Computer Instruction for Compatibly Extending Instruction Functionality

Номер: US20140095833A1

Автор: Michael K. Gschwind, Valentina Salapura

Принадлежит: International Business Machines Corp

A prefix instruction is executed and passes operands to a next instruction without storing the operands in an architected resource such that the execution of the next instruction uses the operands provided by the prefix instruction to perform an operation, the operands may be prefix instruction immediate field or a target register of the prefix instruction execution.

Подробнее

Номер записи: 99

10-04-2014 дата публикации

CODE COVERAGE FRAMEWORK

Номер: US20140101417A1

Автор: Filachek Christopher D., WANG Mei Hui, WISNIEWSKI Joshua B.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

An information processing system records an execution of a program instruction. A determination is made that a thread has entered a program unit. Another determination is made that that the thread is associated with at least one attribute that matches a set of thread recording criteria. An instruction recording mechanism for the thread is dynamically activated in response to the at least one attribute of the thread matching the set of thread recording criteria. 1. An information processing system for recording an execution of a program instruction , the information processing system comprising:a computer memory; and determining that a thread has entered a program unit;', 'determining that the thread is associated with at least one attribute that matches a set of thread recording criteria; and', 'at least one of dynamically activating and deactivating, in response to the at least one attribute of the thread matching the set of thread recording criteria, an instruction recording mechanism for the thread., 'a processor communicatively coupled to the computer memory, the processor configured to perform a method comprising2. The information processing system of claim 1 , the method further comprising:recording, by the instruction recording mechanism in response to being activated, each instruction that has executed within the program unit.3. The information processing system of claim 2 , wherein the recording comprises:maintaining a table of flags representing instructions of the program unit, where each flag in the table represents an address in memory;determining that an instruction has been executed within the program unit;identifying a flag within the table of flags that represents the instruction that has been executed; andchanging a value of the flag that has been identified to indicate that the instruction has been executed.4. The information processing system of claim 2 , wherein the recording comprises:maintaining a table of flags representing instructions of ...

Подробнее

Номер записи: 100