Поиск патентов

A system and method for executing instructions utilizing a preferred slot alignment mechanism is presented. A processor architecture uses a vector register file, a shared data path, and instruction execution logic to process both single instruction multiple data (SIMD) instruction and scalar instructions. The processor architecture divides a vector into four “slots,” each including four bytes, and locates scalar data in “preferred slots” to ensure proper positioning. Instructions using the preferred slot mechanism include 1) shift and rotate instructions operating across an entire quad-word that specify a shift amount, 2) memory load and store instructions that require an address, and 3) branch instructions that use the preferred slot for branch conditions (conditional branches) and branch addresses (register-indirect branches). As a result, the processor architecture eliminates the requirement for separate issue slots, separate pipelines, and the control complexity for separate scalar ...

Подробнее

Номер записи: 4

02-12-2014 дата публикации

Method and apparatus for the dynamic identification and merging of instructions for execution on a wide datapath

Номер: US0008904151B2

Автор: Michael Gschwind, Balaram Sinharoy, GSCHWIND MICHAEL, SINHAROY BALARAM

Принадлежит: International Business Machines Corporation, GSCHWIND MICHAEL, SINHAROY BALARAM, IBM, INTERNATIONAL BUSINESS MACHINES CORPORATION

A processing system and method includes a predecoder configured to identify instructions that are combinable. Instruction storage is configured to merge instructions that are combinable by replacing the combinable instructions with a wide data internal instruction for execution. An instruction execution unit is configured to execute the internal instruction on a wide datapath.

Подробнее

Номер записи: 5

17-08-2004 дата публикации

Symmetric multi-processing system with attached processing units being able to access a shared memory without being structurally configured with an address translation mechanism

Номер: US0006779049B2

Автор: Erik R. Altman, Peter G. Capek, Michael Gschwind, Harm Peter Hofstee, James Allan Kahle, Ravi Nair, Sumedh Wasudeo Sathaye, John-David Wellman, Masakazu Suzuoki, Takeshi Yamazaki, ALTMAN ERIK R, CAPEK PETER G, GSCHWIND MICHAEL, HOFSTEE HARM PETER, KAHLE JAMES ALLAN, NAIR RAVI, SATHAYE SUMEDH WASUDEO, WELLMAN JOHN-DAVID, SUZUOKI MASAKAZU, YAMAZAKI TAKESHI, ALTMAN ERIK R., CAPEK PETER G.

Принадлежит: International Business Machines Corporation, IBM, INTERNATIONAL BUSINESS MACHINES CORPORATION

A method and system for attached processing units accessing a shared memory in an SMP system. In one embodiment, a system comprises a shared memory. The system further comprises a plurality of processing elements coupled to the shared memory. Each of the plurality of processing elements comprises a processing unit, a direct memory access controller and a plurality of attached processing units. Each direct memory access controller comprises an address translation mechanism thereby enabling each associated attached processing unit to access the shared memory in a restricted manner without an address translation mechanism. Each attached processing unit is configured to issue a request to an associated direct memory access controller to access the shared memory specifying a range of addresses to be accessed as virtual addresses. The associated direct memory access controller is configured to translate the range of virtual addresses into an associated range of physical addresses.

Подробнее

Номер записи: 6

21-07-2005 дата публикации

SIMD-RISC microprocessor architecture

Номер: US20050160097A1

Автор: Michael Gschwind, Harm Hofstee, Martin Hopkins, James Kahle

Принадлежит:

A computer architecture and programming model for high speed processing over broadband networks are provided. The architecture employs a consistent modular structure, a common computing module and uniform software cells. The common computing module includes a control processor, a plurality of processing units, a plurality of local memories from which the processing units process programs, a direct memory access controller and a shared main memory. A synchronized system and method for the coordinated reading and writing of data to and from the shared main memory by the processing units also are provided. A hardware sandbox structure is provided for security against the corruption of data among the programs being processed by the processing units. The uniform software cells contain both data and applications and are structured for processing by any of the processors of the network. Each software cell is uniquely identified on the network.

Подробнее

Номер записи: 7

29-04-2004 дата публикации

Method and apparatus for setting breakpoints when debugging integrated executables in a heterogeneous architecture

Номер: US20040083458A1

Автор: Michael Gschwind, Kathryn O'Brien, John O'Brien, Valentina Salapura

Принадлежит: International Business Machines Corporation

The present invention provides inserting and deleting a breakpoint in a parallel processing system. A breakpoint is inserted in a module loaded into the execution environment of an attached processor unit. The breakpoint can be inserted directly. Furthermore, the unloaded image of the module can also have a breakpoint associated with it. The breakpoint can be inserted directly into the module image, or a breakpoint request can be generated, and the breakpoint is inserted when the module is loaded into the execution environment of the attached processor unit.

Подробнее

Номер записи: 8

29-04-2004 дата публикации

Method and apparatus for enabling access to global data by a plurality of codes in an integrated executable for a heterogeneous architecture

Номер: US20040083342A1

Автор: Michael Gschwind, Kathryn O'Brien, John O'Brien, Valentina Salapura

Принадлежит: International Business Machines Corporation

In the present invention, global information is passed from a first execution environment to a second execution environment, wherein both the first and second processor units comprise separate memories. The global variable is transferred through the invocation of a memory flow controller by a stub function. The global descriptor has a plurality of field indicia that allow a binder to link separate object files bound to the first and second execution environments.

Подробнее

Номер записи: 9

04-03-2004 дата публикации

Method and apparatus for transferring control in a computer system with dynamic compilation capability

Номер: US20040044880A1

Автор: Erik Altman, Kemal Ebcioglu, Michael Gschwind, David Luick

Принадлежит: International Business Machines Corporation

In a dynamically compiling computer system, a system and method for efficiently transferring control from execution of an instruction in a first representation to a second representation of the instruction is disclosed. The system and method include the setting of a tag for entry points of each instruction in a first representation that has been translated to a second representation. The tag is stored in memory in association with each such instruction. When a given instruction in a first representation is to be executed, the tag is examined, and if it indicates that a translated version of the instruction has previously been generated, control is passed to execution of the instruction in the second representation. The second representation can be a different instruction set representation, or an optimized representation in the same instruction set as the original instruction.

Подробнее

Номер записи: 10

13-07-2006 дата публикации

Method and apparatus for control signals memoization in a multiple instruction issue microprocessor

Номер: US20060155965A1

Автор: Erik Altman, Michael Gschwind, Jude Rivers, Sumedh Sathaye, John-David Wellman, Victor Zyuban

Принадлежит: International Business Machines Corporation

A dynamic predictive and/or exact caching mechanism is provided in various stages of a microprocessor pipeline so that various control signals can be stored and memorized in the course of program execution. Exact control signal vector caching may be done. Whenever an issue group is formed following instruction decode, register renaming, and dependency checking, an encoded copy of the issue group information can be cached under the tag of the leading instruction. The resulting dependency cache or control vector cache can be accessed right at the beginning of the instruction issue logic stage of the microprocessor pipeline the next time the corresponding group of instructions come up for re-execution. Since the encoded issue group bit pattern may be accessed in a single cycle out of the cache, the resulting microprocessor pipeline with this embodiment can be seen as two parallel pipes, where the shorter pipe is followed if there is a dependency cache or control vector cache hit.

Подробнее

Номер записи: 11

22-03-2017 дата публикации

Insertion of operation-and-indicate instructions for optimized SIMD code

Номер: GB0002486117B

Автор: ALEXANDRE EICHENBERGER, ALAN GARA, MICHAEL GSCHWIND, Alexandre Eichenberger, Alan Gara, Michael Gschwind

Принадлежит: IBM, International Business Machines Corporation

Подробнее

Номер записи: 12

14-07-2021 дата публикации

Garbage collection absent use of special instructions

Номер: GB2556547B

Автор: MICHAEL GSCHWIND, GILES ROGER FRAZIER, Michael Gschwind, Giles Roger Frazier

Принадлежит: IBM, International Business Machines Corporation

Подробнее

Номер записи: 13

27-06-2017 дата публикации

Computer instructions for limiting access violation reporting when accessing strings and similar data structures

Номер: US0009690509B2

Автор: Michael Gschwind, Brett Olsson, Raul E. Silvera, GSCHWIND MICHAEL, OLSSON BRETT, SILVERA RAUL E, Gschwind Michael, Olsson Brett, Silvera Raul E.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION, IBM, International Business Machines Corporation

Embodiments are directed to a computer implemented method of accessing a data frame, wherein a first portion of the data frame is in a first memory block, and wherein a second portion of the data frame is in a second memory block. The method includes initiating, by a processor, an access of the data frame. The method further includes accessing, by the processor, the first portion of the data frame. The method further includes, based at least in part on a determination that the processor does not have access to the second memory block, accessing at least one default character as a replacement for accessing the second portion of the data frame.

Подробнее

Номер записи: 14

25-05-2010 дата публикации

Method and apparatus for sharing storage and execution resources between architectural units in a microprocessor using a polymorphic function unit

Номер: US0007725682B2

Автор: Michael Gschwind, Balaram Sinharoy, GSCHWIND MICHAEL, SINHAROY BALARAM

Принадлежит: International Business Machines Corporation, IBM, INTERNATIONAL BUSINESS MACHINES CORPORATION

Methods and apparatus are provided for sharing storage and execution resources between architectural units in a microprocessor using a polymorphic function unit. A method for executing instructions in a processor having a polymorphic execution unit includes the steps of reloading a state associated with a first instruction class and reconfiguring the polymorphic execution unit to operate in accordance with the first instruction class, when an instruction of the first instruction class is encountered and the polymorphic execution unit is configured to operate in accordance with a second instruction class. The method also includes the steps of reloading a state associated with a second instruction class and reconfiguring the polymorphic execution unit to operate in accordance with the second instruction class, when an instruction of the second instruction class is encountered and the polymorphic execution unit is configured to operate in accordance with the first instruction class.

Подробнее

Номер записи: 15

07-06-2007 дата публикации

Transient cache storage

Номер: US20070130237A1

Автор: Erik Altman, Michael Gschwind, Robert Montoye, Jude Rivers, Sumedh Sathaye, John-David Wellman, Victor Zyuban

Принадлежит: International Business Machines Corporation

A method and apparatus for storing non-critical processor information without imposing significant costs on a processor design is disclosed. Transient data are stored in the processor-local cache hierarchy. An additional control bit forms part of cache addresses, where addresses having the control bit set are designated as “transient storage addresses.” Transient storage addresses are not written back to external main memory and, when evicted from the last level of cache, are discarded. Preferably, transient storage addresses are “privileged” in that they are either not accessible to software or only accessible to supervisory or administrator-level software having appropriate permissions. A number of management functions/instructions are provided to allow administrator/supervisor software to manage and/or modify the behavior of transient cache storage. This transient storage scheme allows the cache hierarchy to store data items that may be used by the processor core but that may be too ...

Подробнее

Номер записи: 16

12-07-2007 дата публикации

Mechanism and method for two level adaptive trace prediction

Номер: US20070162895A1

Автор: Erik Altman, Michael Gschwind, Jude Rivers, Sumedh Sathaye, John-David Wellman, Victor Zyuban

Принадлежит: International Business Machines Corporation

A trace cache system is provided comprising a trace start address cache for storing trace start addresses with successor trace start addresses, a trace cache for storing traces of instructions executed, a trace history table (THT) for storing trace numbers in rows, a branch history shift register (BHSR) or a trace history shift register (THSR) that stores histories of branches or traces executed, respectively, a THT row selector for selecting a trace number row from the THT, the selection derived from a combination of a trace start address and history information from the BHSR or the THSR, and a trace number selector for selecting a trace number from the selected trace number row and for outputting the selected trace number as a predicted trace number.

Подробнее

Номер записи: 17

04-01-2011 дата публикации

Method and apparatus to extend the number of instruction bits in processors with fixed length instructions, in a manner compatible with existing code

Номер: US0007865699B2

Автор: Erik R Altman, Michael Gschwind, David A. Luick, Daniel A. Prener, Jude A. Rivers, Sumedh W. Sathaye, John-David Wellman, ALTMAN ERIK R, GSCHWIND MICHAEL, LUICK DAVID A, PRENER DANIEL A, RIVERS JUDE A, SATHAYE SUMEDH W, WELLMAN JOHN-DAVID, LUICK DAVID A., PRENER DANIEL A., RIVERS JUDE A., SATHAYE SUMEDH W.

Принадлежит: International Business Machines Corporation, IBM, INTERNATIONAL BUSINESS MACHINES CORPORATION

This invention pertains to apparatus, method and a computer program stored on a computer readable medium. The computer program includes instructions for use with an instruction unit having a code page, and has computer program code for partitioning the code page into at least two sections for storing in a first section thereof a plurality of instruction words and, in association with at least one instruction word, for storing in a second section thereof an extension to each instruction word in the first section. The computer program further includes computer program code for setting a state of at least one page table entry bit for indicating, on a code page by code page basis, whether the code page is partitioned into the first and second sections for storing instruction words and their extensions, or whether the code page is comprised instead of a single section storing only instruction words.

Подробнее

Номер записи: 18

24-06-2004 дата публикации

Method and apparatus for reducing power dissipation in latches during scan operation

Номер: US20040123198A1

Автор: Michael Gschwind

Принадлежит: International Business Machines Corporation

A method and apparatus for reducing power dissipation during a scan operation during testing of digital logic circuits which provides for scanning data while switching a limited number of nodes during scan-in and scan-out of input and result chains, and which isolates the logic circuit from random stimulation by scan chains as they are scanned. A scan chain includes a plurality of level sensitive scan design LSSD scan latches, each comprising a master latch M and a slave latch S. The master latch has a first input port D used for operation in a functional mode, and a second input port S used for operation in a scan mode, a scan enable input port, and a clock input port. The master latch M produces output scan data Sout which is directed to a slave latch S which produces a data output Q for the logic circuit under test.

Подробнее

Номер записи: 19

03-06-2004 дата публикации

Symmetric multi-processing system

Номер: US20040107321A1

Автор: Erik Altman, Peter Capek, Michael Gschwind, Harm Hofstee, James Kahle, Ravi Nair, Sumedh Sathaye, John-David Wellman

Принадлежит:

A method and system for attached processing units accessing a shared memory in an SMP system. In one embodiment, a system comprises a shared memory. The system further comprises a plurality of processing elements coupled to the shared memory. Each of the plurality of processing elements comprises a processing unit, a direct memory access controller and a plurality of attached processing units. Each direct memory access controller comprises an address translation mechanism thereby enabling each associated attached processing unit to access the shared memory in a restricted manner without an address translation mechanism. Each attached processing unit is configured to issue a request to an associated direct memory access controller to access the shared memory specifying a range of addresses to be accessed as virtual addresses. The associated direct memory access controller is configured to translate the range of virtual addresses into an associated range of physical addresses.

Подробнее

Номер записи: 20

11-12-2008 дата публикации

METHODS AND APPARATUS FOR IMPLEMENTING POLYMORPHIC BRANCH PREDICTORS

Номер: US20080307209A1

Автор: Michael Gschwind

Принадлежит:

A polymorphic branch predictor and method includes a plurality of branch prediction methods. The methods are selectively enabled to perform branch prediction. A selection mechanism is configured to select one or more of the branch prediction methods in accordance with a dynamic setting to optimize performance of the branch predictor during operation in accordance with a current task.

Подробнее

Номер записи: 21

20-11-2003 дата публикации

Method and apparatus for software-assisted thermal management for electronic systems

Номер: US20030217297A1

Автор: Michael Gschwind, Valentina Salapura

Принадлежит: International Business Machines Corporation

In a computer system, a device for measuring power dissipation (e.g., using on-die thermal sensors) is linked to both a hardware-based thermal management solution and with a means for causing a notification event to software, so that, initially, the operating system software and/or the application software modifies its behavior in response to the notification event to reduce overall system power dissipation and the hardware-based thermal management solution is only triggered if the software solution is not effective; with both operating system and application software resuming higher-performance algorithms when power dissipation is no longer critical.

Подробнее

Номер записи: 22

29-04-2004 дата публикации

Method and apparatus for creating and executing integrated executables in a heterogeneous architecture

Номер: US20040083462A1

Автор: Michael Gschwind, Kathryn O'Brien, John O'Brien, Valentina Salapura

Принадлежит: International Business Machines Corporation

The present invention provides a compilation system for compiling and linking an integrated executable adapted to execute on a heterogeneous parallel processor architecture. The compiler and linker compile different segments of the source code for a first and second processor architecture, and generate appropriate stub functions directed at loading code and data to remote nodes so as to cause them to perform operations described by the transmitted code on the data. The compiler and linker generate stub objects to represent remote execution capability, and stub objects encapsulate the transfers necessary to execute code in such environment.

Подробнее

Номер записи: 23

30-06-2016 дата публикации

COMPUTER INSTRUCTIONS FOR LIMITING ACCESS VIOLATION REPORTING WHEN ACCESSING STRINGS AND SIMILAR DATA STRUCTURES

Номер: US20160188496A1

Автор: Michael Gschwind, Brett Olsson, Raul E. Silvera, GSCHWIND MICHAEL, OLSSON BRETT, SILVERA RAUL E, SILVERA RAUL E.

Принадлежит:

Embodiments are directed to a computer implemented method of accessing a data frame, wherein a first portion of the data frame is in a first memory block, and wherein a second portion of the data frame is in a second memory block. The method includes initiating, by a processor, an access of the data frame. The method further includes accessing, by the processor, the first portion of the data frame. The method further includes, based at least in part on a determination that the processor does not have access to the second memory block, accessing at least one default character as a replacement for accessing the second portion of the data frame.

Подробнее

Номер записи: 24

20-03-2003 дата публикации

Method and apparatus for aligning memory write data in a microprocessor

Номер: US20030056064A1

Автор: Michael Gschwind, Martin Hopkins, H. Hofstee

Принадлежит:

There is provided a method for aligning and inserting data elements into a memory based upon an instruction sequence consisting of one or more alignment instructions and a single store instruction. Given a data item that includes a data element to be stored, the method includes the step of aligning the data element in another memory with respect to a predetermined position in the memory, in response to the one or more alignment instructions. A mask is dynamically generated to enable writing of memory bit lines that correspond to the aligned data element. The memory bit lines are written to the memory under a control of the mask. The generating and writing steps are performed in response to the single store instruction.

Подробнее

Номер записи: 25

29-12-2020 дата публикации

Triggered sensor data capture in a mobile device environment

Номер: US0010878947B2

Автор: Kathleen Chalas, Jonathan R. Fry, Michael Gschwind, John S. Houston, Alexander C. Leventhal, Cameron E. Tidd, Lahiruka S. Winter, CHALAS KATHLEEN, FRY JONATHAN R, GSCHWIND MICHAEL, HOUSTON JOHN S, LEVENTHAL ALEXANDER C, TIDD CAMERON E, WINTER LAHIRUKA S, Chalas, Kathleen, Fry, Jonathan R., Gschwind, Michael, Houston, John S., Leventhal, Alexander C., Tidd, Cameron E., Winter, Lahiruka S.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION, IBM

Triggered sensor data capture in a mobile device environment. A method monitors primary sensor data obtained from first wearable sensor device(s) to determine whether trigger condition(s) are met for triggering supplemental sensor data capture. Based on recognizing a health event, the method obtains health status input from a user, configures second wearable sensor device(s) to obtain supplemental sensor data that includes additional data in addition to the primary sensor data, and obtains the supplemental sensor data. The method provides the health status input and the obtained supplemental sensor data as correlated health event data of the health event for analysis. Based on the analysis, the method tunes at least one trigger condition of the trigger condition(s) to adjust a scope of supplemental sensor data capture.

Подробнее

Номер записи: 26

13-06-2017 дата публикации

Processing page fault exceptions in supervisory software when accessing strings and similar data structures using normal load instructions

Номер: US0009678886B2

Автор: Michael Gschwind, Brett Olsson, GSCHWIND MICHAEL, OLSSON BRETT, Gschwind Michael, Olsson Brett

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION, IBM, International Business Machines Corporation

Embodiments are directed to a method of accessing a data frame, wherein a first portion of the data frame is in a first memory block, and wherein a second portion of the data frame is in a second memory block. The method includes determining that an access of the data frame crosses a boundary between the first second memory blocks, determining that an attempted translation of an address of the first portion of the data frame in the first memory block did not result in a translation fault, and accessing the first portion of the data frame. The method further includes, based at least in part on a determination that an attempted translation of an address of the second portion of the data frame in the second memory block resulted in a translation fault, accessing at least one default character as a replacement for accessing the second portion of the data frame.

Подробнее

Номер записи: 27

19-03-2015 дата публикации

METHOD AND APPARATUS FOR THE DYNAMIC CREATION OF INSTRUCTIONS UTILIZING A WIDE DATAPATH

Номер: US20150082009A1

Автор: Michael Gschwind, Balaram Sinharoy, GSCHWIND MICHAEL, SINHAROY BALARAM

Принадлежит:

A processing system and method includes a predecoder configured to identify instructions that are combinable to form a single executable internal instruction. Instruction storage is configured to merge instructions that are combinable. An instruction execution unit is configured to execute the single, executable internal instruction on a hardware wide datapath.

Подробнее

Номер записи: 28

12-07-2007 дата публикации

Method and apparatus for sharing storage and execution resources between architectural units in a microprocessor using a polymorphic function unit

Номер: US20070162726A1

Автор: Michael Gschwind, Balaram Sinharoy

Принадлежит:

Methods and apparatus are provided for sharing storage and execution resources between architectural units in a microprocessor using a polymorphic function unit. A method for executing instructions in a processor having a polymorphic execution unit includes the steps of reloading a state associated with a first instruction class and reconfiguring the polymorphic execution unit to operate in accordance with the first instruction class, when an instruction of the first instruction class is encountered and the polymorphic execution unit is configured to operate in accordance with a second instruction class. The method also includes the steps of reloading a state associated with a second instruction class and reconfiguring the polymorphic execution unit to operate in accordance with the second instruction class, when an instruction of the second instruction class is encountered and the polymorphic execution unit is configured to operate in accordance with the first instruction class.

Подробнее

Номер записи: 29

31-05-2007 дата публикации

Compilation for a SIMD RISC processor

Номер: US20070124722A1

Автор: Michael Gschwind

Принадлежит:

A computer implemented method, data processing system, and computer usable code are provided for generating code to perform scalar computations on a Single-Instruction Multiple-Data (SIMD) Reduced Instruction Set Computer (RISC) architecture. The illustrative embodiments generate code directed at loading at least one scalar value and generate code using at least one vector operation to generate a scalar result, wherein all scalar computation for integer and floating point data is performed in a SIMD vector execution unit.

Подробнее

Номер записи: 30

19-03-2009 дата публикации

Method and Apparatus for Detection of Data Errors in Tag Arrays

Номер: US20090077425A1

Автор: Michael Gschwind, Michael M. Tsao

Принадлежит:

A method for detecting errors in a tag array includes accessing the tag array with an index, retrieving at least one tag from the tag array, and computing a parity bit based on the expected tag.

Подробнее

Номер записи: 31

29-09-2005 дата публикации

Method and apparatus for directory-based coherence with distributed directory management utilizing prefetch caches

Номер: US20050216672A1

Автор: Michael Gschwind, Charles Johns, Thoung Truong

Принадлежит: International Business Machines Corporation

The present invention provides for a type of parallel processing architecture in which a plurality of processors has access to a shared memory hierarchy level. A memory hierarchy level has a coherence directory and associated directory data with a plurality of cachelines each associated with different data. Prefetch caches are interconnected to processor memory and a plurality of processor elements, each element interconnected to different buffers. Cache lines are requested from memory, and the requests, responses, and detections therein are available for particular access modes, therein providing additional coherence of the data. Processing of said directory data is performed by processing elements. In one embodiment, the system comprises an integrated prefetch cache.

Подробнее

Номер записи: 32

11-07-2017 дата публикации

Processing page fault exceptions in supervisory software when accessing strings and similar data structures using normal load instructions

Номер: US0009703721B2

Автор: Michael Gschwind, Brett Olsson, GSCHWIND MICHAEL, OLSSON BRETT, Gschwind Michael, Olsson Brett

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION, IBM, International Business Machines Corporation

Embodiments are directed to a method of accessing a data frame, wherein a first portion of the data frame is in a first memory block, and wherein a second portion of the data frame is in a second memory block. The method includes determining that an access of the data frame crosses a boundary between the first second memory blocks, determining that an attempted translation of an address of the first portion of the data frame in the first memory block did not result in a translation fault, and accessing the first portion of the data frame. The method further includes, based at least in part on a determination that an attempted translation of an address of the second portion of the data frame in the second memory block resulted in a translation fault, accessing at least one default character as a replacement for accessing the second portion of the data frame.

Подробнее

Номер записи: 33

19-10-2006 дата публикации

Method and apparatus for predictive scheduling of memory accesses based on reference locality

Номер: US20060236036A1

Автор: Michael Gschwind, Jude Rivers, John-David Wellman, Victor Zyuban

Принадлежит:

There are provided methods and apparatus for accessing a memory array. A method for accessing a memory array includes the step of predicting whether at least two memory references can be satisfied by a single array access based on one of an instruction address, local instruction history and global instruction history.

Подробнее

Номер записи: 34

11-01-2007 дата публикации

Method and system for data-driven runtime alignment operation

Номер: US20070011441A1

Автор: Alexandre Eichenberger, Michael Gschwind, Valentina Salapura, Peng Wu

Принадлежит: International business Machines Corporation

A method for processing instructions and data in a processor includes steps of: preparing an input stream of data for processing in a data path in response to a first set of instructions specifying a dynamic parameter; and processing the input stream of data in the same data path in response to a second set of instructions. A common portion of a dataflow is used for preparing the input stream of data for processing in response to a first set of instructions under the control of a dynamic parameter specified by an instruction of the first set of instructions, and for operand data routing based on the instruction specification of a second set of instructions during the processing of the input stream in response to the second set of instructions.

Подробнее

Номер записи: 35

01-11-2007 дата публикации

SYSTEM AND METHOD FOR GARBAGE COLLECTION IN HETEROGENEOUS MULTIPROCESSOR SYSTEMS

Номер: US20070255909A1

Автор: Michael Gschwind, John O'Brien, Kathryn O'Brien

Принадлежит:

A system and method for garbage collection in heterogeneous multiprocessor systems are provided. In some illustrative embodiments, garbage collection operations are distributed across a plurality of the processors in the heterogeneous multiprocessor system. Portions of a global mark queue are assigned to processors of the heterogeneous multiprocessor system along with corresponding chunks of a shared memory. The processors perform garbage collection on their assigned portions of the global mark queue and corresponding chunk of shared memory marking memory object references as reachable or adding memory object references to a non-local mark stack. The marked memory objects are merged with a global mark stack and memory object references in the non-local mark stack are merged with a “to be traced” portion of the global mark queue for re-checking using a garbage collection operation.

Подробнее

Номер записи: 36

24-08-2006 дата публикации

Non-homogeneous multi-processor system with shared memory

Номер: US20060190614A1

Автор: Erik Altman, Peter Capek, Michael Gschwind, Charles Johns, Harm Hofstee, Martin Hopkins, James Kahle, Sumedh Sathaye, John-David Wellman

Принадлежит:

A computer architecture and programming model for high speed processing over broadband networks are provided. The architecture employs a consistent modular structure, a common computing module and uniform software cells. The common computing module includes a control processor, a plurality of processing units, a plurality of local memories from which the processing units process programs, a direct memory access controller and a shared main memory. A synchronized system and method for the coordinated reading and writing of data to and from the shared main memory by the processing units also are provided. A hardware sandbox structure is provided for security against the corruption of data among the programs being processed by the processing units. The uniform software cells contain both data and applications and are structured for processing by any of the processors of the network. Each software cell is uniquely identified on the network.

Подробнее

Номер записи: 37

03-01-2008 дата публикации

Polymorphic Branch Predictor And Method With Selectable Mode Of Prediction

Номер: US20080005542A1

Автор: Michael Gschwind

Принадлежит:

A polymorphic branch predictor and method includes a plurality of branch prediction methods. The methods are selectively enabled to perform branch prediction. A selection mechanism is configured to select one or more of the branch prediction methods in accordance with a dynamic setting to optimize performance of the branch predictor during operation in accordance with a current task.

Подробнее

Номер записи: 38

01-03-2007 дата публикации

Method and apparatus for accessing misaligned data streams

Номер: US20070050592A1

Автор: Michael Gschwind, John Wellman

Принадлежит:

One embodiment of the present method and apparatus for accessing misaligned data streams includes receiving a data request, where the data request includes a request for misaligned data, and retrieving at least a portion of the requested data from a data stream buffer associated with the data stream. If the data retrieved from the data stream buffer does not comprise all of the requested data, the remainder of the requested data is retrieved from memory and combined with the data stream buffer data. In this manner, the number of memory accesses necessary to retrieve the requested misaligned data is reduced. Additional embodiments of the present invention include mechanisms for ensuring data coherence with respect to write updates and protocol requests. Moreover, the present invention advantageously reduces the need for pipeline upset events/pipeline hazards that typically result in performance degradation in pipelined microprocessors.

Подробнее

Номер записи: 39

28-02-2008 дата публикации

SYSTEM AND METHOD OF EXECUTION OF REGISTER POINTER INSTRUCTIONS AHEAD OF INSTRUCTION ISSUES

Номер: US20080052495A1

Автор: ERIK ALTMAN, Michael Gschwind, Jude Rivers, Sumedh Sathaye, John-David Wellman, Victor Zyuban

Принадлежит:

A pipeline system and method includes a plurality of operational stages. The stages include a pointer register stage which stores pointer information and updates, and a rename and dependence checking stage located downstream of the pointer register stage, which renames registers and determines if dependencies exist. A functional unit provides pointer information updates to the pointer register stage such that pointer information is processed and updated to the pointer register stage before or in parallel with the register dependency checking.

Подробнее

Номер записи: 40

12-06-2018 дата публикации

Tracking a user based on an electronic noise profile

Номер: US0009997050B2

Автор: Michael Gschwind, Valentina Salapura, GSCHWIND MICHAEL, SALAPURA VALENTINA, Gschwind, Michael, Salapura, Valentina

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION, IBM

An electronic device includes a device code, a processor, a wireless protocol transceiver, a motion detector, an alarm and a state machine. The electronic device has a device code associated with an owner. The electronic device's wireless protocol transceiver establishes a communication link with another wireless protocol transceiver associated with the owner. The motion detector detects movement of the electronic device. The state machine, operated by the processor, may stay at a first state or advance to a second state based on signals received from the wireless protocol transceiver and the motion detector. The second state signifies a reminder condition and upon arriving at the second state, the alarm is activated.

Подробнее

Номер записи: 41

15-02-2007 дата публикации

Methods for generating code for an architecture encoding an extended register specification

Номер: US20070038984A1

Автор: Michael Gschwind, Robert Montoye, Brett Olsson, John-David Wellman

Принадлежит:

There are provided methods and computer program products for generating code for an architecture encoding an extended register specification. A method for generating code for a fixed-width instruction set includes identifying a non-contiguous register specifier. The method further includes generating a fixed-width instruction word that includes the non-contiguous register specifier.

Подробнее

Номер записи: 42

02-02-2016 дата публикации

Method and apparatus for spatial register partitioning with a multi-bit cell register file

Номер: US0009250899B2

Автор: Michael Gschwind, GSCHWIND MICHAEL

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION, GSCHWIND MICHAEL, IBM

There is provided a multi-bit storage cell for a register file. The storage cell includes a first set of storage elements for a vector slice. Each storage element respectively corresponds to a particular one of a plurality of thread sets for the vector slice. The storage cell includes a second set of storage elements for a scalar slice. Each storage element in the second set respectively corresponds to a particular one of at least one thread set for the scalar slice. The storage cell includes at least one selection circuit for selecting, for an instruction issued by a thread, a particular one of the storage elements from any of the first set and the second set based upon the instruction being a vector instruction or a scalar instruction and based upon a corresponding set from among the pluralities of thread sets to which the thread belongs.

Подробнее

Номер записи: 43

06-07-2010 дата публикации

Method and apparatus for detection of data errors in tag arrays

Номер: US0007752505B2

Автор: Michael Gschwind, Michael M. Tsao, GSCHWIND MICHAEL, TSAO MICHAEL M, TSAO MICHAEL M.

Принадлежит: International Business Machines Corporation, IBM, INTERNATIONAL BUSINESS MACHINES CORPORATION

A method for detecting errors in a tag array includes accessing the tag array with an index, retrieving at least one tag from the tag array, and computing a parity bit based on the expected tag.

Подробнее

Номер записи: 44

19-02-2002 дата публикации

Methods and apparatus for reordering and renaming memory references in a multiprocessor computer system

Номер: US0006349361B1

Автор: Erik Altman, Kemal Ebcioglu, Michael Gschwind, Sumedh Sathaye, ALTMAN ERIK, EBCIOGLU KEMAL, GSCHWIND MICHAEL, SATHAYE SUMEDH

Принадлежит: International Business Machines Corporation, IBM, INTERNATIONAL BUSINESS MACHINES CORPORATION

There is provided a method for reordering and renaming memory references in a multiprocessor computer system having at least a first and a second processor. The first processor has a first private cache and a first buffer, and the second processor has a second private cache and a second buffer. The method includes the steps of, for each of a plurality of gated store requests received by the first processor to store a datum, exclusively acquiring a cache line that contains the datum by the first private cache, and storing the datum in the first buffer. Upon the first buffer receiving a load request from the first processor to load a particular datum, the particular datum is provided to the first processor from among the data stored in the first buffer based on an in-order sequence of load and store operations. Upon the first cache receiving a load request from the second cache for a given datum, an error condition is indicated and a current state of at least one of the processors is reset ...

Подробнее

Номер записи: 45

29-04-2004 дата публикации

Method and apparatus for overlay management within an integrated executable for a heterogeneous architecture

Номер: US20040083455A1

Автор: Michael Gschwind, Kathryn O'Brien, John O'Brien, Valentina Salapura

Принадлежит: International Business Machines Corporation

The present invention provides for creating and employing code and data partitions in a heterogeneous environment. This is achieved by separating source code and data into at least two partitioned sections and at least one unpartitioned section. Generally, a partitioned section is targeted for execution on an independent memory device, such as an attached processor unit. Then, at least two overlay sections are generated from at least one partition section. The plurality of partition sections are pre-bound to each other. A root module is also created, associated with both the pre-bound plurality of partitions and the overlay sections. The root module is employable to exchange the at least two overlay sections between the first and second execution environments. The pre-bound plurality of partition sections are then bound to the at least one unpartitioned section. The binding produces an integrated executable.

Подробнее

Номер записи: 46

20-02-2003 дата публикации

Processor implementation having unified scalar and SIMD datapath

Номер: US20030037221A1

Автор: Michael Gschwind, Harm Hofstee, Martin Hopkins

Принадлежит: International Business Machines Corporation

An improved processor implementation is described in which scalar and vector processing components are merged to reduce complexity. In particular, the implementation includes a scalar-vector register file for storing scalar and vector instructions, as well as a parallel vector unit comprising functional units that can process vector or scalar instructions as required. A further aspect of the invention provides the ability to disable unused functional units in the parallel vector unit, such as during a scalar operation, to achieve significant power savings.

Подробнее

Номер записи: 47

13-03-2008 дата публикации

Method And Apparatus To Extend The Number Of Instruction Bits In Processors With Fixed Length Instructions, In A Manner Compatible With Existing Code

Номер: US20080065861A1

Автор: Erik Altman, Michael Gschwind, David Luick, Daniel Prener, Jude Rivers, Sumedh Sathaye, John-David Wellman

Принадлежит: International Business Machines Corporation

This invention pertains to apparatus, method and a computer program stored on a computer readable medium. The computer program includes instructions for use with an instruction unit having a code page, and has computer program code for partitioning the code page into at least two sections for storing in a first section thereof a plurality of instruction words and, in association with at least one instruction word, for storing in a second section thereof an extension to each instruction word in the first section. The computer program further includes computer program code for setting a state of at least one page table entry bit for indicating, on a code page by code page basis, whether the code page is partitioned into the first and second sections for storing instruction words and their extensions, or whether the code page is comprised instead of a single section storing only instruction words.

Подробнее

Номер записи: 48

19-08-2004 дата публикации

Symmetric multi-processing system

Номер: US20040160835A1

Автор: Erik Altman, Peter Capek, Michael Gschwind, Harm Peter Hofstee, James Allan Kahle, Ravi Nair, Sumedh Sathaye, John-David Wellman

Принадлежит:

A method and system for attached processing units accessing a shared memory in an SMT system. In one embodiment, a system comprises a shared memory. The system further comprises a plurality of processing elements coupled to the shared memory. Each of the plurality of processing elements comprises a processing unit, a direct memory access controller and a plurality of attached processing units. Each direct memory access controller comprises an address translation mechanism thereby enabling each associated attached processing unit to access the shared memory in a restricted manner without an address translation mechanism. Each attached processing unit is configured to issue a request to an associated direct memory access controller to access the shared memory specifying a range of addresses to be accessed as virtual addresses. The associated direct memory access controller is configured to translate the range of virtual addresses into an associated range of physical addresses.

Подробнее

Номер записи: 49

16-08-2007 дата публикации

Low complexity speculative multithreading system based on unmodified microprocessor core

Номер: US20070192545A1

Автор: Alan Gara, Michael Gschwind, Valentina Salapura

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

A system, method and computer program product for supporting thread level speculative execution in a computing environment having multiple processing units adapted for concurrent execution of threads in speculative and non-speculative modes. Each processing unit includes a cache memory hierarchy of caches operatively connected therewith. The apparatus includes an additional cache level local to each processing unit for use only in a thread level speculation mode, each additional cache for storing speculative results and status associated with its associated processor when handling speculative threads. The additional local cache level at each processing unit are interconnected so that speculative values and control data may be forwarded between parallel executing threads. A control implementation is provided that enables speculative coherence between speculative threads executing in the computing environment.

Подробнее

Номер записи: 50

21-04-2009 дата публикации

Polymorphic branch predictor and method with selectable mode of prediction

Номер: US0007523298B2

Автор: Michael Gschwind, GSCHWIND MICHAEL

Принадлежит: International Business Machines Corporation, IBM, INTERNATIONAL BUSINESS MACHINES CORPORATION

A polymorphic branch predictor and method includes a plurality of branch prediction methods. The methods are selectively enabled to perform branch prediction. A selection mechanism is configured to select one or more of the branch prediction methods in accordance with a dynamic setting to optimize performance of the branch predictor during operation in accordance with a current task.

Подробнее

Номер записи: 51

30-06-2016 дата публикации

COMPUTER INSTRUCTIONS FOR LIMITING ACCESS VIOLATION REPORTING WHEN ACCESSING STRINGS AND SIMILAR DATA STRUCTURES

Номер: US20160188242A1

Автор: Michael Gschwind, Brett Olsson, Raul E. Silvera, GSCHWIND MICHAEL, OLSSON BRETT, SILVERA RAUL E, SILVERA RAUL E.

Принадлежит:

Embodiments are directed to a method of accessing a data frame. The method includes, based at least in part on a determination that the data frame spans first and second memory blocks, and further based at least in part on a determination that the processor has access to the first and second memory blocks, accessing the data frame. The method includes, based at least in part on a determination that the data frame spans the first and second memory blocks, and based at least in part on a determination that the processor has access to the first memory block but does not have access to the second memory block, accessing a first portion of the data frame that is in the first memory block, and accessing at least one default character as a replacement for accessing a second portion of the data frame that is in the second memory block.

Подробнее

Номер записи: 52

26-05-2005 дата публикации

Method and apparatus to extend the number of instruction bits in processors with fixed length instructions, in a manner compatible with existing code

Номер: US20050114629A1

Автор: Erik Altman, Michael Gschwind, David Luick, Daniel Prener, Jude Rivers, Sumedh Sathaye, John-David Wellman

Принадлежит: International Business Machines Corporation

This invention pertains to apparatus, method and a computer program stored on a computer readable medium. The computer program includes instructions for use with an instruction unit having a code page, and has computer program code for partitioning the code page into at least two sections for storing in a first section thereof a plurality of instruction words and, in association with at least one instruction word, for storing in a second section thereof an extension to each instruction word in the first section. The computer program further includes computer program code for setting a state of at least one page table entry bit for indicating, on a code page by code page basis, whether the code page is partitioned into the first and second sections for storing instruction words and their extensions, or whether the code page is comprised instead of a single section storing only instruction words.

Подробнее

Номер записи: 53

19-08-2014 дата публикации

Method and apparatus for employing multi-bit register file cells and SMT thread groups

Номер: US0008812824B2

Автор: Michael Gschwind

Принадлежит: International Business Machines Corporation

There are provided methods and apparatus for multi-bit cell and SMT thread groups. An apparatus for a register file includes a plurality of multi-bit storage cells for storing a plurality of bits respectively corresponding to a plurality of threads. The apparatus further includes a plurality of port groups, operatively coupled to the plurality of multi-bit storage cells, responsive to physical register identifiers. The plurality of port groups is responsive to respective ones of a plurality of thread identifiers. Each of the plurality of thread identifiers are for uniquely identifying a particular thread from among a plurality of threads.

Подробнее

Номер записи: 54

06-03-2003 дата публикации

Configurable memory array

Номер: US20030046492A1

Автор: Michael Gschwind, Valentina Salapura

Принадлежит: International Business Machines Corporation, Armonk, New York

There is provided a memory system on a chip. The memory system includes a configurable memory having a first mode of operation wherein the configurable memory is configured as a cache and a second mode of operation wherein the configurable memory is configured as a local, non-cache memory. A selection of any of the first mode of operation and the second mode of operation is capable of being overridden by an other selection of an other of the first mode of operation and the second mode of operation. The configurable memory may be configured at manufacture time, at burn-in time, and/or during program execution. Moreover, an access mode of the configurable memory may be determined from an address corresponding to a memory access instruction.

Подробнее

Номер записи: 55

04-04-2013 дата публикации

COMPILING CODE FOR AN ENHANCED APPLICATION BINARY INTERFACE (ABI) WITH DECODE TIME INSTRUCTION OPTIMIZATION

Номер: US20130086563A1

Автор: Robert J. Blainey, Michael Gschwind, James L. McInnes, Steven J. Munroe

Принадлежит: International Business Machines Corporation

A code sequence made up multiple instructions and specifying an offset from a base address is identified in an object file. The offset from the base address corresponds to an offset location in a memory configured for storing an address of a variable or data. The identified code sequence is configured to perform a memory reference function or a memory address computation function. It is determined that the offset location is within a specified distance of the base address and that a replacement of the identified code sequence with a replacement code sequence will not alter program semantics. The identified code sequence in the object file is replaced with the replacement code sequence that includes a no-operation (NOP) instruction or having fewer instructions than the identified code sequence. Linked executable code is generated based on the object file and the linked executable code is emitted. 1. A computer program product comprising a tangible storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for performing a method comprising:reading, by a computer, an object file comprising a plurality of code sequences;identifying a code sequence in the object file that specifies an offset from a base address, the offset from the base address corresponding to an offset location in a memory configured for storing one of an address of a variable and data, the identified code sequence comprising a plurality of instructions and configured to perform one of a memory reference function and a memory address computation function;determining that the offset location is within a specified distance of the base address;verifying that a replacement of the identified code sequence with a replacement code sequence will not alter program semantics;replacing the identified code sequence in the object file with the replacement code sequence, the replacement code sequence including a no-operation (NOP) instruction or having fewer ...

Подробнее

Номер записи: 56

30-06-2016 дата публикации

PROCESSING PAGE FAULT EXCEPTIONS IN SUPERVISORY SOFTWARE WHEN ACCESSING STRINGS AND SIMILAR DATA STRUCTURES USING NORMAL LOAD INSTRUCTIONS

Номер: US20160188483A1

Автор: Michael Gschwind, Brett Olsson, GSCHWIND MICHAEL, OLSSON BRETT

Принадлежит:

Embodiments are directed to a method of accessing a data frame, wherein a first portion of the data frame is in a first memory block, and wherein a second portion of the data frame is in a second memory block. The method includes determining that an access of the data frame crosses a boundary between the first second memory blocks, determining that an attempted translation of an address of the first portion of the data frame in the first memory block did not result in a translation fault, and accessing the first portion of the data frame. The method further includes, based at least in part on a determination that an attempted translation of an address of the second portion of the data frame in the second memory block resulted in a translation fault, accessing at least one default character as a replacement for accessing the second portion of the data frame.

Подробнее

Номер записи: 57

13-07-2006 дата публикации

SIMD-RISC processor module

Номер: US20060155955A1

Автор: Michael Gschwind, Charles Johns, Harm Hofstee, James Kahle

Принадлежит:

A computer architecture and programming model for high speed processing over broadband networks are provided. The architecture employs a consistent modular structure, a common computing module and uniform software cells. The common computing module includes a control processor, a plurality of processing units, a plurality of local memories from which the processing units process programs, a direct memory access controller and a shared main memory. A synchronized system and method for the coordinated reading and writing of data to and from the shared main memory by the processing units also are provided. A hardware sandbox structure is provided for security against the corruption of data among the programs being processed by the processing units. The uniform software cells contain both data and applications and are structured for processing by any of the processors of the network. Each software cell is uniquely identified on the network.

Подробнее

Номер записи: 58

29-04-2004 дата публикации

Method and apparatus for mapping debugging information when debugging integrated executables in a heterogeneous architecture

Номер: US20040083331A1

Автор: Michael Gschwind, Kathryn O'Brien, John O'Brien, Valentina Salapura

Принадлежит: International Business Machines Corporation

The present invention provides for the employment of a dynamic debugger for a parallel processing environment. This is achieved by dynamically updating mapping information at run-time in a mapping table, wherein the mapping table is read by the dynamic debugger.

Подробнее

Номер записи: 59

15-02-2007 дата публикации

Implementing instruction set architectures with non-contiguous register file specifiers

Номер: US20070038848A1

Автор: Michael Gschwind, Robert Montoye, Brett Olsson, John-David Wellman

Принадлежит:

There are provided methods and computer program products for implementing instruction set architectures with non-contiguous register file specifiers. A method for processing instruction code includes processing a fixed-width instruction of a fixed-width instruction set using a non-contiguous register specifier of a non-contiguous register specification. The fixed-width instruction includes the non-contiguous register specifier.

Подробнее

Номер записи: 60

09-10-2018 дата публикации

Method and apparatus for dynamically replacing legacy instructions with a single executable instruction utilizing a wide datapath

Номер: US0010095524B2

Автор: Michael Gschwind, Balaram Sinharoy, GSCHWIND MICHAEL, SINHAROY BALARAM, Gschwind, Michael, Sinharoy, Balaram

Принадлежит: International Business Machines Corporation, IBM, INTERNATIONAL BUSINESS MACHINES CORPORATION

A processing system and method includes a predecoder configured to identify instructions that are combinable to form a single, executable internal instruction. Instruction storage is configured to merge instructions that are combinable. An instruction execution unit is configured to execute the single, executable internal instruction on a hardware wide datapath.

Подробнее

Номер записи: 61

08-11-2007 дата публикации

Method and apparatus for the dynamic creation of instructions utilizing a wide datapath

Номер: US20070260855A1

Автор: Michael Gschwind, Balaram Sinharoy

Принадлежит:

A processing system and method includes a predecoder configured to identify instructions that are combinable. Instruction storage is configured to merge instructions that are combinable by replacing the combinable instructions with a wide data internal instruction for execution. An instruction execution unit is configured to execute the internal instruction on a wide datapath.

Подробнее

Номер записи: 62

28-08-2003 дата публикации

Method and apparatus for prioritized instruction issue queue

Номер: US20030163671A1

Автор: Michael Gschwind, Valentina Salapura

Принадлежит:

An apparatus and method in a high performance processor for issuing instructions, comprising; a classification logic for sorting instructions in a number of priority categories, a plurality of instruction queues storing the instruction of differing priorities, and a issue logic selecting from which queue to dispatch instructions for execution. This apparatus and method can be implemented in both in-order, and out-of-order execution processor architectures. The invention also involves instruction cloning, and use of various predictive techniques.

Подробнее

Номер записи: 63

03-08-2006 дата публикации

Method and apparatus for embedding wide instruction words in a fixed-length instruction set architecture

Номер: US20060174089A1

Автор: Erik Altman, Michael Gschwind, Daniel Prener, Jude Rivers, Sumedh Sathaye, John-David Wellman, Victor Zyuban

Принадлежит: International Business Machines Corporation

A method, system, and computer program product for mixing of conventional and augmented instructions within an instruction stream, wherein control may be directly transferred, without operating system intervention, between one type of instruction to another. Extra instruction word bits are added in a manner that is designed to minimally interfere with the encoding, decoding, and instruction processing environment in a manner compatible with existing conventional fixed instruction width code. A plurality of instruction words are inserted into an instruction word oriented architecture to form an encoding group of instruction words. The instruction words in the encoding group are dispatched and executed either independently or in parallel based on a specific microprocessor implementation. The encoding group does not indicate any form of required parallelism or sequentiality. One or more indicators for the encoding group are created, wherein one indicator is used to indicate presence of the ...

Подробнее

Номер записи: 64

30-06-2016 дата публикации

PROCESSING PAGE FAULT EXCEPTIONS IN SUPERVISORY SOFTWARE WHEN ACCESSING STRINGS AND SIMILAR DATA STRUCTURES USING NORMAL LOAD INSTRUCTIONS

Номер: US20160188485A1

Автор: Michael Gschwind, Brett Olsson, GSCHWIND MICHAEL, OLSSON BRETT

Принадлежит:

Embodiments are directed to a method of accessing a data frame, wherein a first portion of the data frame is in a first memory block, and wherein a second portion of the data frame is in a second memory block. The method includes determining that an access of the data frame crosses a boundary between the first second memory blocks, determining that an attempted translation of an address of the first portion of the data frame in the first memory block did not result in a translation fault, and accessing the first portion of the data frame. The method further includes, based at least in part on a determination that an attempted translation of an address of the second portion of the data frame in the second memory block resulted in a translation fault, accessing at least one default character as a replacement for accessing the second portion of the data frame.

Подробнее

Номер записи: 65

14-06-2005 дата публикации

Symmetric multi-processing system utilizing a DMAC to allow address translation for attached processors

Номер: US0006907477B2

Автор: Erik R. Altman, Peter G. Capek, Michael Gschwind, Harm Peter Hofstee, James Allan Kahle, Ravi Nair, Sumedh Wasudeo Sathaye, John-David Wellman, ALTMAN ERIK R, CAPEK PETER G, GSCHWIND MICHAEL, HOFSTEE HARM PETER, KAHLE JAMES ALLAN, NAIR RAVI, SATHAYE SUMEDH WASUDEO, WELLMAN JOHN-DAVID, ALTMAN ERIK R., CAPEK PETER G.

Принадлежит: International Business Machines Corporation, IBM, INTERNATIONAL BUSINESS MACHINES CORPORATION

A method and system for attached processing units accessing a shared memory in an SMT system. In one embodiment, a system comprises a shared memory. The system further comprises a plurality of processing elements coupled to the shared memory. Each of the plurality of processing elements comprises a processing unit, a direct memory access controller and a plurality of attached processing units. Each direct memory access controller comprises an address translation mechanism thereby enabling each associated attached processing unit to access the shared memory in a restricted manner without an address translation mechanism. Each attached processing unit is configured to issue a request to an associated direct memory access controller to access the shared memory specifying a range of addresses to be accessed as virtual addresses. The associated direct memory access controller is configured to translate the range of virtual addresses into an associated range of physical addresses.

Подробнее

Номер записи: 66

29-09-2005 дата публикации

Method and apparatus for directory-based coherence with distributed directory management

Номер: US20050216675A1

Автор: Michael Gschwind, Charles Johns, Thoung Truong

Принадлежит: International Business Machines Corporation

The present invention provides for a type of parallel processing architecture in which a plurality of processors has access to a shared memory hierarchy level. A memory hierarchy level has a coherence directory and associated directory data with a plurality of cachelines each associated with different data. Buffers are interconnected to processor memory and a plurality of processor elements, each element interconnected to different buffers. Cache lines are requested from memory, and the requests, responses, and detections therein are available for particular access modes, therein providing additional coherence of the data. Processing of the directory data is performed by processing elements.

Подробнее

Номер записи: 67

10-11-2005 дата публикации

System and method of execution of register pointer instructions ahead of instruction issue

Номер: US20050251654A1

Автор: Erik Altman, Michael Gschwind, Jude Rivers, Sumedh Sathaye, John-David Wellman, Victor Zyuban

Принадлежит:

A pipeline system and method includes a plurality of operational stages. The stages include a pointer register stage which stores pointer information and updates, and a rename and dependence checking stage located downstream of the pointer register stage, which renames registers and determines if dependencies exist. A functional unit provides pointer information updates to the pointer register stage such that pointer information is processed and updated to the pointer register stage before or in parallel with the register dependency checking.

Подробнее

Номер записи: 68

04-11-2004 дата публикации

Processor with low overhead predictive supply voltage gating for leakage power reduction

Номер: US20040221185A1

Автор: Pradip Bose, David Brooks, Peter Cook, Philip Emma, Michael Gschwind, Stanley Schuster, Vijayalakshmi Srinivasan

Принадлежит: International Business Machines Corporation

An integrated circuit (IC) including unit power control, leakage reduction circuit for controllably reducing leakage power with reduced LdI/dt noise in the IC and, an activity prediction unit invoking active/dormant states in IC units. The prediction unit determines turn on and turn off times for each IC unit. The prediction unit controls a supply voltage select circuit selectively passing a supply voltage to a separate supply line at the predicted turn on time and selectively blocking the supply voltage at the predicted turn off time.

Подробнее

Номер записи: 69

20-06-2002 дата публикации

Symmetric multi-processing system

Номер: US20020078308A1

Автор: Erik Altman, Peter Capek, Michael Gschwind, Harm Hofstee, James Kahle, Ravi Nair, Sumedh Sathaye, John-David Wellman

Принадлежит: International Business Machines Corporation

A method and system for attached processing units accessing a shared memory in an SMP system. In one embodiment, a system comprises a shared memory. The system further comprises a plurality of processing elements coupled to the shared memory. Each of the plurality of processing elements comprises a processing unit, a direct memory access controller and a plurality of attached processing units. Each direct memory access controller comprises an address translation mechanism thereby enabling each associated attached processing unit to access the shared memory in a restricted manner without an address translation mechanism. Each attached processing unit is configured to issue a request to an associated direct memory access controller to access the shared memory specifying a range of addresses to be accessed as virtual addresses. The associated direct memory access controller is configured to translate the range of virtual addresses into an associated range of physical addresses.

Подробнее

Номер записи: 70

04-03-2008 дата публикации

Extending the number of instruction bits in processors with fixed length instructions, in a manner compatible with existing code

Номер: US0007340588B2

Автор: Erik R Altman, Michael Gschwind, David A. Luick, Daniel A. Prener, Jude A. Rivers, Sumedh W. Sathaye, John-David Wellman, ALTMAN ERIK R, GSCHWIND MICHAEL, LUICK DAVID A, PRENER DANIEL A, RIVERS JUDE A, SATHAYE SUMEDH W, WELLMAN JOHN-DAVID, LUICK DAVID A., PRENER DANIEL A., RIVERS JUDE A., SATHAYE SUMEDH W.

Принадлежит: International Business Machines Corporation, IBM, INTERNATIONAL BUSINESS MACHINES CORPORATION

This invention pertains to apparatus, method and a computer program stored on a computer readable medium. The computer program includes instructions for use with an instruction unit having a code page, and has computer program code for partitioning the code page into at least two sections for storing in a first section thereof a plurality of instruction words and, in association with at least one instruction word, for storing in a second section thereof an extension to each instruction word in the first section. The computer program further includes computer program code for setting a state of at least one page table entry bit for indicating, on a code page by code page basis, whether the code page is partitioned into the first and second sections for storing instruction words and their extensions, or whether the code page is comprised instead of a single section storing only instruction words.

Подробнее

Номер записи: 71

18-10-2007 дата публикации

Memory compression method and apparatus for heterogeneous processor architectures in an information handling system

Номер: US20070245097A1

Автор: Michael Gschwind, Barry Minor

Принадлежит: IBM Corporation

The disclosed heterogeneous processor compresses information to more efficiently store the information in a system memory coupled to the processor. The heterogeneous processor includes a general purpose processor core coupled to one or more processor cores that exhibit an architecture different from the architecture of the general purpose processor core. In one embodiment, the processor dedicates a processor core other than the general purpose processor core to memory compression and decompression tasks. In another embodiment, system memory stores both compressed information and uncompressed information.

Подробнее

Номер записи: 72

12-01-2012 дата публикации

Matrix Multiplication Operations Using Pair-Wise Load and Splat Operations

Номер: US20120011348A1

Автор: Eichenberger Alexandre E., Gschwind Michael K., GUNNELS JOHN A., Salapura Valentina

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

Mechanisms for performing a matrix multiplication operation are provided. A vector load operation is performed to load a first vector operand of the matrix multiplication operation to a first target vector register. A pair-wise load and splat operation is performed to load a pair of scalar values of a second vector operand and replicate the pair of scalar values within a second target vector register. An operation is performed on elements of the first target vector register and elements of the second target vector register to generate a partial product of the matrix multiplication operation. The partial product is accumulated with other partial products and a resulting accumulated partial product is stored. This operation may be repeated for a second pair of scalar values of the second vector operand. 1. A method , in a data processing system comprising a processor , for performing a matrix multiplication operation , comprising:performing, by the processor, a vector load operation to load a first vector operand of the matrix multiplication operation to a first target vector register of the data processing system, the first vector operand comprising one or more values;performing, by the processor, a load pair and splat operation to load a pair of values of a second vector operand and replicate the pair of values within a second target vector register of the data processing system;performing, by the processor, an operation on elements of the first target vector register and elements of the second target vector register to generate a partial product of the matrix multiplication operation;accumulating, by the processor, the partial product of the matrix multiplication operation with other partial products of the matrix multiplication operation to generate a result of the matrix multiplication operation.2. The method of claim 1 , wherein performing the load pair and splat operation comprises:loading a first scalar value of the second vector operand into a first vector ...

Подробнее

Номер записи: 73

08-03-2012 дата публикации

Vector Loads with Multiple Vector Elements from a Same Cache Line in a Scattered Load Operation

Номер: US20120060015A1

Автор: Eichenberger Alexandre E., Gschwind Michael K., Salapura Valentina

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

Mechanisms for performing a scattered load operation are provided. With these mechanisms, an extended address is received in a cache memory of a processor. The extended address has a plurality of data element address portions that specify a plurality of data elements to be accessed using the single extended address. Each of the plurality of data element address portions is provided to corresponding data element selector logic units of the cache memory. Each data element selector logic unit in the cache memory selects a corresponding data element from a cache line buffer based on a corresponding data element address portion provided to the data element selector logic unit. Each data element selector logic unit outputs the corresponding data element for use by the processor. 1. A method , in a processor , for performing a scattered load operation , comprising:receiving, in a cache memory of the processor, an extended address having a base address portion and a plurality of data element address portions that specifies a plurality of data elements to be accessed using the single extended address;providing each of the plurality of data element address portions to corresponding data element selector logic units of the cache memory;selecting, by each data element selector logic unit in the cache memory, a corresponding data element from a cache line buffer based on a corresponding data element address portion provided to the data element selector logic unit; andoutputting, by each data element selector logic unit, the corresponding data element.2. The method of claim 1 , wherein the extended address is received by a load/store unit of the processor claim 1 , prior to being received in the cache memory claim 1 , as part of a load instruction for loading a plurality of non-contiguous data elements from a same cache line of the cache memory.3. The method of claim 2 , wherein the load instruction is received in the load/store unit from a load generating unit in response to the ...

Подробнее

Номер записи: 74

08-03-2012 дата публикации

Vector Loads from Scattered Memory Locations

Номер: US20120060016A1

Автор: Eichenberger Alexandre E., Gschwind Michael K., Salapura Valentina

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

Mechanisms for performing a scattered load operation are provided. With these mechanisms, a gather instruction is receive in a logic unit of a processor, the gather instruction specifying a plurality of addresses in a memory from which data is to be loaded into a target vector register of the processor. A plurality of separate load instructions for loading the data from the plurality of addresses in the memory are automatically generated within the logic unit. The plurality of separate load instructions are sent, from the logic unit, to one or more load/store units of the processor. The data corresponding to the plurality of addresses is gathered in a buffer of the processor. The logic unit then writes data stored in the buffer to the target vector register. 1. A method , in a logic unit of a processor , for performing a load operation into a target vector register , comprising:receiving, in the logic unit of the processor, a gather instruction specifying a plurality of addresses in a memory from which data is to be loaded into the target vector register of the processor;automatically generating, within the logic unit of the processor, a plurality of separate load instructions for loading the data from the plurality of addresses in the memory based on the gather instruction;sending, from the logic unit within the processor, the plurality of separate load instructions to one or more load/store units of the processor;gathering, within the logic unit of the processor, the data corresponding to the plurality of addresses in a buffer of the processor; andwriting, by the logic unit of the processor, data stored in the buffer to the target vector register.2. The method of claim 1 , wherein the gather instruction specifies a base address register in which a base address for the plurality of addresses is stored claim 1 , and an offset address vector register in which a plurality of address offsets corresponding to the plurality of addresses is stored.3. The method of claim 2 ...

Подробнее

Номер записи: 75

07-06-2012 дата публикации

CONTROL SIGNAL MEMOIZATION IN A MULTIPLE INSTRUCTION ISSUE MICROPROCESSOR

Номер: US20120144166A1

Автор: Altman Erik Richter, Gschwind Michael Karl, Rivers Jude A., Sathaye Sumedh W., Wellman John-David, Zyuban Victor V.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

A dynamic predictive and/or exact caching mechanism is provided in various stages of a microprocessor pipeline so that various control signals can be stored and memorized in the course of program execution. Exact control signal vector caching may be done. Whenever an issue group is formed following instruction decode, register renaming, and dependency checking, an encoded copy of the issue group information can be cached under the tag of the leading instruction. The resulting dependency cache or control vector cache can be accessed right at the beginning of the instruction issue logic stage of the microprocessor pipeline the next time the corresponding group of instructions come up for re-execution. Since the encoded issue group bit pattern may be accessed in a single cycle out of the cache, the resulting microprocessor pipeline with this embodiment can be seen as two parallel pipes, where the shorter pipe is followed if there is a dependency cache or control vector cache hit. 120-. (canceled)21. A microprocessor for multiple instruction issue in the microprocessor , the microprocessor comprising:an instruction buffer;instruction decode and issue logic;a dependency cache; anda plurality of functional units, identifies an instruction group to be issued to the plurality of functional units in the microprocessor;', 'determines whether a dependency cache entry exists for the instruction group in the dependency cache, wherein the dependency cache entry includes control signals for executing the instruction group in a pipe of the microprocessor;', 'uses the control signals in the dependency cache entry to control execution of the instruction group in the microprocessor in response to a dependency cache entry existing for the instruction group in the dependency cache; and', 'computes control signals for the instruction group to form computed control signals and stores the computed control signals in the dependency cache in association with the instruction group in response ...

Подробнее

Номер записи: 76

16-08-2012 дата публикации

STATE RECOVERY AND LOCKSTEP EXECUTION RESTART IN A SYSTEM WITH MULTIPROCESSOR PAIRING

Номер: US20120210162A1

Автор: Gara Alan, Gschwind Michael Karl, Salapura Valentina

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

System, method and computer program product for a multiprocessing system to offer selective pairing of processor cores for increased processing reliability. A selective pairing facility is provided that selectively connects, i.e., pairs, multiple microprocessor or processor cores to provide one highly reliable thread (or thread group). Each paired microprocessor or processor cores that provide one highly reliable thread for high-reliability connect with a system components such as a memory “nest” (or memory hierarchy), an optional system controller, and optional interrupt controller, optional I/O or peripheral devices, etc. The memory nest is attached to a selective pairing facility via a switch or a bus. Each selectively paired processor core is includes a transactional execution facility, wherein the system is configured to enable processor rollback to a previous state and reinitialize lockstep execution in order to recover from an incorrect execution when an incorrect execution has been detected by the selective pairing facility. 1. A multiprocessing computer system comprising:a transactional memory system including a memory storage device;at least two processor cores in communication with said transactional memory system;a pairing sub-system adapted to pair at least two of said at least two processor cores for fault tolerant operations of a current transaction in response to receipt of configuration information signals, said pairing sub-system providing a common signal path for forwarding identical input data signals to each said paired two processor cores for simultaneous pairwise processing thereat, said pairwise processing performing a lock-step execution of said transaction, said transactional memory storage device adapted to store error-free transaction state information used in configuring each paired core said pairing sub-system for said simultaneous pairwise processing;decision logic device, in said pairing sub-system, for receiving transaction output ...

Подробнее

Номер записи: 77

16-08-2012 дата публикации

MULTIPROCESSOR SWITCH WITH SELECTIVE PAIRING

Номер: US20120210172A1

Автор: Gara Alan, Gschwind Michael Karl, Salapura Valentina

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

System, method and computer program product for a multiprocessing system to offer selective pairing of processor cores for increased processing reliability. A selective pairing facility is provided that selectively connects, i.e., pairs, multiple microprocessor or processor cores to provide one highly reliable thread (or thread group). Each paired microprocessor or processor cores that provide one highly reliable thread for high-reliability connect with a system components such as a memory “nest” (or memory hierarchy), an optional system controller, and optional interrupt controller, optional I/O or peripheral devices, etc. The memory nest is attached to a selective pairing facility via a switch or a bus. 1. A multiprocessing computer system comprising:a memory system including a memory storage device;at least two processor cores in communication with said memory system;a pairing sub-system adapted to dynamically configure two of said at least two processor cores for independent parallel operation in response to receipt of first configuration information, said pairing sub-system providing at least two separate signal I/O paths between said memory system and each respective one of said at least two processor cores for said independent parallel operation, and,said pairing sub-system adapted to pair at least two of said at least two processor cores for fault tolerant operations in response to receipt of second configuration information, said pairing sub-system providing a common signal path for forwarding identical input data to each said paired two processor cores for simultaneous processing thereat; and,decision logic device, in said pairing sub-system, for receiving an output of each said paired two processor devices and comparing respective output results of each, said decision logic device generating error indication upon detection of non-matching output results.2. The multiprocessing system as in claim 1 , wherein said at least two separate signal I/O paths in ...

Подробнее

Номер записи: 78

18-10-2012 дата публикации

IMPLEMENTING INSTRUCTION SET ARCHITECTURES WITH NON-CONTIGUOUS REGISTER FILE SPECIFIERS

Номер: US20120265967A1

Автор: Gschwind Michael Karl, Montoye Robert K., Olsson Brett, Wellman John-David

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

There are provided methods and computer program products for implementing instruction set architectures with non-contiguous register file specifiers. A method for processing instruction code includes processing an instruction of an instruction set using a non-contiguous register specifier of a non-contiguous register specification. The instruction includes the non-contiguous register specifier. 1. A non-transitory computer program product for employing extended registers , wherein instructions include at least one register field having a register extension bit , the register field being non-contiguous with the register extension bit , the non-transitory computer program product comprising a tangible storage medium readable by a processor and storing instructions for execution by the computer for performing a method comprising:decoding an instruction for execution, by the processor, the instruction comprising an opcode, one or more register fields for indexing into a corresponding register, a plurality of register extension bits, and an extended opcode, each register extension bit for effectively concatenating as a high order bit to a register field of a corresponding location in the instruction to form an extended register field; for each of said one or more register fields, effectively concatenating, by the processor, a register extension bit as high order bit to a corresponding, non-contiguous register field to form an effectively contiguous extended register field; and', 'performing a function defined by the opcode fields of the instruction using operands of said one or more extended registers corresponding to said extended register fields., 'executing the instruction comprising2. The non-transitory computer program product according to claim 1 , wherein the plurality of register extension bits are located in a single field of the instruction claim 1 , wherein a first extension bit at a first location in the single field is an register extension bit for any ...

Подробнее

Номер записи: 79

22-11-2012 дата публикации

METHODS FOR GENERATING CODE FOR AN ARCHITECTURE ENCODING AN EXTENDED REGISTER SPECIFICATION

Номер: US20120297171A1

Автор: Gschwind Michael Karl, Montoye Robert Kevin, Olsson Brett, Wellman John-David

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

There are provided methods and computer program products for generating code for an architecture encoding an extended register specification. A method for generating code for a fixed-width instruction set includes identifying a non-contiguous register specifier. The method further includes generating a fixed-width instruction word that includes the non-contiguous register specifier. 1. A method for generating code for a fixed-width instruction set , comprising:identifying a non-contiguous register specifier; andgenerating a fixed-width instruction word that includes the non-contiguous register specifier.2. The method of claim 1 , wherein the non-contiguous register specifier includes at least two sets of contiguous bits separated by at least one bit not part of the register specifier claim 1 , and the method further comprises encoding a single logical register specifier into at least two non-contiguous fields of the non-contiguous register specifier.3. The method of claim 2 , wherein the single logical register specifier is represented using a generic intrinsic that provides no indication of a partitioning of an operand specification into the non-contiguous register specifier.4. The method of claim 1 , wherein a first set of bits in the non-contiguous register specifier is specified directly by an instruction field in the fixed-width instruction claim 1 , and a second set of bits in the non-contiguous register specifier is specified directly by another instruction field in the fixed-width instruction.5. The method of claim 1 , wherein a first set of bits in the non-contiguous register specifier is specified directly by an instruction field in the fixed-width instruction claim 1 , a second set of bits in the non-contiguous register specifier is specified using a deep encoding claim 1 , and the method further comprises generating a set of n bits for inclusion as part of the non-contiguous register specifier from a set of m bits encoded in the fixed-width instruction ...

Подробнее

Номер записи: 80

22-11-2012 дата публикации

Methods for generating code for an architecture encoding an extended register specification

Номер: US20120297373A1

Автор: Brett Olsson, John-David Wellman, Michael K. Gschwind, Robert Kevin Montoye

Принадлежит: International Business Machines Corp

There are provided methods and computer program products for generating code for an architecture encoding an extended register specification. A method for generating code for a fixed-width instruction set includes identifying a non-contiguous register specifier. The method further includes generating a fixed-width instruction word that includes the non-contiguous register specifier.

Подробнее

Номер записи: 81

21-03-2013 дата публикации

FINE-GRAINED INSTRUCTION ENABLEMENT AT SUB-FUNCTION GRANULARITY

Номер: US20130073836A1

Автор: Gschwind Michael K., Olsson Brett, Salapura Valentina

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

Fine-grained enablement at sub-function granularity. An instruction encapsulates different sub-functions of a function, in which the sub-functions use different sets of registers of a composite register file, and therefore, different sets of functional units. At least one operand of the instruction specifies which set of registers, and therefore, which set of functional units, is to be used in performing the sub-function. The instruction can perform various functions (e.g., move, load, etc.) and a sub-function of the function specifies the type of function (e.g., move-floating point; move-vector; etc.). 1. A computer program product for executing a machine instruction , said computer program product comprising: [ at least one opcode field identifying a particular instruction; and', 'at least one field used to indicate one set of registers of multiple sets of registers; and, 'obtaining, by a processor, a machine instruction for execution, the machine instruction being defined for computer execution according to a computer architecture, the machine instruction comprising, obtaining from the at least one field at least one value;', 'determining that the at least one value indicates the one set of registers;', 'checking whether one or more control indicators are enabled; and', 'performing an operation specified by the opcode field using the one set of registers, based on the checking indicating that at least one control indicator is enabled., 'executing, by the processor, the machine instruction, the executing comprising], 'a computer readable storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for performing a method comprising2. The computer program product of claim 1 , wherein the multiple sets of registers comprise a floating point set of registers and a vector set of registers.3. The computer program product of claim 1 , wherein the one or more control indicators comprise a floating point enable indicator ...

Подробнее

Номер записи: 82

21-03-2013 дата публикации

MULTI-ADDRESSABLE REGISTER FILES AND FORMAT CONVERSIONS ASSOCIATED THEREWITH

Номер: US20130073838A1

Автор: Gschwind Michael K., Olsson Brett

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

A multi-addressable register file is addressed by a plurality of types of instructions, including scalar, vector and vector-scalar extension instructions. It may be determined that data is to be translated from one format to another format. If so determined, a convert machine instruction is executed that obtains a single precision datum in a first representation in a first format from a first register; converts the single precision datum of the first representation in the first format to a converted single precision datum of a second representation in a second format; and places the converted single precision datum in a second register. 1. A computer program product for executing a machine instruction , said computer program product comprising: [ at least one opcode field identifying a convert instruction;', 'at least one field used to specify a first register; and', 'at least one other field used to specify a second register;, 'obtaining, by a processor, a machine instruction for execution, the machine instruction being defined for computer execution according to a computer architecture, the machine instruction comprising, obtaining from the first register a single precision binary floating point datum in a first representation in a first format;', 'converting the single precision binary floating point datum of the first representation in the first format to a converted single precision binary floating point datum of a second representation in a second format; and', 'placing the converted single precision binary floating point datum in the second register., 'executing, by the processor, the machine instruction, the executing comprising], 'a computer readable storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for performing a method comprising2. The computer program product of claim 1 , wherein the method further comprises:determining that the single precision binary floating point datum of the first ...

Подробнее

Номер записи: 83

28-03-2013 дата публикации

FINE-GRAINED INSTRUCTION ENABLEMENT AT SUB-FUNCTION GRANULARITY

Номер: US20130080745A1

Автор: Gschwind Michael K., Olsson Brett, Salapura Valentina

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

Fine-grained enablement at sub-function granularity. An instruction encapsulates different sub-functions of a function, in which the sub-functions use different sets of registers of a composite register file, and therefore, different sets of functional units. At least one operand of the instruction specifies which set of registers, and therefore, which set of functional units, is to be used in performing the sub-function. The instruction can perform various functions (e.g., move, load, etc.) and a sub-function of the function specifies the type of function (e.g., move-floating point; move-vector; etc.). 1. A method of executing a machine instruction , said method comprising: at least one opcode field identifying a particular instruction; and', 'at least one field used to indicate one set of registers of multiple sets of registers; and, 'obtaining, by a processor, a machine instruction for execution, the machine instruction being defined for computer execution according to a computer architecture, the machine instruction comprising obtaining from the at least one field at least one value;', 'determining that the at least one value indicates the one set of registers;', 'checking whether one or more control indicators are enabled;', 'performing an operation specified by the opcode field using the one set of registers, based on the checking indicating at least one control indicator is enabled., 'executing, by the processor, the machine instruction, the executing comprising2. The method of claim 1 , wherein the multiple sets of registers comprise a floating point set of registers and a vector set of registers.3. The method of claim 1 , wherein the one or more control indicators comprise a floating point enable indicator or a vector enable indicator depending on the value of the at least one field.4. The method of claim 1 , wherein the one or more control indicators comprise a floating point enable indicator claim 1 , a vector enable indicator or a vector-scalar enable ...

Подробнее

Номер записи: 84

04-04-2013 дата публикации

LINKING CODE FOR AN ENHANCED APPLICATION BINARY INTERFACE (ABI) WITH DECODE TIME INSTRUCTION OPTIMIZATION

Номер: US20130086338A1

Автор: Blainey Robert J., Gschwind Michael K., McInnes James L., Meissner Michael R., Munroe Steven J.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

A code sequence made up multiple instructions and specifying an offset from a base address is identified in an object file. The offset from the base address corresponds to an offset location in a memory configured for storing an address of a variable or data. The identified code sequence is configured to perform a memory reference function or a memory address computation function. It is determined that the offset location is within a specified distance of the base address and that a replacement of the identified code sequence with a replacement code sequence will not alter program semantics. The identified code sequence in the object file is replaced with the replacement code sequence that includes a no-operation (NOP) instruction or having fewer instructions than the identified code sequence. Linked executable code is generated based on the object file and the linked executable code is emitted. 1. A method comprising:reading, by a computer, an object file comprising a plurality of code sequences;identifying a code sequence in the object file that specifies an offset from a base address, the offset from the base address corresponding to an offset location in a memory configured for storing one of an address of a variable and data, the identified code sequence comprising a plurality of instructions and configured to perform one of a memory reference function and a memory address computation function;determining that the offset location is within a specified distance of the base address;verifying that a replacement of the identified code sequence with a replacement code sequence will not alter program semantics;replacing the identified code sequence in the object file with the replacement code sequence, the replacement code sequence including a no-operation (NOP) instruction or having fewer instructions than the identified code sequence;generating linked executable code responsive to the object file; andemitting the linked executable code.2. The method of claim 1 , ...

Подробнее

Номер записи: 85

04-04-2013 дата публикации

Scalable Decode-Time Instruction Sequence Optimization of Dependent Instructions

Номер: US20130086361A1

Автор: Gschwind Michael K., Salapura Valentina

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

Producer-consumer instructions, comprising a first instruction and a second instruction in program order, are fetched requiring in-order execution, the second instruction is modified by the processor so that the first instruction and second instruction can be completed out-of-order, the modification comprising any one of extending an immediate field of the second instruction using immediate field information of the first instruction or providing a source location of the first instruction as an additional source location to source locations of the second instruction. 124-. (canceled)25. A computer implemented method for executing dependent instructions of an instruction set architecture (ISA) out-of-order out-of-order , the computer implemented method comprising:fetching for execution, by the processor, a first instruction of the ISA and a second instruction of the ISA;determining in-order execution candidacy, by the processor, of the first instruction and the second instruction, wherein the first instruction and second instruction are configured to be executed out-of-order but are candidates to be effectively executed in-order, the determination comprising determining that the first instruction specifies a target operand location for a target operand and the second instruction specifies a source operand location for a source operand, wherein the first instruction is configured to store a target operand at the target operand location, wherein the source operand location is the same as the target operand location, wherein the second instruction is configured to obtain the source operand at the source operand location; andbased on the determining in-order execution candidacy, effectively executing, by the processor, the first instruction and the second instruction by executing, by the processor, the first instruction of the ISA and a new second instruction not of the ISA, wherein the new second instruction is not dependent on the target operand of the first instruction ...

Подробнее

Номер записи: 86

04-04-2013 дата публикации

Managing a Register Cache Based on an Architected Computer Instruction Set Having Operand First-Use Information

Номер: US20130086362A1

Автор: Gschwind Michael K., Salapura Valentina

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

A prefix instruction is executed and passes operands to a net instruction without storing the operands in an architected resource such that the execution of the next instruction uses the operands provided by the prefix instruction to perform an operation, the operands may be prefix instruction immediate field or a target register of the prefix instruction execution. 17-. (canceled)8. A computer implemented method for executing instructions , the method comprising:obtaining a first instruction and a second instruction for execution, the first instruction preceding the second instruction in program order;determining, by a processor, that the first instruction is a prefix instruction, the prefix instruction specifying a first value to be used m executing the second instruction, the second instruction specifying a second value to be used in executing the second instruction;effectively executing the first instruction absent storing the first value, at an instruction specified location; andeffectively executing the second instruction using the first value absent fetching the first value at a second instruction specified location.9. The method according to claim 8 , the determining further comprising determining that there is no intervening interruption event between the effective execution of the first instruction and the second instruction.10. The method according to claim 9 , wherein the first value to be used in executing the second instruction is identified as a result register of the first instruction claim 9 , wherein the result register of the first instruction is a source register of the second instruction.113. The method according to claim claim 9 , wherein the result register is an architected register associated with an architected instruction set claim 9 , consisting of any one of a general register or a floating point register.12. The method according to claim 9 , wherein the first value to be used in executing the second instruction is identified as a result ...

Подробнее

Номер записи: 87

04-04-2013 дата публикации

Computer Instructions for Activating and Deactivating Operands

Номер: US20130086363A1

Автор: Gschwind Michael K., Salapura Valentina

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

An instruction set architecture (ISA) includes instructions for selectively indicating last-use architected operands having values that will not be accessed again, wherein architected operands are made active or inactive after an instruction specified last-use by an instruction, wherein the architected operands are made active by performing a write operation to an inactive operand, wherein the activation/deactivation may be performed by the instruction having the last-use of the operand or another (prefix) instruction. 1. A computer implemented method for deactivating an active operand , the active operand being an instruction accessible architecturally defined operand , wherein access to a deactivated operand need not return values previously stored to the operand , the method comprising:executing, by a processor, an operand deactivating (OD) instruction, the OD instruction comprising an opcode field having an opcode value, the OD instruction having an associated operand to be deactivated, the execution comprising:determining a last-use of the associated operand;performing an opcode defined function using the associated operand; andplacing the associated operand in a deactivated state.2. The method according to claim 1 , further comprising executing claim 1 , by the processor claim 1 , any one of the OD instruction or another instruction to indicate the use of the associated operand by the OD instruction is the last-use.3. The method according to claim 2 , wherein the associated Operand consists of any one of an architected general register claim 2 , an architected adjunct register or an architected floating point register claim 2 , wherein a read of a deactivated operand returns a default value claim 2 , the default value being any one of a value that is architecturally undefined or an architected default value.4. The method according to claim 3 , further comprising:performing a write to an architected operand in a deactivated state causes the architected operand ...

Подробнее

Номер записи: 88

04-04-2013 дата публикации

Managing a Register Cache Based on an Architected Computer Instruction Set Having Operand Last-User Information

Номер: US20130086364A1

Автор: Gschwind Michael K., Salapura Valentina

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

A multi-level register hierarchy is disclosed comprising a first level pool of registers for caching registers of a second level pool of registers in a system wherein programs can dynamically release and re-enable architected registers such that released architected registers need not be maintained by the processor, the processor accessing operands from the first level pool of registers, wherein a last-use instruction is identified as having a last use of an architected register before being released, the last-use architected register being released causes the multi-level register hierarchy to discard any correspondence of an entry to said last use architected register. 114-. (canceled)15. A computer implemented method for managing a multi-level register hierarchy , comprising a first level pool , of registers for caching registers of a second level pool of registers , the method comprising:assigning, by a processor, architected registers to available entries of one of said first level pool or said second level pool, wherein architected registers are defined by an instruction set architecture (ISA) and addressable by register field values of instructions of the ISA, wherein the assigning comprises associating each assigned architected register to a corresponding an entry of a pool of registers;moving architected register values to said first level pool from said second level pool according to a first level pool replacement algorithm;based on instructions being executed, accessing architected register values of the first level pool of registers corresponding to said architected registers;based on executing a last-use instruction for using an architected register identified as a last-use architected register, un-assigning the last-use architected register from both the first level pool and the second level pool, wherein un-assigned entries are available for assigning to architected registers.16. The method according to claim 15 , further comprising:based on ...

Подробнее

Номер записи: 89

04-04-2013 дата публикации

Exploiting an Architected List-Use Operand Indication in a Computer System Operand Resource Pool

Номер: US20130086365A1

Автор: Gschwind Michael K., Salapura Valentina

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

A pool of available physical registers are provided for architected registers, wherein operations are performed that activate and deactivate selected architected registers, such that the deactivated selected architected registers need not retain values, and physical registers can be deallocated to the pool, wherein deallocation of physical registers is performed after a last-use by a designated last-use instruction, wherein the last-use information is provided either by the last-use instruction or a prefix instruction, wherein reads to deallocated architecture registers return an architected default value. 1. A computer implemented method for managing a pool of available physical registers , the method comprising:performing, by a processor, operations that activate and deactivate selected architected registers of a set of architected registers, wherein a selected architected register is deactivated after a last-use of a value of the selected architected register; andbased on an instruction being executed requesting a read of a first value from an architected register of a set of architected registers, performing a)-c) comprising:a) determining whether the architected register is activated; andb) based on the determining the architected register being read is deactivated, returning an architecture defined default value; andc) based on the determining the architected register being a read is activated, returning the first value, wherein the first value is a previously stored value in said architected register.2. The method according to claim 1 , further comprising;based on an instruction being executed requesting a write of a second value to an architected register of the set of architected registers, performing d)-f) comprising:d) determining whether the architected register is activated; ande) based on the determining the architected register being written to is deactivated, activating the architected register; andf) writing the second value to the architected ...

Подробнее

Номер записи: 90

04-04-2013 дата публикации

Tracking operand liveliness information in a computer system and performance function based on the liveliness information

Номер: US20130086367A1

Автор: Michael K. Gschwind, Valentina Salapura

Принадлежит: International Business Machines Corp

Operand liveness state information is maintained during context switches for current architected operands of executing programs the current operand state information indicating whether corresponding current operands are any one of enabled or disabled for use by a first program module, the first program module comprising machine instructions of an instruction set architecture (ISA) for disabling current architected operands, wherein a current operand is accessed by a machine instruction of said first program module, the accessing comprising using the current operand state information to determine whether a previously stored current operand value is accessible by the first program module.

Подробнее

Номер записи: 91

04-04-2013 дата публикации

Using Register Last Use Infomation to Perform Decode-Time Computer Instruction Optimization

Номер: US20130086368A1

Автор: Gschwind Michael K., Salapura Valentina

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

Two computer machine instructions are fetched for execution, but replaced by a single optimized instruction to be executed, wherein a temporary register used by the two instructions is identified as a last-use register, where a last-use register has a value that is not to be accessed by later instructions, whereby the two computer machine instructions are replaced by a single optimized internal instruction for execution, the single optimized instruction not including the last-use register. 1. A computer implemented method for optimizing instructions to be executed , the method comprising:determining, by a processor, two instructions to be executed are candidates for optimization to a single optimized internal instruction, the two instructions comprising a first instruction identifying a first operand as a target operand and a second instruction identifying the first operand as a source operand, the first instruction preceding the second instruction in program order;determining that the first operand is specified as a last-use operand;creating, by the processor, the single optimized internal instruction based on the two instructions, wherein the single optimized internal instruction does not specify the first operand; andexecuting the single optimized internal instruction instead of the two instructions.2. The method according to claim 1 , wherein the first instruction includes a first immediate field. and the second instruction comprises a second immediate field further comprising;concatenating at least part of the first immediate field with at least a part of the second immediate field to create a combined immediate field of the single optimized internal instruction.3. The method according to claim 2 , wherein the single optimized internal instruction is created based on the at least part of the first immediate field or the at least a part of the second immediate field forming most significant bits of the combined immediate field have a predetermined number of high ...

Подробнее

Номер записи: 92

04-04-2013 дата публикации

COMPILING CODE FOR AN ENHANCED APPLICATION BINARY INTERFACE (ABI) WITH DECODE TIME INSTRUCTION OPTIMIZATION

Номер: US20130086369A1

Автор: Blainey Robert J., Gschwind Michael K., McInnes James L., Munroe Steven J.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

Compiling code for an enhanced application binary interface (ABI) including identifying, by a computer, a code sequence configured to perform a variable address reference table function including an access to a variable at an offset outside of a location in a variable address reference table. The code sequence includes an internal representation (IR) of a first instruction and an IR of a second instruction. The second instruction is dependent on the first instruction. A scheduler cost function associated with at least one of the IR of the first instruction and the IR of the second instruction is modified. The modifying includes generating a modified scheduler cost function that is configured to place the first instruction next to the second instruction. An object file is generated responsive to the modified scheduler cost function. The object file includes the first instruction placed next to the second instruction. The object file is emitted. 1. A method comprising:identifying, by a computer, a code sequence configured to perform a variable address reference table function including an access to a variable at an offset outside of a location in a variable address reference table, the code sequence comprising an internal representation (IR) of a first instruction and an IR of a second instruction, the second instruction dependent on the first instruction;modifying a scheduler cost function associated with at least one of the IR of the first instruction and the IR of the second instruction, the modifying including generating a modified scheduler cost function that is configured to place the first instruction next to the second instruction;generating an object file responsive to the modified scheduler cost function, the object file including the first instruction placed next to the second instruction; andemitting the object file.2. The method of claim 1 , wherein the code sequence is a destructive code sequence.3. The method of claim 1 , wherein the code sequence is a ...

Подробнее

Номер записи: 93

04-04-2013 дата публикации

Generating compiled code that indicates register liveness

Номер: US20130086548A1

Автор: Michael K. Gschwind, Valentina Salapura

Принадлежит: International Business Machines Corp

Object code is generated from an internal representation that includes a plurality of source operands. The generating includes performing for each source operand in the internal representation determining whether a last use has occurred for the source operand. The determining includes accessing a data flow graph to determine whether all uses of a live range have been emitted. If it is determined that a last use has occurred for the source operand, an architected resource associated with the source operand is marked for last-use indication. A last-use indication is then generated for the architected resource. Instructions and the last-use indications are emitted into the object code.

Подробнее

Номер записи: 94

04-04-2013 дата публикации

LINKING CODE FOR AN ENHANCED APPLICATION BINARY INTERFACE (ABI) WITH DECODE TIME INSTRUCTION OPTIMIZATION

Номер: US20130086570A1

Автор: Blainey Robert J., Gschwind Michael K., McInnes James L., Meissner Michael R., Munroe Steven J.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

A code sequence made up multiple instructions and specifying an offset from a base address is identified in an object file. The offset from the base address corresponds to an offset location in a memory configured for storing an address of a variable or data. The identified code sequence is configured to perform a memory reference function or a memory address computation function. It is determined that the offset location is within a specified distance of the base address and that a replacement of the identified code sequence with a replacement code sequence will not alter program semantics. The identified code sequence in the object file is replaced with the replacement code sequence that includes a no-operation (NOP) instruction or having fewer instructions than the identified code sequence. Linked executable code is generated based on the object file and the linked executable code is emitted. 1. A computer program product comprising a tangible storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for performing a method comprising:reading, by a computer, an object file comprising a plurality of code sequences;identifying a code sequence in the object file that specifies an offset from a base address, the offset from the base address corresponding to an offset location in a memory configured for storing one of an address of a variable and data, the identified code sequence comprising a plurality of instructions and configured to perform one of a memory reference function and a memory address computation function;determining that the offset location is within a specified distance of the base address;verifying that a replacement of the identified code sequence with a replacement code sequence will not alter program semantics;replacing the identified code sequence in the object file with the replacement code sequence, the replacement code sequence including a no-operation (NOP) instruction or having fewer ...

Подробнее

Номер записи: 95

04-04-2013 дата публикации

Privilege level aware processor hardware resource management facility

Номер: US20130086581A1

Автор: Giles R. Frazier, Michael K. Gschwind, Naresh Nayar

Принадлежит: International Business Machines Corp

Multiple machine state registers are included in a processor core to permit distinction between use of hardware facilities by applications, supervisory threads and the hypervisor. All facilities are initially disabled by the hypervisor when a partition is initialized. When any access is made to a disabled facility, the hypervisor receives an indication of which facility was accessed and sets a corresponding hardware flag in the hypervisor's machine state register. When an application attempts to access a disabled facility, the supervisor managing the operating system image receives an indication of which facility was accessed and sets a corresponding hardware flag in the supervisor's machine state register. The multiple register implementation permits the supervisor to determine whether particular hardware facilities need to have their state saved when an application context swap occurs and the hypervisor can determine which hardware facilities need to have their state saved when a partition swap occurs.

Подробнее

Номер записи: 96

04-04-2013 дата публикации

GENERATING COMPILED CODE THAT INDICATES REGISTER LIVENESS

Номер: US20130086598A1

Автор: Gschwind Michael K., Salapura Valentina

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

Object code is generated from an internal representation that includes a plurality of source operands. The generating includes performing for each source operand in the internal representation determining whether a last use has occurred for the source operand. The determining includes accessing a data flow graph to determine whether all uses of a live range have been emitted. If it is determined that a last use has occurred for the source operand, an architected resource associated with the source operand is marked for last-use indication. A last-use indication is then generated for the architected resource. Instructions and the last-use indications are emitted into the object code. 1. A computer program product comprising a tangible storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for performing a method comprising: [ 'determining whether a last use has occurred for the source operand, the determining comprising accessing a data flow graph to determine whether all uses of a live range have been emitted;', 'performing, for each source operand in the internal representation, generating a last-use indication for the architected resource responsive to the marking; and', 'emitting instructions and the last-use indications into the object code., 'responsive to determining that a last use has occurred for the source operand, marking an architected resource associated with the source operand for last-use indication; and'}], 'generating, by a computer, object code from an internal representation, the internal representation comprising a plurality of source operands, the generating comprising2. The computer program product of claim 1 , wherein the architected resource is a register.3. The computer program product of claim 1 , wherein an instruction associated with the last use of the source operand is modified to include the last-use indication.4. The computer program product of claim 3 , wherein the instruction ...

Подробнее

Номер записи: 97

25-04-2013 дата публикации

Multi-addressable register files and format conversions associated therewith

Номер: US20130103932A1

Автор: Brett Olsson, Michael K. Gschwind

Принадлежит: International Business Machines Corp

A multi-addressable register file is addressed by a plurality of types of instructions, including scalar, vector and vector-scalar extension instructions. It may be determined that data is to be translated from one format to another format. If so determined, a convert machine instruction is executed that obtains a single precision datum in a first representation in a first format from a first register; converts the single precision datum of the first representation in the first format to a converted single precision datum of a second representation in a second format; and places the converted single precision datum in a second register.

Подробнее

Номер записи: 98

11-07-2013 дата публикации

TICKET CONSOLIDATION

Номер: US20130179736A1

Автор: Gschwind Michael K., Mahindru Ruchi, Salapura Valentina

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

A method of ticket consolidation in computing environment may in one aspect analyze problem reports, determine whether problems reported by machines are caused by the same or substantially the same run-time configuration error or are occurring on the same physical server, and are within the given sensitivity time window, consolidate the problem tickets and increase the priority of the consolidated ticket. 1. A method of monitoring and consolidating problems in a computing environment comprising:analyzing a problem report generated associated with a machine in the computing environment;identifying if a problem indicated in the problem report has also been reported in another machine having same runtime configuration as the machine;if said problem has been reported in said another machine having the same runtime configuration, determining whether the generated problem report and the problem reported in said another machine are within a given sensitivity time window;if the generated problem report and the problem reported in said another machine are determined to be within the given sensitivity time window, consolidating the generated problem report and the problem reported in said another machine into a single representative problem ticket;if the generated problem report and the problem reported in said another machine are determined to be outside the given sensitivity time window, generating a new problem ticket including information associated with the generated problem report.2. The method of claim 1 , wherein the single representative problem ticket is prioritized.3. The method of claim 1 , wherein said single representative problem ticket or said generated new problem ticket is entered into an incident monitoring and handling system and incident management system.4. The method of claim 1 , further including maintaining and updating a list of problem reports including associated timestamps indicating when the problem reports were generated and identification of ...

Подробнее

Номер записи: 99

15-08-2013 дата публикации

MIXED PRECISION ESTIMATE INSTRUCTION COMPUTING NARROW PRECISION RESULT FOR WIDE PRECISION INPUTS

Номер: US20130212139A1

Автор: Gschwind Michael K., Salapura Valentina

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

A technique is provided for performing a mixed precision estimate. A processing circuit receives an input of a first precision having a wide precision value. The processing circuit computes an output in an output exponent range corresponding to a narrow precision value based on the input having the wide precision value. 1. A computer program product for performing a mixed precision estimate , the computer program product comprising:a tangible storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for performing a method comprising:receiving, by the processing circuit, an input of a wide precision having a wide precision value; andcomputing, by the processing circuit, an output in an output exponent range corresponding to a narrow precision value based on the input having the wide precision value.2. The computer program product of claim 1 , wherein the method further comprises storing claim 1 , by the processing circuit claim 1 , the output in a register having an architected register storage format in a wide precision format.3. The computer program product of claim 1 , wherein the method further comprises based on the wide precision value of the input having an input exponent failing to correspond to the output exponent range claim 1 , generating the output as an out of range value.4. The computer program product of claim 3 , wherein the out of range value comprises at least one of zero and infinity.5. The computer program product of claim 1 , wherein the method further comprises based on the input comprising a wide not a number (NaN) claim 1 , converting the wide not a number to a narrow not a number in which not a number properties are preserved.6. The computer program product of claim 1 , wherein the method further comprises based on the input having the wide precision value with an input exponent failing to adhere to a valid exponent range of a valid single precision value claim 1 , generating a mantissa ...

Подробнее

Номер записи: 100

19-09-2013 дата публикации

COMPARING SETS OF CHARACTER DATA HAVING TERMINATION CHARACTERS

Номер: US20130243325A1

Автор: Bradbury Jonathan D., Gschwind Michael K., Slegel Timothy J.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

Multiple sets of character data having termination characters are compared using parallel processing and without causing unwarranted exceptions. Each set of character data to be compared is loaded within one or more vector registers. In particular, in one embodiment, for each set of character data to be compared, an instruction is used that loads data in a vector register to a specified boundary, and provides a way to determine the number of characters loaded. Further, an instruction is used to find the index of the first delimiter character, i.e., the first zero or null character, or the index of unequal characters. Using these instructions, a location of the end of one of the sets of data or a location of an unequal character is efficiently provided. 1. A computer program product for comparing characters of a plurality of sets of data , the computer program product comprising: loading from memory to a first register, first data that is within a first specified block of memory, the first data being at least a portion of a first set of data to be compared;', 'loading from memory to a second register, second data that is within a second specified block of memory, the second data being at least a portion of a second set of data to be compared;', 'obtaining a first count of an amount of the first data loaded in the first register and a second count of an amount of second data loaded in the second register;', A) comparing the first data loaded in the first register with the second data loaded in the second register searching for an unequal character; and', 'B) searching at least one of the first register and the second register for a termination character; and', 'based on at least one of the comparing and the searching, setting the value to one of a location of the unequal character, a location of the termination character, or a pre-specified value based on not finding an unequal character or a termination character;, 'determining, by a processor, a value, the ...

Подробнее

Номер записи: 101

19-09-2013 дата публикации

FINDING THE LENGTH OF A SET OF CHARACTER DATA HAVING A TERMINATION CHARACTER

Номер: US20130246699A1

Автор: Bradbury Jonathan D., Gschwind Michael K., Slegel Timothy J.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

The length of character data having a termination character is determined. The character data for which the length is to be determined is loaded, in parallel, within one or more vector registers. An instruction is used that loads data in a vector register to a specified boundary, and provides a way to determine the number of characters loaded, using, for instance, another instruction. Further, an instruction is used to find the index of the first termination character, e.g., the first zero or null character. This instruction searches the data in parallel for the termination character. By using these instructions, the length of the character data is determined using only one branch instruction. 1. A computer program product for determining a length of a set of data , the computer program product comprising: loading from memory to a register data that is within a specified block of memory, the data being at least a portion of the set of data for which the length is to be determined;', 'obtaining a count of an amount of data loaded in the register;', 'determining, by a processor, a termination value for the data loaded in the register, the determining comprising checking the data to determine whether the register includes a termination character, and based on the register including a termination character, setting the termination value to a location of the termination character, and based on the register not including the termination character, setting the termination value to a pre-specified value;', 'checking whether there is additional data to be counted based on at least one of the count and the termination value;', 'based on the checking indicating additional data is to be counted, incrementing the count based on the additional data, the count providing the length of the set of data; and', 'based on the checking indicating additional data is not to be counted, using the count as a length of the set of data., 'a computer readable storage medium readable by a ...

Подробнее

Номер записи: 102

19-09-2013 дата публикации

INSTRUCTION TO LOAD DATA UP TO A SPECIFIED MEMORY BOUNDARY INDICATED BY THE INSTRUCTION

Номер: US20130246738A1

Автор: Bradbury Jonathan D., Gschwind Michael K., Jacobi Christian, Schwarz Eric M., Slegel Timothy J.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

A Load to Block Boundary instruction is provided that loads a variable number of bytes of data into a register while ensuring that a specified memory boundary is not crossed. The boundary may be specified a number of ways, including, but not limited to, a variable value in the instruction text, a fixed instruction text value encoded in the opcode, or a register based boundary. 1. A computer program product for executing a machine instruction in a central processing unit , the computer program product comprising: [ at least one opcode field to provide an opcode, the opcode identifying a load to block boundary operation;', 'a register field to be used to designate a register, the register comprising a first operand;', 'at least one field for locating a second operand in main memory; and, 'obtaining, by a processor, a machine instruction for execution, the machine instruction being defined for computer execution according to a computer architecture, the machine instruction comprising, 'only loading bytes of the first operand with corresponding bytes of the second operand that are within an instruction specified block of main memory.', 'executing the machine instruction, the execution comprising], 'a computer readable storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for performing a method comprising2. The computer program product of claim 1 , wherein the at least one field comprises a displacement field claim 1 , a base field and an index field claim 1 , the base field and index field for locating general registers having contents to be added to contents of the displacement field to form an address of the second operand claim 1 , and wherein the machine instruction further comprises a mask field claim 1 , the mask field specifying the block boundary.3. The computer program product of claim 2 , wherein the block boundary is one block boundary of a plurality of block boundaries specifiable by the mask field. ...

Подробнее

Номер записи: 103

19-09-2013 дата публикации

Copying character data having a termination character from one memory location to another

Номер: US20130246739A1

Автор: Jonathan D. Bradbury, Michael K. Gschwind, Timothy J. Slegel

Принадлежит: International Business Machines Corp

Copying characters of a set of terminated character data from one memory location to another memory location using parallel processing and without causing unwarranted exceptions. The character data to be copied is loaded within one or more vector registers. In particular, in one embodiment, an instruction (e.g., a Vector Load to block Boundary instruction) is used that loads data in parallel in a vector register to a specified boundary, and provides a way to determine the number of characters loaded. To determine the number of characters loaded (a count), another instruction (e.g., a Load Count to Block Boundary instruction) is used. Further, an instruction (e.g., a Vector Find Element Not Equal instruction) is used to find the index of the first delimiter character, i.e., the first termination character, such as a zero or null character within the character data. This instruction checks a plurality of bytes of data in parallel.

Подробнее

Номер записи: 104

19-09-2013 дата публикации

INSTRUCTION TO LOAD DATA UP TO A DYNAMICALLY DETERMINED MEMORY BOUNDARY

Номер: US20130246740A1

Автор: Bradbury Jonathan D., Gschwind Michael K., Jacobi Christian, Schwarz Eric M., Slegel Timothy J.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

A Load to Block Boundary instruction is provided that loads a variable number of bytes of data into a register while ensuring that a specified memory boundary is not crossed. The boundary is dynamically determined based on a specified type of boundary and one or more characteristics of the processor executing the instruction, such as cache line size or page size used by the processor. 1. A computer program product for executing a machine instruction in a central processing unit , the computer program product comprising: [ at least one opcode field to provide an opcode, the opcode identifying a load to block boundary operation;', 'a register field to be used to designate a register, the register comprising a first operand;', 'at least one field for locating a second operand in main memory; and, 'obtaining, by a processor, a machine instruction for execution, the machine instruction being defined for computer execution according to a computer architecture, the machine instruction comprising, 'only loading bytes of the first operand with corresponding bytes of the second operand that are within a block of main memory dynamically determined based on a specified type of block boundary and one or more characteristics of the processor.', 'executing the machine instruction, the execution comprising], 'a computer readable storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for performing a method comprising2. The computer program product of claim 1 , wherein the at least one field comprises a displacement field claim 1 , a base field and an index field claim 1 , the base field and index field for locating general registers having contents to be added to contents of the displacement field to form an address of the second operand.3. The computer program product of claim 1 , wherein the machine instruction further comprises a mask field claim 1 , the mask field specifying the type of boundary.4. The computer program ...

Подробнее

Номер записи: 105

19-09-2013 дата публикации

Vector find element not equal instruction

Номер: US20130246751A1

Автор: Eric M. Schwarz, Jonathan D. Bradbury, Michael K. Gschwind, Timothy J. Slegel

Принадлежит: International Business Machines Corp

Processing of character data is facilitated. A Find Element Not Equal instruction is provided that compares data of multiple vectors for inequality and provides an indication of inequality, if inequality exists. An index associated with the unequal element is stored in a target vector register. Further, the same instruction, the Find Element Not Equal instruction, also searches a selected vector for null elements, also referred to as zero elements. A result of the instruction is dependent on whether the null search is provided, or just the compare.

Подробнее

Номер записи: 106

19-09-2013 дата публикации

VECTOR FIND ELEMENT EQUAL INSTRUCTION

Номер: US20130246752A1

Автор: Bradbury Jonathan D., Gschwind Michael K., Schwarz Eric M., Slegel Timothy J.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

Processing of character data is facilitated. A Find Element Equal instruction is provided that compares data of multiple vectors for equality and provides an indication of equality, if equality exists. An index associated with the equal element is stored in a target vector register. Further, the same instruction, the Find Element Equal instruction, also searches a selected vector for null elements, also referred to as zero elements. A result of the instruction is dependent on whether the null search is provided, or just the compare. 1. A computer program product for executing a machine instruction in a central processing unit , the computer program product comprising: [ at least one opcode field to provide an opcode, the opcode identifying a Vector Find Element Equal operation;', 'an extension field to be used in designating one or more registers;', 'a first register field combined with a first portion of the extension field to designate a first register, the first register comprising a first operand;', 'a second register field combined with a second portion of the extension field to designate a second register, the second register comprising a second operand;', 'a third register field combined with a third portion of the extension field to designate a third register, the third register comprising a third operand;', 'a mask field, the mask field comprising one or more controls to be used during execution of the machine instruction; and, 'obtaining, by a processor, a machine instruction for execution, the machine instruction being defined for computer execution according to a computer architecture, the machine instruction comprising, determining whether the mask field includes a zero element control set to indicate a search for a zero element;', 'based on the mask field including the zero element control set to indicate the search for a zero element, searching the second operand for a zero element, the searching providing a null index, the null index including one of ...

Подробнее

Номер записи: 107

19-09-2013 дата публикации

RUN-TIME INSTRUMENTATION INDIRECT SAMPLING BY ADDRESS

Номер: US20130246754A1

Автор: Bradbury Jonathan D., Gainey, Gschwind Michael K., JR. Charles W.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

Embodiments of the invention relate to implementing run-time instrumentation indirect sampling by address. An aspect of the invention includes reading sample-point addresses from a sample-point address array, and comparing, by a processor, the sample-point addresses to an address associated with an instruction from an instruction stream executing on the processor. A sample point is recognized upon execution of the instruction associated with the address matching one of the sample-point addresses. Run-time instrumentation information is obtained from the sample point. The run-time instrumentation information is stored in a run-time instrumentation program buffer as a reporting group. 1. A computer program product for implementing run-time instrumentation indirect sampling by address , the computer program product comprising: reading sample-point addresses from a sample-point address array;', 'comparing, by a processor, the sample-point addresses to an address associated with an instruction from an instruction stream executing on the processor;', 'recognizing a sample point upon execution of the instruction associated with the address matching one of the sample-point addresses, wherein run-time instrumentation information is obtained from the sample point; and', 'storing the run-time instrumentation information in a run-time instrumentation program buffer as a reporting group., 'a tangible storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for performing a method comprising2. The computer program product of claim 1 , wherein the address associated with the instruction claim 1 , based on address type claim 1 , is one of: an address of the instruction and an address of an operand of the instruction.3. The computer program product of claim 1 , further comprising:initializing a run-time-instrumentation control based on executing a load run-time instrumentation controls (LRIC) instruction, the LRIC instruction ...

Подробнее

Номер записи: 108

19-09-2013 дата публикации

VECTOR FIND ELEMENT EQUAL INSTRUCTION

Номер: US20130246757A1

Автор: Bradbury Jonathan D., Gschwind Michael K., Schwarz Eric M., Slegel Timothy J.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

Processing of character data is facilitated. A Find Element Equal instruction is provided that compares data of multiple vectors for equality and provides an indication of equality, if equality exists. An index associated with the equal element is stored in a target vector register. Further, the same instruction, the Find Element Equal instruction, also searches a selected vector for null elements, also referred to as zero elements. A result of the instruction is dependent on whether the null search is provided, or just the compare. 1. A method of executing a machine instruction in a central processing unit , the method comprising: at least one opcode field to provide an opcode, the opcode identifying a Vector Find Element Equal operation;', 'an extension field to be used in designating one or more registers;', 'a first register field combined with a first portion of the extension field to designate a first register, the first register comprising a first operand;', 'a second register field combined with a second portion of the extension field to designate a second register, the second register comprising a second operand;', 'a third register field combined with a third portion of the extension field to designate a third register, the third register comprising a third operand;', 'a mask field, the mask field comprising one or more controls to be used during execution of the machine instruction; and, 'obtaining, by a processor, a machine instruction for execution, the machine instruction being defined for computer execution according to a computer architecture, the machine instruction comprising determining whether the mask field includes a zero element control set to indicate a search for a zero element;', 'based on the mask field including the zero element control set to indicate the search for a zero element, searching the second operand for a zero element, the searching providing a null index, the null index including one of an index of a zero element found in the ...

Подробнее

Номер записи: 109

19-09-2013 дата публикации

VECTOR FIND ELEMENT NOT EQUAL INSTRUCTION

Номер: US20130246759A1

Автор: Bradbury Jonathan D., Gschwind Michael K., Schwarz Eric M., Slegel Timothy J.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

Processing of character data is facilitated. A Find Element Not Equal instruction is provided that compares data of multiple vectors for inequality and provides an indication of inequality, if inequality exists. An index associated with the unequal element is stored in a target vector register. Further, the same instruction, the Find Element Not Equal instruction, also searches a selected vector for null elements, also referred to as zero elements. A result of the instruction is dependent on whether the null search is provided, or just the compare. 1. A method of executing a machine instruction in a central processing unit , the method comprising: at least one opcode field to provide an opcode, the opcode identifying a Vector Find Element Not Equal operation;', 'an extension field to be used in designating one or more registers;', 'a first register field combined with a first portion of the extension field to designate a first register, the first register comprising a first operand;', 'a second register field combined with a second portion of the extension field to designate a second register, the second register comprising a second operand;', 'a third register field combined with a third portion of the extension field to designate a third register, the third register comprising a third operand;', 'a mask field, the mask field comprising one or more controls to be used during execution of the machine instruction; and, 'obtaining, by a processor, a machine instruction for execution, the machine instruction being defined for computer execution according to a computer architecture, the machine instruction comprising determining whether the mask field includes a zero element control set to indicate a search for a zero element;', 'based on the mask field including the zero element control set to indicate the search for a zero element, searching the second operand for a zero element, the searching providing a null index, the null index including one of an index of a zero ...

Подробнее

Номер записи: 110

19-09-2013 дата публикации

INSTRUCTION TO LOAD DATA UP TO A DYNAMICALLY DETERMINED MEMORY BOUNDARY

Номер: US20130246762A1

Автор: Bradbury Jonathan D., Gschwind Michael K., Jacobi Christian, Schwarz Eric M., Slegel Timothy J.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

A Load to Block Boundary instruction is provided that loads a variable number of bytes of data into a register while ensuring that a specified memory boundary is not crossed. The boundary is dynamically determined based on a specified type of boundary and one or more characteristics of the processor executing the instruction, such as cache line size or page size used by the processor. 1. A method of executing a machine instruction in a central processing unit , the method comprising: at least one opcode field to provide an opcode, the opcode identifying a load to block boundary operation;', 'a register field to be used to designate a register, the register comprising a first operand;', 'at least one field for locating a second operand in main memory; and, 'obtaining, by a processor, a machine instruction for execution, the machine instruction being defined for computer execution according to a computer architecture, the machine instruction comprising 'only loading bytes of the first operand with corresponding bytes of the second operand that are within a block of main memory dynamically determined based on a specified type of block boundary and one or more characteristics of the processor.', 'executing the machine instruction, the execution comprising2. The method of claim 1 , wherein the at least one field comprises a displacement field claim 1 , a base field and an index field claim 1 , the base field and index field for locating general registers having contents to be added to contents of the displacement field to form an address of the second operand.3. The method of claim 1 , wherein the machine instruction further comprises a mask field claim 1 , the mask field specifying the type of boundary.4. The method of claim 1 , wherein the one or more characteristics comprises one of a cache line size of the processor or a page size of the processor.5. The method of claim 1 , wherein the executing comprises determining a boundary of the block using an address of the ...

Подробнее

Номер записи: 111

19-09-2013 дата публикации

INSTRUCTION TO COMPUTE THE DISTANCE TO A SPECIFIED MEMORY BOUNDARY

Номер: US20130246763A1

Автор: Bradbury Jonathan D., Gschwind Michael K., Jacobi Christian, Schwarz Eric M., Slegel Timothy J.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

A Load Count to Block Boundary instruction is provided that provides a distance from a specified memory address to a specified memory boundary. The memory boundary is a boundary that is not to be crossed in loading data. The boundary may be specified a number of ways, including, but not limited to, a variable value in the instruction text, a fixed instruction text value encoded in the opcode, or a register based boundary; or it may be dynamically determined. 1. A method of executing a machine instruction in a central processing unit , the method comprising: at least one opcode field to provide an opcode, the opcode identifying a Load Count to Block Boundary operation;', 'a register field to be used to designate a register, the register comprising a first operand; and', 'at least one field for indicating a location of a second operand, the second operand comprising at least a portion of a block of main memory; and, 'obtaining, by a processor, a machine instruction for execution, the machine instruction being defined for computer execution according to a computer architecture, the machine instruction comprising determining a distance from the location of the second operand to a boundary of the block of main memory; and', 'placing a value representing the distance in the first operand., 'executing the machine instruction, the execution comprising2. The method of claim 1 , wherein the machine instruction further comprises a mask field claim 1 , the mask field specifying the boundary.3. The method of claim 2 , wherein the block boundary is one boundary of a plurality of boundaries specifiable by the mask field.4. The method of claim 1 , wherein the executing further comprises dynamically determining the boundary claim 1 , the dynamically determining using a specified type of boundary and one or more characteristics of the processor.5. The method of claim 1 , wherein the location of the second operand is a starting address in memory from which data is to be counted.6. The ...

Подробнее

Номер записи: 112

19-09-2013 дата публикации

INSTRUCTION TO LOAD DATA UP TO A SPECIFIED MEMORY BOUNDARY INDICATED BY THE INSTRUCTION

Номер: US20130246764A1

Автор: Bradbury Jonathan D., Gschwind Michael K., Jacobi Christian, Schwarz Eric M., Slegel Timothy J.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

A Load to Block Boundary instruction is provided that loads a variable number of bytes of data into a register while ensuring that a specified memory boundary is not crossed. The boundary may be specified a number of ways, including, but not limited to, a variable value in the instruction text, a fixed instruction text value encoded in the opcode, or a register based boundary. 1. A method of executing a machine instruction in a central processing unit , the method comprising: at least one opcode field to provide an opcode, the opcode identifying a load to block boundary operation;', 'a register field to be used to designate a register, the register comprising a first operand;', 'at least one field for locating a second operand in main memory; and, 'obtaining, by a processor, a machine instruction for execution, the machine instruction being defined for computer execution according to a computer architecture, the machine instruction comprising 'only loading bytes of the first operand with corresponding bytes of the second operand that are within an instruction specified block of main memory.', 'executing the machine instruction, the execution comprising2. The method of claim 1 , wherein the at least one field comprises a displacement field claim 1 , a base field and an index field claim 1 , the base field and index field for locating general registers having contents to be added to contents of the displacement field to form an address of the second operand claim 1 , and wherein the machine instruction further comprises a mask field claim 1 , the mask field specifying the block boundary.3. The method of claim 2 , wherein the block boundary is one block boundary of a plurality of block boundaries specifiable by the mask field.4. The method of claim 1 , wherein the address of the second operand is a starting address in memory from which data is to be loaded in the first operand.5. The method of claim 4 , wherein the executing further comprises determining an ending ...

Подробнее

Номер записи: 113

19-09-2013 дата публикации

TRANSFORMING NON-CONTIGUOUS INSTRUCTION SPECIFIERS TO CONTIGUOUS INSTRUCTION SPECIFIERS

Номер: US20130246766A1

Автор: Gschwind Michael K.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

Emulation of instructions that include non-contiguous specifiers is facilitated. A non-contiguous specifier specifies a resource of an instruction, such as a register, using multiple fields of the instruction. For example, multiple fields of the instruction (e.g., two fields) include bits that together designate a particular register to be used by the instruction. Non-contiguous specifiers of instructions defined in one computer system architecture are transformed to contiguous specifiers usable by instructions defined in another computer system architecture. The instructions defined in the another computer system architecture emulate the instructions defined for the one computer system architecture. 1. A method of transforming instruction specifiers of a computing environment , the method comprising:obtaining, by a processor, from a first instruction defined for a first computer architecture, a non-contiguous specifier, the non-contiguous specifier having a first portion and a second portion, wherein the obtaining comprises obtaining the first portion from a first field of the instruction and the second portion from a second field of the instruction, the first field separate from the second field;generating a contiguous specifier using the first portion and the second portion, the generating using one or more rules based on the opcode of the first instruction; andusing the contiguous specifier to indicate a resource to be used in execution of a second instruction, the second instruction defined for a second computer architecture different from the first computer architecture and emulating a function of the first instruction.2. The method of claim 1 , wherein the processor comprises an emulator claim 1 , and wherein the first portion includes a first one or more bits claim 1 , and the second portion includes a second one or more bits claim 1 , and the generating comprises concatenating the second one or more bits with the first one or more bits to form the ...

Подробнее

Номер записи: 114

19-09-2013 дата публикации

INSTRUCTION TO COMPUTE THE DISTANCE TO A SPECIFIED MEMORY BOUNDARY

Номер: US20130246767A1

Автор: Bradbury Jonathan D., Gschwind Michael K., Jacobi Christian, Schwarz Eric M., Slegel Timothy J.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

A Load Count to Block Boundary instruction is provided that provides a distance from a specified memory address to a specified memory boundary. The memory boundary is a boundary that is not to be crossed in loading data. The boundary may be specified a number of ways, including, but not limited to, a variable value in the instruction text, a fixed instruction text value encoded in the opcode, or a register based boundary; or it may be dynamically determined. 1. A computer program product for executing a machine instruction in a central processing unit , the computer program product comprising: [ at least one opcode field to provide an opcode, the opcode identifying a Load Count to Block Boundary operation;', 'a register field to be used to designate a register, the register comprising a first operand; and', 'at least one field for indicating a location of a second operand, the second operand comprising at least a portion of a block of main memory; and, 'obtaining, by a processor, a machine instruction for execution, the machine instruction being defined for computer execution according to a computer architecture, the machine instruction comprising, determining a distance from the location of the second operand to a boundary of the block of main memory; and', 'placing a value representing the distance in the first operand., 'executing the machine instruction, the execution comprising], 'a computer readable storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for performing a method comprising2. The computer program product of claim 1 , wherein the machine instruction further comprises a mask field claim 1 , the mask field specifying the boundary.3. The computer program product of claim 2 , wherein the block boundary is one boundary of a plurality of boundaries specifiable by the mask field.4. The computer program product of claim 1 , wherein the executing further comprises dynamically determining the ...

Подробнее

Номер записи: 115

19-09-2013 дата публикации

TRANSFORMING NON-CONTIGUOUS INSTRUCTION SPECIFIERS TO CONTIGUOUS INSTRUCTION SPECIFIERS

Номер: US20130246768A1

Автор: Gschwind Michael K.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

Emulation of instructions that include non-contiguous specifiers is facilitated. A non-contiguous specifier specifies a resource of an instruction, such as a register, using multiple fields of the instruction. For example, multiple fields of the instruction (e.g., two fields) include bits that together designate a particular register to be used by the instruction. Non-contiguous specifiers of instructions defined in one computer system architecture are transformed to contiguous specifiers usable by instructions defined in another computer system architecture. The instructions defined in the another computer system architecture emulate the instructions defined for the one computer system architecture. 1. A computer program product for transforming instruction specifiers of a computing environment , the computer program product comprising: obtaining, by a processor, from a first instruction defined for a first computer architecture, a non-contiguous specifier, the non-contiguous specifier having a first portion and a second portion, wherein the obtaining comprises obtaining the first portion from a first field of the instruction and the second portion from a second field of the instruction, the first field separate from the second field;', 'generating a contiguous specifier using the first portion and the second portion, the generating using one or more rules based on the opcode of the first instruction; and', 'using the contiguous specifier to indicate a resource to be used in execution of a second instruction, the second instruction defined for a second computer architecture different from the first computer architecture and emulating a function of the first instruction., 'a computer readable storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for performing a method comprising2. The computer program product of claim 1 , wherein the processor comprises an emulator claim 1 , and wherein the first portion ...

Подробнее

Номер записи: 116

19-09-2013 дата публикации

RUN-TIME INSTRUMENTATION INDIRECT SAMPLING BY INSTRUCTION OPERATION CODE

Номер: US20130246772A1

Автор: Bradbury Jonathan D., Gainey, Gschwind Michael K., JR. Charles W., Schwarz Eric M.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

Embodiments of the invention relate to implementing run-time instrumentation indirect sampling by instruction operation code. An aspect of the invention includes a method for implementing run-time instrumentation indirect sampling by instruction operation code. The method includes reading sample-point instruction operation codes from a sample-point instruction array, and comparing, by a processor, the sample-point instruction operation codes to an operation code of an instruction from an instruction stream executing on the processor. The method also includes recognizing a sample point upon execution of the instruction with the operation code matching one of the sample-point instruction operation codes. The run-time instrumentation information is obtained from the sample point. The method further includes storing the run-time instrumentation information in a run-time instrumentation program buffer as a reporting group. 1. A computer implemented method for implementing run-time instrumentation indirect sampling by instruction operation code , the method comprising:reading sample-point instruction operation codes from a sample-point instruction array;comparing, by a processor, the sample-point instruction operation codes to an operation code of an instruction from an instruction stream executing on the processor;recognizing a sample point upon execution of the instruction with the operation code matching one of the sample-point instruction operation codes, wherein run-time instrumentation information is obtained from the sample point; andstoring the run-time instrumentation information in a run-time instrumentation program buffer as a reporting group.2. The method of claim 1 , wherein the run-time instrumentation information comprises run-time instrumentation event records collected in a collection buffer of the processor and the reporting group further comprises system information records in combination with the run-time instrumentation event records.3. The method of ...

Подробнее

Номер записи: 117

19-09-2013 дата публикации

RUN-TIME INSTRUMENTATION SAMPLING IN TRANSACTIONAL-EXECUTION MODE

Номер: US20130246774A1

Автор: Bradbury Jonathan D., Gainey, Gschwind Michael K., JR. Charles W.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

Embodiments of the invention relate to implementing run-time instrumentation indirect sampling by address. An aspect of the invention includes a method for implementing run-time instrumentation indirect sampling by address. The method includes reading sample-point addresses from a sample-point address array, and comparing, by a processor, the sample-point addresses to an address associated with an instruction from an instruction stream executing on the processor. The method further includes recognizing a sample point upon execution of the instruction associated with the address matching one of the sample-point addresses. Run-time instrumentation information is obtained from the sample point. The method also includes storing the run-time instrumentation information in a run-time instrumentation program buffer as a reporting group. 1. A computer implemented method for implementing run-time instrumentation indirect sampling by address , the method comprising:reading sample-point addresses from a sample-point address array;comparing, by a processor, the sample-point addresses to an address associated with an instruction from an instruction stream executing on the processor;recognizing a sample point upon execution of the instruction associated with the address matching one of the sample-point addresses, wherein run-time instrumentation information is obtained from the sample point; andstoring the run-time instrumentation information in a run-time instrumentation program buffer as a reporting group.2. The method of claim 1 , wherein the address associated with the instruction claim 1 , based on address type claim 1 , is one of: an address of the instruction and an address of an operand of the instruction.3. The method of claim 1 , further comprising:initializing a run-time-instrumentation control based on executing a load run-time instrumentation controls (LRIC) instruction, the LRIC instruction establishing a sampling mode and a sample-point address (SPA) control.4. The ...

Подробнее

Номер записи: 118

19-09-2013 дата публикации

RUN-TIME INSTRUMENTATION SAMPLING IN TRANSACTIONAL-EXECUTION MODE

Номер: US20130246775A1

Автор: Bradbury Jonathan D., Gainey, Gschwind Michael K., JR. Charles W.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

Embodiments of the invention relate to implementing run-time instrumentation sampling in transactional-execution mode. An aspect of the invention includes a method for implementing run-time instrumentation sampling in transactional-execution mode. The method includes determining, by a processor, that the processor is configured to execute instructions of an instruction stream in a transactional-execution mode, the instructions defining a transaction. The method also includes interlocking completion of storage operations of the instructions to prevent instruction-directed storage until completion of the transaction. The method further includes recognizing a sample point during execution of the instructions while in the transactional-execution mode. The method additionally includes run-time-instrumentation-directed storing, upon successful completion of the transaction, run-time instrumentation information obtained at the sample point. 1. A computer implemented method for implementing run-time instrumentation sampling in transactional-execution mode , the method comprising:determining, by a processor, that the processor is configured to execute instructions of an instruction stream in a transactional-execution mode, the instructions defining a transaction;interlocking completion of storage operations of the instructions to prevent instruction-directed storage until completion of the transaction;recognizing a sample point during execution of the instructions while in the transactional-execution mode; andrun-time-instrumentation-directed storing, upon successful completion of the transaction, run-time instrumentation information obtained at the sample point.2. The method of claim 1 , wherein run-time-instrumentation-directed storing the run-time instrumentation information obtained at the sample point further comprises:collecting run-time instrumentation events in a collection buffer while in the transactional-execution mode;deferring storage of the collected run-time ...

Подробнее

Номер записи: 119

19-09-2013 дата публикации

RUN-TIME INSTRUMENTATION INDIRECT SAMPLING BY INSTRUCTION OPERATION CODE

Номер: US20130247009A1

Автор: Bradbury Jonathan D., Gainey, Gschwind Michael K., JR. Charles W., Schwarz Eric M.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

Embodiments of the invention relate to implementing run-time instrumentation indirect sampling by instruction operation code. An aspect of the invention includes reading sample-point instruction operation codes from a sample-point instruction array, and comparing, by a processor, the sample-point instruction operation codes to an operation code of an instruction from an instruction stream executing on the processor. A sample point is recognized upon execution of the instruction with the operation code matching one of the sample-point instruction operation codes. The run-time instrumentation information is obtained from the sample point. The run-time instrumentation information is stored in a run-time instrumentation program buffer as a reporting group. 1. A computer program product for implementing run-time instrumentation indirect sampling by instruction operation code , the computer program product comprising: reading sample-point instruction operation codes from a sample-point instruction array;', 'comparing, by a processor, the sample-point instruction operation codes to an operation code of an instruction from an instruction stream executing on the processor;', 'recognizing a sample point upon execution of the instruction with the operation code matching one of the sample-point instruction operation codes, wherein run-time instrumentation information is obtained from the sample point; and', 'storing the run-time instrumentation information in a run-time instrumentation program buffer as a reporting group., 'a tangible storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for performing a method comprising2. The computer program product of claim 1 , wherein the run-time instrumentation information comprises run-time instrumentation event records collected in a collection buffer of the processor and the reporting group further comprises system information records in combination with the run-time ...

Подробнее

Номер записи: 120

19-09-2013 дата публикации

RUN-TIME INSTRUMENTATION SAMPLING IN TRANSACTIONAL-EXECUTION MODE

Номер: US20130247010A1

Автор: Bradbury Jonathan D., Gainey, Gschwind Michael K., JR. Charles W.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

Embodiments of the invention relate to implementing run-time instrumentation sampling in transactional-execution mode. An aspect of the invention includes run time instrumentation sampling in transactional execution mode. The method includes determining, by a processor, that the processor is configured to execute instructions of an instruction stream in a transactional-execution mode, the instructions defining a transaction. Completion of storage operations of the instructions is interlocked to prevent instruction-directed storage until completion of the transaction. A sample point is recognized during execution of the instructions while in the transactional-execution mode. Run-time-instrumentation-directed storing is performed, upon successful completion of the transaction, run-time instrumentation information obtained at the sample point. 1. A computer program product for implementing run-time instrumentation sampling in transactional-execution mode , the computer program product comprising: determining, by a processor, that the processor is configured to execute instructions of an instruction stream in a transactional-execution mode, the instructions defining a transaction;', 'interlocking completion of storage operations of the instructions to prevent instruction-directed storage until completion of the transaction;', 'recognizing a sample point during execution of the instructions while in the transactional-execution mode; and', 'run-time-instrumentation-directed storing, upon successful completion of the transaction, run-time instrumentation information obtained at the sample point., 'a tangible storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for performing a method comprising2. The computer program product of claim 1 , wherein run-time-instrumentation-directed storing the run-time instrumentation information obtained at the sample point further comprises:collecting run-time instrumentation ...

Подробнее

Номер записи: 121

19-09-2013 дата публикации

TRANSFORMATION OF A PROGRAM-EVENT-RECORDING EVENT INTO A RUN-TIME INSTRUMENTATION EVENT

Номер: US20130247011A1

Автор: Gainey, Gschwind Michael K., JR. Charles W.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

Embodiments of the invention relate to transforming a program-event-recording event into a run-time instrumentation event. An aspect of the invention includes enabling run-time instrumentation for collecting instrumentation information of an instruction stream executing on a processor. Detecting is performed, by the processor, of a program-event-recording (PER) event, the PER event associated with the instruction stream executing on the processor. A PER event record is written to a collection buffer as a run-time instrumentation event based on detecting the PER event, the PER event record identifying the PER event. 1. A computer program product for transforming a program-event-recording event into a run-time instrumentation event , the computer program product comprising: enabling run-time instrumentation for collecting instrumentation information of an instruction stream executing on a processor;', 'detecting, by the processor, a program-event-recording (PER) event, the PER event associated with the instruction stream executing on the processor; and', 'writing a PER event record to a collection buffer as a run-time instrumentation event based on detecting the PER event, the PER event record identifying the PER event., 'a tangible storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for performing a method comprising2. The computer program product of claim 1 , wherein detecting the PER event further comprises:intercepting a PER interruption condition associated with the PER event such that the instruction stream executing on the processor is not interrupted by a PER interruption associated with the PER interruption condition.3. The computer program product of claim 1 , further comprising:configuring a run-time-instrumentation control based on executing a load run-time instrumentation controls (LRIC) instruction, the LRIC instruction enabling run-time instrumentation PER controls.4. The computer program ...

Подробнее

Номер записи: 122

19-09-2013 дата публикации

TRANSFORMATION OF A PROGRAM-EVENT-RECORDING EVENT INTO A RUN-TIME INSTRUMENTATION EVENT

Номер: US20130247012A1

Автор: Gainey, Gschwind Michael K., JR. Charles W.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

Embodiments of the invention relate to transforming a program-event-recording event into a run-time instrumentation event. An aspect of the invention includes a method for transforming a program-event-recording event into a run-time instrumentation event. The method includes enabling run-time instrumentation for collecting instrumentation information of an instruction stream executing on a processor. The method also includes detecting, by the processor, a program-event-recording (PER) event, the PER event associated with the instruction stream executing on the processor. The method further includes writing a PER event record to a collection buffer as a run-time instrumentation event based on detecting the PER event, the PER event record identifying the PER event. 1. A computer implemented method for transforming a program-event-recording event into a run-time instrumentation event , the method comprising:enabling run-time instrumentation for collecting instrumentation information of an instruction stream executing on a processor;detecting, by the processor, a program-event-recording (PER) event, the PER event associated with the instruction stream executing on the processor; andwriting a PER event record to a collection buffer as a run-time instrumentation event based on detecting the PER event, the PER event record identifying the PER event.2. The method of claim 1 , wherein detecting the PER event further comprises:intercepting a PER interruption condition associated with the PER event such that the instruction stream executing on the processor is not interrupted by a PER interruption associated with the PER interruption condition.3. The method of claim 1 , further comprising:configuring a run-time-instrumentation control based on executing a load run-time instrumentation controls (LRIC) instruction, the LRIC instruction enabling run-time instrumentation PER controls.4. The method of claim 3 , further comprising:setting a K-bit based on executing the LRIC ...

Подробнее

Номер записи: 123

03-10-2013 дата публикации

HYBRID ADDRESS TRANSLATION

Номер: US20130262815A1

Автор: Bybell Anthony J., Gschwind Michael K.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

Embodiments of the invention relate to hybrid address translation. An aspect of the invention includes receiving a first address, the first address referencing a location in a first address space. The computer searches a segment lookaside buffer (SLB) for a SLB entry corresponding to the first address; the SLB entry comprising a type field and an address field and determines whether a value of the type field in the SLB entry indicates a hashed page table (HPT) search or a radix tree search. Based on determining that the value of the type field indicates the HPT search, a HPT is searched to determine a second address, the second address comprising a translation of the first address into a second address space; and based on determining that the value of the type field indicates the radix tree search, a radix tree is searched to determine the second address. 1. A computer implemented method for hybrid address translation in a computer , the method comprising:receiving a first address, the first address referencing a location in a first address space;searching, by the computer, a segment lookaside buffer (SLB) for a SLB entry corresponding to the first address, the SLB entry comprising a type field and an address field;determining whether a value of the type field in the SLB entry indicates a hashed page table (HPT) search or a radix tree search;based on determining that the value of the type field indicates the HPT search, searching a HPT to determine a second address, the second address comprising a translation of the first address into a second address space; andbased on determining that the value of the type field indicates the radix tree search, searching a radix tree to determine the second address.2. The method of claim 1 , wherein searching the HPT to determine the second address comprises:extracting a virtual address associated with the first address from the address field of the SLB entry corresponding to the first address; andsearching the HPT for the virtual ...

Подробнее

Номер записи: 124

03-10-2013 дата публикации

HYBRID ADDRESS TRANSLATION

Номер: US20130262817A1

Автор: Bybell Anthony J., Gschwind Michael K.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

Embodiments of the invention relate to hybrid address translation. An aspect of the invention includes receiving a first address, the first address referencing a location in a first address space. The computer searches a segment lookaside buffer (SLB) for a SLB entry corresponding to the first address; the SLB entry comprising a type field and an address field and determines whether a value of the type field in the SLB entry indicates a hashed page table (HPT) search or a radix tree search. Based on determining that the value of the type field indicates the HPT search, a HPT is searched to determine a second address, the second address comprising a translation of the first address into a second address space; and based on determining that the value of the type field indicates the radix tree search, a radix tree is searched to determine the second address. 1. A computer program product for implementing hybrid address translation , the computer program product comprising: receiving a first address, the first address referencing a location in a first address space;', 'searching a segment lookaside buffer (SLB) for a SLB entry corresponding to the first address; the SLB entry comprising a type field and an address field;', 'determining whether a value of the type field in the SLB entry indicates a hashed page table (HPT) search or a radix tree search;', 'based on determining that the value of the type field indicates the HPT search, searching a HPT to determine a second address, the second address comprising a translation of the first address into a second address space; and', 'based on determining that the value of the type field indicates the radix tree search, searching a radix tree to determine the second address., 'a tangible storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for performing a method comprising2. The computer program product of claim 1 , wherein searching the HPT to determine the second ...

Подробнее

Номер записи: 125

03-10-2013 дата публикации

PERFORMING PREDECODE-TIME OPTIMIZED INSTRUCTIONS IN CONJUNCTION WITH PREDECODE TIME OPTIMIZED INSTRUCTION SEQUENCE CACHING

Номер: US20130262821A1

Автор: Gschwind Michael K., Salapura Valentina

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

A method for performing predecode-time optimized instructions in conjunction with predecode time optimized instruction sequence caching. The method includes receiving a first instruction of an instruction sequence and a second instruction of the instruction sequence and determining if the first instruction and the second instruction can be optimized. In response to the determining that the first instruction and second instruction can be optimized, the method includes, preforming a pre-decode optimization on the instruction sequence and generating a new second instruction, wherein the new second instruction is not dependent on a target operand of the first instruction and storing a pre-decoded first instruction and a pre-decoded new second instruction in an instruction cache. In response to determining that the first instruction and second instruction can not be optimized, the method includes, storing the pre-decoded first instruction and a pre-decoded second instruction in the instruction cache. 1. A computer program product for performing predecode-time optimized instructions in conjunction with predecode time optimized instruction sequence caching , the computer program product comprising:a storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for performing a method comprising:receiving a first instruction of an instruction sequence and a second instruction of the instruction sequence;determining if the first instruction and the second instruction can be optimized; preforming a pre-decode optimization on the instruction sequence and generating a new second instruction, wherein the new second instruction is not dependent on a target operand of the first instruction; and', 'storing a pre-decoded first instruction and a pre-decoded new second instruction in an instruction cache;, 'responsive to the determining that the first instruction and second instruction can be optimizedresponsive to the determining ...

Подробнее

Номер записи: 126

03-10-2013 дата публикации

CACHING OPTIMIZED INTERNAL INSTRUCTIONS IN LOOP BUFFER

Номер: US20130262822A1

Автор: Gschwind Michael K., Salapura Valentina

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

Embodiments of the invention relate to a computer system for storing an internal instruction loop in a loop buffer. The computer system includes a loop buffer and a processor. The computer system is configured to perform a method including fetching instructions from memory to generate an internal instruction to be executed, detecting a beginning of a first instruction loop in the instructions, determining that a first internal instruction loop corresponding to the first instruction loop is not stored in the loop buffer, fetching the first instruction loop, optimizing one or more instructions corresponding to the first instruction loop to generate a first optimized internal instruction loop, and storing the first optimized internal instruction loop in the loop buffer based on the determination that the first internal instruction loop is not stored in the loop buffer. 1. A computer program product for implementing an instruction loop buffer , the computer program product comprising: fetching instructions from memory to generate an internal instruction to be executed;', 'determining, by a processor, that a first instruction from the instructions corresponds to a first instruction loop;', 'determining that a first internal instruction loop corresponding to the first instruction loop is not stored in a loop buffer;', 'optimizing one or more internal instructions of the first instruction loop; and', 'storing a resulting first optimized internal instruction loop in the loop buffer based on the determining that the first internal instruction loop is not stored in the loop buffer., 'a tangible storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for performing a method comprising2. The computer program product of claim 1 , wherein optimizing the one or more instructions includes merging at least two machine instructions of the one or more instructions to generate an optimized internal instruction claim 1 , andthe ...

Подробнее

Номер записи: 127

03-10-2013 дата публикации

INSTRUCTION MERGING OPTIMIZATION

Номер: US20130262823A1

Автор: Gschwind Michael K., Salapura Valentina

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

A computer system for optimizing instructions includes a processor including an instruction execution unit configured to execute instructions and an instruction optimization unit configured to optimize instructions and memory to store machine instructions to be executed by the instruction execution unit. The computer system is configured to perform a method including analyzing machine instructions from among a stream of instructions to be executed by the instruction execution unit, the machine instructions including a memory load instruction and a data processing instruction to perform a data processing function based on the memory load instruction, identifying the machine instructions as being eligible for optimization, merging the machine instructions into a single optimized internal instruction, and executing the single optimized internal instruction to perform a memory load function and a data processing function corresponding to the memory load instruction and the data processing instruction. 1. A computer system for optimizing instructions , the computer system comprising:a processor including an instruction execution unit configured to execute instructions and an instruction optimization unit configured to optimize two or more instructions; and 'analyzing the two or more machine instructions from among a stream of instructions to be executed by the instruction execution unit, the two or more machine instructions including a memory load instruction and a data processing instruction to perform a data processing function based on the memory load instruction, identifying the two or more machine instructions as being eligible for optimization, merging the two or more machine instructions into a single optimized internal instruction, and executing the single optimized internal instruction to perform a memory load function and a data processing function corresponding to the memory load instruction and the data processing instruction.', 'memory to store two or more ...

Подробнее

Номер записи: 128

03-10-2013 дата публикации

DECODE TIME INSTRUCTION OPTIMIZATION FOR LOAD RESERVE AND STORE CONDITIONAL SEQUENCES

Номер: US20130262829A1

Автор: Gschwind Michael K.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

A technique is provided for replacing an atomic sequence. A processing circuit receives the atomic sequence. The processing circuit detects the atomic sequence. The processing circuit generates an internal atomic operation to replace the atomic sequence. 1. A computer program product for replacing an atomic sequence , the computer program product comprising:a tangible storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for performing a method comprising:receiving, by the processing circuit, the atomic sequence;detecting, by the processing circuit, the atomic sequence; andgenerating, by the processing circuit, an internal atomic operation to replace the atomic sequence.2. The computer program product of claim 1 , wherein the method further comprises executing the internal atomic operation in place of the atomic sequence.3. The computer program product of claim 1 , wherein the atomic sequence comprises a load reserve instruction and a store conditional instruction.4. The computer program product of claim 3 , wherein detecting the atomic sequence comprises recognizing the load reserve instruction and the store conditional instruction to detect the atomic sequence that needs to be replaced.5. The computer program product of claim 1 , further comprising:based on separate instructions of the atomic sequence not being a same group and based on the separate instructions being positioned to execute separately, configuring an instruction decoder to perform instruction cache marking of the separate instructions that are not in the same group to force a load reserve instruction and a store conditional instruction into the same group;configuring the instruction decoder to initially mark the load reserve instruction based on the load reserve instruction being detected first in the separate instructions or initially mark the store conditional instruction based on the store conditional instruction being detected first in ...

Подробнее

Номер записи: 129

03-10-2013 дата публикации

INSTRUCTION MERGING OPTIMIZATION

Номер: US20130262839A1

Автор: Gschwind Michael K., Salapura Valentina

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

A computer system for optimizing instructions is configured to identify two or more machine instructions as being eligible for optimization, to merge the two or more machine instructions into a single optimized internal instruction that is configured to perform functions of the two or more machine instructions, and to execute the single optimized internal instruction to perform the functions of the two or more machine instructions. Being eligible includes determining that the two or more machine instructions include a first instruction specifying a first target register and a second instruction specifying the first target register as a source register and a target register. The second instruction is a next sequential instruction of the first instruction in program order, wherein the first instruction specifies a first function to be performed, and the second instruction specifies a second function to be performed. 1. A computer system for optimizing instructions , the computer system comprising:a processor including an instruction execution unit configured to execute instructions and an instruction optimization unit configured to optimize two or more instructions; andmemory to store two or more machine instructions to be executed by the instruction execution unit, identifying the two or more machine instructions as being eligible for optimization, wherein the being eligible comprises determining that the two or more machine instructions comprise a first instruction specifying a first target register and a second instruction specifying the first target register as a source register and a target register, wherein the second instruction is a next sequential instruction of the first instruction in program order, wherein the first instruction specifies a first function to be performed, and the second instruction specifies a second function to be performed;', 'merging the two or more machine instructions into a single optimized internal instruction that is configured to ...

Подробнее

Номер записи: 130

03-10-2013 дата публикации

INSTRUCTION MERGING OPTIMIZATION

Номер: US20130262840A1

Автор: Gschwind Michael K., Salapura Valentina

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

A computer-implemented method includes determining that two or more instructions of an instruction stream are eligible for optimization. Eligibility is based on a first instruction specifying a first target register and a second instruction specifying the first target register as a source register and a target register. The method includes merging the two or more machine instructions into a single optimized internal instruction that is configured to perform first and second functions of two or more machine instructions employing operands specified by the two or more machine instructions. The single optimized internal instruction specifies the first target register only as a single target register and the single optimized internal instruction specifies the first and second functions to be performed. The method includes executing the single optimized internal instruction to perform the first and second functions of the two or more instructions. 1. A computer-implemented method comprising:determining that two or more instructions of an instruction stream are eligible for optimization, wherein the being eligible comprises determining that the two or more machine instructions comprise a first instruction specifying a first target register and a second instruction specifying the first target register as a source register and a target register, wherein the second instruction is a next sequential instruction of the first instruction in program order, wherein the first instruction specifies a first function to be performed, and the second instruction specifies a second function to be performed;merging the two or more machine instructions into a single optimized internal instruction that is configured to perform the first and second functions of the two or more machine instructions employing operands specified by the two or more machine instructions, wherein the single optimized internal instruction specifies the first target register only as a single target register, wherein ...

Подробнее

Номер записи: 131

03-10-2013 дата публикации

INSTRUCTION MERGING OPTIMIZATION

Номер: US20130262841A1

Автор: Gschwind Michael K., Salapura Valentina

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

A computer-implemented method includes determining that two or more instructions of an instruction stream are eligible for optimization, where the two or more instructions include a memory load instruction and a data processing instruction to process data based on the memory load instruction. The method includes merging, by a processor, the two or more instructions into a single optimized internal instruction and executing the single optimized internal instruction to perform a memory load function and a data processing function corresponding to the memory load instruction and the data processing instruction. 1. A computer-implemented method comprising:determining that two or more instructions of an instruction stream are eligible for optimization, the two or more instruction including a memory load instruction and a data processing instruction to process data based on the memory load instruction;merging, by a processor, the two or more instructions into a single optimized internal instruction; andexecuting the single optimized internal instruction to perform a memory load function and a data processing function corresponding to the memory load instruction and the data processing instruction.2. The computer-implemented method of claim 1 , wherein executing the single optimized internal instruction includes executing the single optimized internal instruction instead of two or more separate internal instructions corresponding to the two or more instructions of the instruction stream.3. The computer-implemented method of claim 1 , further comprising storing the single optimized internal instruction in a single instruction slot of a queue claim 1 ,wherein executing the single optimized internal instruction includes fetching the single optimized internal instruction from the queue and generating from the single optimized internal instruction two or more separate internal instructions corresponding to the memory load instruction and the data processing instruction.4. The ...

Подробнее

Номер записи: 132

03-10-2013 дата публикации

METHOD AND APPARATUS FOR EFFICIENT INTER-THREAD SYNCHRONIZATION FOR HELPER THREADS

Номер: US20130263145A1

Автор: "OBrien John K.", Gschwind Michael K., Salapura Valentina, Sura Zehra N.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

A monitor bit per hardware thread in a memory location may be allocated, in a multiprocessing computer system having a plurality of hardware threads, the plurality of hardware threads sharing the memory location, and each of the allocated monitor bit corresponding to one of the plurality of hardware threads. A condition bit may be allocated for each of the plurality of hardware threads, the condition bit being allocated in each context of the plurality of hardware threads. In response to detecting the memory location being accessed, it is determined whether a monitor bit corresponding to a hardware thread in the memory location is set. In response to determining that the monitor bit corresponding to a hardware thread is set in the memory location, a condition bit corresponding to a thread accessing the memory location is set in the hardware thread's context. 1. A method of synchronizing threads , comprising:allocating a bit per hardware thread in a memory location, in a multiprocessing computer system having a plurality of hardware threads, the plurality of hardware threads sharing the memory location, and each of the allocated bit corresponding to one of the plurality of hardware threads;allocating a condition bit for each of the plurality of hardware threads, the condition bit being allocated in each context of the plurality of hardware threads;in response to detecting the memory location being accessed, determining whether a bit corresponding to a hardware thread in the memory location is set;in response to determining that the bit corresponding to a hardware thread is set in the memory location, setting a condition bit corresponding to a thread accessing the memory location, in the hardware thread's context.2. The method of claim 1 , wherein the memory location is a cache line in cache memory.3. The method of claim 2 , wherein a helper hardware thread performing data prefetching for an application hardware thread sets the bit in the memory location to monitor ...

Подробнее

Номер записи: 133

03-10-2013 дата публикации

OPTIMIZING SUBROUTINE CALLS BASED ON ARCHITECTURE LEVEL OF CALLED SUBROUTINE

Номер: US20130263153A1

Автор: Gschwind Michael K.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

A technique is provided for generating stubs. A processing circuit receives a call to a called function. The processing circuit retrieves a called function property of the called function. The processing circuit generates a stub for the called function based on the called function property. 1. A computer implemented method for generating stubs , the method comprising:receiving, by a processing circuit, a call to a called function;retrieving, by the processing circuit, a called function property of the called function; andgenerating, by the processing circuit, a stub for the called function based on the called function property.2. The computer implemented method of claim 1 , further comprising:determining that the called function and a calling function are together in a shared library; ordetermining that the called function is in another shared library, the another shared library being external to the shared library.3. The computer implemented method of claim 2 , further comprising optimizing instructions in the stub based on the called function being in the shared library with the calling function.4. The computer implemented method of claim 2 , further comprising optimizing instructions in the stub based on the called function being in the another shared library.5. The computer implemented method of claim 1 , further comprising:optimizing instructions in the stub based on a near call distance responsive to determining that the called function is reachable with a memory address offset from a branch in the stub, wherein the near call distance does not require full address bits; andoptimizing the stub based on a far call distance responsive to determining that the called function requires more address bits than provided in the near call distance, wherein the far call distance requires the full address bits.6. The computer implemented method of claim 1 , wherein the stub is generated by an operating system service.7. The computer implemented method of claim 1 , further ...

Подробнее

Номер записи: 134

28-11-2013 дата публикации

COMPILING CODE FOR AN ENHANCED APPLICATION BINARY INTERFACE (ABI) WITH DECODE TIME INSTRUCTION OPTIMIZATION

Номер: US20130318510A1

Автор: Blainey Robert J., Gschwind Michael K., McInnes James L., Munroe Steven J.

Принадлежит:

Generating decode time instruction optimization (DTIO) object code that enables a DTIO enabled processor to optimize execution of DTIO instructions. A code sequence configured to facilitate DTIO in a DTIO enabled processor is identified by a computer. The code sequence includes an internal representation (IR) of a first instruction and an IR of a second instruction. The second instruction is dependent on the first instruction. A schedule associated with at least one of the IR of the first instruction and the IR of the second instruction is modified. The modifying includes generating a modified schedule that is configured to place the first instruction next to the second instruction. An object file is generated based on the modified schedule. The object file includes the first instruction placed next to the second instruction. The object file is emitted. 1. A computer program product for generating decode time instruction optimization (DTIO) object code , wherein the DTIO object code enables a DTIO enabled processor to optimize execution of DTIO instructions , the computer program product comprising:a tangible storage medium readable by a processing circuit and storing instructions for execution by processing circuit for performing a method comprising:identifying, by a computer, a code sequence configured to facilitate DTIO in a DTIO enabled processor, the code sequence comprising an internal representation (IR) of a first instruction and an IR of a second instruction, the second instruction dependent on the first instruction;modifying a schedule associated with at least one of the IR of the first instruction and the IR of the second instruction, the modifying including generating a modified schedule that is configured to place the first instruction next to the second instruction;generating a DTIO configured object file based on the modified schedule, the object file including the first instruction placed next to the second instruction; andemitting the object file.2. ...

Подробнее

Номер записи: 135

19-12-2013 дата публикации

MANAGING PAGE TABLE ENTRIES

Номер: US20130339651A1

Автор: Bybell Anthony J., Gschwind Michael K.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

Embodiments relate to managing page table entries in a processing system. A first page table entry (PTE) of a page table for translating virtual addresses to main storage addresses is identified. The page table includes a second page table entry contiguous with the second page table entry. It is determined whether the first PTE may be joined with the second PTE, based on the respective pages of main storage being contiguous. A marker is set in the page table for indicating that the main storage pages identified by the first PTE and second PTEs are contiguous. 1. A computer program product for managing page table entries in a processing system , the computer program product comprising: identifying, by a processor, a first page table entry (PTE) of a page table for translating virtual addresses to main storage addresses, the page table comprising a second page table entry contiguous with the second page table entry;', 'determining with the processor whether the first PTE may be joined with the second PTE, the determining based on the respective pages of main storage being contiguous; and', 'setting a marker in the page table for indicating that the main storage pages identified by the first PTE and second PTEs are contiguous., 'a tangible storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for performing a method comprising2. The computer program product of claim 1 , further comprising performing an address translation of a virtual address comprising:based on the virtual address, obtaining the first PTE; andbased on the marker, using the first PTE to translate virtual addresses to both the first page and the second page absent accessing the second PTE.3. The computer program product of claim 1 , wherein the method further comprises executing a translation lookaside buffer (TLB) invalidate instruction for invalidating TLB entries associated with the first PTE and second PTE.4. The computer program product of ...

Подробнее

Номер записи: 136

19-12-2013 дата публикации

Radix Table Translation of Memory

Номер: US20130339652A1

Автор: Bybell Anthony J., Gschwind Michael K.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

Embodiments relate to managing memory page tables in a processing system. A request to access a desired block of memory is received. The request includes an effective address that includes an effective segment identifier (ESID) and a linear address, the linear address including a most significant portion and a byte index. An entry in a buffer that includes the ESID of the effective address is located. Based on the entry including a radix page table pointer (RPTP), performing: using the RPTP to locate a translation table of a hierarchy of translation tables, using the located translation table to translate the most significant portion of the linear address to obtain an address of a block of memory, and based on the obtained address, performing the requested access to the desired block of memory. 1. A computer program product for accessing a memory location in a processing system , the computer program product comprising:a tangible storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for performing a method comprising:receiving a request to access a desired block of memory, the request comprising an effective address that includes an effective segment identifier (ESID) and a linear address, the linear address comprising a most significant portion and a byte index;locating, by a processor, an entry, in a buffer, the entry including the ESID of the effective address; andbased on the entry including a radix page table pointer (RPTP), performing:using the RPTP to locate a translation table of a hierarchy of translation tables;using the located translation table to translate the most significant portion of the linear address to obtain an address of a block of memory; andbased on the obtained address, performing the requested access to the desired block of memory.2. The computer product of claim 1 , wherein based on the entry including a VSID claim 1 , performing locating a page table entry of a group of ...

Подробнее

Номер записи: 137

19-12-2013 дата публикации

Managing Accessing Page Table Entries

Номер: US20130339653A1

Автор: Bybell Anthony J., Gschwind Michael K.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

A system for accessing memory locations includes translating, by a processor, a virtual address to locate a first page table entry (PTE) in a page table. The first PTE includes a marker and an address of a page of main storage. It is determined whether a marker is set in the first PTE. The system identifies a large page size of a large page associated with the first PTE based on determining that the marker is set in the first PTE. The large page consists of contiguous pages of main storage. An origin address of the large page is determined based on determining that the marker is set in the first PTE. The virtual address is used to index into the large page at the origin address to access main storage. 1. A computer program product for managing page table entries in a processing system , the computer program product comprising: translating, by a processor, a virtual address to locate a first page table entry (PTE) in the page table, the first PTE comprising a marker and an address of a page of main storage;', 'determining, with the processor, whether a marker is set in the first PTE;', 'identifying a large page size of a large page associated with the first PTE based on determining that the marker is set in the first PTE, wherein the large page consists of contiguous pages of main storage;', 'determining an origin address of the large page based on determining that the marker is set in the first PTE; and', 'using the virtual address to index into the large page at the origin address to access main storage., 'a tangible storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for performing a method comprising2. The computer program product of claim 1 , wherein a range of virtual addresses identify corresponding PTEs comprising said first PTE claim 1 , wherein each PTE of said corresponding PTEs is configured to address a page of main storage claim 1 , each of said pages of main storage being contiguous.3. The ...

Подробнее

Номер записи: 138

19-12-2013 дата публикации

Radix Table Translation of Memory

Номер: US20130339654A1

Автор: Bybell Anthony J., Gschwind Michael K.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

A method includes receiving a request to access a desired block of memory. The request includes an effective address that includes an effective segment identifier (ESID) and a linear address, the linear address comprising a most significant portion and a byte index. Locating an entry, in a buffer, the entry including the ESID of the effective address. Based on the entry including a radix page table pointer (RPTP), performing, using the RPTP to locate a translation table of a hierarchy of translation tables, using the located translation table to translate the most significant portion of the linear address to obtain an address of a block of memory, and based on the obtained address, performing the requested access to the desired block of memory. 1. A computer implemented method for accessing memory locations , the method comprising:receiving a request to access a desired block of memory, the request comprising an effective address that includes an effective segment identifier (ESID) and a linear address, the linear address comprising a most significant portion and a byte index;locating, by a processor, an entry, in a buffer, the entry including the ESID of the effective address;based on the entry including a radix page table pointer (RPTP), performing;using the RPTP to locate a translation table of a hierarchy of translation tables;using the located translation table to translate the most significant portion of the linear address to obtain an address of a block of memory; andbased on the obtained address, performing the requested access to the desired block of memory.2. The method of claim 1 , wherein based on the entry including a VSID claim 1 , performing locating a page table entry of a group of translation table entries using a hash function to obtain an address of a block of memory.3. The method of claim 1 , wherein the using the obtained address comprises using the byte index of the linear address and the obtained address to form an address of the desired block ...

Подробнее

Номер записи: 139

19-12-2013 дата публикации

MANAGING PAGE TABLE ENTRIES

Номер: US20130339658A1

Автор: Bybell Anthony J., Gschwind Michael K.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

A method includes identifying, by a processor, a first page table entry (PTE) of a page table for translating virtual addresses to main storage addresses, the page table comprising a second page table entry contiguous with the second page table entry, determining with the processor whether the first PTE may be joined with the second PTE, the determining based on the respective pages of main storage being contiguous, and setting a marker in the page table for indicating that the main storage pages of identified by the first PTE and second PTEs are contiguous. 1. A computer implemented method for accessing memory locations , the method comprising:identifying, by a processor, a first page table entry (PTE) of a page table for translating virtual addresses to main storage addresses, the page table comprising a second page table entry contiguous with the second page table entry;determining, with the processor, whether the first PTE may be joined with the second PTE, the determining based on the respective pages of main storage being contiguous; andsetting a marker in the page table for indicating that the main storage pages of identified by the first PTE and second PTEs are contiguous.2. The method of claim 1 , wherein the method the method further comprises performing an address translation of a virtual address comprising:based on the virtual address, obtaining the first PTE; andbased on the marker, using the first PTE to translate virtual addresses to both the first page and the second page absent accessing the second PTE.3. The method of claim 1 , wherein the method further comprises executing a translation lookaside buffer (TLB) invalidate instruction for invalidating TLB entries associated with the first PTE and second PTE.4. The method of claim 1 , wherein the method further comprises starting a memory access routine for the first virtual address stored in the first page table entry (PTE) in the page table claim 1 , wherein the memory access routine performs: ...

Подробнее

Номер записи: 140

19-12-2013 дата публикации

MANAGING ACCESSING PAGE TABLE ENTRIES

Номер: US20130339659A1

Автор: Bybell Anthony J., Gschwind Michael K.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

A method for accessing memory locations includes translating, by a processor, a virtual address to locate a first page table entry (PTE) in a page table. The first PTE includes a marker and an address of a page of main storage. It is determined, by the processor, whether a marker is set in the first PTE. A large page size of a large page associated with the first PTE is identified based on determining that the marker is set in the first PTE. The large page is made up of contiguous pages of main storage. An origin address of the large page is determined based on determining that the marker is set in the first PTE. The virtual address is used to index into the large page at the origin address to access main storage. 1. A computer implemented method for accessing memory locations , the method comprising:translating, by a processor, a virtual address to locate a first page table entry (PTE) in the page table, the first PTE comprising a marker and an address of a page of main storage;determining, with the processor, whether a marker is set in the first PTE;identifying a large page size of a large page associated with the first PTE based on determining that the marker is set in the first PTE, wherein the large page consists of contiguous pages of main storage;determining an origin address of the large page based on determining that the marker is set in the first PTE; andusing the virtual address to index into the large page at the origin address to access main storage.2. The method of claim 1 , wherein a range of virtual addresses identify corresponding PTEs comprising said first PTE claim 1 , wherein each PTE of said corresponding PTEs is configured to address a page of main storage claim 1 , each of said pages of main storage being contiguous.3. The method of claim 1 , wherein the method further comprises:storing virtual address information and an address for locating said large page, in a translation look-aside buffer (TLB); andusing the TLB to translate virtual ...

Подробнее

Номер записи: 141

23-01-2014 дата публикации

REDUCING REGISTER READ PORTS FOR REGISTER PAIRS

Номер: US20140025927A1

Автор: Bradbury Jonathan D., Gschwind Michael K.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

Embodiments relate to reducing a number of read ports for register pairs. An aspect includes executing an instruction. The instruction identifies a pair of registers as containing a wide operand which spans the pair of registers. It is determined if a pairing indicator associated with the pair of registers has a first value or a second value. The first value indicates that the wide operand is stored in a wide register, and the second value indicates that the wide operand is not stored in the wide register. Based on the pairing indicator having the first value, the wide operand is read from the wide register. Based on the pairing indicator having the second value, the wide operand is read from the pair of registers. An operation is performed using the wide operand. 1. A system for reducing a number of read ports for register pairs , the system comprising:a set of registers, and a set of wide registers, the set of registers and the set of wide registers being addressable by register fields of instructions; anda processing circuit coupled to said set of registers and said set of wide registers, configured to perform a method comprising:executing an instruction, the instruction identifying a pair of registers as containing a wide operand, the wide operand spanning the pair of registers, the executing comprising:determining whether a pairing indicator associated with the pair of registers, has a first value or a second value, the first value indicating the wide operand is stored in a wide register, the second value indicating the wide operand is not stored in the wide register;based on the pairing indicator having the first value, reading the wide operand from the wide register;based on the pairing indicator having the second value, reading the wide operand from the pair of registers; andperforming an operation using the wide operand.2. The system of claim 1 , wherein dataflow of execution units of the processor is as wide as the wide operand.3. The system of claim 1 , ...

Подробнее

Номер записи: 142

23-01-2014 дата публикации

PREDICTING REGISTER PAIRS

Номер: US20140025928A1

Автор: Bradbury Jonathan D., Gschwind Michael K.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

Embodiments relate to reducing a number of read ports for register pairs. An aspect includes executing an instruction. The instruction identifies a pair of registers as containing a wide operand which spans the pair of registers. The executing of the instruction includes determining whether a pairing indicator associated with the pair of registers has a first value, a second value or a third value. Based on the pairing indicator having the first value, the wide operand is read from the wide register. Based on the pairing indicator having the second value the wide operand is read from the pair of registers. Based on the pairing indicator having the third value, the wide operand is speculatively read from a predetermined register. The predetermined register consists of the wide register or the pair of registers. 1. A system for reducing a number of read ports for register pairs , the system comprising:a set of registers, and a set of wide registers, the set of registers and the set of wide registers being addressable by register fields of instructions; anda processing circuit coupled to said set of registers and said set of wide registers, configured to perform a method comprising:executing an instruction, the instruction identifying a pair of registers as containing a wide operand, the wide operand spanning the pair of registers, the executing comprising;determining whether a pairing indicator associated with the pair of registers has a first value, a second value or a third value, the first value indicating the wide operand is stored in a wide register, the second value indicating the wide operand is not stored in the wide register and the third value indicating it is not known whether the wide operand is stored in the wide register;based on the pairing indicator having the first value, reading the wide operand from the wide register;based on the pairing indicator having the second value, reading the wide operand from the pair of registers; andbased on the pairing ...

Подробнее

Номер записи: 143

23-01-2014 дата публикации

MANAGING REGISTER PAIRING

Номер: US20140025929A1

Автор: Bradbury Jonathan D., Gschwind Michael K.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

Embodiments relate to reducing a number of read ports for register pairs. An aspect includes maintaining an active pairing indicator that is configured to have a first value or a second value. The first value indicates that the wide operand is stored in a wide register. The second value indicates that the wide operand is not stored in the wide register. The operand is read from either the wide register or a pair of registers based on the active pairing indicator. The active pairing indicator and the values of the set of wide registers are stored to a storage based on a request to store a register pairing status. A saved pairing indicator and saved values of the set of wide registers is loaded from the storage respectively into an active pairing register and wide registers. 1. A system for reducing a number of read ports for register pairs , the system comprising:a set of registers, and a set of wide registers, the set of registers and the set of wide registers being addressable by register fields of instructions; and maintaining an active pairing indicator configured to have a first value or a second value, the first value indicating the wide operand is stored in a wide register, the second value indicating the wide operand is not stored in the wide register;', 'based on the active pairing indicator, determining whether to read the wide operand from the wide register or a pair of registers;', 'storing the active pairing indicator and values of the set of wide registers to a storage based on a request to store a register pairing status; and', 'loading a saved pairing indicator and saved values of the set of wide registers from the storage respectively into an active pairing register and wide registers., 'a processing circuit coupled to said set of registers and said set of wide registers, configured to perform a method comprising2. The system of claim 1 , wherein the storing and loading are performed by any one of executing load or store instructions claim 1 , or by ...

Подробнее

Номер записи: 144

13-02-2014 дата публикации

Scalable Decode-Time Instruction Sequence Optimization of Dependent Instructions

Номер: US20140047216A1

Автор: Gschwind Michael K., Salapura Valentina

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

Producer-consumer instructions, comprising a first instruction and a second instruction in program order, are fetched requiring in-order execution, the second instruction is modified by the processor so that the first instruction and second instruction can be completed out-of-order, the modification comprising any one of extending an immediate field of the second instruction using immediate field information of the first instruction or providing a source location of the first instruction as an additional source location to source locations of the second instruction. 1. A computer system for executing dependent machine instructions of an instruction set architecture (ISA) out-of-order , the system comprising:a processor configured to communicate with a main storage, the processor comprising an instruction fetcher, an instruction modifier and one or more execution units, the processor configured to perform a method comprising:fetching for execution, by the processor, a first instruction of the ISA and a second instruction of the ISA;determining in-order execution candidacy, by the processor, of the first instruction and the second instruction, wherein the first instruction and second instruction are configured to be executed out-of-order but are candidates to be effectively executed in-order, the determination comprising determining that the first instruction specifies a target operand location for a target operand and the second instruction specifies a source operand location for a source operand, wherein the first instruction is configured to store a target operand at the target operand location, wherein the source operand location is the same as the target operand location, wherein the second instruction is configured to obtain the source operand at the source operand location; andbased on the determining in-order execution candidacy, effectively executing, by the processor, the first instruction and the second instruction by executing, by the processor, the first ...

Подробнее

Номер записи: 145

13-02-2014 дата публикации

Managing A Register Cache Based on an Architected Computer Instruction Set having Operand Last-User Information

Номер: US20140047219A1

Автор: Gschwind Michael K., Salapura Valentina

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

A multi-level register hierarchy is disclosed comprising a first level pool of registers for caching registers of a second level pool of registers in a system wherein programs can dynamically release and re-enable architected registers such that released architected registers need not be maintained by the processor, the processor accessing operands from the first level pool of registers, wherein a last-use instruction is identified as having a last use of an architected register before being released, the last-use architected register being released causes the multi-level register hierarchy to discard any correspondence of an entry to said last use architected register. 1. A computer program product for managing a multi-level register hierarchy , comprising a first level pool of registers for caching registers of a second level pool of registers , the computer program product comprising a tangible storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for performing a method comprising:assigning, by a processor, architected registers to available entries of one of said first level pool or said second level pool, wherein architected registers are defined by an instruction set architecture (ISA) and addressable by register field values of instructions of the ISA, wherein the assigning comprises associating each assigned architected register to a corresponding an entry of a pool of registers;moving architected register values to said first level pool from said second level pool according to a first level pool replacement algorithm;based on instructions being executed, accessing architected register values of the first level pool of registers corresponding to said architected registers;based on executing a last-use instruction for using an architected register identified as a last-use architected register, un-assigning the last-use architected register from both the first level pool and the second level pool, ...

Подробнее

Номер записи: 146

20-02-2014 дата публикации

PRIVILEGE LEVEL AWARE PROCESSOR HARDWARE RESOURCE MANAGEMENT FACILITY

Номер: US20140053154A1

Автор: Frazier Giles R., Gschwind Michael K., NAYAR NARESH

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

Multiple machine state registers are included in a processor core to permit distinction between use of hardware facilities by applications, supervisory threads and the hypervisor. All facilities are initially disabled by the hypervisor when a partition is initialized. When any access is made to a disabled facility, the hypervisor receives an indication of which facility was accessed and sets a corresponding hardware flag in the hypervisor's machine state register. When an application attempts to access a disabled facility, the supervisor managing the operating system image receives an indication of which facility was accessed and sets a corresponding hardware flag in the supervisor's machine state register. The multiple register implementation permits the supervisor to determine whether particular hardware facilities need to have their state saved when an application context swap occurs and the hypervisor can determine which hardware facilities need to have their state saved when a partition swap occurs. 1. A method of tracking usage of a hardware execution facility within a processor core of a computer system by processes executing at different privilege levels , the method comprising:at a first privilege level, first maintaining a first hardware flag in a first register of the processor core that indicates whether or not a corresponding particular hardware facility is enabled for access at another privilege level lower than the first privilege level; andat a second privilege level lower than the first privilege level, second maintaining a second hardware flag in a second register of the processor core that indicates whether or not the particular hardware facility is enabled for access at a third privilege level lower than the second privilege level.2. The method of claim 1 , wherein the first privilege level is a hypervisor privilege level and wherein the second privilege level is a supervisory privilege level.3. The method of claim 2 , wherein the first ...

Подробнее

Номер записи: 147

27-03-2014 дата публикации

CACHING OPTIMIZED INTERNAL INSTRUCTIONS IN LOOP BUFFER

Номер: US20140089636A1

Автор: Gschwind Michael K., Salapura Valentina

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

Embodiments of the invention relate to a computer system for storing an internal instruction loop in a loop buffer. The computer system includes a loop buffer and a processor. The computer system is configured to perform a method including fetching instructions from memory to generate an internal instruction to be executed, detecting a beginning of a first instruction loop in the instructions, determining that a first internal instruction loop corresponding to the first instruction loop is not stored in the loop buffer, fetching the first instruction loop, optimizing one or more instructions corresponding to the first instruction loop to generate a first optimized internal instruction loop, and storing the first optimized internal instruction loop in the loop buffer based on the determination that the first internal instruction loop is not stored in the loop buffer. 1. A computer program product for implementing an instruction loop buffer , the computer program product comprising: fetching instructions from memory to generate an internal instruction to be executed;', 'determining, by a processor, that a first instruction from the instructions corresponds to a first instruction loop;', 'determining that a first internal instruction loop corresponding to the first instruction loop is not stored in a loop buffer;', 'optimizing one or more internal instructions of the first instruction loop; and', 'storing a resulting first optimized internal instruction loop in the loop buffer based on the determining that the first internal instruction loop is not stored in the loop buffer., 'a tangible storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for performing a method comprising2. The computer program product of claim 1 , wherein optimizing the one or more instructions includes merging at least two machine instructions of the one or more instructions to generate an optimized internal instruction claim 1 , andthe ...

Подробнее

Номер записи: 148

03-04-2014 дата публикации

Prefix Computer Instruction for Compatibly Extending Instruction Functionality

Номер: US20140095833A1

Автор: Michael K. Gschwind, Valentina Salapura

Принадлежит: International Business Machines Corp

A prefix instruction is executed and passes operands to a next instruction without storing the operands in an architected resource such that the execution of the next instruction uses the operands provided by the prefix instruction to perform an operation, the operands may be prefix instruction immediate field or a target register of the prefix instruction execution.

Подробнее

Номер записи: 149

03-04-2014 дата публикации

PERFORMING PREDECODE-TIME OPTIMIZED INSTRUCTIONS IN CONJUNCTION WITH PREDECODE TIME OPTIMIZED INSTRUCTION SEQUENCE CACHING

Номер: US20140095835A1

Автор: Gschwind Michael K., Salapura Valentina

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

A method for performing predecode-time optimized instructions in conjunction with predecode time optimized instruction sequence caching. The method includes receiving a first instruction of an instruction sequence and a second instruction of the instruction sequence and determining if the first instruction and the second instruction can be optimized. In response to the determining that the first instruction and second instruction can be optimized, the method includes, preforming a pre-decode optimization on the instruction sequence and generating a new second instruction, wherein the new second instruction is not dependent on a target operand of the first instruction and storing a pre-decoded first instruction and a pre-decoded new second instruction in an instruction cache. In response to determining that the first instruction and second instruction can not be optimized, the method includes, storing the pre-decoded first instruction and a pre-decoded second instruction in the instruction cache. 1. A computer program product for performing predecode-time optimized instructions in conjunction with predecode time optimized instruction sequence caching , the computer program product comprising:a storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for performing a method comprising:receiving a first instruction of an instruction sequence and a second instruction of the instruction sequence;determining if the first instruction and the second instruction can be optimized; preforming a pre-decode optimization on the instruction sequence and generating a new second instruction, wherein the new second instruction is not dependent on a target operand of the first instruction; and', 'storing a pre-decoded first instruction and a pre-decoded new second instruction in an instruction cache;, 'responsive to the determining that the first instruction and second instruction can be optimizedresponsive to the determining ...

Подробнее

Номер записи: 150

03-04-2014 дата публикации

Tracking Operand Liveliness Information in a Computer System and Performing Function Based on the Liveliness Information

Номер: US20140095848A1

Автор: Gschwind Michael K., Salapura Valentina

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

Operand liveness state information is maintained during context switches for current architected operands of executing programs the current operand state information indicating whether corresponding current operands are any one of enabled or disabled for use by a first program module, the first program module comprising machine instructions of an instruction set architecture (ISA) for disabling current architected operands, wherein a current operand is accessed by a machine instruction of said first program module, the accessing comprising using the current operand state information to determine whether a previously stored current operand value is accessible by the first program module. 1. A computer system for maintaining liveness information for executing programs , the system comprising:processor configured to communicate with a main storage, the processor comprising an instruction fetcher, an instruction optimizer and one or more execution units for executing optimized instructions, the processor configured to perform a method comprising:maintaining, by a processor, current operand state information, the current operand state information for indicating whether corresponding current operands are any one of enabled or disabled for use by a first program module, the first program module comprising machine instructions of an instruction set architecture (ISA), the first program module currently being executed by the processor;accessing a current operand, by a machine instruction of said first program module, the accessing comprising using the current operand state information to determine whether a previously stored current operand value is accessible by the first program module.2. The computer system according to claim 1 , further comprising:based on the current operand being disabled, the accessing comprising at least one of a) and b) comprising:a) returning an architecture-specified value, and where the architecture-specified value is any one of an undefined ...

Подробнее

Номер записи: 151

10-04-2014 дата публикации

MIXED PRECISION ESTIMATE INSTRUCTION COMPUTING NARROW PRECISION RESULT FOR WIDE PRECISION INPUTS

Номер: US20140101216A1

Автор: Gschwind Michael K., Salapura Valentina

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

A technique is provided for performing a mixed precision estimate. A processing circuit receives an input of a first precision having a wide precision value. The processing circuit computes an output in an output exponent range corresponding to a narrow precision value based on the input having the wide precision value. 1. A computer system configured to perform a mixed precision estimate , the system comprising:a processing circuit, the system configured to perform a method comprising:receiving, by the processing circuit, an input of a wide precision having a wide precision value; andcomputing, by the processing circuit, an output in an output exponent range corresponding to a narrow precision value based on the input having the wide precision value.2. The computer system of claim 1 , wherein the method further comprises storing claim 1 , by the processing circuit claim 1 , the output in a register having an architected register storage format in a wide precision format.3. The computer system of claim 1 , wherein the method further comprises based on the wide precision value of the input having an input exponent failing to correspond to the output exponent range claim 1 , generating the output as an out of range value.4. The computer system of claim 3 , wherein the out of range value comprises at least one of zero and infinity.5. The computer system of claim 1 , wherein the method further comprises based on the input comprising a wide not a number (NaN) claim 1 , converting the wide not a number to a narrow not a number in which not a number properties are preserved.6. The computer system of claim 1 , wherein the method further comprises based on the input having the wide precision value with an input exponent failing to adhere to a valid exponent range of a valid single precision value claim 1 , generating a mantissa mask based on the input exponent to be applied to a mantissa of the output.7. The computer system of claim 6 , wherein the method further comprises ...

Подробнее

Номер записи: 152

10-04-2014 дата публикации

ASYMMETRIC CO-EXISTENT ADDRESS TRANSLATION STRUCTURE FORMATS

Номер: US20140101359A1

Автор: Bybell Anthony J., Dukro David D., Frey Bradly G., Gschwind Michael K.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

An address translation capability is provided in which translation structures of different types are used to translate memory addresses from one format to another format. Multiple translation structure formats (e.g., multiple page table formats, such as hash page tables and hierarchical page tables) are concurrently supported in a system configuration. This facilitates provision of guest access in virtualized operating systems, and/or the mixing of translation formats to better match the data access patterns being translated. 1. A computer program product for facilitating translation of memory addresses , said computer program product comprising: determining, by a processor, whether a first address translation structure of a first type is to be used to translate a memory address;', 'based on the determining that a first address translation structure of the first type is to be used, accessing a second address translation structure of a second type, the second type being different from the first type, to determine a particular first address translation structure to be used and to obtain an origin address of that particular first address translation structure; and', 'using the particular first address translation structure in translating the memory address., 'a computer readable storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for performing a method comprising2. The computer program product of claim 1 , wherein the determining comprises checking an indicator to determine whether the first address translation structure is to be used claim 1 , the indicator located in an entry of a data structure located using a portion of the memory address to be translated.3. The computer program product of claim 2 , wherein the indicator is located in a segment lookaside buffer entry (SLBE) claim 2 , the SLBE located using an effective segment identifier field of the memory address.4. The computer program product of ...

Подробнее

Номер записи: 153

10-04-2014 дата публикации

Adjunct component to provide full virtualization using paravirtualized hypervisors

Номер: US20140101360A1

Автор: Michael K. Gschwind

Принадлежит: International Business Machines Corp

A system configuration is provided with a paravirtualizing hypervisor that supports different types of guests, including those that use a single level of translation and those that use a nested level of translation. When an address translation fault occurs during a nested level of translation, an indication of the fault is received by an adjunct component. The adjunct component addresses the address translation fault, at least in part, on behalf of the guest.

Подробнее

Номер записи: 154

10-04-2014 дата публикации

SYSTEM SUPPORTING MULTIPLE PARTITIONS WITH DIFFERING TRANSLATION FORMATS

Номер: US20140101361A1

Автор: Gschwind Michael K.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

A system configuration is provided with multiple partitions that supports different types of address translation structure formats. The configuration may include partitions that use a single level of translation and those that use a nested level of translation. Further, differing types of translation structures may be used. The different partitions are supported by a single hypervisor. 1. A computer program product for facilitating memory access , said computer program product comprising: providing a first partition within a system configuration, the first partition configured to support an operation system (OS) designed for a first address translation architecture, the first partition not supporting an OS designed for a second address translation architecture; and', 'providing a second partition within the system configuration, the second partition configured to support the OS designed for the second address translation architecture, the second partition not supporting the OS designed for the first address translation architecture, wherein the first address translation architecture is structurally different from the second address translation architecture., 'a computer readable storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for performing a method comprising2. The computer program product of claim 1 , wherein the first address translation architecture uses a hash structure and the second address translation architecture uses a hierarchical table structure.3. The computer program product of claim 1 , wherein the first partition is a paravirtualized partition in which a guest of the first partition assists in handling address translation faults corresponding to host translations claim 1 , and the second partition is a virtualized partition in which handling of address translation faults corresponding to host translations is independent of assistance from a guest of the second partition.4. The computer ...

Подробнее

Номер записи: 155

10-04-2014 дата публикации

SUPPORTING MULTIPLE TYPES OF GUESTS BY A HYPERVISOR

Номер: US20140101362A1

Автор: Gschwind Michael K.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

A system configuration is provided that includes multiple partitions that have differing translation mechanisms associated therewith. For instance, one partition has associated therewith a single level translation mechanism for translating guest virtual addresses to host physical addresses, and another partition has a nested level translation mechanism for translating guest virtual addresses to host physical addresses. The different translation mechanisms and partitions are supported by a single hypervisor. Although the hypervisor is a paravirtualized hypervisor, it provides full virtualization for those partitions using nested level translations. 1. A computer program product for facilitating translation of memory addresses , said computer program product comprising: providing, by a hypervisor executing within a computing environment, a first type of translation support for a first type of guest operating system supported by the hypervisor, the first type of translation support comprising a paravirtualization support in which the first type of guest operating system assists in handling address translation faults corresponding to host translations of guest memory addresses; and', 'providing, by the hypervisor, a second type of translation support for a second type of guest operating system supported by the hypervisor, the second type of translation support comprising a virtualization support in which handling address translation faults corresponding to host translations of guest memory addresses are handled entirely by a host, the host including at least the hypervisor., 'a computer readable storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for performing a method comprising2. The computer program product of claim 1 , wherein the first type of guest operating system uses a single level of address translation to translate a first guest virtual address claim 1 , and the second type of guest operating ...

Подробнее

Номер записи: 156

10-04-2014 дата публикации

SELECTABLE ADDRESS TRANSLATION MECHANISMS WITHIN A PARTITION

Номер: US20140101363A1

Автор: Gschwind Michael K.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

An address translation capability is provided in which translation structures of different types are used to translate memory addresses from one format to another format. Multiple translation structure formats (e.g., multiple page table formats, such as hash page tables and hierarchical page tables) are concurrently supported in a system configuration. For a system configuration that includes partitions, the translation mechanism to be used for a partition or a portion thereof is selectable and may be different for different partitions or even portions within a partition. 1. A computer program product for facilitating translation of memory addresses , said computer program product comprising: selecting, by a monitor executing on a processor, a first address translation structure format for a first portion of a partition managed by the monitor; and', 'selecting, by the monitor, a second address translation structure format for a second portion of the partition, the first address translation structure format being different from the second address translation structure format., 'a computer readable storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for performing a method comprising2. The computer program product of claim 1 , wherein the first address translation structure format uses a hierarchical structure and the second address translation structure format uses a hash structure.3. The computer program product of claim 1 , wherein the monitor comprises one of a hypervisor claim 1 , or a virtual machine monitor running on an operating system.4. The computer program product of claim 1 , wherein at least one of the selecting the first address translation structure format and the selecting the second address translation structure format is based on one or more of: available address translation structure formats claim 1 , implementation choice claim 1 , or memory characteristics.5. The computer program ...

Подробнее

Номер записи: 157

10-04-2014 дата публикации

SELECTABLE ADDRESS TRANSLATION MECHANISMS WITHIN A PARTITION

Номер: US20140101364A1

Автор: Gschwind Michael K.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

An address translation capability is provided in which translation structures of different types are used to translate memory addresses from one format to another format. Multiple translation structure formats (e.g., multiple page table formats, such as hash page tables and hierarchical page tables) are concurrently supported in a system configuration. For a system configuration that includes partitions, the translation mechanism to be used for a partition or a portion thereof is selectable and may be different for different partitions or even portions within a partition. 1. A method of facilitating translation of memory addresses , said method comprising:selecting, by a monitor executing on a processor, a first address translation structure format for a first portion of a partition managed by the monitor; andselecting, by the monitor, a second address translation structure format for a second portion of the partition, the first address translation structure format being different from the second address translation structure format.2. The method of claim 1 , wherein the first address translation structure format uses a hierarchical structure and the second address translation structure format uses a hash structure.3. The method of claim 1 , wherein the monitor comprises one of a hypervisor claim 1 , or a virtual machine monitor running on an operating system.4. The method of claim 1 , wherein at least one of the selecting the first address translation structure format and the selecting the second address translation structure format is based on one or more of: available address translation structure formats claim 1 , implementation choice claim 1 , or memory characteristics.5. The method of claim 1 , further comprising configuring at least one of selection of the first address translation structure format or selection of the second address translation structure format by providing an indication of the at least one selection.6. The method of claim 5 , wherein the ...

Подробнее

Номер записи: 158

10-04-2014 дата публикации

SUPPORTING MULTIPLE TYPES OF GUESTS BY A HYPERVISOR

Номер: US20140101365A1

Автор: Gschwind Michael K.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

A system configuration is provided that includes multiple partitions that have differing translation mechanisms associated therewith. For instance, one partition has associated therewith a single level translation mechanism for translating guest virtual addresses to host physical addresses, and another partition has a nested level translation mechanism for translating guest virtual addresses to host physical addresses. The different translation mechanisms and partitions are supported by a single hypervisor. Although the hypervisor is a paravirtualized hypervisor, it provides full virtualization for those partitions using nested level translations. 1. A method of facilitating translation of memory addresses , said method comprising:providing, by a hypervisor executing within a computing environment, a first type of translation support for a first type of guest operating system supported by the hypervisor, the first type of translation support comprising a paravirtualization support in which the first type of guest operating system assists in handling address translation faults corresponding to host translations of guest memory addresses; andproviding, by the hypervisor, a second type of translation support for a second type of guest operating system supported by the hypervisor, the second type of translation support comprising a virtualization support in which handling address translation faults corresponding to host translations of guest memory addresses are handled entirely by a host, the host including at least the hypervisor.2. The method of claim 1 , wherein the first type of guest operating system uses a single level of address translation to translate a first guest virtual address claim 1 , and the second type of guest operating system uses a nested level of address translation to translate a second guest virtual address.3. The method of claim 2 , wherein the single level of address translation comprises using a hash structure to translate the first guest ...

Подробнее

Номер записи: 159

10-04-2014 дата публикации

SYSTEM SUPPORTING MULTIPLE PARTITIONS WITH DIFFERING TRANSLATION FORMATS

Номер: US20140101402A1

Автор: Gschwind Michael K.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

A system configuration is provided with multiple partitions that supports different types of address translation structure formats. The configuration may include partitions that use a single level of translation and those that use a nested level of translation. Further, differing types of translation structures may be used. The different partitions are supported by a single hypervisor. 1. A method of facilitating memory access , said method comprising:providing a first partition within a system configuration, the first partition configured to support an operation system (OS) designed for a first address translation architecture, the first partition not supporting an OS designed for a second address translation architecture; andproviding a second partition within the system configuration, the second partition configured to support the OS designed for the second address translation architecture, the second partition not supporting the OS designed for the first address translation architecture, wherein the first address translation architecture is structurally different from the second address translation architecture.2. The method of claim 1 , wherein the first address translation architecture uses a hash structure and the second address translation architecture uses a hierarchical table structure.3. The method of claim 1 , wherein the first partition is a paravirtualized partition in which a guest of the first partition assists in handling address translation faults corresponding to host translations claim 1 , and the second partition is a virtualized partition in which handling of address translation faults corresponding to host translations is independent of assistance from a guest of the second partition.4. The method of claim 1 , wherein the first partition uses a single level address translation mechanism for translating guest virtual addresses to host physical addresses claim 1 , and the second partition uses a nested level address translation mechanism for ...

Подробнее

Номер записи: 160

10-04-2014 дата публикации

SELECTABLE ADDRESS TRANSLATION MECHANISMS

Номер: US20140101404A1

Автор: Bybell Anthony J., Frey Bradly G., Gschwind Michael K.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

An address translation capability is provided in which translation structures of different types are used to translate memory addresses from one format to another format. Multiple translation structure formats (e.g., multiple page table formats, such as hash page tables and hierarchical page tables) are concurrently supported in a system configuration, and the use of a particular translation structure format in translating an address is selectable. 1. A computer program product for facilitating translation of memory addresses , said computer program product comprising: obtaining, by a processor, a memory address to be translated, the processor having a first type of address translation structure and a second type of address translation structure available for translating the memory address, wherein the first type is structurally different from the second type; and', 'selecting, by the processor, one type of address translation structure of the first type of address translation structure and the second type of address translation structure to be used in translating the memory address., 'a computer readable storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for performing a method comprising2. The computer program product of claim 1 , wherein the method further comprises:obtaining, by the processor, a first address to be translated to a second address; andusing a third type of address translation structure to translate the first address to the second address, the second address being the memory address to be translated by the selected one type of address translation structure.3. The computer program product of claim 2 , wherein the selected one type of address translation structure is different from the third type of address translation structure.4. The computer program product of claim 3 , wherein the third type of address translation structure uses a hierarchical translation structure and the selected one ...

Подробнее

Номер записи: 161

10-04-2014 дата публикации

ADJUNCT COMPONENT TO PROVIDE FULL VIRTUALIZATION USING PARAVIRTUALIZED HYPERVISORS

Номер: US20140101406A1

Автор: Gschwind Michael K.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

A system configuration is provided with a paravirtualizing hypervisor that supports different types of guests, including those that use a single level of translation and those that use a nested level of translation. When an address translation fault occurs during a nested level of translation, an indication of the fault is received by an adjunct component. The adjunct component addresses the address translation fault, at least in part, on behalf of the guest. 1. A method of facilitating translation of a guest memory address , said method comprising:obtaining, by an adjunct component, an indication of an address translation fault related to the guest memory address, the adjunct component being separate and distinct from a guest operating system and executing on a processor of a system configuration, the system configuration comprising the guest operating system supported by a hypervisor, the hypervisor being a paravirtualized hypervisor configured such that address translation faults related to host translations of guest memory addresses are managed in part by the guest operating system; andbased on obtaining the indication of the address translation fault, providing, by the adjunct component to the hypervisor, address translation information to enable successful performance of a host translation of the guest memory address.2. The method of claim 1 , wherein the hypervisor supports a first type of guest that uses a single level of translation and a second type of guest that uses a nested level of translation claim 1 , the guest operating system being a second type of guest.3. The method of claim 1 , wherein the address translation fault is based on a translation from a guest physical address to a host physical address.4. The method of claim 3 , wherein the guest physical address is provided as a result of translating a guest virtual address to the guest physical address in a guest level translation.5. The method of claim 1 , wherein the providing comprises updating a ...

Подробнее

Номер записи: 162

10-04-2014 дата публикации

ASYMMETRIC CO-EXISTENT ADDRESS TRANSLATION STRUCTURE FORMATS

Номер: US20140101408A1

Автор: Bybell Anthony J., Dukro David D., Frey Bradly G., Gschwind Michael K.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

An address translation capability is provided in which translation structures of different types are used to translate memory addresses from one format to another format. Multiple translation structure formats (e.g., multiple page table formats, such as hash page tables and hierarchical page tables) are concurrently supported in a system configuration. This facilitates provision of guest access in virtualized operating systems, and/or the mixing of translation formats to better match the data access patterns being translated. 1. A method of facilitating translation of memory addresses , said method comprising:determining, by a processor, whether a first address translation structure of a first type is to be used to translate a memory address;based on the determining that a first address translation structure of the first type is to be used, accessing a second address translation structure of a second type, the second type being different from the first type, to determine a particular first address translation structure to be used and to obtain an origin address of that particular first address translation structure; andusing the particular first address translation structure in translating the memory address.2. The method of claim 1 , wherein the determining comprises checking an indicator to determine whether the first address translation structure is to be used claim 1 , the indicator located in an entry of a data structure located using a portion of the memory address to be translated.3. The method of claim 2 , wherein the indicator is located in a segment lookaside buffer entry (SLBE) claim 2 , the SLBE located using an effective segment identifier field of the memory address.4. The method of claim 3 , wherein the SLBE includes a virtual segment identifier (VSID) field claim 3 , and wherein the accessing the second address translation structure comprises using the VSID to locate an entry in the second address translation structure that includes the origin ...

Подробнее

Номер записи: 163

05-01-2017 дата публикации

INACCESSIBILITY STATUS INDICATOR

Номер: US20170003913A1

Автор: Gschwind Michael K., Olsson Brett

Принадлежит:

Processing within a computing environment is facilitated by use of an inaccessibility status indicator. A processor determines whether a unit of memory to be accessed is inaccessible in that default data is to be used for the unit of memory. The determining is based on an inaccessibility status indicator in a selected location accessible to the processor. Based on the determining indicating the unit of memory is inaccessible, default data is provided to be used for a request associated with the unit of memory. 1. A computer program product for facilitating processing within a computing environment , said computer program product comprising: attempting access to a portion of memory, the portion of memory including a first unit of memory and a second unit of memory;', 'determining, by a processor, whether one unit of memory of the first unit of memory and the second unit of memory is inaccessible in that default data is to be used for the one unit of memory, the determining being based on an inaccessibility status indicator in a selected location accessible to the processor;', 'providing default data to be used for a request associated with the portion of memory, based on the determining indicating the one unit of memory is inaccessible;', 'obtaining data from another unit of memory of the first unit of memory and the second unit of memory that is accessible, the other unit of memory being different from the one unit of memory; and', 'combining the data and the default data to provide combined data to be used for the request., 'a computer readable storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for performing a method comprising2. The computer program product of claim 1 , wherein the combined data is to be used to populate a memory operand to be used by an instruction.3. The computer program product of claim 2 , wherein the method further comprises:receiving, by a control component, a fault for the one ...

Подробнее

Номер записи: 164

05-01-2017 дата публикации

Inaccessibility status indicator

Номер: US20170003914A1

Автор: Brett Olsson, Michael K. Gschwind

Принадлежит: International Business Machines Corp

Processing within a computing environment is facilitated by use of an inaccessibility status indicator. A processor determines whether a unit of memory to be accessed is inaccessible in that default data is to be used for the unit of memory. The determining is based on an inaccessibility status indicator in a selected location accessible to the processor. Based on the determining indicating the unit of memory is inaccessible, default data is provided to be used for a request associated with the unit of memory.

Подробнее

Номер записи: 165

05-01-2017 дата публикации

Initialization status of a register employed as a pointer

Номер: US20170003941A1

Автор: Michael K. Gschwind

Принадлежит: International Business Machines Corp

Initialization status of a register to be used as a pointer to a reference data structure is used to determine how a stub is to be generated to access the reference data structure. The register is one type of pointer configuration to be used to access the reference data structure, which is used to resolve a symbol associated with a function of a program. An indication is obtained as to whether the register has been initialized with a reference data structure pointer. Based on obtaining the indication, a stub is generated that is to be used to access the function. The generating depends on whether the register has been initialized. If the register has not been initialized, then the stub is generated to include another type of pointer configuration to be used to access the reference data structure.

Подробнее

Номер записи: 166

05-01-2017 дата публикации

INITIALIZATION STATUS OF A REGISTER EMPLOYED AS A POINTER

Номер: US20170003942A1

Автор: Gschwind Michael K.

Принадлежит:

Initialization status of a register to be used as a pointer to a reference data structure is used to determine how a stub is to be generated to access the reference data structure. The register is one type of pointer configuration to be used to access the reference data structure, which is used to resolve a symbol associated with a function of a program. An indication is obtained as to whether the register has been initialized with a reference data structure pointer. Based on obtaining the indication, a stub is generated that is to be used to access the function. The generating depends on whether the register has been initialized. If the register has not been initialized, then the stub is generated to include another type of pointer configuration to be used to access the reference data structure. 1. A computer-implemented method of facilitating processing within a computing environment , the computer-implemented method comprising:obtaining, by a processor, an indication of whether a register has been initialized with a reference data structure pointer, the register being one type of pointer configuration to be used to access a reference data structure to be used to resolve a symbol associated with a function of a program; andbased on obtaining the indication, generating a stub to be used to access the function, the generating depending on whether the indication indicates the register has been initialized, wherein based on the indication indicating the register has not been initialized, the generating comprises generating the stub to include another type of pointer configuration to be used to access the reference data structure, the other type of pointer configuration being different from the one type of pointer configuration.2. The computer-implemented method of claim 1 , wherein the other type of pointer configuration is a non-register pointer configuration.3. The computer-implemented method of claim 1 , wherein the other type of pointer configuration includes ...

Подробнее

Номер записи: 167

05-01-2017 дата публикации

NON-FAULTING COMPUTE INSTRUCTIONS

Номер: US20170003961A1

Автор: Gschwind Michael K., Olsson Brett

Принадлежит:

A compute instruction to be executed is to use a memory operand in a computation. An address associated with the memory operand is to be used to locate a portion of memory from which data is to be obtained and placed in the memory operand. A determination is made as to whether the portion of memory extends across a specified memory boundary. Based on the portion of memory extending across the specified memory boundary, the portion of memory includes a plurality of memory units and a check is made as to whether at least one specified memory unit is accessible and whether at least one specified memory unit is inaccessible. Based on the checking indicating the at least one specified memory unit is accessible and the at least one specified memory unit is inaccessible accessing the at least one specified memory unit that is accessible and placing data from the at least one specified memory unit that is accessible in one or more locations in the memory operand, and for the at least one unit of memory that is inaccessible, placing default data in one or more other locations of the memory operand. 1. A computer-implemented method of facilitating processing of compute instructions in a computing environment , said computer-implemented method comprising:obtaining, by a processor, a compute instruction to be executed, the compute instruction to use a memory operand in a computation indicated by the compute instruction;obtaining an address associated with the memory operand, the address to be used to locate a portion of memory from which data is to be obtained and placed in the memory operand;determining whether the portion of memory extends across a specified memory boundary, wherein based on the portion of memory extending across the specified memory boundary, the portion of memory comprises a plurality of memory units;based on determining the portion of memory extends across the specified memory boundary, checking whether at least one specified memory unit of the plurality ...

Подробнее

Номер записи: 168

05-01-2017 дата публикации

Multi-Section Garbage Collection

Номер: US20170004072A1

Автор: Frazier Giles R., Gschwind Michael Karl, MANTON Younes, Taylor Karl M., Thompto Brian W.

Принадлежит: INTERNATIONAL BUSINESS MACHINES CORPORATION

The embodiments relate to a method for managing a garbage collection process. The method includes executing a garbage collection process on a memory block of user address space. A load instruction is run. Running the load instruction includes loading content of a storage location into a processor. The loaded content corresponds to a memory address. It is determined if the garbage collection process is being executed at the memory address. The load instruction is diverted to a process to move an object at the memory address to a location outside of the memory block in response to determining that the garbage collection process is being executed at the first memory address. The load instruction is continued in response to determining that the garbage collection process is not being executed at the memory address. 1. A computer program product for facilitating garbage collection within a computing environment , the computer program product comprising: obtaining processing control by a handler executing within a processor of the computer environment, the obtaining processing control being based on execution of a load instruction and a determination that an object pointer to be loaded indicates a location within a selected portion of memory undergoing garbage collection;', 'based on obtaining processing control by the handler, obtaining by the handler an image of the instruction and calculating a pointer address from the image, the address specifying a location of the object pointer;', 'based on obtaining the address of the object pointer, reading, by the handler, the object pointer, the object pointer indicating a location of an object pointed to by the object pointer;', 'determining by the handler whether the object pointer is to be modified;', 'modifying by the handler, based on determining the object pointer is to be modified, the object pointer to provide a modified object pointer; and', 'storing, based on modifying the object pointer, the modified object pointer in ...

Подробнее

Номер записи: 169

05-01-2017 дата публикации

GARBAGE COLLECTION HANDLER TO UPDATE OBJECT POINTERS

Номер: US20170004073A1

Автор: Frazier Giles R., Gschwind Michael K.

Принадлежит:

Garbage collection processing is facilitated. Based on execution of a load instruction and determining that an object pointer to be loaded indicates a location within a selected portion of memory undergoing garbage collection, processing control is obtained by a handler executing within a processor of the computing environment. The handler obtains an address of the object pointer from a pre-defined location, reads the object pointer, and determines whether the object pointer is to be modified. If the object pointer is to be modified, the handler modifies the object pointer. The handler then stores the modified object pointer in a selected location. 1. A computer program product for facilitating garbage collection within a computing environment , said computer program product comprising: obtaining processing control by a handler executing within a processor of the computing environment, the obtaining processing control being based on execution of a load instruction and a determination that an object pointer to be loaded indicates a location within a selected portion of memory undergoing garbage collection;', 'based on obtaining processing control by the handler, obtaining by the handler from a pre-defined location an address of the object pointer, the address specifying a location of the object pointer;', 'based on obtaining the address of the object pointer, reading, by the handler, the object pointer, the object pointer indicating a location of an object pointed to by the object pointer;', 'determining by the handler whether the object pointer is to be modified;', 'modifying by the handler, based on determining the object pointer is to be modified, the object pointer to provide a modified object pointer; and', 'storing, based on modifying the object pointer, the modified object pointer in a selected location., 'a computer readable storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for performing a method ...

Подробнее

Номер записи: 170

05-01-2017 дата публикации

GARBAGE COLLECTION ABSENT USE OF SPECIAL INSTRUCTIONS

Номер: US20170004074A1

Автор: Frazier Giles R., Gschwind Michael K.

Принадлежит:

Garbage collection processing is facilitated. Based on execution of a load instruction and determining that an address of an object pointer to be loaded is located in a pointer storage area and the object pointer indicates a location within a selected portion of memory undergoing garbage collection, processing control is obtained by a handler executing within a processor of the computing environment. The handler obtains the object pointer from the pointer storage area, and determines whether the object pointer is to be modified. If the object pointer is to be modified, the handler modifies the object pointer. The handler may then store the modified object pointer in a selected location. 1. A computer program product for facilitating garbage collection within a computing environment , said computer program product comprising: obtaining processing control by a handler executing within a processor of the computing environment, the obtaining processing control being based on execution of a load instruction and a determination that an address of an object pointer to be loaded is located in a pointer storage area and the object pointer indicates a location within a selected portion of memory undergoing garbage collection;', 'based on obtaining processing control by the handler, obtaining by the handler from the pointer storage area the object pointer, the object pointer indicating a location of an object pointed to by the object pointer;', 'determining by the handler whether the object pointer is to be modified;', 'modifying by the handler, based on determining the object pointer is to be modified, the object pointer to provide a modified object pointer; and', 'storing, based on modifying the object pointer, the modified object pointer in a selected location., 'a computer readable storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for performing a method comprising2. The computer program product of claim 1 , ...

Подробнее

Номер записи: 171

Настройки

Небесная энциклопедия

Мониторинг СМИ

Форма поиска