Настройки

Укажите год
-

Небесная энциклопедия

Космические корабли и станции, автоматические КА и методы их проектирования, бортовые комплексы управления, системы и средства жизнеобеспечения, особенности технологии производства ракетно-космических систем

Подробнее
-

Мониторинг СМИ

Мониторинг СМИ и социальных сетей. Сканирование интернета, новостных сайтов, специализированных контентных площадок на базе мессенджеров. Гибкие настройки фильтров и первоначальных источников.

Подробнее

Форма поиска

Поддерживает ввод нескольких поисковых фраз (по одной на строку). При поиске обеспечивает поддержку морфологии русского и английского языка
Ведите корректный номера.
Ведите корректный номера.
Ведите корректный номера.
Ведите корректный номера.
Укажите год
Укажите год

Применить Всего найдено 1508. Отображено 197.
05-06-2019 дата публикации

Программируемые устройства для обработки запросов передачи данных памяти

Номер: RU2690751C2

Изобретение относится к устройствам обработки запросов передачи данных памяти. Технический результат заключается в повышении пропускной способности памяти. Устройство включает в себя программируемый блок обработки запросов передачи данных памяти (ПОЗПДП) и программируемый блок прямого доступа к памяти (ППДП). Блок ПОЗПДП включает в себя как минимум один программируемый дескриптор региона. Блок ППДП включает в себя как минимум один программируемый управляющий дескриптор передачи память-память. Блок ППДП адаптирован для отправки запроса передачи данных памяти на блок ПОЗПДП. Блок ПОЗПДП адаптирован для приема и успешной обработки запроса передачи данных памяти, отправленного блоком ППДП, который адресован в область памяти, которая ассоциирована с частью как минимум одного дескриптора региона блока ПОЗПДП. 6 н. и 23 з.п. ф-лы, 24 ил.

Подробнее
05-07-2006 дата публикации

Multiple contexts for efficient use of translation lookaside buffer

Номер: GB2421821A
Принадлежит:

The present invention provides a method and apparatus for increased efficiency for translation lookaside buffers by collapsing redundant translation table entries into a single translation table entry (TTE). In the present invention, each thread of a multitreaded processor is provided with multiple context registers. Each of these context registers is compared independently to the context of the TTE. If any of the contexts match (and the other match conditions are satisfied), then the translation is allowed to proceed. Two applications attempting to share one page but that still keep separate pages can then employ three total contexts. One context is for one application's private use; one of the contexts is for the other application's private use; and a third context is for the shared page. In one embodiment of the invention, two contexts are implemented per thread. However, the teachings of the present invention can be extended to a higher number of contexts per thread.

Подробнее
02-12-2015 дата публикации

An apparatus and method for operating a virtually indexed physically tagged cache

Номер: GB0201518257D0
Автор:
Принадлежит:

Подробнее
22-02-2023 дата публикации

Memory management unit

Номер: GB0002610023A
Принадлежит:

The present disclosure advantageously provides a memory management unit and methods for invalidating cache lines in address translation caches. The memory management unit has one or more address translation caches, and each address translation cache has a plurality of cache lines. The memory management unit receives transactions from a source of transactions. The transactions include, inter alia, memory transactions and a set-aside translation transaction. The memory transactions include at least a first memory transaction and a last memory transaction, and each memory transaction includes the same virtual memory address and the same translation context identifier. The set-aside translation transaction also includes the same virtual memory address and the same translation context identifier. In response to receiving the set-aside translation transaction, the memory management unit deallocates each cache line that stores an address translation for the same virtual memory address and the ...

Подробнее
30-11-1978 дата публикации

Номер: CH0000607139A5
Принадлежит: SPERRY RAND CORP, SPERRY RAND CORP.

Подробнее
14-07-1978 дата публикации

Номер: CH0000601859A5
Принадлежит: SIEMENS AG

Подробнее
19-01-2018 дата публикации

Memory physical-address query method and memory physical-address query devices

Номер: CN0107608912A
Принадлежит:

Подробнее
07-11-2014 дата публикации

METHOD FOR FILTERING TRAFFIC TO A PHYSICALLY-TAGGED DATA CACHE

Номер: KR0101458928B1
Автор:
Принадлежит:

Подробнее
12-07-2012 дата публикации

MEMORY ADDRESS TRANSLATION

Номер: WO2012094481A2
Принадлежит:

The present disclosure includes devices, systems, and methods for memory address translation. One or more embodiments include a memory array and a controller coupled to the array. The array includes a first table having a number of records, wherein each record includes a number of entries, wherein each entry includes a physical address corresponding to a data segment stored in the array and a logical address. The controller includes a second table having a number of records, wherein each record includes a number of entries, wherein each entry includes a physical address corresponding to a record in the first table and a logical address. The controller also includes a third table having a number of records, wherein each record includes a number of entries, wherein each entry includes a physical address corresponding to a record in the second table and a logical address.

Подробнее
02-07-1991 дата публикации

Coherent cache structures and methods

Номер: US0005029070A
Автор:
Принадлежит:

A multiprocessing system includes a cache coherency technique that ensures that every access to a line of data is the most up-to-date copy of that line without storing cache coherency status bits in a global memory and any reference thereto. An operand cache includes a first directory which directly, on a one-to-one basis, maps a range of physical address bits into a first section of the operand cache storage. An associative directory multiply maps physical addresses outside of the range into a second section of the operand cache storage section. All stack frames of user programs to be executed on a time-shared basis are stored in the first section, so cache misses due to stack operations are avoided. An instruction cache having various categories of instructions stores a group of status bits identifying the instruction category with each instruction. When a context switch occurs, only instructions of the category least likely to be used in the near future are cleared decreasing delays ...

Подробнее
27-06-2000 дата публикации

In-line bank conflict detection and resolution in a multi-ported non-blocking cache

Номер: US0006081873A
Автор:
Принадлежит:

A data cache unit associated with a processor, the data cache unit including a multi-ported non-blocking cache receiving a data access request from a lower level device in the processor. A memory scheduling window includes at least one row of entries, wherein each entry includes an address field holding an address of the access request. A conflict map field within at least some of the entries is coupled to a conflict checking unit. The conflict checking unit responds to the address fields by setting bits in the conflict map fields to indicate intra-row conflicts between entries. A picker coupled to the memory scheduling window responds to the conflict map fields so as to identify groups of non-conflicting entries to launch in parallel at the multi-ported non-blocking cache.

Подробнее
03-07-2001 дата публикации

System and method of performing gateway access

Номер: US0006256715B1

A virtual memory system including a local-to-global virtual address translator for translating local virtual addresses having associated task specific address spaces into global virtual addresses corresponding to an address space associated with multiple tasks, and a global virtual-to-physical address translator for translating global virtual addresses to physical addresses. Protection information is provided by each of the local virtual-to-global virtual address translator, the global virtual-to-physical address translator, the cache tag storage, or a protection information buffer depending on whether a cache hit or miss occurs during a given data or instruction access. Memory area priority protection is achieved by employing a gateway instruction which includes a gateway register pointer and the priority level of the instruction. The gateway register holds a pointer to a specific entry point within a high priority area within memory at which the lower priority gateway instruction may ...

Подробнее
05-07-1983 дата публикации

Diagnostic subsystem for a cache memory

Номер: US0004392201A
Автор:
Принадлежит:

A cache memory wherein data words identified by odd address numbers are stored separately from data words identified by even address numbers. A group of diagnostic control registers supply signals for controlling the testing of the cache within the cache memory to determine the operability of the individual elements included in the cache memory.

Подробнее
16-09-2003 дата публикации

System and methods using a system-on-a-chip with soft cache

Номер: US0006622208B2

A soft cache system compares tag bits of a virtual address with tag fields of a plurality of soft cache register entries, each entry associated with an index to a corresponding cache line in virtual memory. A cache line size for the cache line is programmable. When the tag bits of the virtual address match the tag field of one of the soft cache entries, the index from that entry is selected for generating a physical address. The physical address is generated using the selected index as an offset to a corresponding soft cache space in memory.

Подробнее
24-02-2015 дата публикации

Access optimization method for main memory database based on page-coloring

Номер: US0008966171B2

An access optimization method for a main memory database based on page-coloring is described. An access sequence of all data pages of a weak locality dataset is ordered by page-color, and all the data pages are grouped by page-color, and then all the data pages of the weak locality dataset are scanned in a sequence of page-color grouping. Further, a number of memory pages having the same page-color are preset as a page-color queue, in which the page-color queue serves as a memory cache before a memory page is loaded into a CPU cache; the data page of the weak locality dataset first enters the page-color queue in an asynchronous mode, and is then loaded into the CPU cache to complete data processing. Accordingly, cache conflicts between datasets with different data locality strengths can be effectively reduced.

Подробнее
15-11-2001 дата публикации

Method and apparatus for determining interleaving schemes in a computer system that supports multiple interleaving schemes

Номер: US2001042174A1
Автор:
Принадлежит:

A method and apparatus determines interleaving schemes in a computer system that supports multiple interleaving schemes. In one embodiment, a memory interleaving scheme lookup table is used to assign memory interleaving schemes based on the number of available bank bits. In another embodiment, the percentage of concurrent memory operations is increased by assigning memory interleaving schemes to bank bits based on the classification of bank bits. The present invention supports a memory organization that provides separate memory busses that support independent simultaneous memory transactions, and memory bus segments that allow memory read operations to be overlapped with memory write operations, with each memory bus segment capable of carrying single memory operation at any given time. Bank bits that distinguish between memory busses are classified as class A, bank bits that distinguish between memory bus segments are classified as class B, and bank bits that distinguish between banks on ...

Подробнее
20-06-2019 дата публикации

TWO ADDRESS TRANSLATIONS FROM A SINGLE TABLE LOOK-ASIDE BUFFER READ

Номер: US20190188151A1
Принадлежит:

A streaming engine employed in a digital data processor specifies a fixed read only data stream. An address generator produces virtual addresses of data elements. An address translation unit converts these virtual addresses to physical addresses by comparing the most significant bits of a next address N with the virtual address bits of each entry in an address translation table. Upon a match, the translated address is the physical address bits of the matching entry and the least significant bits of address N. The address translation unit can generate two translated addresses. If the most significant bits of address N+1 match those of address N, the same physical address bits are used for translation of address N+1. The sequential nature of the data stream increases the probability that consecutive addresses match the same address translation entry and can use this technique.

Подробнее
29-04-2014 дата публикации

Image forming apparatus and method of translating virtual memory address into physical memory address

Номер: US0008711418B2

An image forming apparatus includes a function unit to perform functions of the image forming apparatus, and a control unit to control the function unit to perform the functions of the image forming apparatus. The control unit includes a processor core to operate in a virtual memory address, a main memory to operate in a physical memory address and store data used in the functions of the image forming apparatus, and a plurality of input/output (I/O) logics to operate in the virtual memory address and control at least one of the functions performed by the image forming apparatus. Each of the plurality of I/O logics translates the virtual memory address into the physical memory address corresponding to the virtual memory address and accesses the main memory.

Подробнее
21-08-2012 дата публикации

Operating system virtual memory management for hardware transactional memory

Номер: US0008250331B2

Operating system virtual memory management for hardware transactional memory. A method may be performed in a computing environment where an application running on a first hardware thread has been in a hardware transaction, with transactional memory hardware state in cache entries correlated by memory hardware when data is read from or written to data cache entries. The data cache entries are correlated to physical addresses in a first physical page mapped from a first virtual page in a virtual memory page table. The method includes an operating system deciding to unmap the first virtual page. As a result, the operating system removes the mapping of the first virtual page to the first physical page from the virtual memory page table. As a result, the operating system performs an action to discard transactional memory hardware state for at least the first physical page. Embodiments may further suspend hardware transactions in kernel mode. Embodiments may further perform soft page fault handling ...

Подробнее
03-10-2002 дата публикации

Descriptor table storing segment descriptors of varying size

Номер: US2002144080A1
Автор:
Принадлежит:

A segment descriptor table is stores segment descriptors of different sizes. Smaller segment descriptors may be segment descriptors similar to the x86 architecture definition, and larger segment descriptors may be used to provide virtual addresses (e.g. base addresses or offsets) having more the 32 bits. By providing a segment descriptor table that stores different sized segment descriptors, maintaining multiple segment descriptor tables for different operating modes may be avoidable while providing support for segment descriptors having addresses greater than 32 bits. In one embodiment, the larger segment descriptors may be twice the size of the smaller segment descriptors. The segment descriptor table may comprise entries, each capable of storing the smaller segment descriptor, and a larger segment descriptor may occupy two entries of the table.

Подробнее
05-04-2022 дата публикации

Method and apparatus for vector permutation

Номер: US0011294826B2
Принадлежит: Texas Instruments Incorporated

A method is provided that includes performing, by a processor in response to a vector permutation instruction, permutation of values stored in lanes of a vector to generate a permuted vector, wherein the permutation is responsive to a control storage location storing permute control input for each lane of the permuted vector, wherein the permute control input corresponding to each lane of the permuted vector indicates a value to be stored in the lane of the permuted vector, wherein the permute control input for at least one lane of the permuted vector indicates a value of a selected lane of the vector is to be stored in the at least one lane, and storing the permuted vector in a storage location indicated by an operand of the vector permutation instruction.

Подробнее
26-10-2023 дата публикации

TRANSLATION TABLE ADDRESS STORAGE CIRCUITRY

Номер: US20230342303A1
Принадлежит:

An apparatus has address translation circuitry to translate a target virtual address (VA) specified by a memory access request into a target physical address, first/second translation table address storage circuitry to store first/second translation table addresses; and protected region defining data storage circuitry to store region defining data specifying at least one protected region of virtual address space. In response to the memory access request: when the target VA is in the protected region(s), the address translation circuitry translates the target VA based on address translation data from a first translation table structure identified by the first translation table address. When the target VA is outside the protected region(s), the target VA is translated based on address translation data from a second translation table structure identified by the second translation table address.

Подробнее
07-06-2006 дата публикации

Microprocessor with variable latency stack cache

Номер: EP0001555617A3
Автор: Hooker, Rodney E.
Принадлежит:

A variable latency cache memory is disclosed. The cache memory includes a plurality of storage elements for storing stack memory data in a first-in-first-out manner. The cache memory distinguishes between pop and load instruction requests and provides pop data faster than load data by speculating that pop data will be in the top cache line of the cache. The cache memory also speculates that stack data requested by load instructions will be in the top one or more cache lines of the cache memory. Consequently, if the source virtual address of a load instruction hits in the top of the cache memory, the data is speculatively provided faster than the case where the data is in a lower cache line or where a full physical address compare is required or where the data must be provided from a non-stack cache memory in the microprocessor, but slower than pop data.

Подробнее
16-11-2005 дата публикации

Controller for instruction cache and instruction translation look-aside buffer, and method of controlling the same

Номер: GB0000520272D0
Автор:
Принадлежит:

Подробнее
14-03-2001 дата публикации

Method and system for providing a high bandwidth cache that enables simultaneous reads and writes within the cache

Номер: GB0000102442D0
Автор:
Принадлежит:

Подробнее
05-10-2016 дата публикации

Memory management

Номер: GB0002536880A
Принадлежит:

A multiple stage memory management unit (MMU) 10 comprises a first MMU stage 12 to translate an input virtual memory address to an intermediate memory address, generating a set (burst) of intermediate memory addresses including the corresponding intermediate memory address; and a second MMU stage 14 to translate an intermediate memory address received from the first MMU stage to a physical memory address, providing a set (burst) of physical memory addresses including the physical memory address corresponding to the intermediate memory address. The first MMU stage is configured to provide a subset of the burst of intermediate memory addresses to the second MMU stage for translation. The subset does not include intermediate memory addresses, such as those within a threshold separation 1030, whose corresponding physical memory address will be provided by the second MMU stage in response to translation of another of the intermediate memory addresses in the set. A multiple stage memory management ...

Подробнее
19-12-1991 дата публикации

METHOD AND APPARATUS FOR CONTROLLING THE CONVERSION OF VIRTUAL TO PHYSICAL MEMORY ADDRESSES IN A DIGITAL COMPUTER SYSTEM

Номер: AU0005395090A
Автор: NAME NOT GIVEN
Принадлежит:

Подробнее
10-01-1984 дата публикации

VIRTUAL MEMORY TERMINAL

Номер: CA1160351A

UK9-80-005 Virtual Memory Terminal A microprocessor controlled terminal (1) with keyboard input (4) and CRT display output (6) employs virtual memory and store management techniques to enable an operator to run any of a wide variety of different applications (e.g. text, image, graphics, data entry, reservation, payroll etc.) in the terminal. Application procedure records and data records relating to a selected application are copied as required from a local backing store (14) and/or over a communications link (9) from a main store (16) of central processing unit (12) into a dynamically managed region (20) of random access memory (RAM) (3) under control of primitive microprocessor control instructions permanently held in read-only store (10). In view of the virtual memory techniques employed, the backing store (14) and main store (16) appear to be logical extensions of RAM (3). The records copied to region (201 are contiguously stored as variable length segments in successive free storage ...

Подробнее
26-10-1993 дата публикации

PROCESSING OF MEMORY ACCESS EXCEPTIONS WITH PRE-FETCHED INSTRUCTIONS WITHIN THE INSTRUCTION PIPELINE OF A VIRTUAL MEMORY SYSTEM-BASED DIGITAL COMPUTER

Номер: CA0001323701C

PROCESSING OF MEMORY ACCESS EXCEPTIONS WITH PRE-FETCHED INSTRUCTIONS WITHIN THE INSTRUCTION PIPELINE OF A VIRTUAL MEMORY SYSTEM-BASED DIGITAL COMPUTER A technique for processing memory access exceptions along with pre-fetched instructions in a pipelined instruction processing computer system is based upon the concept of pipelining exception information along with other parts of the instruction being executed. In response to the detection of access exceptions at a pipeline stage, corresponding fault information is generated and transferred along the pipeline. The fault information is acted upon only when the instruction reaches the execution stage of the pipeline. Each stage of the instruction pipeline is ported into the front end of a memory unit adapted to perform the virtual-to-physical address translation; each port being provided with means for storing virtual addresses accompanying an instruction as well as means for storing corresponding fault information. When a memory access exception ...

Подробнее
16-12-1994 дата публикации

Memory hiding place for numerical processor with translation of virtual addresses in actual

Номер: FR0002682507B1
Автор:
Принадлежит: INTEL CORP

Подробнее
05-03-2014 дата публикации

OPTIMIZATIONS FOR AN UNBOUNDED TRANSACTIONAL MEMORY (UTM) SYSTEM

Номер: KR0101370314B1
Автор:
Принадлежит:

Подробнее
06-08-1998 дата публикации

A CACHE SYSTEM

Номер: WO1998034172A1
Принадлежит:

A cache system is provided which includes a cache memory and a cache refill mechanism which allocates one or more of a set of cache partitions in the cache memory to an item in dependence on the address of the item in main memory. This is achieved in one of the described embodiments by including with the address of an item a set of partition selector bits which allow a partition mask to be generated to identify into which cache partition the item may be loaded.

Подробнее
22-06-2017 дата публикации

TRANSLATION ENTRY INVALIDATION IN A MULTITHREADED DATA PROCESSING SYSTEM

Номер: US20170177501A1
Принадлежит:

In a multithreaded data processing system including a plurality of processor cores, storage-modifying requests, including a translation invalidation request of an initiating hardware thread, are received in a shared queue. The translation invalidation request is removed and buffered in sidecar logic. While the translation invalidation request is buffered in the sidecar logic, the sidecar logic broadcasts the translation invalidation request so that it is received and processed by the plurality of processor cores. In response to confirmation of completion of processing of the translation invalidation request by the initiating processor core, the sidecar logic removes the translation invalidation request from the sidecar. Completion of processing of the translation invalidation request at all of the plurality of processor cores is ensured by a broadcast synchronization request. Subsequent memory referent instructions are ordered with respect to the broadcast synchronization request by a synchronization ...

Подробнее
06-07-2021 дата публикации

Valid bits of a translation lookaside buffer (TLB) for checking multiple page sizes in one probe cycle and reconfigurable sub-TLBS

Номер: US0011055232B2
Принадлежит: Intel Corporation, INTEL CORP

A processor includes a translation lookaside buffer (TLB) to store a TLB entry, wherein the TLB entry comprises a first set of valid bits to identify if the first TLB entry corresponds to a virtual address from a memory access request, wherein the valid bits are set based on a first page size associated with the TLB entry from a first set of different page sizes assigned to a first probe group; and a control circuit to probe the TLB for each page size of the first set of different page sizes assigned to the first probe group in a single probe cycle to determine if the TLB entry corresponds to the virtual address from the memory access request.

Подробнее
01-06-2017 дата публикации

HIGHLY INTEGRATED SCALABLE, FLEXIBLE DSP MEGAMODULE ARCHITECTURE

Номер: US20170153890A1
Принадлежит:

This invention addresses implements a range of interesting technologies into a single block. Each DSP CPU has a streaming engine. The streaming engines include: a SE to L2 interface that can request 512 bits/cycle from L2; a loose binding between SE and L2 interface, to allow a single stream to peak at 1024 bits/cycle; one-way coherence where the SE sees all earlier writes cached in system, but not writes that occur after stream opens; full protection against single-bit data errors within its internal storage via single-bit parity with semi-automatic restart on parity error.

Подробнее
20-07-2006 дата публикации

Multiple contexts for efficient use of translation lookaside buffer

Номер: US20060161760A1
Принадлежит: Sun Microsystems, Inc.

The present invention provides a method and apparatus for increased efficiency for translation lookaside buffers by collapsing redundant translation table entries into a single translation table entry (TTE). In the present invention, each thread of a multithreaded processor is provided with multiple context registers. Each of these context registers is compared independently to the context of the TTE. If any of the contexts match (and the other match conditions are satisfied), then the translation is allowed to proceed. Two applications attempting to share one page but that still keep separate pages can then employ three total contexts. One context is for one application's private use; one of the contexts is for the other application's private use; and a third context is for the shared page. In one embodiment of the invention, two contexts are implemented per thread. However, the teachings of the present invention can be extended to a higher number of contexts per thread.

Подробнее
01-10-2020 дата публикации

PERFORMANCE MANAGEMENT UNIT (PMU) AIDED TIER SELECTION IN HETEROGENEOUS MEMORY

Номер: US20200310957A1
Принадлежит: Intel Corporation

A processor including a processing core to execute an instruction prior to executing a memory allocation call; one or more last branch record (LBR) registers to store one or more recently retired branch instructions; a performance monitoring unit (PMU) comprising a logic circuit to: retrieve the one or more recently retired branch instructions from the one or more LBR registers; identify, based on the retired branch instructions, a signature of the memory allocation call; provide the signature to software to determine a memory tier to allocate memory for the memory allocation call.

Подробнее
09-04-2019 дата публикации

Method and apparatus for sub-page write protection

Номер: US0010255196B2
Принадлежит: Intel Corporation, INTEL CORP

An apparatus and method for sub-page extended page table protection. For example, one embodiment of an apparatus comprises: a page miss handler to perform a page walk using a guest physical address (GPA) and to detect whether a page identified with the GPA is mapped with sub-page permissions; a sub-page control storage to store at least one GPA and other data related to a sub-page; the page miss handler to determine whether the GPA is programmed in the sub-page control storage; and the page miss handler to send a translation to a translation lookaside buffer (TLB) with a sub-page protection indication set to cause a matching of the sub-page control storage when an access matches a TLB entry with sub-page protection indication.

Подробнее
18-07-2017 дата публикации

Hierarchical translation structures providing separate translations for instruction fetches and data accesses

Номер: US0009710382B2

Hierarchical address translation structures providing separate translations for instruction fetches and data accesses. An address is to be translated from the address to another address using a hierarchy of address translation structures. The hierarchy of address translation structures includes a plurality of levels, and a determination is made as to which level of the plurality of levels it is indicated that translation through the hierarchy of address translation structures is to split into a plurality of translation paths. The hierarchy of address translation structures is traversed to obtain information to be used to translate the address to the another address, in which the traversing selects, based on a determination of the level that indicates the split and based on an attribute of the address to be translated, one translation path of the plurality of translation paths to obtain the information to be used to translate the address to the another address. The information is then used ...

Подробнее
24-04-1984 дата публикации

Data steering logic for the output of a cache memory having an odd/even bank structure

Номер: US0004445172A
Автор:
Принадлежит:

A cache memory including an even data store for storing data words associated with even address numbers and an odd data store for storing data words associated with odd address numbers, a local bus for transferring a low order data word and a high order data word simultaneously from the cache memory to a system element requesting the transfer of a pair of data words through the supplying of a single address number request, and a data steering multiplexer for supplying the data word associated with the memory request number, as outputted from either the odd or even cache data store to the low order data word transfer portion of the local bus and the other of the pair of data words outputted from the odd or even data store to the high order data word transfer portion of the local bus.

Подробнее
29-05-2001 дата публикации

Prefetching hints

Номер: US0006240488B1

A processor capable of executing prefetching instructions containing hint fields is provided. The hint fields contain a first portion which enables the selection of a destination indicator for refill operations, and a second portion which identifies a destination.

Подробнее
05-04-2016 дата публикации

Mixed memory type hybrid cache

Номер: US0009304913B2
Принадлежит: QUALCOMM INCORPORATED, QUALCOMM INC

A hybrid cache includes a static random access memory (SRAM) portion and a resistive random access memory portion. Cache lines of the hybrid cache are configured to include both SRAM macros and resistive random access memory macros. The hybrid cache is configured so that the SRAM macros are accessed before the resistive random memory macros in each cache access cycle. While SRAM macros are accessed, the slower resistive random access memory reach a data access ready state.

Подробнее
26-01-2021 дата публикации

Write data allocation in storage system

Номер: US0010901906B2

This disclosure provides a method, a computing system and a computer program product for allocating write data in a storage system. The storage system comprises a Non-Volatile Write Cache (NVWC) and a backend storage subsystem, and the write data comprises first data whose addresses are not in the NVWC. The method includes checking fullness of the NVWC, and determining at least one of a write-back mechanism or a write-through mechanism as a write mode for the first data based on the checked fullness.

Подробнее
13-06-2024 дата публикации

Accessing a Cache Based on an Address Translation Buffer Result

Номер: US20240193097A1
Принадлежит: Advanced Micro Devices, Inc.

Address translation is performed to translate a virtual address targeted by a memory request (e.g., a load or memory request for data or an instruction) to a physical address. This translation is performed using an address translation buffer, e.g., a translation lookaside buffer (TLB). One or more actions are taken to reduce data access latencies for memory requests in the event of a TLB miss where the virtual address to physical address translation is not in the TLB. Examples of actions that are performed in various implementations in response to a TLB miss include bypassing level 1 (L1) and level 2 (L2) caches in the memory system, and speculatively sending the memory request to the L2 cache while checking whether the memory request is satisfied by the L1 cache.

Подробнее
28-04-1999 дата публикации

Cache memory with reduced access time

Номер: EP0000911737A1
Автор: Naffziger, Samuel D.
Принадлежит:

A cache with a translation lookaside buffer (TLB) (210) that eliminates the need for retrieval of a physical address tag from the TLB when accessing the cache. The TLB includes two content addressable memories (CAM's) (206, 208). For each new cache line, in the tag portion of the cache (204), instead of storing physical tags, the cache stores vectors called physical hit vectors. Physical hit vectors are generated by a first TLB CAM (206). Each physical hit vector indicates all locations in the first TLB CAM containing the physical tag (203) of the cache line. For a cache access, a second TLB CAM (208) receives a virtual tag (202) and generates a vector called a virtual hit vector (214). The virtual hit vector indicates the location in the second TLB CAM of the corresponding virtual tag. Then, instead of retrieving and comparing physical tags, the cache compares a virtual hit vector to a set of physical hit vectors without having to retrieve a physical tag. As a result, one operation is ...

Подробнее
26-11-2018 дата публикации

Номер: RU2017118316A3
Автор:
Принадлежит:

Подробнее
20-02-2002 дата публикации

Cache chain structure to implement high bandwidth low latency cache memory subsystem

Номер: GB0002365591A
Принадлежит:

The inventive cache 100 uses a queuing structure 206, 209 which provides out-of-order cache memory access support for multiple accesses, as well as support for managing bank conflicts and address conflicts. The inventive cache can support four data accesses that are hits per clocks, support one access that misses the L1 cache every clock, and support one instruction access every clock. The responses are interspersed in the pipeline, so that conflicts in the queue are minimized. Non-conflicting accesses are not inhibited, however, conflicting accesses are held up until the conflict clears. The inventive cache provides out-of-order support after the retirement stage of a pipeline.

Подробнее
11-11-2015 дата публикации

Hazard Checking

Номер: GB0201516967D0
Автор:
Принадлежит:

Подробнее
19-09-1979 дата публикации

DATA PROCESSING SYSTEMS

Номер: GB0001553048A
Автор:
Принадлежит:

Подробнее
16-07-2014 дата публикации

Processor with kernel mode access to user space virtual addresses

Номер: GB0201409727D0
Автор:
Принадлежит:

Подробнее
05-11-2014 дата публикации

Cache hashing

Номер: GB0201416619D0
Автор:
Принадлежит:

Подробнее
16-02-2018 дата публикации

Access System And Method For Data Storage

Номер: CN0107710172A
Принадлежит:

Подробнее
03-08-2018 дата публикации

스위치들 내의 어드레스 캐싱

Номер: KR1020180088525A
Принадлежит:

... 컴퓨터 저장 매체들 상에서 인코딩되는 컴퓨터 프로그램들을 포함하여, 어드레스를 스위치의 메모리에 저장하기 위한 방법들, 시스템들 및 장치가 개시된다. 시스템들 중 하나는 버스에 연결된 디바이스들 각각과 스위치 간에 버스 상의 어떠한 컴포넌트들도 없이, 디바이스들로부터 패킷들을 수신하고, 패킷들을 디바이스들로 전달하는 스위치, 물리적 어드레스들로의 가상 어드레스들의 맵핑을 저장하기 위해 스위치에 통합된 메모리, 및 스위치에 통합되고, 스위치에 의해 실행 가능한 명령들을 저장하는 저장 매체를 포함하고, 명령들은 스위치로 하여금 동작들을 수행하게 하고, 동작들은, 버스에 의해 스위치에 연결된 디바이스에 대한 어드레스 변환 요청에 대한 응답을 수신하는 동작 ― 응답은 물리적 어드레스로의 가상 어드레스의 맵핑을 포함함 ― , 및 응답을 수신하는 동작에 대한 응답으로, 물리적 어드레스로의 가상 어드레스의 맵핑을 메모리에 저장하는 동작을 포함한다.

Подробнее
20-02-2018 дата публикации

프로그램가능 메모리 전송 요청 유닛들

Номер: KR1020180016982A
Автор: 기틴스, 벤자민
Принадлежит:

... 장치(100)는 프로그램 가능한 메모리 전달 요청 처리(PMTRP) 유닛(120)과, 프로그램 가능한 직접 메모리 액세스(PDMA) 유닛(140)을 포함한다. PMTRP 유닛(120)은 적어도 하나의 프로그램 가능한 영역 기술자(123)를 포함한다. PDMA 유닛(140)은 적어도 하나의 프로그램 가능한 메모리 간 전달 제어 기술자(148, 149, 150)를 포함한다. PDMA 유닛(140)은 메모리 전달 요청을 PMTRP 유닛(120)에 전송(143)하도록 되어 있다. PMTRP 유닛(120)은, PMTRP 유닛(120)의 적어도 하나의 영역 기술자(123) 중 적어도 하나의 일부분과 관련되는 메모리 위치에 어드레싱되는 PDMA 유닛(120)에 의해 발행되는 메모리 전달 요청을 수신하고(134) 성공적으로 처리하도록 되어 있다.

Подробнее
03-04-2015 дата публикации

Номер: KR1020150034661A
Автор:
Принадлежит:

Подробнее
13-06-2013 дата публикации

PROCESSING UNIT AND METHOD FOR CONTROLLING PROCESSING UNIT

Номер: WO2013084315A1
Принадлежит:

A processing unit set forth in one aspect of the present invention is provided with: a plurality of processors containing a first cache memory; a second cache memory that holds data resulting from operations performed by each of the plurality of processors; an acquisition unit that acquires attribute information pertaining to the first cache memory, which is to be controlled, of the source of an access request to be controlled, said attribute information containing pathway information representing the pathway of a cache block of the first cache memory; a holding unit that holds address information and the attribute information; and a controller that controls an access request relating to a replacement request for a cache block of a second cache memory specified by address information and attribute information to be controlled, on the basis of an address to be replaced, which is included in an access request relating to a replacement request issued by one of the plurality of processors pertaining ...

Подробнее
02-07-1991 дата публикации

Coherent cache structures and methods

Номер: US0005029070A1
Принадлежит: Edge Computer Corporation

A multiprocessing system includes a cache coherency technique that ensures that every access to a line of data is the most up-to-date copy of that line without storing cache coherency status bits in a global memory and any reference thereto. An operand cache includes a first directory which directly, on a one-to-one basis, maps a range of physical address bits into a first section of the operand cache storage. An associative directory multiply maps physical addresses outside of the range into a second section of the operand cache storage section. All stack frames of user programs to be executed on a time-shared basis are stored in the first section, so cache misses due to stack operations are avoided. An instruction cache having various categories of instructions stores a group of status bits identifying the instruction category with each instruction. When a context switch occurs, only instructions of the category least likely to be used in the near future are cleared decreasing delays ...

Подробнее
17-08-1993 дата публикации

Translation lookaside buffer shutdown scheme

Номер: US0005237671A
Автор:
Принадлежит:

Apparatus for temporarily disabling a translation lookaside buffer in a computer system upon the occurrence of certain predefined system conditions. Such conditions may be of a first type which have been predetermined to indicate a greater risk that two or more virtual addresses stored in the TLB will simultaneously match the incoming virtual address, and/or of a second type in which access to the TLB is not needed. An example of the first type is a reference to an unmapped segment of memory. An example of the second type is the processing of a non-memory-access instruction. The apparatus may further include failsafe circuitry to shut down the TLB if at least a given number of matches occur at any time and for any reason, the given number being greater than 1. The apparatus prevents loss of data or damage to the chip where match comparisons are performed in parallel.

Подробнее
30-08-2016 дата публикации

Adjunct component to provide full virtualization using paravirtualized hypervisors

Номер: US0009430398B2

A system configuration is provided with a paravirtualizing hypervisor that supports different types of guests, including those that use a single level of translation and those that use a nested level of translation. When an address translation fault occurs during a nested level of translation, an indication of the fault is received by an adjunct component. The adjunct component addresses the address translation fault, at least in part, on behalf of the guest.

Подробнее
11-05-2017 дата публикации

TWO ADDRESS TRANSLATIONS FROM A SINGLE TABLE LOOK-ASSIDE BUFFER READ

Номер: US20170132149A1
Принадлежит:

A streaming engine employed in a digital data processor specifies a fixed read only data stream. An address generator produces virtual addresses of data elements. An address translation unit converts these virtual addresses to physical addresses by comparing the most significant bits of a next address N with the virtual address bits of each entry in an address translation table. Upon a match, the translated address is the physical address bits of the matching entry and the least significant bits of address N. The address translation unit can generate two translated addresses. If the most significant bits of address N+1 match those of address N, the same physical address bits are used for translation of address N+1. The sequential nature of the data stream increases the probability that consecutive addresses match the same address translation entry and can use this technique. 1. An address translation unit comprising:an address translation table storing a plurality of entries each including a set of a first plurality of most significant virtual address bits and a set of a second plurality of most significant physical address bits;a first comparator connected to said address generator and said address translation table, said comparator comparing said first plurality of most significant bits of said next sequential address N with said first plurality of most significant virtual address bits of each entry of said address translation table, said comparator upon detecting a match generating an entry select signal indicating which entry matched;a multiplexer connected to said address translation table and said comparator, said multiplexer having plural input each receiving said second plurality of most significant physical address bits of a corresponding entry of said address translation table, an output and a control input receiving said entry select signal, said multiplexer outputting said second plurality of most significant physical address bits of one entry of said ...

Подробнее
18-06-2020 дата публикации

TRANSLATION LOOKASIDE BUFFER CACHE MARKER SCHEME FOR EMULATING SINGLE-CYCLE PAGE TABLE ENTRY INVALIDATION

Номер: US20200192818A1
Принадлежит:

A system and method for emulating single-cycle translation lookaside buffer invalidation are described. One embodiment of a method comprises defining a translation lookaside buffer (TLB) cache marking variable comprising a first marker value and a second marker value. A context bank marker associated with a translation context bank is initiated with one of the first marker value and the second marker value. A TLB cache entry table specifies whether each of a plurality of TLB cache entries associated with the translation context bank has a corresponding entry marker set to the first marker value or the second marker value. In response to a TLB invalidate command associated with the translation context bank, the context bank marker is changed from the one of the first marker value and the second marker value to the other of the first marker value and the second marker value prior to initiating TLB invalidation.

Подробнее
05-04-2018 дата публикации

QUEUING MEMORY ACCESS REQUESTS

Номер: US20180095893A1
Принадлежит: ARM LTD

A data processing apparatus is provided including queue circuitry to respond to control signals each associated with a memory access instruction, and to queue a plurality of requests for data, each associated with a reference to a storage location. Resolution circuitry acquires a request for data, and issues the request for data, the resolution circuitry having a resolution circuitry limit. When a current capacity of the resolution circuitry is below the resolution circuitry limit, the resolution circuitry acquires the request for data by receiving the request for data from the queue circuitry, stores the request for data in association with the storage location, issues the request for data, and causes a result of issuing the request for data to be provided to said storage location. When the current capacity of the resolution circuitry meets or exceeds the resolution circuitry limit, the resolution circuitry acquires the request for data by examining a next request for data in the queue circuitry and issues a further request for the data based on the request for data.

Подробнее
22-10-2015 дата публикации

MANAGING TRANSLATIONS ACROSS MULTIPLE CONTEXTS USING A TLB WITH ENTRIES DIRECTED TO MULTIPLE PRIVILEGE LEVELS AND TO MULTIPLE TYPES OF ADDRESS SPACES

Номер: US20150301951A1

For a current context in control of a processor requesting access to a particular address, a translation lookaside buffer (TLB) controller specifies a virtual address with a logical partition identifier value indicating a privilege setting of the current context, a process identifier value indicating whether the address is within shared address space, and an effective address comprising at least a portion of the particular address. In response to the virtual address not matching at least one entry within a TLB comprising at least one entry stored for at least one previous translation of at least one previous address, the TLB controller translates the virtual address into a real page number using at least one page table and adding a new entry to the TLB with the virtual address and the real page number, wherein each at least one entry within the TLB identifies a separate privilege setting from among a plurality of privilege settings and a separate indicator of whether the address is within ...

Подробнее
08-09-2015 дата публикации

Coherent memory scheme for heterogeneous processors

Номер: US0009128849B2

Systems, methods, and devices for maintaining cache coherence between two or more heterogeneous processors are provided. In accordance with one embodiment, such an electronic device may include memory, a first processing unit having a first characteristic memory usage rate, and a second processing unit having a second characteristic memory usage rate lower than the first. The first and second processing units may share at least a portion of the memory and one or both of the first and second processing units may maintain internal cache coherence at a first granularity, while maintaining cache coherence between the first processing unit and the second processing unit at a second granularity. The first granularity may be finer than the second granularity.

Подробнее
08-06-2006 дата публикации

System, method and computer program product for application-level cache-mapping awareness and reallocation requests

Номер: US20060123196A1

In view of the foregoing, the shortcomings of the prior art cache optimization techniques, the present invention provides an improved method, system, and computer program product that can optimize cache utilization. In one embodiment, an application requests a kernel cache map from a kernel service and the application receives the kernel. The application designs an optimum cache footprint for a data set from said application. The objects, advantages and features of the present invention will become apparent from the following detailed description. In one embodiment of the present invention, the application transmits a memory reallocation order to a memory manager. In one embodiment of the present invention, the step of the application transmitting a memory reallocation order to the memory manager further comprises the application transmitting a memory reallocation order containing the optimum cache footprint to the memory manager. In one embodiment of the present invention, the step of ...

Подробнее
02-04-2015 дата публикации

MULTI-STAGE ADDRESS TRANSLATION FOR A COMPUTING DEVICE

Номер: US20150095610A1
Принадлежит: APPLIED MICRO CIRCUITS CORPORATION

Providing for address translation in a virtualized system environment is disclosed herein. By way of example, a memory management apparatus is provided that comprises a shared translation look-aside buffer (TLB) that includes a plurality of translation types, each supporting a plurality of page sizes, one or more processors, and a memory management controller configured to work with the one or more processors. The memory management controller includes logic configured for caching virtual address to physical address translations and intermediate physical address to physical address translations in the shared TLB, logic configured to receive a virtual address for translation from a requester, logic configured to conduct a table walk of a translation table in the shared TLB to determine a translated physical address in accordance with the virtual address, and logic configured to transmit the translated physical address to the requester. 1. A shared translation look-aside buffer (TLB) in a computer system , comprising:a memory cache;a memory cache controller operably connected to the memory cache and configured to store multiple distinct types of memory address translation entries within the memory cache, the multiple distinct types comprising multiple stage-1 translation entries, or multiple stage-2 translation entries, or a combination of a stage-1 and a stage-2 translation entry; andan interface operably connected to the memory cache controller and configured to receive a first memory address and transmit a response comprising a second memory address retrieved from the memory cache by the memory cache controller.2. The TLB of claim 1 , wherein the multiple stage-1 translation entries comprise a virtual address to physical address hypervisor memory address translation entry and a non-virtualized virtual address to physical address memory address translation entry.3. The TLB of claim 1 , wherein the multiple stage-1 translation entries comprise a virtual address to ...

Подробнее
15-11-2022 дата публикации

Apparatus and method for controlling input/output throughput of a memory system

Номер: US0011500720B2
Автор: Jeen Park
Принадлежит: SK hynix Inc.

A memory system includes a memory device including a plurality of memory units capable of inputting or outputting data individually, and a controller coupled with the plurality of memory units via a plurality of data paths. The controller is configured to perform a correlation operation on two or more read requests among a plurality of read requests input from an external device, so that the plurality of memory units output plural pieces of data corresponding to the plurality of read requests via the plurality of data paths based on an interleaving manner. The controller is configured to determine whether to load map data associated with the plurality of read requests before a count of the plurality of read requests reaches a threshold, to divide the plurality of read request into two groups based on whether to load the map data, and to perform the correlation operation per group.

Подробнее
27-06-2024 дата публикации

BACKWARD COMPATIBILITY TESTING OF SOFTWARE IN A MODE THAT ATTEMPTS TO INDUCE SKEW

Номер: US20240211380A1
Принадлежит:

A device and computer program product including one or more processors and a memory coupled to the one or more processors. The device being configured to selectively run in a timing testing mode or in a mode of operation other than the timing testing mode, wherein in the timing testing mode the device is configured to attempt to induce skew.

Подробнее
18-11-1992 дата публикации

PROCESSOR FOR MULTIPLE CACHE COHERENT PROTOCOLS

Номер: GB0009220788D0
Автор:
Принадлежит:

Подробнее
31-05-2006 дата публикации

Sharing a block of memory between processes on a portable electronic device

Номер: GB0002420642A
Принадлежит:

Process 305 creates and names a block of shared memory 314 in memory 300. This block is mapped onto process 305's virtual address space 330 at a random address 370. The MMU stores a mapping adjustment 325. Process 305 creates a pointer and stores it in shared memory 315 which includes virtual address 370. Process 310 wishing to access a specific data address in shared memory 315 maps the block of shared memory into its own virtual space 340 at address 375. To make use of a pointer stored by process 305, it must first correct the pointer by adding or subtracting the difference between its offset value 375 (as maintained by the MMU) and that value 370 of process 305 calculated with respect to some baseline 360. When memory is always allocated in pages (e.g. 4096 bytes), such as in the Symbian (r.t.m.) operating system, as an alternative to the above method the pointers may be corrected by creating a certain number of the most significant bits of an address, as a number of the least significant ...

Подробнее
22-11-1978 дата публикации

COMPUTER MEMORY SYSTEMS

Номер: GB0001532798A
Автор:
Принадлежит:

... 1532798 Computer memory systems SPERRY RAND CORP 10 Nov 1975 [11 Nov 1974] 46376/75 Heading G4C In a computer memory system comprising a relatively large capacity slow cycle time main memory 36 for storing data blocks at addressable locations therein, a relatively small capacity fast cycle time buffer memory 24, 26 for storing a subset of the data blocks stored in the main memory 36, and a plurality of requestor units (which may be central processing units and/or input/output devices) in communication with the buffer memory 24, 26, when a particular address is requested by a requestor unit a check is made to determine if that address is resident in the buffer memory 24, 26 and if so the corresponding data is available for reading or writing; if not a block in the buffer memory 24, 26 is selected for replacement, e.g. on a least recently accessed basis, and the block in the main memory 36 at the requested address is transferred to the buffer memory 24, 26, the block in buffer memory 24, ...

Подробнее
15-11-1988 дата публикации

DATA PROCESSING SYSTEM.

Номер: AT0000038442T
Принадлежит:

Подробнее
15-01-2002 дата публикации

SINGLE-SEQUENCE TO MULTIPLEACCESSIBLE INTERLOCKED CACHE

Номер: AT0000211276T
Принадлежит:

Подробнее
23-11-2017 дата публикации

Programmable memory transfer request units

Номер: AU2016245421A1
Принадлежит: Benjamin Aaron Gittins

An apparatus (100) comprising a programmable memory transfer request processing (PMTRP) unit (120) and a programmable direct memory access (PDMA) unit (140). The PMTRP unit (120) comprises at least one programmable region descriptor (123). The PDMA unit (140) comprises at least one programmable memory-to-memory transfer control descriptor (148, 149, 150). The PDMA unit (140) is adapted to send (143) a memory transfer request to the PMTRP unit (120). The PMTRP unit (120) is adapted to receive (134) and successfully process a memory transfer request issued by the PDMA unit (120) that is addressed to a memory location that is associated with a portion of at least one of the at least one region descriptor (123) of the PMTRP unit (120).

Подробнее
22-02-2017 дата публикации

ASSOCIATING CACHE MEMORY WITH A WORK PROCESS

Номер: CN0106462599A
Автор: MORETTI MICHAEL J
Принадлежит:

Подробнее
12-09-2018 дата публикации

상이한 인덱싱 방식을 사용하는 1차 캐시와 오버플로 캐시를 갖는 캐시 시스템

Номер: KR0101898322B1

... 캐시 메모리 시스템은 1차 캐시 및 검색 어드레스를 사용하여 함께 검색되는 오버플로 캐시를 포함한다. 오버플로 캐시는 1차 캐시에 대해 축출 어레이와 같이 동작한다. 1차 캐시는 검색 어드레스의 비트를 사용하여 어드레스되고, 오버플로 캐시는 검색 어드레스의 비트에 인가되는 해쉬 함수에 의해 생성된 해쉬 인덱스에 의해 어드레스 된다. 전체 캐시 활용성을 개선하기 위하여 해쉬 함수는 1차 캐시로부터 축출된 희생을 오버플로 캐시의 상이한 셋트에 분배하도록 동작한다. 해쉬 함수를 수행하기 위해 해쉬 생성기가 포함되어진다. 1차 캐시내의 유효 엔트리의 해쉬 인덱스를 저장하기 위하여 해쉬 테이블이 포함되어진다. 마이크로프로세서용 변환 색인 버퍼를 구현하기 위하여 캐시 메모리 시스템이 사용된다.

Подробнее
23-12-1994 дата публикации

Номер: KR19940011668B1
Автор:
Принадлежит:

Подробнее
07-03-2013 дата публикации

PROGRAMMABLY PARTITIONING CACHES

Номер: WO2013032437A1
Автор: KACEVAS, Nicolas
Принадлежит:

Agents may be assigned to discrete portions of a cache. In some cases, more than one agent may be assigned to the same cache portion. The size of the portion, the assignment of agents to the portion and the number of agents may be programmed dynamically in some embodiments.

Подробнее
30-07-1987 дата публикации

PAGED MEMORY MANAGEMENT UNIT CAPABLE OF SELECTIVELY SUPPORTING MULTIPLE ADDRESS SPACES

Номер: WO1987004544A1
Принадлежит:

A paged memory management unit (PMMU) (20) adapted to selectively access a plurality of pointer tables (PT) and page tables (PG) stored in a memory (18) to translate a selected logical address (LA) into a corresponding physical address (PA) by first combining a first portion of the logical address (PA) and a first table pointer to access a first one of the pointer tables (PT) to obtain therefrom a page table pointer to a selected one of the page tables (PG) and then combining a second portion of the logical address (LA) and the page table pointer to access the selected page table (PG) to obtain therefrom the physical address (PA). If desired, an address space selector may be considered as an extension of the logical address (LA). If so, the PMMU (20) may be selectively enabled to initially combine the first table pointer and the address space selector to access a second one of the pointer tables (PT) to obtain therefrom a second table pointer and then combine the first portion of the logical ...

Подробнее
22-03-2005 дата публикации

Cache system

Номер: US0006871266B2

A cache system is provided which includes a cache memory and a cache refill mechanism which allocates one or more of a set of cache partitions in the cache memory to an item in dependence on the address of the item in main memory. This is achieved in one of the described embodiments by including with the address of an item a set of partition selector bits which allow a partition mask to be generated to identify into which cache partition the item may be loaded.

Подробнее
22-02-2000 дата публикации

Cache bank conflict avoidance and cache collision avoidance

Номер: US0006029225A
Автор:
Принадлежит:

The inventive mechanism determines whether memory source and destination addresses map to the same or nearly the same cache address. If they map to different addresses, then loads and stores are ordered so that loads to one cache bank are performed on the same clock cycles as the stores to another cache bank. After a group of loads and stores are completed, then load and store operations for each bank are switched. If the source and destination addresses map to nearly the same cache address and if the source address is prior to the destination address, then a group of cache lines is loaded into registers and stored to memory without any interleaving of other loads and stores. If the source and destination addresses map to the same cache location, then an initial load of data into registers is performed. After that, additional loads are interleaved with non-cache conflicting stores to move new values into memory. Thus, loads and stores to matching cache addresses are separated by time.

Подробнее
27-02-2007 дата публикации

Configurable cache allowing cache-type and buffer-type access

Номер: US000RE39500E1
Автор: Craig C. Hansen
Принадлежит: MicroUnity Systems Engineering, Inc.

A virtual memory system including a local-to-global virtual address translator for translating local virtual addresses having associated task specific address spaces into global virtual addresses corresponding to an address space associated with multiple tasks, and a global virtual-to-physical address translator for translating global virtual addresses to physical addresses. Protection information is provided by each of the local virtual-to-global virtual address translator, the global virtual-to-physical address translator, the cache tag storage, or a protection information buffer depending on whether a cache hit or miss occurs during a given data or instruction access. The cache is configurable such that it can be configured into a buffer portion or a cache portion for faster cache accesses.

Подробнее
01-12-2020 дата публикации

Arithmetic processing apparatus and method of controlling arithmetic processing apparatus

Номер: US0010853072B2
Принадлежит: FUJITSU LIMITED, FUJITSU LTD

An arithmetic processing apparatus includes: an instruction controller; a first level cache and a second level cache. The instruction controller, for a memory access instruction to be speculatively executed that is executed while a branch destination of a branch instruction is undetermined, adds a valid speculation flag and an instruction identifier of the branch instruction to the memory access instruction and issues to the first level cache. The first level cache controller interrupts execution of the memory access instruction when a virtual address of the memory access instruction hits in a TLB of the first level cache, the speculation flag of the memory access instruction is valid and an entry having a virtual address matching the virtual address of the memory access instruction stores a speculative access prohibition flag prohibiting speculative access.

Подробнее
02-05-2013 дата публикации

METHOD AND SYSTEM FOR CACHING ATTRIBUTE DATA FOR MATCHING ATTRIBUTES WITH PHYSICAL ADDRESSES

Номер: US20130111184A1
Принадлежит: Intellectual Venture Funding LLC

A method for caching attribute data for matching attributes with physical addresses. The method includes storing a plurality of attribute entries in a memory, wherein the memory is configured to provide at least one attribute entry when accessed with a physical address, and wherein the attribute entry provided describes characteristics of the physical address. 120-. (canceled)21. A method comprising:accessing a logic to compute at least one attribute for a physical address in response to a translation look aside buffer (TLB) miss and an attribute cache miss; andstoring the at least one attribute and the physical address in an attribute cache.22. The method of claim 21 , further comprising:storing the at least one attribute and the physical address in a translation look aside buffer (TLB).23. The method of claim 22 , wherein the attribute cache comprises a greater number of entries than the TLB.24. The method of claim 21 , wherein the attribute cache is operable to cache a first plurality of attributes that are time-consuming to obtain.25. The method of claim 21 , further comprising:determining whether the attribute cache is configured to speculatively load a plurality of attributes for a plurality of speculative physical addresses.26. The method of claim 25 , further comprising:responsive to the attribute cache being configured to speculatively load, accessing the logic to compute the plurality of attributes for the plurality of speculative physical addresses.27. The method of claim 26 , further comprising:storing the plurality of attributes and the plurality of speculative physical addresses in the attribute cache.28. An apparatus comprising:attribute logic operable to generate at least one attribute for a physical address in response to a translation look aside buffer (TLB) miss and an attribute cache miss; andan attribute cache operable to store the least one attribute and the physical address.29. The apparatus of claim 28 , wherein the attribute cache is ...

Подробнее
09-05-2013 дата публикации

Managing Chip Multi-Processors Through Virtual Domains

Номер: US20130117521A1
Принадлежит: Hewlett Packard Development Co LP

A chip multi-processor (CMP) with virtual domain management. The CMP has a plurality of tiles each including a core and a cache, a mapping storage, a plurality of memory controllers, a communication bus interconnecting the tiles and the memory controllers, and machine-executable instructions. The tiles and memory controllers are responsive to the instructions to group the tiles into a plurality of virtual domains, each virtual domain associated with at least one memory controller, and to store a mapping unique to each virtual domain in the mapping storage.

Подробнее
18-07-2013 дата публикации

Determining Cache Hit/Miss of Aliased Addresses in Virtually-Tagged Cache(s), and Related Systems and Methods

Номер: US20130185520A1
Принадлежит: Qualcomm Inc

Apparatuses and related systems and methods for determining cache hit/miss of aliased addresses in virtually-tagged cache(s) are disclosed. In one embodiment, virtual aliasing cache hit/miss detector for a VIVT cache is provided. The detector comprises a TLB configured to receive a first virtual address and a second virtual address from the VIVT cache resulting from an indexed read into the VIVT cache based on the first virtual address. The TLB is further configured to generate first and second physical addresses translated from the first and second virtual addresses, respectively, The detector further comprises a comparator configured to receive the first and second physical addresses and effectuate a generation of an aliased cache hit/miss indicator based on a comparison of the first and second physical addresses. In this manner, the virtual aliasing cache hit/miss detector correctly generates cache hits and cache misses, even in the presence of aliased addressing.

Подробнее
29-08-2013 дата публикации

MEMORY ADDRESS TRANSLATION

Номер: US20130227247A1
Принадлежит: MICRON TECHNOLOGY, INC.

The present disclosure includes devices, systems, and methods for memory address translation. One or more embodiments include a memory array and a controller coupled to the array. The array includes a first table having a number of records, wherein each record includes a number of entries, wherein each entry includes a physical address corresponding to a data segment stored in the array and a logical address. The controller includes a second table having a number of records, wherein each record includes a number of entries, wherein each entry includes a physical address corresponding to a record in the first table and a logical address. The controller also includes a third table having a number of records, wherein each record includes a number of entries, wherein each entry includes a physical address corresponding to a record in the second table and a logical address. 1. A memory device , comprising: each of the number of records includes a number of entries; and', 'each of the number of entries includes a physical address corresponding to a data segment stored in the array and a logical address; and, 'a memory array including a first table having a number of records, wherein each of the number of records includes a number of entries; and', 'each of the number of entries includes a physical address corresponding to a record in the first table and a logical address., 'a controller coupled to the array and including a second table having a number of records, wherein2. The memory device of claim 1 , wherein the controller includes a third table having a number of records.3. The memory device of claim 2 , wherein:each of the number of records in the third table includes a number of entries; andeach of the number of entries in the third table includes a physical address corresponding to a record in the second table and a logical address.4. The memory device of claim 1 , wherein the controller includes a cache storing one or more of the number of records in the first ...

Подробнее
17-10-2013 дата публикации

Programmably Partitioning Caches

Номер: US20130275683A1
Автор: Kacevas Nicolas
Принадлежит: Intel Corporation

Agents may be assigned to discrete portions of a cache. In some cases, more than one agent may be assigned to the same cache portion. The size of the portion, the assignment of agents to the portion and the number of agents may be programmed dynamically in some embodiments. 1. A method comprising:programmably assigning agents to discrete portions of a cache.2. The method of including programmably assigning more than one agent to the same discrete cache portion.3. The method of including programmably setting the size of a cache portion.4. The method of including dynamically changing the assignments of one or more agents to a cache portion.5. The method of including assigning agents to discrete portions of a cache in the form of a translation lookaside buffer.6. The method of including using a cache having an associativity greater than four ways.7. A non-transitory computer readable medium storing instructions to cause a core to:assign more than one agent to a discrete part of a cache.8. The medium of further storing instructions to dynamically change the assignment of more than one agent to said discrete part of said cache.9. The medium of further storing instructions to programmably set the size of a cache part.10. The medium of further storing instructions to assign agents to discrete parts of a cache.11. The medium of further storing instructions to change the assignments of one or more agents to a cache part.12. The medium of further storing instructions to assigning agents to discrete parts of a cache in the form of a translation lookaside buffer.13. The medium of further storing instructions to use a cache having an associativity greater than four ways.14. An apparatus comprising:a processor core; anda cache coupled to said core, said core to assign agents to discrete portions of a cache.15. The apparatus of claim 14 , said core to programmably assign more than one agent to the same discrete cache portion.16. The apparatus of claim 14 , said core to ...

Подробнее
28-11-2013 дата публикации

Apparatus and method for accelerating operations in a processor which uses shared virtual memory

Номер: US20130318323A1
Принадлежит: Intel Corp

An apparatus and method are described for coupling a front end core to an accelerator component (e.g., such as a graphics accelerator). For example, an apparatus is described comprising: an accelerator comprising one or more execution units (EUs) to execute a specified set of instructions; and a front end core comprising a translation lookaside buffer (TLB) communicatively coupled to the accelerator and providing memory access services to the accelerator, the memory access services including performing TLB lookup operations to map virtual to physical addresses on behalf of the accelerator and in response to the accelerator requiring access to a system memory.

Подробнее
09-01-2014 дата публикации

COMBINING A REMOTE TLB LOOKUP AND A SUBSEQUENT CACHE MISS INTO A SINGLE COHERENCE OPERATION

Номер: US20140013074A1
Принадлежит: ORACLE INTERNATIONAL CORPORATION

The disclosed embodiments provide techniques for reducing address-translation latency and the serialization latency of combined TLB and data cache misses in a coherent shared-memory system. For instance, the last-level TLB structures of two or more multiprocessor nodes can be configured to act together as either a distributed shared last-level TLB or a directory-based shared last-level TLB. Such TLB-sharing techniques increase the total amount of useful translations that are cached by the system, thereby reducing the number of page-table walks and improving performance. Furthermore, a coherent shared-memory system with a shared last-level TLB can be further configured to fuse TLB and cache misses such that some of the latency of data coherence operations is overlapped with address translation and data cache access latencies, thereby further improving the performance of memory operations. 1. A computer-implemented method that combines a remote TLB lookup and a subsequent cache miss into a single coherence operation , wherein the method operates in a shared-memory multiprocessor system which partitions a virtual address space and a physical address space across two or more nodes of the shared-memory multiprocessor system , wherein a last-level TLB structure in each node is responsible for TLB entries for a subset of the virtual address space , and wherein the last-level TLB structures of the nodes collectively form a shared last-level TLB , the method comprising:receiving a memory operation with a virtual address;determining that one or more TLB levels in a first node will miss for the virtual address;sending a TLB request to a second node associated with the virtual address, wherein the last-level TLB structure in the second node is responsible for a subset of the virtual address space that includes the virtual address; andupon determining a hit for the virtual address in the last-level TLB structure of the second node, ensuring that a corresponding TLB entry is sent ...

Подробнее
27-02-2014 дата публикации

Method, apparatus, and system for speculative abort control mechanisms

Номер: US20140059333A1
Принадлежит: Intel Corp

An apparatus and method is described herein for providing robust speculative code section abort control mechanisms. Hardware is able to track speculative code region abort events, conditions, and/or scenarios, such as an explicit abort instruction, a data conflict, a speculative timer expiration, a disallowed instruction attribute or type, etc. And hardware, firmware, software, or a combination thereof makes an abort determination based on the tracked abort events. As an example, hardware may make an initial abort determination based on one or more predefined events or choose to pass the event information up to a firmware or software handler to make such an abort determination. Upon determining an abort of a speculative code region is to be performed, hardware, firmware, software, or a combination thereof performs the abort, which may include following a fallback path specified by hardware or software. And to enable testing of such a fallback path, in one implementation, hardware provides software a mechanism to always abort speculative code regions.

Подробнее
10-04-2014 дата публикации

Adjunct component to provide full virtualization using paravirtualized hypervisors

Номер: US20140101360A1
Автор: Michael K. Gschwind
Принадлежит: International Business Machines Corp

A system configuration is provided with a paravirtualizing hypervisor that supports different types of guests, including those that use a single level of translation and those that use a nested level of translation. When an address translation fault occurs during a nested level of translation, an indication of the fault is received by an adjunct component. The adjunct component addresses the address translation fault, at least in part, on behalf of the guest.

Подробнее
01-01-2015 дата публикации

Shared Virtual Memory Between A Host And Discrete Graphics Device In A Computing System

Номер: US20150002526A1
Автор: Ginzburg Boris
Принадлежит:

In one embodiment, the present invention includes a device that has a device processor and a device memory. The device can couple to a host with a host processor and host memory. Both of the memories can have page tables to map virtual addresses to physical addresses of the corresponding memory, and the two memories may appear to a user-level application as a single virtual memory space. Other embodiments are described and claimed. 1. A system on chip (SoC) comprising:a plurality of cores;a host memory controller to couple to a host memory;a plurality of graphics units coupled to the plurality of cores; anda device memory controller to couple to a device memory, the plurality of graphics units and the plurality of cores having a shared virtual address space, wherein on a page fault in a first graphics unit, the first graphics unit is to request a missing page from the host memory via a host page table that maps first virtual addresses to physical addresses of the host memory, the first graphics unit having a device page table to map second virtual addresses to physical addresses of the device memory.2. The SoC of claim 1 , wherein the host memory and the device memory appear to a user-level application as a single virtual memory space.3. The SoC of claim 1 , wherein the device memory is to act as a page-based cache memory of the host memory.4. The SoC of claim 3 , wherein coherency between the device memory and the host memory is maintained implicitly without programmer interaction.5. The SoC of claim 1 , wherein one of the plurality of cores is to provide the missing page from the host memory to the device processor if present therein claim 1 , and to set a not present indicator in the host memory for the corresponding page if the missing page is write enabled claim 1 , wherein when the not present indicator is set claim 1 , the plurality of cores is prevented from accessing the corresponding page in the host memory.6. The SoC of claim 1 , wherein one of the ...

Подробнее
04-01-2018 дата публикации

READ FROM MEMORY INSTRUCTIONS, PROCESSORS, METHODS, AND SYSTEMS, THAT DO NOT TAKE EXCEPTION ON DEFECTIVE DATA

Номер: US20180004595A1
Принадлежит: Intel Corporation

A processor of an aspect includes a decode unit to decode a read from memory instruction. The read from memory instruction is to indicate a source memory operand and a destination storage location. The processor also includes an execution unit coupled with the decode unit. The execution unit, in response to the read from memory instruction, is to read data from the source memory operand, store an indication of defective data in an architecturally visible storage location, when the data is defective, and complete execution of the read from memory instruction without causing an exceptional condition, when the data is defective. Other processors, methods, systems, and instructions are disclosed. 1. A processor comprising:a decode unit to decode a read from memory instruction, the read from memory instruction to indicate a source memory operand and a destination storage location; andan execution unit coupled with the decode unit, the execution unit, in response to the read from memory instruction, to:read data from the source memory operand;store an indication of defective data in an architecturally visible storage location, when the data is defective; andcomplete execution of the read from memory instruction without causing an exceptional condition, when the data is defective.2. The processor of claim 1 , wherein the execution unit claim 1 , in response to the read from memory instruction claim 1 , is to read the data from a block storage memory location claim 1 , which is to be in a physical memory address space that is addressable by the read from memory instruction claim 1 , and wherein the processor comprises a general-purpose central processing unit (CPU).3. The processor of claim 1 , wherein the decode unit is to decode a second read from memory instruction claim 1 , which is to have a same opcode as the read from memory instruction claim 1 , and which is to indicate a second source memory operand and a second destination storage location claim 1 , and wherein ...

Подробнее
20-01-2022 дата публикации

Write broadcast operations associated with a memory device

Номер: US20220020424A1
Принадлежит: Micron Technology Inc

Methods, systems, and devices related to write broadcast operations associated with a memory device are described. In one example, a memory device in accordance with the described techniques may include a memory array, a sense amplifier array, and a signal development cache configured to store signals (e.g., cache signals, signal states) associated with logic states (e.g., memory states) that may be stored at the memory array (e.g., according to various read or write operations). The memory device may enable write broadcast operations. A write broadcast may occur from one or more signal development components or from one or more multiplexers to multiple locations of the memory array.

Подробнее
11-01-2018 дата публикации

INFORMATION PROCESSING APPARATUS AND CACHE INFORMATION OUTPUT METHOD

Номер: US20180011795A1
Автор: SUGISAKI Yoshinori
Принадлежит: FUJITSU LIMITED

An information processing apparatus includes a memory, and a processor coupled to the memory and configured to count first number indicating storing a plurality of arrays of data to each of cash lines, the data being accessed in accordance with execution of a program, and count second number indicating cache thrashing to the cache lines when the first number exceeds number of ways of cache. 1. An information processing apparatus comprising:a memory; anda processor coupled to the memory and configured to:count first number indicating storing a plurality of arrays of data to each of cash lines, the data being accessed in accordance with execution of a program; andcount second number indicating cache thrashing to the cache lines when the first number exceeds number of ways of cache.2. The information processing apparatus according to claim 1 , whereinthe plurality of arrays are contained in a predetermined instruction enclosed by a loop instruction in a source code of the program; andthe processor configured to count the second number in accordance with execution of the predetermined instruction.3. The information processing apparatus according to claim 1 , the processor further configured to select the plurality of arrays to be monitored before counting the first number.4. The information processing apparatus according to claim 1 , the processor further configured to output the second number after counting the second number.5. The information processing apparatus according to claim 1 , the processor further configured to:output information indicating the cache line storing the plurality of arrays of data in accordance with execution of the program; andjudge an occurrence of the cache thrashing on the basis of the information indicating the cache line.6. The information processing apparatus according to claim 1 , the processor further configured to:count third number indicating data of array is accessed in accordance with execution of the program;count fourth number ...

Подробнее
10-01-2019 дата публикации

Buffer Management in a Data Storage Device

Номер: US20190012114A1
Принадлежит: SEAGATE TECHNOLOGY LLC

Method and apparatus for managing data buffers in a data storage device. In some embodiments, a write manager circuit stores user data blocks in a write cache pending transfer to a non-volatile memory (NVM). The write manager circuit sets a write cache bit value in a forward map describing the NVM to a first value upon storage of the user data blocks in the write cache, and subsequently sets the write cache bit value to a second value upon transfer of the user data blocks to the NVM. A read manager circuit accesses the write cache bit value in response to a read command for the user data blocks. The read manager circuit searches the write cache for the user data blocks responsive to the first value, and retrieves the requested user data blocks from the NVM without searching the write cache responsive to the second value.

Подробнее
03-02-2022 дата публикации

Dynamic translation lookaside buffer (tlb) invalidation using virtually tagged cache for load/store operations

Номер: US20220035748A1
Принадлежит: International Business Machines Corp

Translation lookaside buffer (TLB) invalidation using virtual addresses is provided. A cache is searched for a virtual address matching the input virtual address. Based on a matching virtual address in the cache, the corresponding cache entry is invalidated. The load/store queue is searched for a set and a way corresponding to the set and the way of the invalidated cache entry. Based on an entry in the load/store queue matching the set and the way of the invalidated cache entry, the entry in the load/store queue is marked as pending. Indicating a completion of the TLB invalidate instruction is delayed until all pending entries in the load/store queues are complete.

Подробнее
17-01-2019 дата публикации

ADDRESS TRANSLATION CACHE PARTITIONING

Номер: US20190018777A1
Принадлежит:

An apparatus has an address translation cache with entries for storing address translation data. Partition configuration storage circuitry stores multiple sets of programmable configuration data each corresponding to a partition identifier identifying a corresponding software execution environment or master device and specifying a corresponding subset of entries of the cache. In response to a translation lookup request specifying a target address and a requesting partition identifier, control circuitry triggers a lookup operation to identify whether the target address hits or misses in the corresponding subset of entries specified by the set of partition configuration data for the requesting partition identifier. 1. An apparatus comprising:an address translation cache comprising a plurality of entries, each entry to store address translation data for a corresponding block of addresses;partition configuration storage circuitry to store a plurality of sets of programmable partition configuration data, each set of programmable partition configuration data corresponding to a partition identifier identifying a corresponding software execution environment or master device and specifying a corresponding subset of entries of the address translation cache; andcontrol circuitry responsive to a translation lookup request specifying a target address and a requesting partition identifier identifying the software execution environment or master device associated with the translation lookup request, to perform a lookup operation to identify whether the target address hits or misses in the corresponding subset of entries specified by the set of programmable partition configuration data corresponding to the requesting partition identifier.2. The apparatus according to claim 1 , wherein the control circuitry is configured to allocate address translation data to one of the corresponding subset of entries when the target address misses in the corresponding subset of entries.3. The ...

Подробнее
10-02-2022 дата публикации

Signal development caching in a memory device

Номер: US20220044723A1
Принадлежит: Micron Technology Inc

Methods, systems, and devices related to signal development caching in a memory device are described. In one example, a memory device in accordance with the described techniques may include a memory array, a sense amplifier array, and a signal development cache configured to store signals (e.g., cache signals, signal states) associated with logic states (e.g., memory states) that may be stored at the memory array (e.g., according to various read or write operations). In various examples, accessing the memory device may include accessing information from the signal development cache, or the memory array, or both, based on various mappings or operations of the memory device.

Подробнее
02-02-2017 дата публикации

ADDRESS CACHING IN SWITCHES

Номер: US20170031835A1
Автор: SEREBRIN Benjamin C.
Принадлежит:

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for storing an address in a memory of a switch. One of the systems includes a switch that receives packets from and delivers packets to devices connected to a bus without any components on the bus between the switch and each of the devices, a memory integrated into the switch to store a mapping of virtual addresses to physical addresses, and a storage medium integrated into the switch storing instructions executable by the switch to cause the switch to perform operations including receiving a response to an address translation request for a device connected to the switch by the bus, the response including a mapping of a virtual address to a physical address, and storing, in the memory, the mapping of the virtual address to the physical address in response to receiving the response. 1. A system comprising:a switch that receives packets from and delivers packets to one or more devices connected to a bus without any components on the bus between the switch and each of the devices;a memory integrated into the switch to store a mapping of virtual addresses to physical addresses; and receiving, by the switch, a response to an address translation request for a device connected to the switch by the bus, the response including a mapping of a virtual address to a physical address; and', 'storing, in the memory, the mapping of the virtual address to the physical address in response to receiving the response to the address translation request for the device., 'a non-transitory computer readable storage medium integrated into the switch storing instructions executable by the switch and upon such execution cause the switch to perform operations comprising2. The system of claim 1 , comprising:an input/output memory management unit (IOMMU) integrated into the switch and including an IOMMU memory, wherein:the memory comprises the IOMMU memory;receiving, by the switch, the response to the ...

Подробнее
17-02-2022 дата публикации

Content-addressable memory for signal development caching in a memory device

Номер: US20220050776A1
Принадлежит: Micron Technology Inc

Methods, systems, and devices related to content-addressable memory for signal development caching are described. In one example, a memory device in accordance with the described techniques may include a memory array, a sense amplifier array, and a signal development cache configured to store signals (e.g., cache signals, signal states) associated with logic states (e.g., memory states) that may be stored at the memory array (e.g., according to various read or write operations). The memory device may also include storage, such as a content-addressable memory, configured to store a mapping between addresses of the signal development cache and addresses of the memory array. In various examples, accessing the memory device may include determining and storing a mapping between addresses of the signal development cache and addresses of the memory array, or determining whether to access the signal development cache or the memory array based on such a mapping.

Подробнее
17-02-2022 дата публикации

DETERMINING PAGE SIZE VIA PAGE TABLE CACHE

Номер: US20220050792A1
Принадлежит:

A page directory entry cache (PDEC) can be checked to potentially rule out one or more possible page sizes for a translation lookaside buffer (TLB) lookup. Information gained from the PDEC lookup can reduce the number of TLB checks required to conclusively determine if the TLB lookup is a hit or a miss. 1. A method , comprising:receiving a request, the request including a virtual address;identifying a set of possible page sizes;performing a Page Directory Entry Cache (PDEC) lookup based on the virtual address;updating, based on the PDEC lookup, the set of possible page sizes; andperforming a Translation Lookaside Buffer (TLB) lookup based on the set of possible page sizes.2. The method of claim 1 , further comprising:performing a page walk;identifying page size indicator information based on the page walk; andupdating the PDEC to include a hint bit based on the page size indicator information.3. The method of claim 2 , wherein:the performing the PDEC lookup includes identifying the hint bit; andthe updating the set of possible page sizes includes removing one or more of the possible page sizes from the set based on the hint bit.4. The method of claim 3 , wherein:the performing the TLB lookup includes performing a first TLB check with a first page size; andthe method further comprises determining, based on the hint bit, a correct page size, wherein the first TLB check is performed with first page size of the correct page size.5. The method of claim 1 , wherein:the performing the PDEC lookup includes identifying a level of a returned page directory entry (PDE); andthe updating the set of possible page sizes includes removing one or more of the possible page sizes from the set based on the level.6. The method of claim 1 , wherein the PDEC lookup is performed contemporaneously with the TLB lookup.7. The method of claim 1 , wherein the PDEC lookup is performed before the TLB lookup.8. A system claim 1 , comprising:a memory including instructions; and{'claim-text': [' ...

Подробнее
31-01-2019 дата публикации

Precise invalidation of virtually tagged caches

Номер: US20190034349A1
Принадлежит: Qualcomm Inc

A translation lookaside buffer (TLB) index valid bit is set in a first line of a virtually indexed, virtually tagged (VIVT) cache. The first line of the VIVT cache is associated with a first TLB entry which stores a virtual address to physical address translation for the first cache line. The TLB index valid bit of the first line is cleared upon determining that the translation is no longer stored in the first TLB entry. An indication of a received invalidation instruction is stored. When a context synchronization instruction is received, the first line of the VIVT cache is cleared based on the TLB index valid bit being cleared and the stored indication of the invalidate instruction.

Подробнее
31-01-2019 дата публикации

Auxiliary processor resources

Номер: US20190034350A1
Принадлежит: Intel Corp

Apparatuses, systems and methods associated microprocessor segment registers are disclosed herein. More particularly, the present disclosure relates to providing an auxiliary segment register(s) and/or auxiliary segment descriptor table(s), and various ways for their use, for example, providing new instructions for their access, or remapping existing processor resources. A machine might provide isolated execution regions and/or protected memory by associating or exclusively reserving some or all of the auxiliary segment register(s)/table(s) with a specific task, program, instruction sequence, etc. In some embodiments, such as in Internet of Things (IoT) or wearable devices, auxiliary resources may be employed to isolate mutually-distrustful code regions to facilitate engaging unknown devices. Other embodiments are also described and/or claimed.

Подробнее
30-01-2020 дата публикации

METHODS AND APPARATUS OF MAPPING OR REPLACEMENT FOR DATA ARRAY LOCATIONS OF A CACHE MEMORY

Номер: US20200034303A1
Принадлежит:

Aspects of the present disclosure relate to an apparatus comprising a data array having locality-dependent latency characteristics such that an access to an open unit of the data array has a lower latency than an access to a closed unit of the data array. Set associative cache indexing circuitry determines, in response to a request for data associated with a target address, a cache set index. Mapping circuitry identifies, in response to the index, a set of data array locations corresponding to the index, according to a mapping in which a given unit of the data array comprises locations corresponding to a plurality of consecutive indices, and at least two locations of the set of locations corresponding to the same index are in different units of the data array. Cache access circuitry accesses said data from one of the set of data array locations. 1. An apparatus comprising:a data array having locality-dependent latency characteristics such that an access to an open unit of the data array has a lower latency than an access to a closed unit of the data array;set associative cache indexing circuitry to determine, in response to a request for data associated with a target address, a cache set index corresponding to the target address;mapping circuitry to identify, in response to the cache set index, a set of data array locations of the data array corresponding to the cache set index, according to a mapping in which a given unit of the data array comprises data array locations corresponding to a plurality of consecutive set indices, and at least two memory locations of the set of data array locations corresponding to the same cache set index are in different units of the data array; andcache access circuitry to access said data associated with the target address from one of the set of data array locations identified by the mapping circuitry.2. An apparatus according to claim 1 , wherein the mapping is such that data array locations corresponding to different ways of the ...

Подробнее
04-02-2021 дата публикации

METHOD, APPARATUS, DEVICE AND COMPUTER-READABLE STORAGE MEDIUM FOR STORAGE MANAGEMENT

Номер: US20210034517A1
Принадлежит:

Example embodiments of the present disclosure provide a method, an apparatus, a device and a computer-readable storage medium for storage management. The method for storage management includes: obtaining an available channel mode of a plurality of channels in a memory of a data processing system, the available channel mode indicating availabilities of the plurality of channels, and each of the plurality of channels being associated with a set of addresses in the memory; obtaining a channel data-granularity of the plurality of channels, the channel data-granularity indicating a size of a data block that can be carried on each channel; obtaining a target address of data to be transmitted in the memory; and determining a translated address corresponding to the target address based on the available channel mode and the channel data-granularity. 1. A method for storage management , comprising:obtaining an available channel mode of a plurality of channels in a memory of a data processing system, the available channel mode indicating availabilities of the plurality of channels, and each of the plurality of channels being associated with a set of addresses in the memory;obtaining a channel data-granularity of the plurality of channels, the channel data-granularity indicating a size of a data block that can be carried on cads channel;obtaining a target address of data to be transmitted in the memory; anddetermining a translated address corresponding to the target address based on the available channel mode and the channel data-granularity.2. The method according to claim 1 , wherein obtaining the available channel mode comprises:obtaining information related to unavailable channels in the plurality of channels; anddetermining the available channel mode based on the information related to the unavailable channels.3. The method according to claim 1 , wherein determining the translated address comprises:dividing the target address into a high-order portion and a low-order ...

Подробнее
04-02-2021 дата публикации

REDUCING IMPACT OF CONTEXT SWITCHES THROUGH DYNAMIC MEMORY-MAPPING OVERALLOCATION

Номер: US20210034545A1
Принадлежит:

A method including: receiving, via a processor, established upper bounds for dynamic structures in a multi-tenant system; creating, via the processor, arrays comprising related memory-management unit (MMU) mappings to be placed together; and placing the dynamic structures within the arrays, the placing comprising for each array: skipping an element of the array based on determining that placing a dynamic structure in that element would cause the array to become overcommitted and result in a layout where accessing all elements would impose a translation look aside buffer (TLB) replacement action; and scanning for an array-start entry by placing the start of a first element at an address from which an entire array can be placed without TLB contention, and accessing, via the processors, all non-skipped elements without incurring TLB replacements. 1. A computer-implemented method comprising:receiving, via a processor, established upper bounds for dynamic structures in a multi-tenant system;creating, via the processor, arrays comprising related memory-management unit (MMU) mappings to be placed together; and skipping an element of the array based on determining that placing a dynamic structure in that element would cause the array to become overcommitted and result in a layout where accessing all elements would impose a translation look aside buffer (TLB) replacement action; and', 'scanning for an array-start entry by placing the start of a first element at an address from which an entire array can be placed without TLB contention, and', 'accessing, via the processors, all non-skipped elements without incurring TLB replacements., 'placing the dynamic structures within the arrays, the placing comprising for each array2. The method according to claim 1 , wherein the related MMU mappings to be placed together all have the same size.3. The method according to claim 1 , wherein the established upper bounds for dynamic structures include an amount of memory which must be ...

Подробнее
30-01-2020 дата публикации

METHOD AND APPARATUS FOR STACKING CORE AND UNCORE DIES HAVING LANDING SLOTS

Номер: US20200035659A1
Автор: RUSU Stefan
Принадлежит: Intel Corporation

A method is described for stacking a plurality of cores. For example, one embodiment comprises: mounting an uncore die on a package, the uncore die comprising a plurality of exposed landing slots, each landing slot including an inter-die interface usable to connect vertically to a cores die, the uncore die including a plurality of uncore components usable by cores within the cores die; and vertically coupling a first cores die comprising a first plurality of cores on top of the uncore die, the cores spaced on the first cores die to correspond to all or a first subset of the landing slots on the uncore die, each of the cores having an inter-die interface positioned to be communicatively coupled to a corresponding inter-die interface within a landing slot on the uncore die when the first cores die is vertically coupled on top of the uncore die. 1. A method comprising:providing a secure website to a user for configuring a custom server processor, the secure website including a graphical user interface (GUI);providing a first graphical element in the GUI, the first graphical element including a plurality of base die options from which a user is to select a base die and an associated package, each of the base die options associated with a respective number of landing slots and a respective plurality of supported I/O interface options;providing a second graphical element in the GUI, the second graphical element including a plurality of building block options selectable by the user to populate the landing slots in the base die, each of the building block options associated with a respective building block and includes one or more fields for the user to specify a number of horizontal landing slots and a number of vertical landing slots to be occupied by the building block on the base die;providing a third graphical element in the GUI, the third graphical element including a visual representation of a user configuration, the visual representation comprising one or more ...

Подробнее
24-02-2022 дата публикации

ADDRESSING SCHEME FOR LOCAL MEMORY ORGANIZATION

Номер: US20220058126A1
Принадлежит:

A memory tile, in a local memory, may be considered to be a unit of memory structure that carries multiple memory elements, wherein each memory element is a one-dimensional memory structure. Multiple memory tiles make up a memory segment. By structuring the memory tiles, and a mapping matrix to the memory tiles, within a memory segment, non-blocking, concurrent write and read accesses to the local memory for multiple requestors may be achieved with relatively high throughput. The accesses may be either row-major or column-major for a two-dimensional memory array. 1. A method of memory access , the method comprising: [ a memory bank among a plurality of memory banks; and', 'a memory sub-bank among a plurality of memory sub-banks;, 'a plurality of memory tiles, each memory tile among the plurality of memory tiles designated as belonging to, 'a plurality of memory entries, each memory entry among the plurality of memory entries extending across the plurality of memory tiles;', 'each memory tile among the plurality of memory tiles having plurality of memory lines that are associated with a respective memory entry of the plurality of memory entries; and', 'each memory line among the plurality of memory lines having a plurality of memory elements, wherein each memory element is a one-dimensional memory structure;, 'establishing an addressing scheme for a memory segment, the addressing scheme definingselecting, using the addressing scheme, a memory element among the plurality of memory elements in a first memory line among the plurality of memory lines, in a first entry of the plurality of memory entries, of a first memory tile in a first memory bank and a first memory sub-bank, thereby establishing a first selected memory element;selecting, using the addressing scheme, a memory element among the plurality of memory elements in a second memory line among the plurality of memory lines, in the first entry, of a second memory tile in a second memory bank, thereby establishing ...

Подробнее
07-02-2019 дата публикации

System, Apparatus And Method For Data Driven Low Power State Control Based On Performance Monitoring Information

Номер: US20190041950A1
Принадлежит:

In one embodiment, a processor includes one or more cores including a cache memory hierarchy; a performance monitor coupled to the one or more cores, the performance monitor to monitor performance of the one or more cores, the performance monitor to calculate pipeline cost metadata based at least in part on count information associated with the cache memory hierarchy; and a power controller coupled to the performance monitor, the power controller to receive the pipeline cost metadata and determine a low power state for the one or more cores to enter based at least in part on the pipeline cost metadata. Other embodiments are described and claimed. 1. A processor comprising:at least one core to execute instructions, the at least one core including a cache memory hierarchy including at least one translation lookaside buffer (TLB) and at least one core-included cache memory;a performance monitor coupled to the at least one core, the performance monitor to monitor performance of the at least one core, the performance monitor including a first counter to count misses in the at least one TLB and a second counter to count misses in the at least one core-included cache memory, the performance monitor to calculate pipeline cost metadata based at least in part on the first counter and the second counter; anda power controller coupled to the performance monitor, the power controller to receive the pipeline cost metadata and determine a low power state for the at least one core to enter based at least in part on the pipeline cost metadata.2. The processor of claim 1 , wherein the power controller is to receive a software request for the at least one core to enter into a second low power state and cause the at least one core to enter into a different low power state claim 1 , when the pipeline cost metadata indicates that a second pipeline cost subsequent to the at least one core being in the second low power state exceeds by at least a first threshold a first pipeline cost ...

Подробнее
07-02-2019 дата публикации

Method and apparatus for multi-level memory early page demotion

Номер: US20190042145A1
Принадлежит: Intel Corp

An apparatus is described that includes a memory controller to couple to a multi-level memory characterized by a faster higher level and a slower lower level. The memory controller having early demotion logic circuitry to demote a page from the higher level to the lower level without system software having to instruct the memory controller to demote the page and before the system software promotes another page from the lower level to the higher level.

Подробнее
07-02-2019 дата публикации

STORAGE MODEL FOR A COMPUTER SYSTEM HAVING PERSISTENT SYSTEM MEMORY

Номер: US20190042415A1
Принадлежит:

A processor is described. The processor includes register space to accept input parameters of a software command to move a data item out of computer system storage and into persistent system memory. The input parameters include an identifier of a software process that desires access to the data item in the persistent system memory and a virtual address of the data item referred to by the software process. 1. A processor , comprising:register space to accept input parameters of a software command to move a data item out of computer system storage and into persistent system memory, the input parameters comprising an identifier of a software process that desires access to the data item in the persistent system memory and a virtual address of the data item referred to by the software process.2. The processor of in which the processor further comprises register space to return claim 1 , in response to the command claim 1 , a different virtual address to use when accessing the data item in the persistent system memory.3. The processor of in which memory management unit (MMU) logic circuitry of the processor is to determine the new virtual address in response to the request.4. The processor of in which the MMU logic circuitry is to enter a new entry in a translation look-aside buffer (TLB) of the processor for translating the new virtual address to an address of the persistent system memory useable to access the data item in the persistent system memory.5. The processor of in which the processor is to move the data item from a mass storage cache region of system memory to the persistent system memory claim 1 , if the data item resides in the mass storage cache region.6. The processor of in which claim 1 , if the data item resides in a mass storage cache region of the system memory claim 1 , re-characterize the address where the data item resides as being associated with persistent system memory instead of the mass storage cache.7. The processor of in which a mass storage ...

Подробнее
07-02-2019 дата публикации

PAUSE COMMUNICATION FROM I/O DEVICES SUPPORTING PAGE FAULTS

Номер: US20190042461A1
Принадлежит:

A processing device includes a core to execute instructions, and memory management circuitry coupled to, memory, the core and an I/O device that supports page faults. The memory management circuitry includes an express invalidations circuitry, and a page translation permission circuitry. The memory management circuitry is to, while the core is executing the instructions, receive a command to pause communication between the I/O device and the memory. In response to receiving the command to pause the communication, modify permissions of page translations by the page translation permission circuitry and transmit an invalidation request, by the express invalidations circuitry to the I/O device, to cause cached page translations in the I/O device to be invalidated. 1. A processing device comprising:a core to execute instructions; and an express invalidations circuitry; and', 'a page translation permission circuitry, wherein the memory management circuitry is to:', receive a command to pause communication between the I/O device and the memory; and', modify permissions of page translation responses by the page translation permission circuitry; and', 'transmit an invalidation request, by the express invalidations circuitry to the I/O device, to cause cached page translations in the I/O device to be invalidated., 'in response to receiving the command to pause the communication], 'while the core is executing the instructions], 'memory management circuitry coupled to, memory, the core and an I/O device that supports page faults, the memory management circuitry comprising2. The processing device of claim 1 , wherein the memory management circuitry is further to:transmit the page translations comprising the modified permissions to the I/O device.3. The processing device of claim 1 , wherein the memory management circuitry is further to:forgo transmitting a response to a page fault request from the I/O device.4. The processing device of claim 1 , wherein the memory management ...

Подробнее
18-02-2021 дата публикации

DEVICES, SYSTEMS, AND METHODS FOR DYNAMICALLY REMAPPING MEMORY ADDRESSES

Номер: US20210049094A1
Принадлежит: SMART IOPS, INC.

In certain aspects, dynamic remapping of memory addresses is provided and includes initiating a remapping of a logical block from a “mapped block” to a “remapped block.” Logical address locations for the logical block are mapped to physical address locations in the mapped block. The mapped and remapped blocks include non-volatile memory. A read command is received and determined to be for reading from a logical address location of the logical block, and the logical address location is determined to be mapped to a physical address location. Data is read from the physical address location of the mapped block. A write command is received and determined to be for writing data to the logical address location. Data is written to the physical address location of the remapped block. The read command is received after the initiation of the remapping and before the writing of the data to the remapped block. 1. A data storage system , comprising:non-volatile memory; and initiate a remapping of a first logical block from a mapped block to a remapped block, wherein a plurality of logical address locations for the first logical block is mapped to a plurality of physical address locations in the mapped block, and wherein the mapped block and the remapped block comprise the non-volatile memory;', 'receive a first read command;', 'determine that the first read command is for reading from a first logical address location of the first logical block;', 'determine that the first logical address location is mapped to a first physical address location of the plurality of physical address locations;', 'read first data from the first physical address location of the mapped block;', 'receive a first write command;', 'determine that the first write command is for writing second data to the first logical address location of the first logical block; and', 'write the second data to the first physical address location of the remapped block;', after the initiating of the remapping of the first ...

Подробнее
18-02-2021 дата публикации

SYSTEMS AND METHODS FOR HYPERVISOR-BASED PROTECTION OF CODE

Номер: US20210049263A1
Принадлежит:

Systems and methods for protecting vulnerable code by obtaining an input file comprising code representing executable files; generating a protected executable file by replacing an unencrypted version of each vulnerable function of the input file with a VM-exit generating instruction; and generating a database file including an encrypted version of each vulnerable function deleted from the input file. The protected executable file, database file are stored on a target device. A UEFI application initializes a hypervisor which accesses the decryption key using a TPM device and loads an operating system. When the hypervisor detects an attempt to execute an encrypted version of a vulnerable function it decrypts the encrypted version of the vulnerable function. 1. (canceled)3. The method for protecting computer code of wherein the step of said hypervisor accessing said decryption key comprises said TPM unsealing said decryption key.4. The method for protecting computer code of wherein the step of said TPM unsealing said decryption key further comprises said TPM transitioning to an inaccessible state such that the decryption key is indecipherable.6. The method for protecting computer code of further comprising said secondary level address translation table mapping the hypervisor address space and the operating system address space to different groups of cache sets.7. The method for protecting computer code of further comprising said input-output memory management unit mapping the hypervisor address space and the operating system address space to different groups of cache sets.8. The method for protecting computer code of further comprising said secondary level address translation table assigning access rights to said real physical addresses.9. The method for protecting computer code of further comprising said input-output memory management unit assigning access rights to said real physical addresses.10. The method for protecting computer code of wherein said secondary ...

Подробнее
16-02-2017 дата публикации

ASSOCIATING CACHE MEMORY WITH A WORK PROCESS

Номер: US20170046275A1
Автор: Moretti Michael J.
Принадлежит:

Systems, methods, and software described herein provide accelerated input and output of data in a work process. In one example, a method of operating a support process within a computing system for providing accelerated input and output for a work process includes monitoring for a file mapping attempt initiated by the work process. The method further includes, in response to the file mapping attempt, identifying a first region in memory already allocated to a cache service, and associating the first region in memory with the work process. 1. A method of operating a support process on a computing system for providing accelerated input and output for a work process , the method comprising:monitoring for a file mapping attempt initiated by the work process;in response to the file mapping attempt, identifying a first region in memory already allocated to a cache service; andassociating the first region in memory with the work process.2. The method of wherein the support process comprises a kernel process for the computing system.3. The method of wherein the work process comprises a Java virtual machine.4. The method of wherein the file mapping attempt initiated by the work process comprises a new input/output channel request initiated by the Java virtual machine.5. The method of wherein the cache service comprises a distributed cache service for providing data to one or more work processes.6. The method of further comprising:monitoring for a second file mapping attempt initiated by a second work process;in response to the second file mapping attempt, identifying a second region in memory already allocated to the cache service; andassociating the second region in memory with the second work process.7. The method of wherein the work process comprises a work process initiated by a Hadoop framework.8. The method of wherein the work process comprises a work process initiated by a map reduce framework.9. The method of wherein the computing system comprises a virtual computing ...

Подробнее
15-02-2018 дата публикации

SYSTEMS AND METHODS FOR FASTER READ AFTER WRITE FORWARDING USING A VIRTUAL ADDRESS

Номер: US20180046575A1
Принадлежит:

Methods for read after write forwarding using a virtual address are disclosed. A method includes determining when a virtual address has been remapped from corresponding to a first physical address to a second physical address and determining if all stores occupying a store queue before the remapping have been retired from the store queue. Loads that are younger than the stores that occupied the store queue before the remapping are prevented from being dispatched and executed until the stores that occupied the store queue before the remapping have left the store queue and become globally visible. 1. A method for read-after-write forwarding using a virtual address , said method comprising:determining if a load and a store are to a same page; andif said load and said store are to said same page, completing said load with data acquired via virtual-address-based forwarding while said load and said store are delayed for acquisition of their respective physical addresses and while the physical address of said load is cross-checked against the physical address of said store.2. The method of claim 1 , further comprising retiring said load if said cross checking indicates that said load and said store have different physical addresses.3. The method of claim 1 , further comprising flushing the instruction pipeline if said cross checking indicates that said load and said store have the same physical address claim 1 , wherein said load claim 1 , and every instruction subsequent to it claim 1 , is flushed.4. The method of claim 1 , wherein said same page is 4k in size.5. The method of claim 1 , further comprising signaling a hazard during said completing.6. A cache system claim 1 , comprising:data storage components; and a determining component for determining if a load and a store are to a same page;', 'a hazard signaling/load completing component, wherein responsive to a determination that said load and said store are to said same page, said load completing component is ...

Подробнее
13-02-2020 дата публикации

DETECTING BUS LOCKING CONDITIONS AND AVOIDING BUS LOCKS

Номер: US20200050471A1
Принадлежит:

A processor may include a register to store a bus-lock-disable bit and an execution unit to execute instructions. The execution unit may receive an instruction that includes a memory access request. The execution may further determine that the memory access request requires acquiring a bus lock, and, responsive to detecting that the bus-lock-disable bit indicates that bus locks are disabled, signal a fault to an operating system. 1. (canceled)2. A processor comprising:a register to store a bus-lock-disable bit, wherein the register comprises an architectural model-specific register (MSR) visible from outside of the processor; and receive an instruction that includes a memory access request;', 'determine that the memory access request requires acquiring a bus lock; and', 'responsive to detecting that the bus-lock-disable bit indicates that bus locks are disabled, signal a fault to an operating system., 'an execution unit to execute instructions, wherein the execution unit is to3. The processor of claim 2 , wherein the fault is a general protection fault.4. The processor of claim 2 , wherein the execution unit is further to terminate execution of the instruction responsive to detecting that the bus-lock-disable bit is enabled.5. The processor of claim 2 , wherein the execution unit is further to claim 2 , responsive to the fault claim 2 , execute a fault handler of the operating system to:disable memory accesses by other agents; andemulate execution of the instruction without requiring a bus lock.6. The processor of claim 2 , wherein the memory access request comprises a locked operation to uncacheable memory.7. The processor of claim 2 , wherein the memory access request comprises a locked operation that spans multiple cache lines.8. The processor of claim 2 , wherein the memory access request comprises a page-walk from a page table in uncacheable memory.9. The processor of claim 2 , wherein the execution unit is further to claim 2 , responsive to a determination ...

Подробнее
13-02-2020 дата публикации

Write data allocation in storage system

Номер: US20200050552A1
Автор: Gang Lyu, HUI Zhang
Принадлежит: International Business Machines Corp

This disclosure provides a method, a computing system and a computer program product for allocating write data in a storage system. The storage system comprises a Non-Volatile Write Cache (NVWC) and a backend storage subsystem, and the write data comprises first data whose addresses are not in the NVWC. The method includes checking fullness of the NVWC, and determining at least one of a write-back mechanism or a write-through mechanism as a write mode for the first data based on the checked fullness.

Подробнее
10-03-2022 дата публикации

MEMORY ARRAY PAGE TABLE WALK

Номер: US20220075733A1
Автор: Lea Perry V.
Принадлежит:

An example memory array page table walk can include using an array of memory cells configured to store a page table. The page table walk can include using sensing circuitry coupled to the array. The page table walk can include using a controller coupled to the array. The controller can be configured to operate the sensing circuitry to determine a physical address of a portion of data by accessing the page table in the array of memory cells. The controller can be configured to operate the sensing circuitry to cause storing of the portion of data in a buffer. 120-. (canceled)21. A system , comprising:a host; and an array of memory cells coupled to a plurality of sense lines, the array configured to store a page table;', 'a plurality of sense amplifiers coupled to the plurality of sense lines;', 'a compute component coupled to at least one of the plurality of sense amplifiers; and', 'a memory controller coupled to the compute component and configured to control the compute component to perform a number of operations to determine a physical address of a portion of data by accessing the page table;, 'a memory device coupled to the host, the memory device comprising the host is configured to send a request to access the portion of data; and', receive the request; and', 'perform the number of operations in response to receiving the request., 'the memory device is configured to], 'wherein22. The system of claim 21 , wherein the host comprises a processing resource configured to generate the request to send to the memory device.23. The system of claim 21 , wherein the memory controller is further configured to store the portion of data in a buffer claim 21 , wherein the buffer is a translation lookaside buffer (TLB).24. The system of claim 21 , wherein the memory controller is further configured to access the portion of data using the determined physical address.25. The system of claim 24 , wherein the memory controller is further configured to send the accessed portion of ...

Подробнее
01-03-2018 дата публикации

DETECTING BUS LOCKING CONDITIONS AND AVOIDING BUS LOCKS

Номер: US20180060099A1
Принадлежит:

A processor may include a register to store a bus-lock-disable bit and an execution unit to execute instructions. The execution unit may receive an instruction that includes a memory access request. The execution may further determine that the memory access request requires acquiring a bus lock, and, responsive to detecting that the bus-lock-disable bit indicates that bus locks are disabled, signal a fault to an operating system. 1. A processor comprising:a register to store a bus-lock-disable bit; and receive an instruction that includes a memory access request;', 'determine that the memory access request requires acquiring a bus lock; and', 'responsive to detecting that the bus-lock-disable bit indicates that bus locks are disabled, signal a fault to an operating system., 'an execution unit to execute instructions, wherein the execution unit is to2. The processor of claim 1 , wherein the fault is a general protection fault.3. The processor of claim 1 , wherein the register is a model-specific register.4. The processor of claim 1 , wherein the execution unit is further to terminate execution of the instruction responsive to detecting that the bus-lock-disable bit is enabled.5. The processor of claim 1 , wherein the execution unit is further to claim 1 , responsive to the fault claim 1 , execute a fault handler of the operating system to:disable memory accesses by other agents; andemulate execution of the instruction without requiring a bus lock.6. The processor of claim 1 , wherein the memory access request comprises a locked operation to uncacheable memory.7. The processor of claim 1 , wherein the memory access request comprises a locked operation that spans multiple cache lines.8. The processor of claim 1 , wherein the memory access request comprises a page-walk from a page table in uncacheable memory.9. A system on a chip (SoC) comprising:a memory to store a virtual machine control structure (VMCS); anda core coupled to the memory, wherein the core is to execute ...

Подробнее
02-03-2017 дата публикации

GPU SHARED VIRTUAL MEMORY WORKING SET MANAGEMENT

Номер: US20170060743A1
Автор: Kumar Derek R.
Принадлежит:

A method and apparatus of a device that manages virtual memory for a graphics processing unit is described. In an exemplary embodiment, the device manages a graphics processing unit working set of pages. In this embodiment, the device determines the set of pages of the device to be analyzed, where the device includes a central processing unit and the graphics processing unit. The device additionally classifies the set of pages based on a graphics processing unit activity associated with the set of pages and evicts a page of the set of pages based on the classifying. 1. A non-transitory machine-readable medium having executable instructions to cause one or more processing units to perform a method to manage a graphics processing unit working set of pages , the method comprising:determining the set of pages of a device to be analyzed, wherein the device includes a central processing unit and the graphics processing unit;classifying the set of pages based on a graphics processing unit activity associated with the set of pages; andevicting a page of the set of pages based on the classifying.2. The non-transitory machine-readable medium of claim 1 , further comprising:clearing a graphics processing unit reference bit from each page table entry associated with the set of pages.3. The non-transitory machine-readable medium of claim 1 , further comprising:waiting a time period before re-determining the set of pages to be analyzed.4. The non-transitory machine-readable medium of claim 1 , wherein each page in the set of pages is a contiguous block of virtual memory.5. The non-transitory machine-readable medium of claim 1 , wherein the classification of each page is based on if a graphics processing unit reference bit is set in a page table entry associated with that page.6. The non-transitory machine-readable medium of claim 5 , wherein the graphics processing unit reference bit for a page table entry indicates whether a graphics processing unit has accessed a virtual memory ...

Подробнее
11-03-2021 дата публикации

Processing method and apparatus for translation lookaside buffer flush instruction

Номер: US20210073144A1
Автор: Ren Guo
Принадлежит: Alibaba Group Holding Ltd

The present invention discloses an instruction processing apparatus, including: a first register adapted to store address information; a second register adapted to store address space identification information; a decoder adapted to receive and decode a translation lookaside buffer flush instruction, where the translation lookaside buffer flush instruction indicates that the first register serves as a first operand, and the second register serves as a second operand; and an execution unit coupled to the first register, the second register, and the decoder and executing the decoded translation lookaside buffer flush instruction, so as to acquire address information from the first register, to acquire address space identification information from the second register, and to broadcast the acquired address, information and address space identification information on a bus coupled to the instruction processing apparatus, so that another processing unit coupled to the bus performs purging on a translation lookaside buffer, corresponding to the address information, in an address space indicated by the address space identification information, The present invention also discloses a corresponding instruction processing method, a computing system, and a system-on-chip.

Подробнее
17-03-2016 дата публикации

Cache Bank Spreading For Compression Algorithms

Номер: US20160077973A1
Принадлежит: Qualcomm Inc

Aspects include computing devices, systems, and methods for implementing a cache memory access requests for compressed data using cache bank spreading. In an aspect, cache bank spreading may include determining whether the compressed data of the cache memory access fits on a single cache bank. In response to determining that the compressed data fits on a single cache bank, a cache bank spreading value may be calculated to replace/reinstate bank selection bits of the physical address for a cache memory of the cache memory access request that may be cleared during data compression. A cache bank spreading address in the physical space of the cache memory may include the physical address of the cache memory access request plus the reinstated bank selection bits. The cache bank spreading address may be used to read compressed data from or write compressed data to the cache memory device.

Подробнее
15-03-2018 дата публикации

DDR STORAGE ADAPTER

Номер: US20180074971A1
Принадлежит:

A method of accessing a persistent memory over a memory interface is disclosed. In one embodiment, the method includes allocating a virtual address range comprising virtual memory pages to be associated with physical pages of a memory buffer and marking each page table entry associated with the virtual address range as not having a corresponding one of the physical pages of the memory buffer. The method further includes generating a page fault when one or more of the virtual memory pages within the virtual address range is accessed and mapping page table entries of the virtual memory pages to the physical pages of the memory buffer. The method further includes transferring data between a physical page of the persistent memory and one of the physical pages of the memory buffer mapped to a corresponding one of the virtual memory pages. 1. A method of accessing a persistent memory over a memory interface comprising:allocating a virtual address range comprising virtual memory pages to be associated with physical pages of a memory buffer;marking each page table entry associated with the virtual address range as not having a corresponding one of the physical pages of the memory buffer;generating a page fault when one or more of the virtual memory pages within the virtual address range is accessed;mapping page table entries of the virtual memory pages to the physical pages of the memory buffer; andtransferring data between a physical page of the persistent memory and one of the physical pages of the memory buffer mapped to a corresponding one of the virtual memory pages.2. The method of claim 1 , wherein the memory buffer comprises at least one of a host buffer and a DIMM buffer.3. The method of claim 2 , wherein the mapping comprises:detecting if a physical page of the DIMM buffer is available; andif the physical page of the DIMM buffer is available,updating the page table entry of one of the virtual memory pages with the physical page of the DIMM buffer.4. The method of ...

Подробнее
15-03-2018 дата публикации

SELECTIVE PURGING OF PCI I/O ADDRESS TRANSLATION BUFFER

Номер: US20180074972A1
Принадлежит:

Embodiments relate to enhancing a refresh PCI translation (RPCIT) instruction to refresh a translation lookaside buffer (TLB). A computer processor determines a request to purge a translation for a single frame of the TLB in response to executing an enhanced RPCIT instruction. The enhanced RPCIT instruction is configured to selectively perform one of a single-frame TLB refresh operation or a range-bounded TLB refresh operation. The computer processor determines an absolute storage frame based on a translation of a PCI virtual address in response to the request to purge a translation for a single frame of the TLB. The computer processor further performs the single-frame TLB refresh operation to purge the translation for the single frame. 1. A method of improving efficiency of a refresh operation on a translation lookaside buffer (TLB) using an enhanced refresh PCI translation (RPCIT) operation to refresh a translation lookaside buffer (TLB) , the method comprising:determining a data storage access request corresponding to a PCI function;in response to the data storage access request, performing, by a computer processor, a plurality of storage access operations to access storage data stored in the TLB;analyzing, by a computer processor, an enhanced RPCIT instruction block indicating a request to perform at least one RPCIT instruction for performing a series of refresh operations to purge at least one translation from the TLB; andpurging, by the computer processor, the at least one translation from the TLB in response to executing the at least one RPCIT instruction, wherein the computer processor selectively sets a bit value of a synchronization bypass (SB) control bit included in the enhanced RPCIT instruction block to perform a synchronization bypass operation,wherein the computer processor selectively sets the SB control bit value to a first bit value to bypass the synchronization operation such that the execution of the at least one RPCIT instruction does not wait ...

Подробнее
07-03-2019 дата публикации

USING REAL SEGMENTS AND ALTERNATE SEGMENTS IN NON-VOLATILE STORAGE

Номер: US20190073317A1
Принадлежит:

Provided are techniques for using real segments and alternate segments in Non-Volatile Storage (NVS). One or more write requests for a track are executed by alternating between storing data in one or more sectors of real segments and one or more sectors of alternate segments for each of the write requests, while setting indicators in a real sector structure and an alternate sector structure. In response to determining that the one or more write requests for the track have completed, the data stored in the one or more sectors of the real segments and in the one or more sectors of the alternate segments are merged to form newly written data. In response to determining that a hardened, previously written data of a track does exist in Non-Volatile Storage (NVS), the newly written data is merged with the hardened, previously written data in the NVS. The merged data is committed. 1. A computer program product , the computer program product comprising a computer readable storage medium having program code embodied therewith , the program code executable by at least one processor to perform:executing one or more write requests for a track by alternating between storing data in one or more sectors of real segments and one or more sectors of alternate segments for each of the write requests, while setting one or more corresponding indicators for the one or more sectors in a real sector structure and an alternate sector structure to indicate whether the data for that sector is stored in the real segments or in the alternate segments; and merging the data stored in the one or more sectors of the real segments and in the one or more sectors of the alternate segments to form newly written data;', 'in response to determining that a hardened, previously written data of a track does exist in Non-Volatile Storage (NVS), merging the newly written data with the hardened, previously written data in the NVS; and', 'committing the merged data., 'in response to determining that the one or ...

Подробнее
24-03-2022 дата публикации

Unified Memory Management for a Multiple Processor System

Номер: US20220091981A1
Принадлежит:

Various multi-processor unified memory management systems and methods are detailed herein. In embodiments detailed herein, inter-chip memory management modules may be executed by processors that are in communication via an inter-chip link. A flat memory map may be used across the multiple processors of the system. Each inter-chip memory management module may analyze memory transactions. If the memory transaction is directed to a portion of the flat memory map managed by another processor, the memory-transaction may be translated to a non-memory mapped transaction and transmitted via an inter-chip communication link. 1. A multi-processor unified memory management system , comprising: analyze memory access transactions;', 'translate outbound memory-mapped transactions into non-memory mapped transactions comprising coded memory address data; and', 'translate inbound non-memory mapped transactions into memory-mapped transactions based on coded memory address data; and, 'a first inter-chip memory management module configured to, 'a first programmable processor system that communicates via an inter-chip link with a second programmable processor system, wherein the first programmable processor system comprises analyze memory access transactions;', 'translate outbound memory-mapped transactions into non-memory mapped transactions comprising coded memory address data; and', 'translate inbound non-memory mapped transactions into memory-mapped transactions based on coded memory address data., 'a second inter-chip memory management module configured to, 'the second programmable processor system that communicates via the inter-chip link with the first programmable processor system, wherein the second programmable processor system comprises2. The multi-processor unified memory management system of claim 1 , wherein the first inter-chip memory management module is further configured to:analyze a memory access transaction;determine that the memory access transaction involves a ...

Подробнее
18-03-2021 дата публикации

MEMORY SYSTEM

Номер: US20210081329A1
Принадлежит:

A memory system is connectable to the host. The memory system includes a nonvolatile first memory, a second memory in which a plurality of pieces of first information each correlating a logical address indicating a location in a logical address space of the memory system with a physical address indicating a location in the first memory are stored, a volatile third memory including a first cache and a second cache, a compressor configured to perform compression on the plurality of pieces of first information, and a memory controller. The memory controller stores the first information not compressed by the compressor in the first cache, stores the first information compressed by the compressor in the second cache, and controls a ratio between a first capacity, which is a capacity of the first cache, and a second capacity, which is a capacity of the second cache. 1. A memory system connectable to a host , comprising:a nonvolatile first memory;a second memory in which a plurality of pieces of first information each correlating a logical address indicating a location in a logical address space of the memory system with a physical address indicating a location in the first memory are stored;a volatile third memory including a first cache and a second cache;a compressor configured to perform compression on the plurality of pieces of first information; anda memory controller configured to store the first information not compressed by the compressor in the first cache, store the first information compressed by the compressor in the second cache, and control a ratio between a first capacity, which is a capacity of the first cache, and a second capacity, which is a capacity of the second cache.2. The memory system according to claim 1 , wherein the memory controller is configured todetermine a frequency of sequential writes based on a plurality of write commands from the host, andadjust the ratio according to the frequency of sequential writes.3. The memory system according to ...

Подробнее
22-03-2018 дата публикации

CLOUD STORAGE SYSTEM

Номер: US20180081562A1
Автор: Vasudevan Suresh
Принадлежит:

Methods, systems, and computer readable media for execution by a cloud storage system are provided. One example method is for storage processing on a cloud system. The method includes executing a storage application on a compute node of the cloud system, and the storage application is configured to process write commands and read commands to and from storage of the cloud system. The write commands and the read commands are from an application. The method includes processing, by the storage application, a write command from the application. The processing includes writing data blocks to memory cache provided by the compute node for the storage application; writing data blocks written to memory cache to a write cache of a block storage that is part of the storage of the cloud system; and writing select data blocks written to memory cache to a read cache of block storage that is part of storage of the cloud system. The method further includes coalescing, by the storage application, the data blocks to produce data segments and writing, by the storage application, the data segments to object storage that is part of storage of the cloud system. The methods also include management of a read path via the storage application. 1. A method for storage processing on a cloud system , comprising ,executing a storage application on a compute node of the cloud system, the storage application is configured to process write commands and read commands to and from storage of the cloud system, the write commands and the read commands being from an application; writing data blocks to memory cache provided by the compute node for the storage application;', 'writing data blocks written to memory cache to a write cache of a block storage that is part of the storage of the cloud system;', 'writing select data blocks written to memory cache to a read cache of block storage that is part of storage of the cloud system; and', 'coalescing, by the storage application, data blocks to produce data ...

Подробнее
24-03-2016 дата публикации

IMMEDIATE BRANCH RECODE THAT HANDLES ALIASING

Номер: US20160085550A1
Принадлежит:

A system and method for efficiently indicating branch target addresses. A semiconductor chip predecodes instructions of a computer program prior to installing the instructions in an instruction cache. In response to determining a particular instruction is a control flow instruction with a displacement relative to a program counter address (PC), the chip replaces a portion of the PC relative displacement in the particular instruction with a subset of a target address. The subset of the target address is an untranslated physical subset of the full target address. When the recoded particular instruction is fetched and decoded, the remaining portion of the PC relative displacement is added to a virtual portion of the PC used to fetch the particular instruction. The result is concatenated with the portion of the target address embedded in the fetched particular instruction to form a full target address. 1. A processor comprising:an interface to a memory located external to a cache subsystem, wherein the interface is configured to send requests comprising physical fetch addresses to the memory for instructions; receive one or more instructions from the memory; and', 'in response to determining a first instruction of the received one or more instructions comprises a control flow instruction with a program counter (PC) relative displacement, replace a lower portion of the relative displacement in the first instruction with a lower portion of a virtual target address for a next instruction to fetch while an upper portion of the relative displacement is not replaced in the first instruction, wherein the upper portion of the relative displacement corresponds to bit positions of a virtual portion of the virtual target address and the lower portion of the relative displacement corresponds to bit positions of a physical portion of the virtual target address., 'control logic configured to2. The processor as recited in claim 1 , wherein the processor further comprises an ...

Подробнее
24-03-2016 дата публикации

Cache Hashing

Номер: US20160085672A1
Автор: Simon Fenney
Принадлежит: Imagination Technologies Ltd

Cache logic generates a cache address from an input memory address that includes a first binary string and a second binary string. The cache logic includes a hashing engine configured to generate a third binary string from the first binary string and to form each bit of the third binary string by combining a respective subset of bits of the first binary string by a first bitwise operation, wherein the subsets of bits of the first binary string are defined at the hashing engine such that each subset is unique and comprises approximately half of the bits of the first binary string; and a combination unit arranged to combine the third binary string with the second binary string by a reversible operation so as to form a binary output string for use as at least part of a cache address in a cache memory.

Подробнее
26-03-2015 дата публикации

Translation Bypass In Multi-Stage Address Translation

Номер: US20150089150A1
Принадлежит: Cavium, Inc.

A computer system that supports virtualization may maintain multiple address spaces. Each guest operating system employs guest virtual addresses (GVAs), which are translated to guest physical addresses (GPAs). A hypervisor, which manages one or more guest operating systems, translates GPAs to root physical addresses (RPAs). A merged translation lookaside buffer (MTLB) caches translations between the multiple addressing domains, enabling faster address translation and memory access. The MTLB can be logically addressable as multiple different caches, and can be reconfigured to allot different spaces to each logical cache. Lookups to the caches of the MTLB can be selectively bypassed based on a control configuration and the attributes of a received address. 1. A circuit comprising:a cache configured to store translations between address domains, the cache addressable as a first logical portion and a second logical portion, the first logical portion configured to store translations between a first address domain and a second address domain, the second logical portion configured to store translations between the second address domain and a third address domain; anda processor configured to 1) control a bypass of at least one of the first and second logical portions with respect to an address request and 2) match the address request against a non-bypassed portion of the cache in accordance with the bypass and output a corresponding address result.2. The circuit of claim 1 , wherein the processor is further configured to control the bypass based on an address indicated by the address request.3. The circuit of claim 2 , wherein the processor is further configured to control the bypass based on whether the address indicated by the address request specifies a subset of a memory excluded from a given address translation.4. The circuit of claim 2 , wherein the processor is further configured to control the bypass based on whether the address indicated by the address request is ...

Подробнее
23-03-2017 дата публикации

METHOD AND APPARATUS FOR STACKING CORE AND UNCORE DIES HAVING LANDING SLOTS

Номер: US20170084593A1
Автор: RUSU Stefan
Принадлежит:

A method is described for stacking a plurality of cores. For example, one embodiment comprises: mounting an uncore die on a package, the uncore die comprising a plurality of exposed landing slots, each landing slot including an inter-die interface usable to connect vertically to a cores die, the uncore die including a plurality of uncore components usable by cores within the cores die; and vertically coupling a first cores die comprising a first plurality of cores on top of the uncore die, the cores spaced on the first cores die to correspond to all or a first subset of the landing slots on the uncore die, each of the cores having an inter-die interface positioned to be communicatively coupled to a corresponding inter-die interface within a landing slot on the uncore die when the first cores die is vertically coupled on top of the uncore die. 1. A method comprising:mounting an uncore die on a package, the uncore die comprising a plurality of exposed landing slots, each landing slot including an inter-die interface usable to connect vertically to a cores die, the uncore die including a plurality of uncore components usable by cores within the cores die including a memory controller component, a level 3 (L3) cache, a system memory or system memory interface, and a core interconnect fabric or bus; andvertically coupling a first cores die comprising a first plurality of cores on top of the uncore die, the cores spaced on the first cores die to correspond to all or a first subset of the landing slots on the uncore die, each of the cores having an inter-die interface positioned to be communicatively coupled to a corresponding inter-die interface within a landing slot on the uncore die when the first cores die is vertically coupled on top of the uncore die, wherein the communicative coupling between the inter-die interface of a core and the inter-die interface of its corresponding landing slot communicatively couples the core to the uncore components of the uncore die.2. ...

Подробнее
12-03-2020 дата публикации

Unified address space for multiple hardware accelerators using dedicated low latency links

Номер: US20200081850A1
Принадлежит: Xilinx Inc

A system may include a host processor coupled to a communication bus, a first hardware accelerator communicatively linked to the host processor through the communication bus, and a second hardware accelerator communicatively linked to the host processor through the communication bus. The first hardware accelerator and the second hardware accelerator are directly coupled through an accelerator link independent of the communication bus. The host processor is configured to initiate a data transfer between the first hardware accelerator and the second hardware accelerator directly through the accelerator link.

Подробнее
19-06-2014 дата публикации

SYSTEM AND METHOD FOR VERSIONING BUFFER STATES AND GRAPHICS PROCESSING UNIT INCORPORATING THE SAME

Номер: US20140168227A1
Автор: Meixner Albert
Принадлежит: NVIDIA CORPORATION

A system and method for versioning states of a buffer. In one embodiment, the system includes: (1) a page table lookup and coalesce circuit operable to provide a page table directory request for a translatable virtual address of the buffer to a page table stored in a virtual address space and (2) a page directory processing circuit associated with the page table lookup and coalesce circuit and operable to provide a translated virtual address based on the virtual address and a page table load response received from the page table. 1. A system for versioning states of a buffer , comprising:a page table lookup and coalesce circuit operable to provide a page table directory request for a translatable virtual address of said buffer to a page table stored in a virtual address space; anda page directory processing circuit associated with said page table lookup and coalesce circuit and operable to provide a translated virtual address based on said virtual address and a page table load response received from said page table.2. The system as recited in wherein said translatable virtual address is associated with a cache memory line of said buffer.3. The system as recited in wherein a virtual base address in said translatable virtual address identifies a version of said buffer.4. The system as recited in wherein said page directory processing circuit employs a two-level translation.5. The system as recited in further comprising a constant translation cache lookup circuit operable to employ a cache of recently translated addresses to determine if a translation request can be fulfilled from a cache.6. The system as recited in further comprising a translation miss buffer configured to contain miss request information.7. The system as recited in wherein said translated virtual address is a sum of a page directory entry employed as a base address and a field of said translatable virtual address employed as an offset.8. A method of versioning states of a buffer claim 1 , comprising: ...

Подробнее
25-03-2021 дата публикации

DATA CONSISTENCY TECHNIQUES FOR PROCESSOR CORE, PROCESSOR, APPARATUS AND METHOD

Номер: US20210089469A1
Принадлежит:

A processor core, a processor, an apparatus, and a method are disclosed. The processor core is coupled to a translation lookaside buffer and a first memory. The processor core further includes a memory processing module that includes: an instruction processing unit, adapted to identify a virtual memory operation instruction and send the virtual memory operation instruction to a bus request transceiver module; the bus request transceiver module, adapted to send the virtual memory operation instruction to an external interconnection unit; a forwarding request transceiver unit, adapted to receive the virtual memory operation instruction broadcast by the interconnection unit and send the virtual memory operation instruction to the virtual memory operation unit; and the virtual memory operation unit, adapted to perform a virtual memory operation according to the virtual memory operation instruction. An initiation core sends the virtual memory operation instruction to the interconnection unit. The interconnection unit determines, based on an operation address, to broadcast the virtual memory operation instruction to at least one of a plurality of processor cores, so that all the cores can process the virtual memory operation instruction using the same hardware logic, thereby reducing hardware logic of the processor core. 1. A processor core , wherein the processor core is coupled to a translation lookaside buffer and a first memory , and the processor core further comprises a memory processing module implemented in hardware logic , wherein the memory processing module comprises:an instruction processing unit, adapted to identify a virtual memory operation instruction from received instructions, and send the virtual memory operation instruction to a bus request transceiver module;the bus request transceiver module, adapted to send the virtual memory operation instruction to an interconnection unit;a forwarding request transceiver unit, adapted to receive the virtual memory ...

Подробнее
29-03-2018 дата публикации

PROCESSOR EXTENSIONS TO IDENTIFY AND AVOID TRACKING CONFLICTS BETWEEN VIRTUAL MACHINE MONITOR AND GUEST VIRTUAL MACHINE

Номер: US20180088976A1
Принадлежит:

A processing system includes an execution unit, communicatively coupled to an architecturally-protected memory, the execution unit comprising a logic circuit to execute a virtual machine monitor (VMM) that supports a virtual machine (VM) comprising a guest operating system (OS) and to implement an architecturally-protected execution environment, wherein the logic circuit is to responsive to executing a blocking instruction by the guest OS directed at a first page stored in the architecturally-protected memory during a first time period identified by a value stored in a first counter, copy the value from the first counter to a second counter, responsive to executing a first tracking instruction issued by the VMM, increment the value stored in the first counter, and set a flag to indicate successful execution of the second tracking instruction. 1. A processing system , comprising:an execution unit, communicatively coupled to an architecturally-protected memory, the execution unit comprising a logic circuit to execute a virtual machine monitor (VMM) that supports a virtual machine (VM) comprising a guest operating system (OS) and to implement an architecturally-protected execution environment, responsive to executing a blocking instruction by the guest OS directed at a first page stored in the architecturally-protected memory during a first time period identified by a value stored in a first counter, copy the value from the first counter to a second counter;', 'responsive to executing a first tracking instruction issued by the VMM, increment the value stored in the first counter; and', 'responsive to receiving a request to execute a second tracking instruction issued by the guest OS directed to a second page stored in the architecturally-protected memory and responsive to determining that the value store in the first counter is greater than the value stored in the second counter, set a flag to indicate successful execution of the second tracking instruction., 'wherein ...

Подробнее
30-03-2017 дата публикации

DYNAMIC RELEASING OF CACHE LINES

Номер: US20170090977A1
Принадлежит:

A computer-implemented method includes, in a transactional memory environment, identifying a transaction and identifying one or more cache lines. The cache lines are allocated to the transaction. A cache line record is stored. The cache line record includes a reference to the one or more cache lines. An indication is received. The indication denotes a request to demote the one or more cache lines. The cache line record is retrieved, and the one or more cache lines are released. A corresponding computer program product and computer system are also disclosed. 1. A computer-implemented method comprising , in a transactional memory environment:identifying a transaction;identifying one or more cache lines, said one or more cache lines being allocated to said transaction;storing a cache line record, said cache line record comprising a reference to said one or more cache lines;receiving an indication, said indication denoting a request to demote said one or more cache lines;retrieving said cache line record; andreleasing said one or more cache lines.2. The computer-implemented method of claim 1 , wherein said indication is provided by at least one element selected from the group consisting of:one or more machine-level instructions to a computer hardware component;one or more values in one or more computer control registers; anddetection of one or more conflict conditions by a contention management policy.3. The computer-implemented method of claim 1 , wherein:storing said cache line record comprises storing a reference to load cache lines in a level one cache; andretrieving said cache line record comprises accessing said level one cache.4. The computer-implemented method of claim 1 , wherein:storing said cache line record comprises storing a reference to store cache lines in a store buffer; andretrieving said cache line record comprises accessing said store buffer.5. The computer-implemented method of claim 1 , wherein:storing said cache line record comprises storing a ...

Подробнее
30-03-2017 дата публикации

PROVIDING MEMORY MANAGEMENT FUNCTIONALITY USING AGGREGATED MEMORY MANAGEMENT UNITS (MMUs)

Номер: US20170091116A1
Принадлежит:

Providing memory management functionality using aggregated memory management units (MMUs), and related apparatuses and methods are disclosed. In one aspect, an aggregated MMU is provided, comprising a plurality of input data paths including each including plurality of input transaction buffers, and a plurality of output paths each including a plurality of output transaction buffers. Some aspects of the aggregated MMU additionally provide one or more translation caches and/or one or more hardware page table walkers The aggregated MMU further includes an MMU management circuit configured to retrieve a memory address translation request (MATR) from an input transaction buffer, perform a memory address translation operation based on the MATR to generate a translated memory address field (TMAF), and provide the TMAF to an output transaction buffer. The aggregated MMU also provides a plurality of output data paths, each configured to output transactions with resulting memory address translations. 1. An aggregated memory management unit (MMU) , comprising:a plurality of input data ports each configured to convey pre-translation transactions to a plurality of input data paths configured to receive a plurality of memory address pre-translation read transactions and a plurality of memory address pre-translation write transactions, the plurality of input data paths comprising a corresponding plurality of input transaction buffers each comprising a plurality of input transaction buffer slots configured to store a respective pre-translation transaction among the plurality of pre-translation transactions; anda plurality of output data paths comprising a corresponding plurality of output transaction buffers each comprising a plurality of output transaction buffer slots configured to store a respective post-translation transaction of a plurality of post-translation transactions; retrieve a memory address translation request (MATR) of a pre-translation transaction from an input ...

Подробнее
19-03-2020 дата публикации

PREFETCH KILL AND REVIVAL IN AN INSTRUCTION CACHE

Номер: US20200089622A1
Принадлежит:

A system comprises a processor including a CPU core, first and second memory caches, and a memory controller subsystem. The memory controller subsystem speculatively determines a hit or miss condition of a virtual address in the first memory cache and speculatively translates the virtual address to a physical address. Associated with the hit or miss condition and the physical address, the memory controller subsystem configures a status to a valid state. Responsive to receipt of a first indication from the CPU core that no program instructions associated with the virtual address are needed, the memory controller subsystem reconfigures the status to an invalid state and, responsive to receipt of a second indication from the CPU core that a program instruction associated with the virtual address is needed, the memory controller subsystem reconfigures the status back to a valid state. 1. A data processing apparatus comprising:a memory; and receive a first address and a pre-fetch count value;', 'compute a second address based on the first address;', 'determine a hit/miss condition of the second address;', 'set a status of the second address to a valid state;', 'after setting the status of the second address to a valid state, determine whether the pre-fetch count value is zero; and', 'change the status of the second address to an invalid state when the pre-fetch count value is zero., 'a memory controller coupled to the memory and configured to2. The data processing apparatus of claim 1 , wherein the memory controller includes a register and setting the status of the second address to valid includes storing a first value that corresponds to a valid state to a first field of the register.3. The data processing apparatus of claim 2 , wherein the first field is a single bit of the register.4. The data processing apparatus of claim 2 , wherein the memory controller is configured to store the hit/miss condition of the second address into a second field of the register and store ...

Подробнее
12-05-2022 дата публикации

PROCESS DEDICATED IN-MEMORY TRANSLATION LOOKASIDE BUFFERS (TLBs) (mTLBs) FOR AUGMENTING MEMORY MANAGEMENT UNIT (MMU) TLB FOR TRANSLATING VIRTUAL ADDRESSES (VAs) TO PHYSICAL ADDRESSES (PAs) IN A PROCESSOR-BASED SYSTEM

Номер: US20220147463A1
Принадлежит: Microsoft Technology Licensing LLC

Process dedicated in-memory translation lookaside buffers (TLBs) (mTLBs) for augmenting a memory management unit (MMU) TLB for translating virtual addresses (VAs) to physical addresses (PA) in a processor-based system is disclosed. In disclosed examples, a dedicated in-memory TLB is supported in system memory for each process so that one process's cached page table entries do not displace another process's cached page table entries. When a process is scheduled to execute in a central processing unit (CPU), the in-memory TLB address stored for such process can be used by page table walker circuit in the CPU MMU to access the dedicated in-memory TLB for executing the process to perform VA to PA translations in the event of a TLB miss to the MMU TLB. If a TLB miss occurs to the in-memory TLB, the page table walker circuit can walk the page table in the MMU.

Подробнее
28-03-2019 дата публикации

AREA-EFFICIENT IMPLEMENTATIONS OF GRAPHICS INSTRUCTIONS

Номер: US20190096024A1
Принадлежит: Intel Corporation

Embodiments are generally directed to area-efficient implementations of graphics instructions. An embodiment of an apparatus includes a graphics subsystem, the graphics subsystem including one or more of a first logic for processing of memory read-return data for single-instruction-multiple-data instructions, the first logic to store data for a message in raw data format and delay conversion into shader format until all cache line requests for the message have been received; a second logic for assembly of memory read-return data for media block instructions into shader register format, the logic to provide for storage of valid bytes from a cache fragment in a register; or a third logic to remap scatter or gather instructions to untyped surface instruction types. An embodiment of an apparatus includes a graphics subsystem, the graphics subsystem including a translation lookaside buffer (TLB) and a data port controller to control the TLB, the data port controller including an incoming request pipeline to receive an incoming request with virtual address and generate a response, an incoming response pipeline to receive the response and generate a cache request, and an invalidation flow pipeline. 1. An apparatus comprising: a first logic for processing of memory read-return data for single-instruction-multiple-data instructions, the first logic to store data for a message in raw data format and delay conversion into shader format until all cache line requests for the message have been received;', 'a second logic for assembly of memory read-return data for media block instructions into shader register format, the second logic to provide for storage of valid bytes from a cache fragment in a register; or', 'a third logic to remap scatter or gather instructions to untyped surface instruction types., 'a graphics subsystem, the graphics subsystem including one or more of the following logics for handling of graphical data2. The apparatus of claim 1 , wherein the first logic ...

Подробнее
16-04-2015 дата публикации

Computer Processor Employing Dedicated Hardware Mechanism Controlling The Initialization And Invalidation Of Cache Lines

Номер: US20150106566A1
Принадлежит: Mill Computing Inc

A computer processing system includes execution logic that generates memory requests that are supplied to a hierarchical memory system. The computer processing system includes a hardware map storing a number of entries associated with corresponding cache lines, where each given entry of the hardware map indicates whether a corresponding cache line i) currently stores valid data in the hierarchical memory system, or ii) does not currently store valid data in hierarchical memory system and should be interpreted as being implicitly zero throughout.

Подробнее
16-04-2015 дата публикации

Computer Processor With Deferred Operations

Номер: US20150106597A1
Принадлежит: Mill Computing Inc

A computer processor and corresponding method of operation employs execution logic that includes at least one functional unit and operand storage that stores data that is produced and consumed by the at least one functional unit. The at least one functional unit is configured to execute a deferred operation whose execution produces result data. The execution logic further includes a retire station that is configured to store and retire the result data of the deferred operation in order to store such result data in the operand storage, wherein the retire of such result data occurs at a machine cycle following issue of the deferred operation as controlled by statically-assigned parameter data included in the encoding of the deferred operation.

Подробнее
26-03-2020 дата публикации

EXTERNAL MEMORY BASED TRANSLATION LOOKASIDE BUFFER

Номер: US20200097413A1
Принадлежит: ATI TECHNOLOGIES ULC

Methods, devices, and systems for virtual address translation. A memory management unit (MMU) receives a request to translate a virtual memory address to a physical memory address and searching a translation lookaside buffer (TLB) for a translation to the physical memory address based on the virtual memory address. If the translation is not found in the TLB, the MMU searches an external memory translation lookaside buffer (EMTLB) for the physical memory address and performs a page table walk, using a page table walker (PTW), to retrieve the translation. If the translation is found in the EMTLB, the MMU aborts the page table walk and returns the physical memory address. If the translation is not found in the TLB and not found in the EMTLB, the MMU returns the physical memory address based on the page table walk. 1. A method for virtual address translation , the method comprising:receiving, by a memory management unit (MMU), a request to translate a virtual memory address to a physical memory address;searching, by the MMU, based on the virtual memory address, a translation lookaside buffer (TLB), for a translation to the physical memory address; fetching the domain identification from the TLB;', 'searching, by the MMU, an external memory translation lookaside buffer (EMTLB) for the translation;', 'performing, by a page table walker (PTW), a page table walk to retrieve the translation from a page table;', 'if the translation is found in the EMTLB, aborting the page table walk and returning the physical memory address; and', 'if the translation is not found in the EMTLB, returning the physical memory address based on the page table walk., 'in response to the translation not being found in the TLB and a domain identification being found in the TLB, wherein the domain identification identifies an address space for a virtual machine2. The method of claim 1 , wherein the EMTLB comprises a region of memory that is external to the MMU.3. The method of claim 1 , wherein the ...

Подробнее
02-06-2022 дата публикации

SYSTEM AND METHOD FOR MULTIMODAL COMPUTER ADDRESS SPACE PROVISIONING

Номер: US20220171700A1
Принадлежит: UNIFABRIX LTD.

A computer based system and method for managing memory resources in a computing system may include, using a computer processor, receiving, from a computing system, a memory transaction request originating from a process executing on the computing system. Translation of a memory address associated with the request, or provisioning of memory for a translated address, may be determined based on various memory-transactions-related metadata—such as the service level of the process; the service level of other processes; access patterns of memory resources; a prediction of future memory requests of a process, and the like. 1. A method of managing memory resources in a computing system , the method comprising:receiving from a computing system comprising a memory translation entity, a memory transaction request originating from a process executing on the computing system; anddetermining a translation of a memory address associated with the memory transaction request or a provisioning of memory for a translated address, based on one or more of metadata associated with memory transactions; the service level of the process; the service level of other processes executing on the computing system; access patterns of memory systems among a plurality of memory systems; and a prediction of future memory requests of the process.2. The method of claim 1 , wherein the translation or provisioning is transparent to an operating system executing on the computing system.3. The method of claim 1 , wherein the translation or provisioning comprise unifying several address spaces into a single address space.4. The method of claim 3 , wherein the unifying produces a unified single address space comprising an address space spanning a plurality of storage resources.5. The method of claim 1 , wherein the translation or provisioning is to two or more address spaces.6. The method of claim 1 , wherein the translation or provisioning are performed based on a machine learning model trained using one or ...

Подробнее
03-07-2014 дата публикации

APPARATUS AND METHOD FOR A MULTIPLE PAGE SIZE TRANSLATION LOOKASIDE BUFFER (TLB)

Номер: US20140189192A1
Принадлежит:

An apparatus and method for implementing a multiple page size translation lookaside buffer (TLB). For example, a method according to one embodiment comprises: reading a first group of bits and a second group of bits from a linear address; determining whether the linear address is associated with a large page size or a small page size; identifying a first cache set using the first group of bits if the linear address is associated with a first page size and identifying a second cache set using the second group of bits if the linear address is associated with a second page size; and identifying a first cache way if the linear address is associated with a first page size and identifying a second cache way if the linear address is associated with a second page size. 1. A method comprising:reading a first group of bits and a second group of bits from a linear address;determining whether the linear address is associated with a large page size or a small page size;identifying a first cache set using the first group of bits if the linear address is associated with a first page size and identifying a second cache set using the second group of bits if the linear address is associated with a second page size; andidentifying a first cache way if the linear address is associated with a first page size and identifying a second cache way if the linear address is associated with a second page size.2. The method as in wherein the set and way identify an entry within a translation lookaside buffer (TLB).3. The method as in wherein determining comprises identifying an entry in the TLB using the first or second groups of bits and reading a bit from the TLB entry indicating whether the linear address is associated with a large page or a small page.4. The method as in further comprising:determining that a TLB miss has occurred when no TLB entry is identified; andreading a physical address translation for the linear address from a page table in memory.5. The method as in further comprising ...

Подробнее
03-07-2014 дата публикации

IMAGE FORMING APPARATUS AND METHOD OF TRANSLATING VIRTUAL MEMORY ADDRESS INTO PHYSICAL MEMORY ADDRESS

Номер: US20140189193A1
Автор: CHO Byoung-tae
Принадлежит: Samsung Electronics, Co., Ltd.

An image forming apparatus includes a function unit to perform functions of the image forming apparatus, and a control unit to control the function unit to perform the functions of the image forming apparatus. The control unit includes a processor core to operate in a virtual memory address, a main memory to operate in a physical memory address and store data used in the functions of the image forming apparatus, and a plurality of input/output (I/O) logics to operate in the virtual memory address and control at least one of the functions performed by the image forming apparatus. Each of the plurality of I/O logics translates the virtual memory address into the physical memory address corresponding to the virtual memory address and accesses the main memory. 1. An image forming apparatus , comprising:a function unit to perform functions of the image forming apparatus; anda control unit to control the function unit to perform the functions of the image forming apparatus, a process core to operate in a virtual memory address;', 'a main memory to operate in a physical memory address and store data used in the functions of the image forming apparatus; and', 'a plurality of input/output (I/O) logics to operate in the virtual memory address and control at least one of the functions performed by the image forming apparatus,', 'wherein each of the plurality of I/O logics translates the virtual memory address into the physical memory address corresponding to the virtual memory address to access the main memory., 'wherein the control unit includes2. The image forming apparatus of claim 1 , wherein each of the plurality of I/O logics includes:a function core to control at least one of the functions performed by the image forming apparatus;an address translation unit to translate a virtual memory address required by the function core into a physical memory address using a translation look-aside buffer (TLB); anda direct memory access (DMA) to perform access using a translated ...

Подробнее
20-04-2017 дата публикации

APPARATUS AND METHOD FOR ACCELERATING OPERATIONS IN A PROCESSOR WHICH USES SHARED VIRTUAL MEMORY

Номер: US20170109294A1
Принадлежит:

An apparatus and method are described for coupling a front end core to an accelerator component (e.g., such as a graphics accelerator). For example, an apparatus is described comprising: an accelerator comprising one or more execution units (EUs) to execute a specified set of instructions; and a front end core comprising a translation lookaside buffer (TLB) communicatively coupled to the accelerator and providing memory access services to the accelerator, the memory access services including performing TLB lookup operations to map virtual to physical addresses on behalf of the accelerator and in response to the accelerator requiring access to a system memory. 1. A system comprising: an accelerator functional unit, and', 'context save/restore circuitry to save and restore a context of the accelerator', 'functional unit, and, 'an accelerator to perform data operations associated with one or more tasks, the accelerator comprising a translation lookaside buffer (TLB) to store virtual-to-physical address mappings; and', 'page walker circuitry to provide page walk services to the accelerator to', 'determine virtual-to-physical address mappings., 'front end hardware logic coupled to the accelerator, the front end hardware logic to receive and schedule tasks for execution on the accelerator, the front end hardware logic comprising2. The system as in wherein the front end hardware logic further comprises:address translation circuitry to perform a virtual-to-physical address translation for one or more of the accelerator execution circuits by querying the TLB containing the mapping of virtual-to-physical addresses.3. The system as in wherein the address translation circuitry is to cause the page walker circuitry to access a page table from a memory hierarchy of one or more processor cores if the query to the TLB fails to locate a translation for a particular virtual address claim 2 , the memory hierarchy comprising a system memory and a plurality of cache levels.4. The system ...

Подробнее
30-04-2015 дата публикации

INPUT/OUTPUT MEMORY MAP UNIT AND NORTHBRIDGE

Номер: US20150120978A1
Принадлежит:

The present invention provides for page table access and dirty bit management in hardware via a new atomic test[0] and OR and Mask. The present invention also provides for a gasket that enables ACE to CCI translations. This gasket further provides request translation between ACE and CCI, deadlock avoidance for victim and probe collision, ARM barrier handling, and power management interactions. The present invention also provides a solution for ARM victim/probe collision handling which deadlocks the unified northbridge. These solutions includes a dedicated writeback virtual channel, probes for IO requests using 4-hop protocol, and a WrBack Reorder Ability in MCT where victims update older requests with data as they pass the requests. 1. A method for handling dirty bits , the method comprising:{'b': '0', 'modifying a memory address, if bit =test bit, the modification being based on operation operands Op1 and Op2 operating on the memory address.'}2. The method of claim 1 , wherein the method is performed according to Mem[Addr]=Mem[Addr][0] ? ((Mem[Addr] | Op1) & Op2): Mem[Addr].3. The method of claim 1 , wherein AP[2]=1 if the page is clean.4. The method of claim 1 , wherein AP[2]=0 if the page is dirty.5. The method of wherein the modifying is performed on a single translation table entry in a single Test[0]SetandClr atomic operation.6. The method of wherein the modifying occurs in one atomic update.7. The method of wherein the modifying is managed in hardware.8. The method of wherein if a translation table entry with the access flag set to 0 is read into the translation lookaside buffer (TLB) claim 1 , the hardware writes 1 to the access flag bit of the translation table entry in memory.9. The method of wherein the use of the hardware Access flag field is used to indicate the Secure and Non-secure PL1&0 stage 1 translations.10. The method of claim 9 , wherein the access flag field is ID_MMFR2[31:28].11. The method of claim 1 , wherein when hardware management of the ...

Подробнее
07-05-2015 дата публикации

BOUNDED CACHE SEARCHES

Номер: US20150127911A1
Автор: Steiss Donald Edward
Принадлежит: CISCO TECHNOLOGY, INC.

Cache lines of a data cache may be assigned to a specific page type or color. In addition, the computing system may monitor when a cache line assigned to the specific page color is allocated in the cache. As each cache line assigned to a particular page color is allocated, the computing system may compare a respective index associated with each of the cache lines to determine maximum and minimum indices for that page color. These indices define a block of the cache that stores the data assigned to the page color. Thus, when the data of a page color is evicted from the cache, instead of searching the entire cache to locate the cache lines, the computing system uses the maximum and minimum indices as upper and lower bounds to reduce the portion of the cache that is searched. 1. A memory system , comprising:a data cache comprising a plurality of cache lines, wherein each of the plurality of cache lines is associated with a respective tag that assigns the plurality of cache lines to a same page type;a first register configured to store a minimum cache index associated with the plurality of cache lines;a second register configured to store a maximum cache index associated with the plurality of cache lines; and perform a plurality of cache allocation operations to store data to the plurality of cache lines,', 'determine the minimum cache index stored in the first register and the maximum cache index stored in the second register based upon index values associated with the plurality of cache allocation operations, and', 'upon receiving a prompt to remove the plurality of cache lines from the data cache, search a subset of the data cache for the plurality of cache lines assigned to the same page type, wherein an upper boundary of the subset is determined by the maximum cache index and a lower boundary of the subset is determined by the minimum cache index., 'logic circuitry configured to2. The memory system of claim 1 , further comprising a network device comprising a first ...

Подробнее
24-07-2014 дата публикации

System, method, and computer program product for graphics processing unit (gpu) demand paging

Номер: US20140204098A1
Принадлежит: Nvidia Corp

A system, method, and computer program product are provided for GPU demand paging. In operation, input data is addressed in terms of a virtual address space. Additionally, the input data is organized into one or more pages of data. Further, the input data organized as the one or more pages of data is at least temporarily stored in a physical cache. In addition, access to the input data in the physical cache is facilitated.

Подробнее
03-05-2018 дата публикации

MEMORY ACCESS SYSTEM AND METHOD

Номер: US20180121126A1
Принадлежит:

A memory access system includes a memory, a controller, and a redundancy elimination unit. The memory is a multi-way set associative memory, and the redundancy elimination unit records M record items. Each record item is used to store a tag of a stored data block in one of storage sets. The controller determines a read data block and a target storage set of the read data block and sends a query message to the redundancy elimination unit. The query message carries a set identifier of the target storage set of the read data block and a tag of the read data block. The redundancy elimination unit determines a record item corresponding to the set identifier of the target storage set, matches the tag of the read data block with a tag of a stored data block in the record item corresponding to the target storage set of the read data block. 1. A memory access system , comprising: a memory , a controller , and a redundancy elimination unit , wherein{'b': '2', 'the memory comprises M×N storage blocks, wherein M rows of the storage blocks form M storage sets, N columns of the storage blocks form N storage ways, each storage set is provided with a set identifier, and at least one of M or N is a positive integer greater than or equal to ;'}the redundancy elimination unit is configured to record M record items, wherein each record item corresponds to one of the M storage sets, and each record item is used to store a tag of a stored data block in one of the M storage sets;the controller is configured to receive a data read request, determine a read data block and a target storage set of the read data block, and send a query message to the redundancy elimination unit, wherein the query message carries a set identifier of the target storage set of the read data block and a tag of the read data block; and determine, according to the set identifier of the target storage set of the read data block, a record item corresponding to the set identifier of the target storage set;', 'match the ...

Подробнее
03-05-2018 дата публикации

IDENTIFYING STALE ENTRIES IN ADDRESS TRANSLATION CACHE

Номер: US20180121365A1
Принадлежит:

A mapping may be changed in a table stored in memory. The table may map a first set of addresses, for a set of data, to a second set of addresses. The changing of the mapping may including mapping the first set of addresses to a third set of addresses. In response to the changing of the mapping, one or more flush operations may be executed to invalidate one or more entries within one or more address translation caches. The one or more entries may include the second set of addresses. In response to the executing of the one or more flush operations, a first test case may be run. The first test case may be to test whether any of the first set of addresses are mapping to the second set of addresses. 1. A computer-implemented method comprising:mapping a first set of addresses to a second set of addresses in a memory;changing the mapping by mapping the first set of addresses to a third set of addresses;executing, in response to the changing of the mapping, one or more operations to invalidate one or more entries within one or more address translation caches, the one or more entries including at least one of the second set of addresses; andrunning a first test case, the first test case tests whether any of the first set of addresses are mapping to the second set of addresses.2. The method of claim 1 , further comprising running claim 1 , prior to the changing of the mapping claim 1 , a second test case claim 1 , the second test case to determine which particular set of addresses the first set of addresses are mapping to claim 1 , the running of the second test case including determining that the first set of addresses are mapped to the second set of addresses.3. The method of claim 2 , further comprising:determining, subsequent to the executing the one or more operations and running the first test case, that at least one of the first set of addresses are mapping to at least one of the second set of addresses; anddetermining, based on comparing the first test case with the ...

Подробнее
04-05-2017 дата публикации

MECHANISM FOR CREATING FRIENDLY TRANSACTIONS WITH CREDENTIALS

Номер: US20170123843A1
Принадлежит:

A transactional memory execution environment receives a first request from a first transaction to access a cache line. A first request is received from a first transaction to access a cache line. The cache line is determined to be used by a second transaction. The first transaction and the second transaction opt-in to a transaction potential conflict check. The transaction potential conflict check determines if the first transaction and the second transaction are in a conflicting coherent state. The conflicting coherent state occurs when the first transaction is modifying the cache line used by the second transaction. The first transaction is allowed access to the cache line without aborting the second transaction in response to a determination that the first transaction and the second transaction are compatible from the transaction potential conflict check. 1. A computer-implemented method for granting access to a cache line in a transactional memory execution environment , the method comprising:receiving a first request from a first transaction to access a cache line;determining, in response to receiving the first request, that the cache line is used by a second transaction;determining if the first transaction and the second transaction opt-in to a transaction potential conflict check;performing, based on an opt-in of the first transaction and the second transaction for the transaction potential conflict check, the transaction potential conflict check between first transaction and the second transaction; andallowing, in response to performing the transaction potential conflict check, access of the cache line for the first transaction without aborting the second transaction.2. The method of claim 1 , wherein performing the transaction potential conflict check further comprises:obtaining a first token type for the first transaction and a second token type for the second transaction based on the opt-in of the first transaction and the second transaction for the ...

Подробнее
04-05-2017 дата публикации

Backward compatibility testing of software in a mode that disrupts timing

Номер: US20170123961A1

A device may be run in a timing testing mode in which the device is configured to disrupt timing of processing that takes place on the one or more processors while running an application with the one or more processors. The application may be tested for errors while the device is running in the timing testing mode

Подробнее
25-04-2019 дата публикации

SYSTEM AND METHOD FOR IDENTIFYING HOT DATA AND STREAM IN A SOLID-STATE DRIVE

Номер: US20190121742A1
Принадлежит:

A method for providing a Bloom filter for a multi-stream enabled solid-state drive (SSD) is disclosed. The Bloom filter includes two Bloom filter arrays, a counter corresponding to the two Bloom filter arrays, and a masking logic. The method includes: inserting an element in one or more of the two Bloom filter arrays using a plurality of hash functions; and updating the counter based on the insertion of the element. The method further includes: updating the Bloom filter by inserting one or more additional elements in one or more of the two Bloom filter arrays and updating the counter; and masking a data stored in the Bloom filter with the one or more additional elements to pseudo delete the data using the masking logic and reduce a false positive rate of the Bloom filter. 1. A method comprising:providing a Bloom filter including two Bloom filter arrays, a counter corresponding to the two Bloom filter arrays, and a masking logic;inserting an element in one or more of the two Bloom filter arrays using a plurality of hash functions;updating the counter based on the insertion of the element;updating the Bloom filter by inserting one or more additional elements in one or more of the two Bloom filter arrays and updating the counter; andmasking a data stored in the Bloom filter with the one or more additional elements to pseudo delete the data using the masking logic and reduce a false positive rate of the Bloom filter.2. The method of claim 1 , further comprising:performing an OR operation between the two Bloom filter arrays to generate a third Bloom filter array; andsaving the third Bloom filter array and a counter value of the counter in a history data structure.3. The method of claim 2 , wherein the third Bloom filter array and the counter value are saved in the history data structure periodically.4. The method of claim 2 , wherein the third Bloom filter array and the counter value are saved in the history data structure prior to switching a mode of operation of the ...

Подробнее
16-04-2020 дата публикации

Configuration Cache For The ARM SMMUv3

Номер: US20200117613A1
Автор: Ma Albert, Salvi Manan
Принадлежит:

A method of translating a virtual address into a physical memory address in an ARM SMMUv3 system may comprise searching a Configuration Cache memory for a matching tag that matches the associated tag upon receiving the virtual address and an associated tag, and extracting, in a single memory lookup cycle, a matching data field associated with the matching tag when the matching tag is found in the Configuration Cache memory. The matching data field of the Configuration Cache may comprise a matching Stream Table Entry (STE) and a matching Context Descriptor (CD), both associated with the matching tag. The Configuration Cache may be configured as a content-addressable memory. The method may further comprise storing entries associated with a multiple memory lookup cycle virtual address-to-physical address translation into the Configuration Cache memory, each of the entries comprising a tag, an associated STE and an associated CD. 1. A method of translating a virtual address into a physical memory address in an ARM SMMUv3 memory management system , comprising:upon receiving the virtual address and an associated tag, searching a Configuration Cache memory for a matching tag that matches the associated tag;extracting, in a single memory lookup cycle, a matching data field associated with the matching tag when the matching tag is found in the Configuration Cache memory, the matching data field of the Configuration Cache comprising a matching Stream Table Entry (STE) and a matching Context Descriptor (CD), both associated with the matching tag.2. The method of claim 1 , further comprising organizing the Configuration Cache as a content-addressable memory (CAM).3. The method of claim 1 , further comprising storing one or more entries associated with a multiple memory lookup cycle virtual address-to-physical address translation into the Configuration Cache memory claim 1 , each of the one or more entries comprising a tag claim 1 , an associated STE and an associated CD.4. The ...

Подробнее
10-05-2018 дата публикации

PROGRAMMABLE MEMORY TRANSFER REQUEST PROCESSING UNITS

Номер: US20180129620A1
Принадлежит:

An apparatus () comprising a programmable memory transfer request processing (PMTRP) unit () and a programmable direct memory access (PDMA) unit (). The PMTRP unit () comprises at least one programmable region descriptor (). The PDMA unit () comprises at least one programmable memory-to-memory transfer control descriptor (). The PDMA unit () is adapted to send () a memory transfer request to the PMTRP unit (). The PMTRP unit () is adapted to receive () and successfully process a memory transfer request issued by the PDMA unit () that is addressed to a memory location that is associated with a portion of at least one of the at least one region descriptor () of the PMTRP unit (). 1. An apparatus comprising: [ receive a memory transfer request associated with a first address space; and', 'send a corresponding memory transfer response;, 'a first port, which is a target port, adapted to, for each of the at least one region descriptors, the type of that region descriptor is selected from one of the 7 following types:', 'a page descriptor with a fixed length page;', 'a page descriptor with a variable length page;', 'a segment descriptor;', 'a translation look aside buffer descriptor;', 'a range descriptor;', 'a range descriptor that has been adapted with a programmable memory address translation policy;', 'a cache tag descriptor;, 'at least one region descriptor that encodes at least one policy that is associated with a region of the first address space, in which, 'programmable configuration data, in which the programmable configuration data comprises, send a memory transfer request associated with a second address space; and', 'receive a corresponding memory transfer response;, 'a second port, which is a master port, adapted to, 'means to process a memory transfer request associated with the first address space received on the first port in accordance with the at least one policy associated with the first address space that are encoded in the programmable configuration ...

Подробнее
23-04-2020 дата публикации

OPERATION OF A MULTI-SLICE PROCESSOR IMPLEMENTING A UNIFIED PAGE WALK CACHE

Номер: US20200125496A1
Принадлежит:

Operation of a multi-slice processor that includes a plurality of execution slices, a plurality of load/store slices, and one or more page walk caches, where operation includes: receiving, at a load/store slice, an instruction to be issued; determining, at the load/store slice, a process type indicating a source of the instruction to be a host process or a guest process; and determining, in accordance with an allocation policy and in dependence upon the process type, an allocation of an entry of the page walk cache, wherein the page walk cache comprises one or more entries for both host processes and guest processes. 1. A method of operation of a multi-slice processor , the multi-slice processor including a page walk cache , a plurality of execution slices , and a plurality of load/store slices , the method comprising:receiving, at a load/store slice, an instruction to be issued;determining, at the load/store slice, a process type indicating a source of the instruction to be a host process or a guest process; anddetermining, in accordance with an allocation policy and in dependence upon the process type, an allocation of an entry of the page walk cache, wherein the page walk cache comprises one or more entries for both host processes and guest processes, wherein the allocation policy specifies a first portion of the page walk cache to be dedicated to one or more host processes, and wherein the allocation policy specifies a second portion of the page walk cache to be dedicated to one or more guest processes.2. The method of claim 1 , further comprising:storing, within the entry of the page walk cache, a flag indicating the process type, address bits, and a process identification.3. The method of claim 2 , further comprising:receiving, at the load/store slice, a second instruction to be issued, wherein the second instruction comprises an effective address field;indexing the page walk cache according to the effective address field of the second instruction;determining, ...

Подробнее
02-05-2019 дата публикации

SPECULATIVE SIDE CHANNEL ATTACK MITIGATION USING UNCACHEABLE MEMORY

Номер: US20190130102A1
Принадлежит:

Speculative side channels exist when memory is accessed by speculatively-executed processor instructions. Embodiments use uncacheable memory mappings to close speculative side channels that could allow an unprivileged execution context to access a privileged execution context's memory. Based on allocation of memory location(s) to the unprivileged execution context, embodiments map these memory location(s) as uncacheable within first page table(s) corresponding to the privileged execution context, but map those same memory locations as cacheable within second page table(s) corresponding to the unprivileged execution context. This prevents a processor from carrying out speculative execution of instruction(s) from the privileged execution context that access any of this memory allocated to the unprivileged execution context, due to the unprivileged execution context's memory being mapped as uncacheable for the privileged execution context. Performance for the unprivileged execution context is substantially unaffected, however, since this memory is mapped as cacheable for the unprivileged execution context. 1. A method , implemented at a computer system that includes one or more processors , for managing uncacheable memory mappings for an unprivileged execution context , the method comprising: mapping the memory locations corresponding to the unprivileged execution context as uncacheable within one or more first page tables corresponding to a privileged execution context; and', 'mapping the memory locations corresponding to the unprivileged execution context as cacheable within one or more second page tables corresponding to the unprivileged execution context., 'based at least on allocating memory locations to an unprivileged execution context2. The method as recited in claim 1 , wherein claim 1 , based at least on the memory locations allocated to the unprivileged execution context being mapped as uncacheable within the one or more first page tables claim 1 , the one ...

Подробнее
19-05-2016 дата публикации

TRANSLATION LOOKASIDE BUFFER INVALIDATION SUPPRESSION

Номер: US20160140051A1
Принадлежит:

Managing a plurality of translation lookaside buffers (TLBs) includes: issuing, at a first processing element, a first instruction for invalidating one or more TLB entries associated with a first context in a first TLB associated with the first processing element. The issuing includes: determining whether or not a state of an indicator indicates that all TLB entries associated with the first context in a second TLB associated with a second processing element are invalidated; if not: sending a corresponding instruction to the second processing element, causing invalidation of all TLB entries associated with the first context in the second TLB, and changing a state of the indicator; and if so: suppressing sending of any corresponding instructions for causing invalidation of any TLB entries associated with the first context in the second TLB to the second processing element. 1. A method for managing a plurality of translation lookaside buffers , each translation lookaside buffer including a plurality of translation lookaside buffer entries and being associated with a corresponding processing element of a plurality of processing elements , the method comprising: determining, at the first processing element, whether or not a state of an indicator indicates that all translation lookaside buffer entries associated with the first context in a second translation lookaside buffer associated with a second processing element are invalidated;', sending a corresponding instruction to the second processing element of the plurality of processing elements, the corresponding instruction causing invalidation of all translation lookaside buffer entries associated with the first context in the second translation lookaside buffer, and', 'changing a state of the indicator to indicate that all translation lookaside buffer entries associated with the first context in the second translation lookaside buffer are invalidated; and, 'if the state of the indicator indicates that all translation ...

Подробнее
21-05-2015 дата публикации

SYSTEMS AND METHODS FOR REDUCING FIRST LEVEL CACHE ENERGY BY ELIMINATING CACHE ADDRESS TAGS

Номер: US20150143046A1
Принадлежит:

Methods and systems which, for example, reduce energy usage in cache memories are described. Cache location information regarding the location of cachelines which are stored in a tracked portion of a memory hierarchy is stored in a cache location table. Address tags are stored with corresponding location information in the cache location table to associate the address tag with the cacheline and its cache location information. When a cacheline is moved to a new location in the memory hierarchy, the cache location table is updated so that the cache location information indicates where the cacheline is located within the memory hierarchy. 1. A method of tracking the location of a cacheline in a memory hierarchy including one or more levels of cache memory comprising the steps of:storing cache location information about the cacheline in a cache location table;storing an address tag in the cache table to associate the address tag with the cacheline and its cache location information; andupdating the cache location information when the cacheline is moved to a new location in the memory hierarchy,wherein the cache location information indicates where the cacheline is located within the memory hierarchy.2. The method of claim 1 , further comprising:comparing at least some of the stored address tags in the cache location table with a tag portion of an address upon receiving an access request to the memory hierarchy for the cacheline.3. The method of claim 1 , wherein the memory hierarchy includes only one level of cache memory and main memory.4. The method of claim 1 , wherein the memory hierarchy includes at least two levels of cache memory and main memory.5. The method of claim 1 , wherein the at least one level of cache memory does not have address tags stored therein.6. The method of claim 3 , wherein the cache location information is a way indication which indicates which way within a set of the one level of cache memory that the cacheline currently resides.7. The ...

Подробнее
17-05-2018 дата публикации

DYNAMIC RELEASING OF CACHE LINES

Номер: US20180136966A1
Принадлежит:

A computer-implemented method includes, in a transactional memory environment, identifying a transaction and identifying one or more cache lines. The cache lines are allocated to the transaction. A cache line record is stored. The cache line record includes a reference to the one or more cache lines. An indication is received. The indication denotes a request to demote the one or more cache lines. The cache line record is retrieved, and the one or more cache lines are released. A corresponding computer program product and computer system are also disclosed. 1. A method comprising:starting a first hardware transaction that shares data between a plurality of central processing units (CPUs);committing, by a first CPU of the plurality of CPUs, stores of the first CPU to the first hardware transaction;responsive to committing the stores of the first CPU to the first hardware transaction, acquiring, by the first CPU, ownership of a set of write line(s) associated with the first hardware transaction so that other CPUs of the plurality of CPUs must send a protocol request to the first CPU and receive a response to the protocol request prior to accessing any write line of the set of write line(s);determining that the first hardware transaction has been successfully completed; andresponsive to the determination that the first hardware transaction has been successfully completed giving up ownership of the set of write line(s) by the first CPU.2. The method of further comprising:subsequent to the giving up of the ownership of the set of write line lines by the first CPU, accessing a write line, by a second CPU of the plurality of CPUs, without sending a protocol request to the first CPU.3. A computer program product (CPP) comprising:a computer readable storage medium; starting a first hardware transaction that shares data between a plurality of central processing units (CPUs),', 'committing, by a first CPU of the plurality of CPUs, stores of the first CPU to the first hardware ...

Подробнее
09-05-2019 дата публикации

LOCALITY DOMAIN-BASED MEMORY POOLS FOR VIRTUALIZED COMPUTING ENVIRONMENT

Номер: US20190138436A1
Автор: Gschwind Michael K.
Принадлежит:

Processing within a non-uniform memory access (NUMA) computing environment is facilitated by obtaining memory for a memory heap for an application of a virtualized environment of the NUMA computing environment, and assigning portions of memory of the obtained memory to locality domain-based freelists. The assigning including obtaining, for a selected portion of memory of the portions of memory, a locality domain within the NUMA computing environment with which the portion of memory is associated, and adding the selected portion of memory to a corresponding locality domain-based freelist of the locality domain-based freelists based on the associated locality domain of the portion of memory. Domain locality is then used in allocating the memory from the locality domain-based freelists to processors of the NUMA computing environment performing processing of the application. 1. A computer program product for facilitating processing within a non-uniform memory access (NUMA) computing environment , the computer program product comprising: obtaining memory for a memory heap for an application of a virtualized environment of the NUMA computing environment;', obtaining, for a selected portion of memory of the portions of memory, a locality domain within the NUMA computing environment with which the selected portion of memory is associated;', 'adding the selected portion of memory to a corresponding locality domain-based freelist of the locality domain-based freelists based on the associated locality domain of the portion of memory; and, 'assigning portions of memory of the obtained memory to locality domain-based freelists, the assigning comprising, 'using domain locality in allocating the memory from the locality domain-based freelists to processors of the NUMA computing environment performing processing of the application., 'a computer readable storage medium readable by a processing circuit and storing instructions which, when executed, perform a method comprising2. The ...

Подробнее
09-05-2019 дата публикации

Data processing systems

Номер: US20190138458A1
Принадлежит: ARM LTD

When writing data to memory via a write buffer including a write cache containing a plurality of lines for storing data to be written to memory and an address-translation cache that stores a list of virtual address to physical address translations, a record of a set of lines of the write cache that are available to be evicted to the memory is maintained, and the evictable lines in the record of evictable lines are processed by requesting from the address-translation cache a respective physical address for each virtual address associated with an evictable line. The address-translation cache returns a hit or a miss status to the write buffer for each evictable line that is checked, and the write buffer writes out to memory at least one of the evictable lines for which a hit status was returned.

Подробнее
30-04-2020 дата публикации

Data merge method, memory storage device and memory control circuit unit

Номер: US20200133844A1
Принадлежит: Phison Electronics Corp

A data merge method for a rewritable non-volatile memory module including a plurality of physical units is provided according to an exemplary embodiment of the disclosure. The method includes: obtaining a first logical distance value between a first physical unit and a second physical unit among the physical units, and the first logical distance value reflects a logical dispersion degree between at least one first logical unit mapped by the first physical unit and at least one second logical unit mapped by the second physical unit; and performing a data merge operation according to the first logical distance value, so as to copy valid from a source node to a recycling node.

Подробнее
30-04-2020 дата публикации

SELECTIVELY ENABLED RESULT LOOKASIDE BUFFER

Номер: US20200133880A1
Принадлежит:

A processing system selectively enables and disables a result lookaside buffer (RLB) based on a hit rate tracked by a counter, thereby reducing power consumption for lookups at the result lookaside buffer during periods of low hit rates and improving the overall hit rate for the result lookaside buffer. A controller increments the counter in the event of a hit at the RLB and decrements the counter in the event of a miss at the RLB. If the value of the counter falls below a threshold value, the processing system temporarily disables the RLB for a programmable period of time. After the period of time, the processing system re-enables the RLB and resets the counter to an initial value. 1. A method comprising:storing, at a tag portion of a buffer, first instruction information comprising a first opcode and a first set of operands, the first set comprising at least one operand;storing, at a data portion of the buffer, a first result of a first operation performed on the first set of operands based on the first opcode;in response to receiving an instruction for execution at a computation unit, the instruction comprising a second opcode and a second set of operands, the second set comprising at least one operand, comparing the second opcode and the second set of operands to the first instruction information stored at the tag portion of the buffer;accessing the first result at the data portion of the buffer in response to the second opcode and the second set of operands matching the first instruction information;tracking, at a counter, an instance of the second opcode and the second set of operands matching the first instruction information; anddisabling the buffer in response to a value of the counter being less than a threshold value.2. The method of claim 1 , wherein tracking comprises:incrementing the counter by a first number in response to an instance of the second opcode and the second set of operands matching the first instruction information.3. The method of claim ...

Подробнее
10-06-2021 дата публикации

DYNAMIC MEMORY ADDRESS ENCODING

Номер: US20210173777A1
Автор: Meng Pingfan, ZHANG Yubo
Принадлежит:

Described herein is a memory architecture that is configured to dynamically determine an address encoding to use to encode multi-dimensional data such as multi-coordinate data in a manner that provides a coordinate bias corresponding to a current memory access pattern. The address encoding may be dynamically generated in response to receiving a memory access request or may be selected from a set of preconfigured address encodings. The dynamically generated or selected address encoding may apply an interleaving technique to bit representations of coordinate values to obtain an encoded memory address. The interleaving technique may interleave a greater number of bits from the bit representation corresponding to the coordinate direction in which a coordinate bias is desired than from bit representations corresponding to other coordinate directions. 1. A computer-implemented method for dynamic memory address encoding of coordinate data , the method comprising:receiving a memory access request for the coordinate data;identifying one or more constraints associated with the coordinate data;predicting a coordinate bias for the coordinate data based at least in part on the one or more constraints;determining an address encoding for the coordinate data that provides the predicted coordinate bias;applying the address encoding to the coordinate data to obtain an encoded memory address; andstoring the coordinate data at a memory location corresponding to the encoded memory address.2. The computer-implemented method of claim 1 , wherein determining the address encoding comprises dynamically generating the address encoding responsive claim 1 , at least in part claim 1 , to receiving the memory access request for the coordinate data.3. The computer-implemented method of claim 1 , wherein determining the address encoding comprises selecting the address encoding from a set of preconfigured address encodings.4. The computer-implemented method of claim 1 , wherein determining the ...

Подробнее
10-06-2021 дата публикации

MEMORY ARCHITECTURE FOR EFFICIENT SPATIAL-TEMPORAL DATA STORAGE AND ACCESS

Номер: US20210173778A1
Автор: Meng Pingfan, ZHANG Yubo
Принадлежит:

Described herein are systems, methods, and non-transitory computer readable media for memory address encoding of multi-dimensional data in a manner that optimizes the storage and access of such data in linear data storage. The multi-dimensional data may be spatial-temporal data that includes two or more spatial dimensions and a time dimension. An improved memory architecture is provided that includes an address encoder that takes a multi-dimensional coordinate as input and produces a linear physical memory address. The address encoder encodes the multi-dimensional data such that two multi-dimensional coordinates close to one another in multi-dimensional space are likely to be stored in close proximity to one another in linear data storage. In this manner, the number of main memory accesses, and thus, overall memory access latency is reduced, particularly in connection with real-world applications in which the respective probabilities of moving along any given dimension are very close. 1. A computer-implemented method for memory address encoding of multi-dimensional data that includes first multi-dimensional data and second multi-dimensional data , the method comprising:applying an address encoding to the first multi-dimensional data to obtain a first memory address for the first multi-dimensional data;applying the address encoding to the second multi-dimensional data to obtain a second memory address for the second multi-dimensional data, wherein the second multi-dimensional data is obtained from the first multi-dimensional data by incrementing by one unit respective data of the first multi-dimensional data that corresponds to a particular dimension;storing the first multi-dimensional data in a memory at the first memory address; andstoring the second multi-dimensional data in the memory at the second memory address,wherein the address encoding ensures that a linear difference between the second memory address and the first memory address is bounded independently of ...

Подробнее
25-05-2017 дата публикации

Optimizing page table manipulations

Номер: US20170147500A1
Принадлежит: International Business Machines Corp

A computer program product for optimizing page table manipulations is provided and includes a computer readable storage medium having program instructions that are readable and executable by a processing circuit to cause the processing circuit to create and maintain a translation table with a translation look-aside buffer (TLB) disposed to cache priority translations, update the translation table upon de-registration of a DMA address, allocate entries in the translation table from low to high memory addresses during memory registration, maintain a cursor for identifying where to search for available entries upon performance of a new registration, advance the cursor from entry-to-entry in the translation table and wrap the cursor from an end of the translation table to a beginning of the translation table and issue a synchronous TLB invalidation instruction to invalidate the TLB upon at least one wrapping and an entry being identified and updated.

Подробнее
16-05-2019 дата публикации

OPTIMIZING PAGE TABLE MANIPULATIONS

Номер: US20190146928A1
Принадлежит:

A computer program product for optimizing page table manipulations is provided and includes a computer readable storage medium having program instructions that are readable and executable by a processing circuit to cause the processing circuit to create and maintain a translation table with a translation look-aside buffer (TLB) disposed to cache priority translations, update the translation table upon de-registration of a DMA address, allocate entries in the translation table from low to high memory addresses during memory registration, maintain a cursor for identifying where to search for available entries upon performance of a new registration, advance the cursor from entry-to-entry in the translation table and wrap the cursor from an end of the translation table to a beginning of the translation table and issue a synchronous TLB invalidation instruction to invalidate the TLB upon at least one wrapping and an entry being identified and updated. 1. A computer program product for optimizing page table manipulations , the computer program product comprising a computer readable storage medium having program instructions embodied therewith , the program instructions being readable and executable by a processing circuit to cause the processing circuit to:create and maintain a table for translating first addresses to second addresses with a buffer disposed to cache priority translations;update the table upon de-registration of one of the first addresses without issuance of a corresponding invalidation instruction;allocate entries in the table from low to high memory addresses during memory registration;maintain a cursor for identifying where to search for available entries upon performance of a new registration, the cursor being configured for entry-to-entry advancement and end-to-beginning wrapping in the table; andissue a synchronous invalidation instruction to invalidate an entirety of the buffer upon at least one wrapping of the cursor and an entry being identified ...

Подробнее
31-05-2018 дата публикации

MULTI-MODE CACHE INVALIDATION

Номер: US20180150394A1
Принадлежит:

Systems and methods for cache invalidation, with support for different modes of cache invalidation include receiving a matchline signal, wherein the matchline signal indicates whether there is a match between a search word and an entry of a tag array of the cache. The matchline signal is latched in a latch controlled by a function of a single bit mismatch clock, wherein a rising edge of the single bit mismatch clock is based on delay for determining a single bit mismatch between the search word and the entry of the tag array. An invalidate signal for invalidating a cacheline corresponding to the entry of the tag array is generated at an output of the latch. Circuit complexity is reduced by gating a search word with a search-invalidate signal, such that the gated search word corresponds to the search word for a search-invalidate and to zero for a Flash-invalidate. 1. A method of cache invalidation , the method comprising:receiving a matchline signal, wherein the matchline signal indicates whether there is a match between a search word and an entry of a tag array of the cache;latching the matchline signal in a latch controlled by a latch clock, wherein the latch clock is a function of a single bit mismatch clock, wherein a rising edge of the single bit mismatch clock is based on delay for determining a single bit mismatch between the search word and the entry of the tag array; andgenerating an invalidate signal at an output of the latch.2. The method of claim 1 , further comprising invalidating a cacheline in a data array of the cache based on the invalidate signal claim 1 , wherein the cacheline is associated with the entry of the tag array.3. The method of claim 2 , wherein invalidating the cacheline comprises setting a valid bit associated with the cacheline to zero.4. The method of claim 1 , wherein the invalidate signal is decoupled from the matchline signal by the latch.5. The method of claim 1 , wherein the latch clock is a delayed and stretched version of the ...

Подробнее
17-06-2021 дата публикации

TRACKING STREAMING ENGINE VECTOR PREDICATES TO CONTROL PROCESSOR EXECUTION

Номер: US20210182210A1
Принадлежит:

In a method of operating a computer system, an instruction loop is executed by a processor in which each iteration of the instruction loop accesses a current data vector and an associated current vector predicate. The instruction loop is repeated when the current vector predicate indicates the current data vector contains at least one valid data element and the instruction loop is exited when the current vector predicate indicates the current data vector contains no valid data elements. 1. A circuit comprising:a processor; an address generator configured to generate a set of addresses;', 'a memory interface coupled to the address generator and configured to retrieve a set of data elements associated with the set of addresses from the memory;', store the set of data elements; and', 'provide a data stream that includes the set of data elements to the processor; and, 'a head register coupled to the memory interface and to the processor and configured to, store a set of valid indicators; and', 'provide a first valid indicator of the set of valid indicators to the processor, wherein the first valid indicator is associated with a first portion of the data stream and indicates whether the first portion is valid., 'a valid register coupled to the memory interface and to the processor and configured to], 'a memory circuit coupled to the processor and configured to couple to a memory, wherein the memory circuit includes2. The circuit of claim 1 , wherein:the head register includes a plurality of bytes grouped in lanes such that each lane stores one data element of the set of data elements; andthe valid register includes a plurality of bits associated with the plurality of bytes of the head register such that each bit of the plurality of bits of the valid register indicates whether one respective byte of the plurality of bytes of the head register is valid.3. The circuit of claim 1 , wherein:the head register includes a plurality of bytes grouped in lanes such that each lane ...

Подробнее
01-06-2017 дата публикации

Apparatus and method for accelerating operations in a processor which uses shared virtual memory

Номер: US20170153984A1
Принадлежит: Individual

An apparatus and method are described for coupling a front end core to an accelerator component (e.g., such as a graphics accelerator). For example, an apparatus is described comprising: an accelerator comprising one or more execution units (EUs) to execute a specified set of instructions; and a front end core comprising a translation lookaside buffer (TLB) communicatively coupled to the accelerator and providing memory access services to the accelerator, the memory access services including performing TLB lookup operations to map virtual to physical addresses on behalf of the accelerator and in response to the accelerator requiring access to a system memory.

Подробнее
17-06-2021 дата публикации

CACHE STORAGE FOR STREAMING DATA

Номер: US20210185142A1
Автор: Paduroiu Andrei
Принадлежит:

Described herein are systems and techniques to efficiently cache data for streaming applications. A cache can be organized to include multiple cache segments, and each cache segment can include multiple cache blocks. A cache entry can be created for streaming data, and the streaming data can be streamed directly into a first cache block. When the first cache block is full, a next cache block can be identified, in a same cache segment or in a new cache segment. The streaming data can be streamed directly into the next cache block, and into any further cache blocks as needed. 1. A method , comprising:identifying, by a device comprising a processor, a first cache block in a first cache segment of multiple cache segments of a cache storage, wherein each cache segment of the multiple cache segments comprises multiple cache blocks;storing, by the device, in the first cache block, a first portion of a cache entry comprising a first portion of a data stream;in response to an indication that the first cache block is full, identifying, by the device, a second cache block in the cache storage;storing, by the device, in the second cache block, a second portion of the cache entry comprising a second portion of the data stream; andstoring, by the device, a pointer in metadata for the second cache block, wherein the pointer points to the first cache block.2. The method of claim 1 , wherein the multiple cache segments are equal sized and wherein the multiple cache blocks are equal sized.3. The method of claim 1 , wherein the pointer in metadata for the second cache block is stored in a metadata block of a cache segment of the multiple cache segments comprising the second cache block.4. The method of claim 1 , wherein identifying the first cache block in the cache storage comprises checking a cache index for a last used cache block identification claim 1 , and wherein the method further comprises updating claim 1 , by the device claim 1 , the last used cache block identification in ...

Подробнее
07-06-2018 дата публикации

DDR STORAGE ADAPTER

Номер: US20180157597A1
Принадлежит:

A method of accessing a persistent memory over a memory interface is disclosed. In one embodiment, the method includes allocating a virtual address range comprising virtual memory pages to be associated with physical pages of a memory buffer and marking each page table entry associated with the virtual address range as not having a corresponding one of the physical pages of the memory buffer. The method further includes generating a page fault when one or more of the virtual memory pages within the virtual address range is accessed and mapping page table entries of the virtual memory pages to the physical pages of the memory buffer. The method further includes transferring data between a physical page of the persistent memory and one of the physical pages of the memory buffer mapped to a corresponding one of the virtual memory pages. 1. A method of accessing a DIMM-attached storage subsystem over a DIMM interface configured to be communicatively coupled to a memory buffer , a DIMM controller , and the DIMM-attached storage subsystem , comprising:generating a page fault when a virtual address within a virtual memory address space that is not mapped to a corresponding physical address within a physical memory address space of the memory buffer is accessed; mapping the virtual address within the virtual memory address space to a physical address within the physical memory address space of the memory buffer; and', 'queuing one or more commands in a command buffer of the DIMM controller to write existing data in the mapped physical address within the physical memory address space of the memory buffer to the DIMM-attached storage subsystem and erase the existing data in the mapped physical memory address within the physical memory address space of the memory buffer., 'in response to the page fault,'}2. The method of claim 1 , further comprising:writing data from the DIMM-attached storage subsystem to the mapped physical memory address within the physical memory address ...

Подробнее
07-06-2018 дата публикации

TRANSLATION LOOKASIDE BUFFER SWITCH BANK

Номер: US20180157599A1
Принадлежит:

Example devices are disclosed. For example, a device may include a processor, a plurality of translation lookaside buffers, a plurality of switches, and a memory management unit. Each of the translation lookaside buffers may be assigned to a different process of the processor, each of the plurality of switches may include a register for storing a different process identifier, and each of the plurality of switches may be associated with a different one of the translation lookaside buffer buffers. The memory management unit may be for receiving a virtual memory address and a process identifier from the processor and forwarding the process identifier to the plurality of switches. Each of the plurality of switches may be for connecting the memory management unit to a translation associated with the switch when there is a match between the process identifier and the different process identifier stored by the register of the switch. 1. A device comprising:a processor;a plurality of translation lookaside buffers, wherein each of the plurality of translation lookaside buffers is to be assigned to a different process of a plurality of processes of the processor;a plurality of switches, wherein each of the plurality of switches comprises a register for storing a different process identifier of a plurality of process identifiers, wherein each of the plurality of switches is associated with a different translation lookaside buffer of the plurality of translation lookaside buffers; and ranking the plurality of processes based on a memory utilization, wherein the ranking is used to assign one or more of the plurality of translation lookaside buffers to one or more of the plurality of processes;', 'receiving a virtual memory address and a process identifier from the processor; and', 'forwarding the process identifier to the plurality of switches, wherein each switch of the plurality of switches is further for connecting the memory management unit to a translation lookaside buffer ...

Подробнее