Настройки

Укажите год
-

Небесная энциклопедия

Космические корабли и станции, автоматические КА и методы их проектирования, бортовые комплексы управления, системы и средства жизнеобеспечения, особенности технологии производства ракетно-космических систем

Подробнее
-

Мониторинг СМИ

Мониторинг СМИ и социальных сетей. Сканирование интернета, новостных сайтов, специализированных контентных площадок на базе мессенджеров. Гибкие настройки фильтров и первоначальных источников.

Подробнее

Форма поиска

Поддерживает ввод нескольких поисковых фраз (по одной на строку). При поиске обеспечивает поддержку морфологии русского и английского языка
Ведите корректный номера.
Ведите корректный номера.
Ведите корректный номера.
Ведите корректный номера.
Укажите год
Укажите год

Применить Всего найдено 3574. Отображено 199.
24-10-2001 дата публикации

Caching data

Номер: GB0000121115D0
Автор:
Принадлежит:

Подробнее
21-12-2011 дата публикации

Cache for a multiprocessor system which can treat a local access operation as a shared access operation

Номер: GB0002481232A
Принадлежит:

A data processing system has several processors 300, 320, 340, each with its own cache 310, 330, 350. Accesses by a processor to its cache may be local or shared. The processor contains a flag 307, which causes a local access to be treated as a global access. The operation may be a clean operation, an invalidate operation or a memory barrier operation issued by an operating system. The flag may be set, when a hypervisor moves a virtual machine from one processor to another. Local operations by the hypervisor may be treated as being local when the flag is set. The cache may be a data cache or a translation look aside buffer.

Подробнее
02-03-1993 дата публикации

CACHE WITH AT LEAST TWO DIFFERENT FILL SIZES

Номер: CA0001314107C

A method and apparatus for optimizing the performance of a cache memory system is disclosed. During the operation of a computer system whose processor is supported by virtual cache memory, the cache must be cleared and refilled to allow the replacement of old data and instructions with more current instructions and data. The cache is filled with either P or N blocks of data. Numerous methods for dynamically selecting N or P blocks of data are possible. Pursuant to one embodiment, immediately after the cache has been flushed, the miss is refilled with N blocks, allowing data and instructions to be moved to the cache at high speed. Once the cache is mostly full, the miss tends to be refilled with P blocks, which is less than N. This maintains the currency of the data in the cache, while simultaneously avoiding writing-over of data or instructions already in the cache. The invention is particularly advantageous in a multi-user/multi-tasking system where the program being run changes frequently ...

Подробнее
04-05-2021 дата публикации

INSTRUCTION CACHE IN A MULTI-THREADED PROCESSOR

Номер: CA3040901C
Принадлежит: GRAPHCORE LTD, GRAPHCORE LIMITED

A processor comprising: a barrel-threaded execution unit for executing concurrent threads, and a repeat cache shared between the concurrent threads. The processor's instruction set includes a repeat instruction which takes a repeat count operand. When the repeat cache is not claimed and the repeat instruction is executed in a first thread, a portion of code is cached from the first thread into the repeat cache, the state of the repeat cache is changed to record it as claimed, and the cached code is executed a number of times. When the repeat instruction is then executed in a further thread, then the already-cached portion of code is again executed a respective number of times, each time from the repeat cache. For each of the first and further instructions, the repeat count operand in the respective instruction specifies the number of times to execute the cached code.

Подробнее
13-02-2018 дата публикации

MEMORY RESOURCE OPTIMIZATION METHOD AND APPARATUS

Номер: CA0002927372C

Embodiments of the present invention provide a memory resource optimization method and apparatus, relate to the computer field, solve a problem that existing multi-level memory resources affect each other, and optimize an existing single partitioning mechanism. A specific solution is: obtaining performance data of each program in a working set by using a page coloring technology, obtaining a category of each program in light of a memory access frequency, selecting, according to the category of each program, a page coloring-based partitioning policy corresponding to the working set, and writing the page coloring-based partitioning policy to an operating system kernel, to complete corresponding coloring-based partitioning processing. The present invention is used to eliminate or reduce mutual interference of processes or threads on a storage resource in light of a feature of the working set, thereby improving overall performance of a computer.

Подробнее
01-09-2020 дата публикации

Dynamic sharing buffer supporting multiple protocols

Номер: CN0111611180A
Автор:
Принадлежит:

Подробнее
28-11-2017 дата публикации

Method for determining shared virtual memory page management mode and related devices

Номер: CN0107402891A
Принадлежит:

Подробнее
08-06-2018 дата публикации

Distributed caching dynamic migration

Номер: CN0108139974A
Автор:
Принадлежит:

Подробнее
29-03-2019 дата публикации

For adaptive persistence of the system, method and interface

Номер: CN0104903872B
Автор:
Принадлежит:

Подробнее
01-10-2020 дата публикации

ONBOARD SECURE ELEMENT

Номер: WO2020193663A1
Принадлежит:

The present invention concerns an onboard secure element (E) comprising a virtual memory (VRAM), and being configured to implement at least part of a first application (App20) adapted to be implemented by at least one low level operating system (113) of the onboard secure element (E), wherein execution data relating to one or more secondary tasks of said first application (App20) are stored in part of said virtual memory (VRAM) when the execution of said part of the first application (App20) is interrupted by the execution of at least part of a second application (App21).

Подробнее
27-04-2017 дата публикации

SYSTEM AND METHOD FOR A SHARED CACHE WITH ADAPTIVE PARTITIONING

Номер: WO2017069907A1
Принадлежит:

A cache controller adaptively partitions a shared cache. The adaptive partitioning cache controller includes tag comparison and staling logic and selection logic that are responsive to client access requests and various parameters. A component cache is assigned a target occupancy which is compared to a current occupancy. A conditional identification of stale cache lines is used to manage data stored in the shared cache. When a conflict or cache miss is identified, selection logic identifies candidates for replacement preferably among cache lines identified as stale. Each cache line is assigned to a bucket with a fixed number of buckets per component cache. Allocated cache lines are assigned to a bucket as a function of the target occupancy. After a select number of buckets are filled, subsequent allocations result in oldest cache lines being marked stale. Cache lines are deemed stale when their respective component cache active indicator is de-asserted.

Подробнее
24-11-2016 дата публикации

SYSTEMS AND METHODS FOR ADDRESSING A CACHE WITH SPLIT-INDEXES

Номер: WO2016185272A1
Автор: RICHMOND, Richard
Принадлежит:

Cache memory mapping techniques are presented. A cache may contain an index configuration register. The register may configure the locations of an upper index portion and a lower index portion of a memory address. The portions may be combined to create a combined index. The configurable split-index addressing structure may be used, among other applications, to reduce the rate of cache conflicts occurring between multiple processors decoding the video frame in parallel.

Подробнее
09-06-2020 дата публикации

Dynamic adjustment of a process scheduler in a data storage system based on loading of the data storage system during a preceding sampling time period

Номер: US0010678480B1

Technology for dynamically adjusting a process scheduler in a storage processor of a data storage system. An average amount of host data contained in sets of host data processed by host I/O request processing threads is calculated. An average amount of time required for each host I/O request processing thread to execute to completely process the average amount of host data contained in a set of host data is also calculated. Operation of the process scheduler in the storage processor is then adjusted to cause the process scheduler to subsequently allocate the processor in the storage processor to host I/O request processing threads in timeslices having a duration that is at least as large as the average amount of time required for each host I/O request processing thread to execute to completely process the average amount of host data contained in a set of host data.

Подробнее
21-05-2019 дата публикации

System, apparatus and method for dynamic profiling in a processor

Номер: US0010296464B2
Принадлежит: Intel Corporation, INTEL CORP

In one embodiment, an apparatus includes: a storage having a plurality of entries each to store address information of an instruction and a count value of a number of executions of the instruction during execution of code including the instruction; and at least one comparator circuit to compare a count value from one of the plurality of entries to a threshold value, where the instruction is a tagged instruction of the code, the tagged instruction tagged by a static compiler prior to execution of the code. Other embodiments are described and claimed.

Подробнее
04-01-2022 дата публикации

Hardware accelerator automatic detection of software process migration

Номер: US0011216377B2
Принадлежит: NXP USA, Inc.

A mechanism is provided by which a hardware accelerator detects migration of a software process among processors and uses this information to write operation results to an appropriate cache memory for faster access by the current processor. This mechanism is provided, in part, by incorporation within the hardware accelerator of a mapping table having entries including a cache memory identifier associated with a processor identifier. The hardware accelerator further includes circuitry configured to receive a processor identifier from a calling processor, and to perform a look-up in the mapping table to determine the cache memory identifier associated with the processor identifier. The hardware accelerator uses the associated cache memory identifier to write results of called operations to the cache memory associated with the calling processor, thereby accelerating subsequent operations by the calling processor that rely upon the hardware accelerator results.

Подробнее
18-09-2018 дата публикации

Using leases for entries in a translation lookaside buffer

Номер: US0010078588B2

The described embodiments include a computing device with two or more translation lookaside buffers (TLB) that performs operations for handling entries in the TLBs. During operation, the computing device maintains lease values for entries in the TLBs, the lease values representing times until leases for the entries expire, wherein a given entry in the TLB is invalid when the associated lease has expired. The computing device uses the lease value to control operations that are allowed to be performed using information from the entries in the TLBs. In addition, the computing device maintains, in a page table, longest lease values for page table entries indicating when corresponding longest leases for entries in TLBs expire. The longest lease values are used to determine when and if a TLB shootdown is to be performed.

Подробнее
04-10-2018 дата публикации

EFFICIENT MULTI-CONTEXT THREAD DISTRIBUTION

Номер: US20180285110A1
Принадлежит: Intel Corporation

Methods and apparatus relating to techniques for avoiding cache lookup for cold cache. In an example, an apparatus comprises logic, at least partially comprising hardware logic, to determine a first number of threads to be scheduled for each context of a plurality of contexts in a multi-context processing system, allocate a second number of streaming multiprocessors (SMs) to the respective plurality of contexts, and dispatch threads from the plurality of contexts only to the streaming multiprocessor(s) allocated to the respective plurality of contexts. Other embodiments are also disclosed and claimed.

Подробнее
07-01-2020 дата публикации

Increased bandwidth of ordered stores in a non-uniform memory subsystem

Номер: US0010528253B2

A method, computer program product, and system for maintaining a proper ordering of a data steam that includes two or more sequentially ordered stores, the data stream being moved to a destination memory device, the two or more sequentially ordered stores including at least a first store and a second store, wherein the first store is rejected by the destination memory device. A computer-implemented method includes sending the first store to the destination memory device. A conditional request is sent to the destination memory device for approval to send the second store to the destination memory device, the conditional request dependent upon successful completion of the first store. The second store is cancelled responsive to receiving a reject response corresponding to the first store.

Подробнее
07-03-2019 дата публикации

DEFERRED RESPONSE TO A PREFETCH REQUEST

Номер: US2019073309A1
Принадлежит:

Modifying prefetch request processing. A prefetch request is received by a local computer from a remote computer. The local computer responds to a determination that execution of the prefetch request is predicted to cause an address conflict during an execution of a transaction of the local processor by comparing a priority of the prefetch request with a priority of the transaction. Based on a result of the comparison, the local computer modifies program instructions that govern execution of the program instructions included in the prefetch request to include program instruction to perform one or both of: (i) a quiesce of the prefetch request prior to execution of the prefetch request, and (ii) a delay in execution of the prefetch request for a predetermined delay period.

Подробнее
07-03-2019 дата публикации

CONTROL DEVICE, METHOD AND NON-TRANSITORY COMPUTER-READABLE STORAGE MEDIUM

Номер: US2019073147A1
Принадлежит:

A control device includes a nonvolatile memory, a first processor, a first volatile memory coupled to the first processor, a second processor, and a second volatile memory coupled to the second processor, wherein the first processor is configured to transmit first data stored in the first volatile memory to the second processor by using electric power supplied from a backup power supply, the second processor is configured to store the first data in the second volatile memory, after storing the first data in the second volatile memory, the backup power supply stops supplying the electric power to at least one of the first volatile memory and the first processor, and the second processor is configured to store, in the nonvolatile memory, the first data stored in the second volatile memory.

Подробнее
01-03-2018 дата публикации

METHOD AND SYSTEMS FOR MASTER ESTABLISHMENT USING SERVICE-BASED STATISTICS

Номер: US20180060236A1
Принадлежит:

A method and apparatus are described for assigning mastership of nodes to data blocks. A method involves connecting each session of a plurality of sessions to a particular node of a cluster of nodes based on services associated with the plurality of sessions. Each session of the plurality of sessions is associated with a respective service of a plurality of services. The method also involves collecting service-based access statistics aggregated by service and ranges of data block addresses. Each range corresponds to one or more contiguous subrange of data block addresses. The method further involves assigning mastership of the nodes to the data blocks having addresses within said ranges of data block addresses based on services associated with the nodes and the service-based access statistics.

Подробнее
13-02-2018 дата публикации

Method and apparatus for improving non-uniform memory access

Номер: US0009892030B2

A method, computer readable medium and apparatus for improving non-uniform memory access are disclosed. For example, the method divides a plurality of stream processing jobs into a plurality of groups of stream processing jobs to match a topology of a non-uniform memory access platform. The method sets a parameter in an operating system kernel of the non-uniform memory access platform to favor an allocation of a local memory, and defines a plurality of processor sets. The method binds one of the plurality of groups to one of the plurality of processor sets, and run the one group of stream processing jobs on the one processor set.

Подробнее
03-07-2018 дата публикации

Transactional execution processor having a co-processor accelerator, both sharing a higher level cache

Номер: US0010013351B2

A higher level shared cache of a hierarchical cache of a multi-processor system utilizes transaction identifiers to manage memory conflicts in corresponding transactions. The higher level cache is shared with two or more processors. A processor may have a corresponding accelerator that performs operations on behalf of the processor. Transaction indicators are set in the higher level cache corresponding to the cache lines being accessed. The transaction aborts if a memory conflict with the transaction's cache lines from another transaction is detected, and the corresponding cache lines are invalidated. For a successfully completing transaction, the corresponding cache lines are committed and the data from store operations is stored.

Подробнее
03-01-2023 дата публикации

Semiconductor device, control system, and control method of semiconductor device

Номер: US0011544192B2
Принадлежит: RENESAS ELECTRONICS CORPORATION

A semiconductor device includes first and second CPUs, first and second SPUs for controlling a snoop operation, a controller supporting ASIL D of a functional safety standard and a memory. The controller sets permission of the snoop operation to the first and second SPUs when a software lock-step is not performed. The controller sets prohibition of the snoop operation to the first and second SPUs when the software lock-step is performed. The first CPU executes a first software for the software lock-step, and writes an execution result in a first area for the memory. The second CPU executes a second software for the software lock-step, and writes an execution result in a second area of the memory. The execution result written in the first area is compared with the execution result written in the second area.

Подробнее
07-04-2022 дата публикации

CACHE PROBE TRANSACTION FILTERING

Номер: US20220107897A1
Принадлежит:

Examples described herein relate to circuitry to selectively disable cache snoop operations issued by a particular processor or its cache manager based on data in a memory address range, to be accessed by the particular processor, having been flushed from one or more other cache devices accessible to other processors. At or after completion of flushing or scrubbing data in the memory address range to memory, the particular processor or its cache manager do not issue snoop operations for accesses to the memory address range. In response to an access by some other device to the memory address range, the processor or cache manager may resume issuing snoop operations.

Подробнее
16-07-2014 дата публикации

Partitioning a shared cache using masks associated with threads to avoiding thrashing

Номер: GB0002509755A
Принадлежит:

The invention relates to fill partitioning of a shared cache. In one embodiment, all threads running on a processor are able to access any data stored in the shared cache; however, in the event of a cache miss, a thread may be restricted such that it can only store data in a portion of the shared cache. The restrictions may be implemented for all cache miss events or for only a subset of those events. For example, the restrictions may be implemented only when the shared cache is full and/or only for particular threads. The restrictions may also be applied dynamically, for example, based on conditions associated with the cache. Different portions may be defined for different threads (e.g. in a multi-threaded processor) and these different portions may, for example, be separate and non-overlapping. Fill partitioning may be applied to any on-chip cache, for example, an L1 cache.

Подробнее
26-09-2012 дата публикации

Scheduling graphics processor tasks in parallel with corresponding cache coherency operations based on data dependencies

Номер: GB0002489278A
Принадлежит:

A data processing system has several processors of different types, such as a central processing unit (CPU) and a graphics processing unit (GPU). Each processor has a cache for data from the main memory. Cache consistency operations are used to ensure that data in the cache of one processor can be correctly accessed by other processors. Tasks T1â T7, without data dependencies, are scheduled for one of the processors. All the consistency operations C1â C7 for a task are executed before a task starts. The consistency operations for one task are performed while another task is being executed. The data dependencies of the tasks may be re-evaluated after the execution of a task. The tasks may be split into sub-tasks without data dependencies. The first task may be selected to minimise the initial latency due to its consistency operations.

Подробнее
24-05-1995 дата публикации

Cache affinity scheduler

Номер: GB0002284081A
Принадлежит:

A computing system 50 includes N symmetrical computing engines having N cache memories joined by a system bus 12. The computing system includes a global run queue (54), an FPA global run queue, and N affinity run queues (58). Each engine is associated with one affinity run queue, which includes multiple slots. When a process first becomes runnable, it is typically attached to one of the global run queues. A scheduler allocates engines to processes and schedules the processes to run on the basis of priority and engine availability. The system keeps track of the number of processes queued to each processor and can transfer a process from one processor to another with a shorter queue. ...

Подробнее
03-08-2005 дата публикации

A simultaneous multi-threading processor accessing a cache in different power modes according to a number of threads

Номер: GB0002410584A
Принадлежит:

A cache memory associated with a Simultaneous Multi-Threading (SMT) processor, which comprises a tag memory and a data memory. The tag and data memories are accessed in two modes: concurrently, with each accessed at the same time, or subsequently, with the tag memory being accessed before the data memory. The mode of memory access is chosen according to the number of threads running on the processor, allowing the processor to operate in a high-power or low-power mode, thus scaling power consumption according to activity.

Подробнее
01-06-2005 дата публикации

Caching data

Номер: GB0002379294B
Автор: DANAN ITAI, ITAI * DANAN
Принадлежит: DISCREET LOGIC INC, * DISCREET LOGIC INC

Подробнее
11-05-2011 дата публикации

Improving the scheduling of tasks to be performed by a non-coherent device

Номер: GB0201104958D0
Автор:
Принадлежит:

Подробнее
30-01-2019 дата публикации

Cache memory access

Номер: GB0002564994A
Принадлежит:

A multiprocessor data processing system includes multiple vertical cache hierarchies supporting a plurality of processor cores, a system memory, and a system interconnect. In response to a load-and-reserve request from a first processor core, a first cache memory supporting the first processor core issues on the system interconnect a memory access request for a target cache line of the load-and-reserve request. Responsive to the memory access request and prior to receiving a system wide coherence response for the memory access request, the first cache memory receives from a second cache memory in a second vertical cache hierarchy by cache-to-cache intervention the target cache line and an early indication of the system wide coherence response for the memory access request. In response to the early indication and prior to receiving the system wide coherence response, the first cache memory initiating processing to update the target cache line in the first cache memory.

Подробнее
14-09-2022 дата публикации

Providing direct data access between accelerators and storage in computing environment

Номер: GB0002604785A
Принадлежит:

A method for providing direct access to non-volatile memory in a computing environment by a processor, comprises providing one or more accelerators via an application programming interface ("API") direct access to non-volatile storage independent of a host central processing unit ("CPU") on a control path or data path to perform a read operation and write operation of data.

Подробнее
04-06-2001 дата публикации

Buffer memories, methods and systems for buffering having seperate buffer memories for each of a plurality of tasks

Номер: AU0007728300A
Автор: DENT PAUL W, PAUL W. DENT
Принадлежит:

Подробнее
27-06-2020 дата публикации

INSTRUCTION CACHE IN A MULTI-THREADED PROCESSOR

Номер: CA0003040901A1
Принадлежит: BERESKIN & PARR LLP/S.E.N.C.R.L.,S.R.L.

A processor comprising: a barrel-threaded execution unit for executing concurrent threads, and a repeat cache shared between the concurrent threads. The processor's instruction set includes a repeat instruction which takes a repeat count operand. When the repeat cache is not claimed and the repeat instruction is executed in a first thread, a portion of code is cached from the first thread into the repeat cache, the state of the repeat cache is changed to record it as claimed, and the cached code is executed a number of times. When the repeat instruction is then executed in a further thread, then the already-cached portion of code is again executed a respective number of times, each time from the repeat cache. For each of the first and further instructions, the repeat count operand in the respective instruction specifies the number of times to execute the cached code.

Подробнее
27-02-2013 дата публикации

Method and apparatus for memory utilization

Номер: CN101196833B
Принадлежит:

Подробнее
11-08-2023 дата публикации

Network-on-chip system and control method thereof

Номер: CN116578523A
Автор: ZHU HAIJIE
Принадлежит:

The embodiment of the invention discloses a network-on-chip system and a control method thereof. The network-on-chip system comprises a first network layer and a second network layer, wherein the first network layer comprises a routing node array, a processing node array and a cache consistency node array, and each routing node in the routing node array is respectively connected with one corresponding processing node in the processing node array and one corresponding cache consistency node in the cache consistency node array; the routing node is used for forwarding the communication transaction request of the processing node to the cache consistency node or the cache consistency nodes corresponding to other routing nodes; the second network layer is connected with the first network layer through a bonding layer and comprises a cache node array, and cache nodes in the cache node array are connected with one cache consistency node in the cache consistency node array through bonding contacts ...

Подробнее
14-05-2021 дата публикации

CONFIDENTIAL COMPUTING MECHANISM

Номер: WO2021091744A1
Принадлежит:

According to a first aspect, execution logic is configured to perform a linear capability transfer operation which transfers a physical capability from a partition of a first software modules to a partition of a second of software module without retaining it in the partition of the first. According to a second, alternative or additional aspect, the execution logic is configured to perform a sharding operation whereby a physical capability is divided into at least two instances, which may later be combined.

Подробнее
30-03-2017 дата публикации

METHOD AND APPARATUS FOR CACHE LINE DEDUPLICATION VIA DATA MATCHING

Номер: WO2017053109A1
Принадлежит:

A cache fill line is received, including an index, a thread identifier, and cache fill line data. The cache is probed, using the index and a different thread identifier, for a potential duplicate cache line. The potential duplicate cache line includes cache line data and the different thread identifier. Upon the cache fill line data matching the cache line data, duplication is identified. The potential duplicate cache line is set as a shared resident cache line, and the thread share permission tag is set to a permission state.

Подробнее
12-05-2020 дата публикации

Image processing device, image processing method, and non-transitory computer readable medium for image processing

Номер: US0010650481B2

An image processing device executes image processing by each object of an object group in which plural objects are connected to each other in a directed acyclic graph form. The image processing device includes: a division unit that divides image data as an image processing target into division image data having a first size; a subdivision unit that subdivides the division image data into subdivision image data having a second size smaller than the first size for each partial processing which is image processing to be performed on the division image data, the division image data corresponding to the partial processing which is determined as executable processing based on a pre-and-post dependency relationship; and a control unit that performs control for causing plural computation devices to execute subdivision partial processing which is image processing to be performed on the subdivision image data, in parallel.

Подробнее
29-12-2015 дата публикации

Data cache method, device, and system in a multi-node system

Номер: US0009223712B2

A data cache method, device, and system in a multi-node system are provided. The method includes: dividing a cache area of a cache medium into multiple sub-areas, where each sub-area is corresponding to a node in the system; dividing each of the sub-areas into a thread cache area and a global cache area; when a process reads a file, detecting a read frequency of the file; when the read frequency of the file is greater than a first threshold and the size of the file does not exceed a second threshold, caching the file in the thread cache area; or when the read frequency of the file is greater than the first threshold and the size of the file exceeds the second threshold, caching the file in the global cache area. Thus overheads of remote access of a system are reduced, and I/O performance of the system is improved.

Подробнее
03-11-2020 дата публикации

Balanced, opportunistic multicore I/O scheduling from non-SMP applications

Номер: US0010826848B2
Принадлежит: NETAPP, INC., NETAPP INC

A system for dynamically configuring and scheduling input/output (I/O) workloads among processing cores is disclosed. Resources for an application that are related to each other and/or not multicore safe are grouped together into work nodes. When these need to be executed, the work nodes are added to a global queue that is accessible by all of the processing cores. Any processing core that becomes available can pull and process the next available work node through to completion, so that the work associated with that work node software object is all completed by the same core, without requiring additional protections for resources that are not multicore safe. Indexes track the location of both the next work node in the global queue for processing and the next location in the global queue for new work nodes to be added for subsequent processing.

Подробнее
21-11-2017 дата публикации

Per thread cacheline allocation mechanism in shared partitioned caches in multi-threaded processors

Номер: US0009824013B2

Systems and methods for allocation of cache lines in a shared partitioned cache of a multi-threaded processor. A memory management unit is configured to determine attributes associated with an address for a cache entry associated with a processing thread to be allocated in the cache. A configuration register is configured to store cache allocation information based on the determined attributes. A partitioning register is configured to store partitioning information for partitioning the cache into two or more portions. The cache entry is allocated into one of the portions of the cache based on the configuration register and the partitioning register.

Подробнее
16-04-2020 дата публикации

EFFICIENT MULTI-CONTEXT THREAD DISTRIBUTION

Номер: US20200117455A1
Принадлежит: Intel Corporation

Methods and apparatus relating to techniques for avoiding cache lookup for cold cache. In an example, an apparatus comprises logic, at least partially comprising hardware logic, to determine a first number of threads to be scheduled for each context of a plurality of contexts in a multi-context processing system, allocate a second number of streaming multiprocessors (SMs) to the respective plurality of contexts, and dispatch threads from the plurality of contexts only to the streaming multiprocessor(s) allocated to the respective plurality of contexts. Other embodiments are also disclosed and claimed.

Подробнее
23-11-2023 дата публикации

COMPILE TIME LOGIC FOR DETECTING AND RESOLVING MEMORY LAYOUT CONFLICTS

Номер: US20230376292A1
Принадлежит: SambaNova Systems, Inc.

The technology disclosed relates to automatically assigning and optimizing the physical memory layouts of all intermediate dense tensor data in a program. The technology disclosed is an implementation of a compiler analysis and transformation pass which automatically determines required physical layouts in light of kernel operation and performance requirements. The proposed solution also inserts physical layout conversion operations where necessary in cases of unresolvable incompatibilities. The pass takes as input a program acyclic dataflow graph and a set of physical layout constraints for every known operation.

Подробнее
25-07-2018 дата публикации

DISTRIBUTED CACHE LIVE MIGRATION

Номер: EP3350713A1
Принадлежит:

Подробнее
23-11-2000 дата публикации

Verfahren zur Erhöhung der Nutzleistung von Multiprozessorsystemen

Номер: DE0019833221C2

Das Prinzip der "natürlichen Affinität" wird durch einen weiteren Steuerungsmechanismus ergänzt, indem Programmabschnitte verschiedener Prozesse, welche Daten aus denselben Speicherbereichen benötigen, einheitlich gekennzeichnet und aufgrund dieser Kennzeichnung bei zeitlich konkurrierendem Zugriff demselben Prozessor (CPU) zugewiesen werden. Durch die hierdurch geschaffene Affinität verschiedener Prozesse wird die Anzahl der Speicherzugriffe auf den Hauptspeicher (MM) bzw. die Caches (SIC) der anderen Prozessoren (CPU) weiter verringert und somit die Nutzleistung des Multiprozessorsystems erhöht. DOLLAR A Kann einem freien Prozessor (CPU) kein affiner Prozeß zugewiesen werden, wird zur Vermeidung von Leerlaufverlusten ein nicht-affiner Prozeß zugeweisen und ein Indikator gesetzt, der zur Unterbrechung bei Vorliegen eines affinen Prozesses führt.

Подробнее
26-02-2015 дата публикации

VERBESSERTE VERWENDUNG VON SPEICHERRESSOURCEN

Номер: DE102014012155A1
Принадлежит:

Es werden Verfahren zur Steigerung der Effizienz von Speicherressourcen in einem Prozessor beschrieben. In einer Ausführungsform werden diese Daten anstelle davon, dass eine gewidmete DSP-Indirektregister-Ressource zum Speichern von DSP-Befehlen zugeordneten Daten umfasst ist, in einem zugewiesenen und gesperrten Bereich im Cache gespeichert. Der Zustand aller Cachezeilen, die zur Speicherung von DSP-Daten verwendet werden, wird daraufhin festgelegt, um zu verhindern, dass die Daten in den Speicher geschrieben werden. Die Größe des zugewiesenen Bereichs im Cache kann gemäß der Menge an DSP-Daten, die gespeichert werden soll, variieren, und wenn keine DSP-Befehle laufen, werden keine Cache-Ressourcen zur Speicherung der DSP-Daten zugewiesen.

Подробнее
13-10-2010 дата публикации

Keeping a cache consistent with a main memory in a system with simultaneous multithreading

Номер: GB0002469299A
Принадлежит:

A multithreaded system has a processor core, a cache and a main memory. Between the cache and the main memory are an incoherency detection module and a memory arbiter. The incoherency detection module has a memory which stores the addresses of write requests. When the incoherency detection module receives a read request, it compares the address with the write addresses for other tasks stored in the memory. If there is a match, it sends a barrier request to the memory arbiter for each of the tasks with a matching write. It then annotates the read request with sideband data to indicate which tasks had matching writes. The memory arbiter does not send the read request to main memory until all the requests for the tasks indicated in the sideband data which are ahead of the barrier requests have been processed.

Подробнее
04-10-1995 дата публикации

Computing system

Номер: GB0002284081B

Подробнее
24-12-2014 дата публикации

Speculative load issue

Номер: GB0002501582B
Автор: JACKSON HUGH, HUGH JACKSON

Подробнее
28-08-2013 дата публикации

Virtual Machine Backup

Номер: GB0201312422D0
Автор:
Принадлежит:

Подробнее
15-09-2007 дата публикации

INTEGRATED CIRCUIT WITH DYNAMIC MEMORY DISPATCHING

Номер: AT0000372548T
Принадлежит:

Подробнее
25-01-2019 дата публикации

For the distribution of the load instruction in the program to the data cache method and apparatus

Номер: CN0105808211B
Автор:
Принадлежит:

Подробнее
07-08-2018 дата публикации

The microprocessor and its performance and power consumption management method

Номер: CN0104572500B
Автор:
Принадлежит:

Подробнее
08-06-2018 дата публикации

For for split index cache addressing system and method

Номер: CN0108139975A
Автор:
Принадлежит:

Подробнее
16-05-2023 дата публикации

Caching method, caching architecture, heterogeneous architecture and electronic equipment

Номер: CN116126747A
Принадлежит:

The invention provides a caching method, a caching architecture, a heterogeneous architecture and electronic equipment, and is applied to the technical field of computers and chips, and the caching architecture comprises a caching read processing module, a caching write processing module, a cold item detection module, a memory read processing module and a memory write processing module. Wherein the cache read processing module, the cache write processing module and the cold item detection module are realized on the coprocessor side, and the memory read processing module and the memory write processing module are realized on the general processor side. A novel cache architecture is arranged in a heterogeneous architecture, so that part of functions of the coprocessor are improved to a general processor, the overall process of data caching is smooth and high in efficiency, the influence of bottleneck such as area, power consumption and electric leakage on the coprocessor is reduced, the cost ...

Подробнее
06-07-2017 дата публикации

MEMORY NODE WITH CACHE FOR EMULATED SHARED MEMORY COMPUTERS

Номер: WO2017115007A1
Автор: FORSELL, Martti
Принадлежит:

Data memory node (400) for ESM (Emulated Shared Memory) architectures (100, 200), comprising a data memory module (402) containing data memory for storing input data therein and retrieving stored data therefrom responsive to predetermined control signals, a multi-port cache (404) for the data memory, said cache being provided with at least one read port (404A, 404B) and at least one write port (404C, 404D, 404E), said cache (404) being configured to hold recently and/or frequently used data stored in the data memory (402), and an active memory unit (406) at least functionally connected to a plurality of processors via an interconnection network (108), said active memory unit (406) being configured to operate the cache (404) upon receiving a multioperation reference (410) incorporating a memory reference to the data memory of the data memory module from a number of processors of said plurality, wherein responsive to the receipt of the multioperation reference the active memory unit (406) ...

Подробнее
04-02-2021 дата публикации

CACHE USAGE MEASURE CALCULATION DEVICE, CACHE USAGE MEASURE CALCULATION METHOD, AND CACHE USAGE MEASURE CALCULATION PROGRAM

Номер: WO2021019674A1
Принадлежит:

A cache usage measure calculation device (1) is provided with: a memory from which data is read and to which data is written; a cache which can be accessed at a higher speed than the memory; a central processing unit which performs processing by reading from and writing to the memory and the cache; a usage status measurement unit which measures the status of the usage of the cache by applications (11a, 11b) executed by the central processing unit; a performance measurement unit which measures the cache sensitivity and/or the cache pollution level with respect to the applications (11a, 11b); and a measure calculation unit which calculates a measure of the cache sensitivity and/or the cache pollution level for each of a plurality of pre-selected applications from performance degradation of the pre-selected applications and the usage status of the cache.

Подробнее
19-07-2018 дата публикации

PARTITIONING TLB OR CACHE ALLOCATION

Номер: WO2018130802A1
Принадлежит:

A request for data from a cache (TLB or data/instruction cache) specifies a partition identifier allocated to a software execution environment associated with the request. Allocation of data to the cache is controlled based on a set of configuration information selected based on the partition identifier specified by the request. For a TLB, this allows different allocation policies to be used for requests associated with different software execution environments. In one example, the cache allocation is controlled based on an allocation threshold specified by the selected set of configuration information, which limits the maximum number of cache entries allowed to be allocated with data associated with the corresponding partition identifier.

Подробнее
31-01-2019 дата публикации

LOCK ADDRESS CONTENTION PREDICTOR

Номер: WO2018057293A3
Принадлежит:

Techniques for selectively executing a lock instruction speculatively or non-speculatively based on lock address prediction and/or temporal lock prediction, including methods an devices for locking an entry in a memory device. In some techniques, a lock instruction executed by a thread for a particular memory entry of a memory device is detected. Whether contention occurred for the particular memory entry during an earlier speculative lock is detected on a condition that the lock instruction comprises a speculative lock instruction. The lock is executed non-speculatively if contention occurred for the particular memory entry during an earlier speculative lock. The lock is executed speculatively if contention did not occur for the particular memory entry during an earlier speculative lock.

Подробнее
07-04-2020 дата публикации

Tracking modifications to a virtual machine image that occur during backup of the virtual machine

Номер: US0010613940B2

A computer system comprises a processor unit arranged to run a hypervisor running one or more virtual machines; a cache connected to the processor unit and comprising a plurality of cache rows, each cache row comprising a memory address, a cache line and an image modification flag; and a memory connected to the cache and arranged to store an image of at least one virtual machine. The processor unit is arranged to define a log in the memory and the cache further comprises a cache controller arranged to set the image modification flag for a cache line modified by a virtual machine being backed up, but not for a cache line modified by the hypervisor operating in privilege mode; periodically check the image modification flags; and write only the memory address of the flagged cache rows in the defined log.

Подробнее
23-03-2021 дата публикации

Temporarily suppressing processing of a restrained storage operand request

Номер: US0010956337B2

Processing of a storage operand request identified as restrained is selectively, temporarily suppressed. The processing includes determining whether a storage operand request to a common storage location shared by multiple processing units of a computing environment is restrained, and based on determining that the storage operand request is restrained, then temporarily suppressing requesting access to the common storage location pursuant to the storage operand request. The processing unit performing the processing may proceed with processing of the restrained storage operand request, without performing the suppressing, where the processing can be accomplished using cache private to the processing unit. Otherwise the suppressing may continue until an instruction, or operation of an instruction, associated with the storage operand request is next to complete.

Подробнее
01-02-2018 дата публикации

MULTIPLE CHANNEL CACHE MEMORY AND SYSTEM MEMORY DEVICE UTILIZING A PSEUDO-MULTIPLE PORT FOR COMMANDS AND ADDRESSES AND A MULTIPLE FREQUENCY BAND QAM SERIALIZER/DESERIALIZER FOR DATA

Номер: US20180032436A1
Автор: Sheau-Jiung Lee
Принадлежит:

A high performance, low power, and cost effective multiple channel cache-system memory system is disclosed. 1. A computing device comprising:a first chip comprising one or more CPU cores, a memory controller coupled to the one or more CPU cores, and a first serializer-deserializer device;a second chip comprising cache memory managed by the memory controller, a data router, and a second serializer-deserializer device;system memory separate from the first chip and the second chip and managed by the memory controller; andan analog interface coupled to the first chip and the second chip, wherein the first serializer-deserializer device and the second serializer-deserializer device exchange data over the interface using quadrature amplitude modulation;wherein a memory request from the one or more CPU cores is serviced by the memory controller and the data router by providing data to the one or more CPU cores from the cache memory or the system memory.2. The computing device of claim 1 , wherein the memory controller is coupled to the one or more CPU cores with a processor bus.3. The computing device of claim 2 , further comprising a system bus coupled to the memory controller.4. The computing device of claim 3 , wherein the system bus is coupled to one or more graphics processor unit (GPU) cores.5. The computing device of claim 4 , wherein the memory controller comprises an arbiter for managing control of the analog interface.6. A computing device comprising:a first chip comprising one or more CPU cores, a memory controller coupled to the one or more CPU cores, and a first serializer-deserializer device;a second chip comprising cache memory managed by the memory controller, a data router, and a second serializer-deserializer device;system memory separate from the first chip and the second chip and managed by the memory controller;an analog interface coupled to the first chip and the second chip, wherein the first serializer-deserializer device and the second serializer- ...

Подробнее
22-05-2018 дата публикации

Parallel computing apparatus, compiling apparatus, and parallel processing method for enabling access to data in stack area of thread by another thread

Номер: US0009977759B2
Принадлежит: FUJITSU LIMITED, FUJITSU LTD

A parallel computing apparatus includes a first processor that executes a first thread, a second processor that executes a second thread, and a memory. The memory includes a first private area that corresponds to the first thread, a second private area that corresponds to the second thread, and a shared area. The first processor stores first data in the first private area and stores address information that enables access to the first data in the shared area. The second processor stores second data in the second private area, accesses the first data based on the address information, and generates third data based on the first and second data.

Подробнее
26-06-2018 дата публикации

System and method for cache replacement using conservative set dueling

Номер: US0010007620B2
Принадлежит: Intel Corporation, INTEL CORP

A processor includes a set associative cache and a cache controller. The cache controller makes an initial association between first and second groups of sampled sets in the cache and first and second cache replacement policies. Follower sets in the cache are initially associated with the more conservative of the two policies. Following cache line insertions in a first epoch, the associations between the groups of sampled sets and cache replacement policies are swapped for the next epoch. If the less conservative policy outperforms the more conservative policy during two consecutive epochs, the follower sets are associated with the less conservative policy for the next epoch. Subsequently, if the more conservative policy outperforms the less conservative policy during any epoch, the follower sets are again associated with the more conservative policy. Performance may be measured based the number of cache misses associated with each policy.

Подробнее
23-05-2023 дата публикации

Device and method for maintaining summary consistency in caches

Номер: US0011656991B2
Автор: Aviv Kuvent, Yair Toaff
Принадлежит: Huawei Technologies Co., Ltd.

An information processing device comprises: a memory comprising a cache for storing information related to an object from a plurality of objects, and a summary structure configured to store a summary for the object; a volume configured to store a merge file including the plurality of objects, and a set of dump-files, each dump-file being associated with a specific cache-dump operation of the cache; and a processor configured to assign, to the cache, a first identifier; perform a cache-dump operation based on generating a dump-file associated with the first identifier and storing the information related to the object from the cache to the generated dump-file; and assign, to the cache, a second identifier, wherein the second identifier is larger than the first identifier.

Подробнее
03-10-2023 дата публикации

Extended tags for speculative and normal executions

Номер: US0011775308B2
Принадлежит: Micron Technology, Inc.

A cache system having cache sets, registers associated with the cache sets respectively, and a logic circuit coupled to a processor to control the cache sets according to the registers. When a connection to an address bus of the system receives a memory address from the processor, the logic circuit can be configured to: generate an extended tag from at least the memory address; and determine whether the generated extended tag matches with a first extended tag for a first cache set or a second extended tag for a second cache set of the system. Also, the logic circuit can also be configured to implement a command received from the processor via the first cache set in response to the generated extended tag matching with the first extended tag and via the second cache set in response to the generated extended tag matching with the second extended tag.

Подробнее
26-09-2023 дата публикации

Scheduling of threads for execution utilizing load balancing of thread groups

Номер: US0011768687B2
Принадлежит: Intel Corporation

An apparatus to facilitate thread scheduling is disclosed. The apparatus includes logic to store barrier usage data based on a magnitude of barrier messages in an application kernel and a scheduler to schedule execution of threads across a plurality of multiprocessors based on the barrier usage data.

Подробнее
16-05-2024 дата публикации

CACHE OPTIMIZATION MECHANISM

Номер: US20240160581A1
Принадлежит: Intel Corporation

An apparatus includes a central processing unit (CPU), including a plurality of processing cores, each having a cache memory, a fabric interconnect coupled to the plurality of processing cores and cryptographic circuitry, coupled to the fabric interconnect including mesh stop station to receive memory data and determine a destination of the memory data and encryption circuitry to encrypt/decrypt the memory data based on a destination of the memory data.

Подробнее
03-07-2019 дата публикации

MULTI-LEVEL SYSTEM MEMORY CONFIGURATIONS TO OPERATE HIGHER PRIORITY USERS OUT OF A FASTER MEMORY LEVEL

Номер: EP3506112A1
Принадлежит:

A method is described. The method includes recognizing higher priority users of a multi-level system memory characterized by a faster higher level and a slower lower level in which the higher level is to act as a cache for the lower level and in which a first capacity of the higher level is less than a second capacity of the lower level such that caching resources of the higher level are oversubscribe-able. The method also includes performing at least one of: declaring an amount of the second capacity un-useable to reduce oversubscription of the caching resources; allocating system memory address space of the multi-level system memory so that requests associated with lower priority users will not compete with requests associated with the higher priority users for the caching resources.

Подробнее
15-01-2020 дата публикации

MEMORY RESOURCE OPTIMIZATION METHOD AND APPARATUS

Номер: EP3388947B1
Принадлежит: Huawei Technologies Co., Ltd.

Подробнее
01-03-2018 дата публикации

SYSTEME UND VERFAHREN ZUR ADRESSIERUNG EINES ZWISCHENSPEICHERS MIT AUFGESPALTENEN INDIZES

Номер: DE112016002247T5

Zwischenspeicher-Abbildungsverfahren werden präsentiert. Ein Zwischenspeicher kann ein Indexkonfigurationsverzeichnis enthalten. Das Verzeichnis kann die Sätze eines oberen Indexabschnitts und eines unteren Indexabschnitts einer Speicheradresse konfigurieren. Die Abschnitte können kombiniert werden, um einen kombinierten Index zu erzeugen. Die konfigurierbare Adressenstruktur mit einem aufgespaltenen Index kann unter anderen Anwendungen benutzt werden, um die Rate von Zwischenspeicherkonflikten, die zwischen mehreren, den Videorahmen parallel decodierender Prozesse auftreten, zu verringern.

Подробнее
15-07-2011 дата публикации

CIRCUIT AND PROCEDURE WITH CACHEKOHÄRENZBELASTUNGSSTEUERUNG

Номер: AT0000516542T
Принадлежит:

Подробнее
31-05-2017 дата публикации

Address access method and device

Номер: CN0106776366A
Принадлежит:

Подробнее
07-09-2018 дата публикации

INFORMATION PROCESSING DEVICE, METHOD FOR CONTROL OF INFORMATION PROCESSING DEVICE, AND PROGRAM FOR CONTROL OF INFORMATION PROCESSING DEVICE

Номер: WO2018159365A1
Автор: MAEDA, Munenori
Принадлежит:

... [Problem] To suppress occurrence of memory leak. [Solution] Provided is an information processing device comprising an execution unit, a first storage unit, a second storage unit, a migration processing unit, and an information handoff unit. The execution unit executes threads. The first storage unit stores thread information which may be used in the execution of the threads. The second storage unit stores the thread information which is migrated from the first storage unit. From among the threads which the execution unit executes, after completion of the execution of a thread which is subject to migration for which the thread information is migrated, the migration processing unit migrates the thread information of the thread which is subject to migration from the first storage unit to the second storage unit. In a case where the thread being executed uses the thread information which has been migrated to the second storage unit, the information handoff unit transfers the thread information ...

Подробнее
16-05-2017 дата публикации

System and method for managing a cache pool

Номер: US0009652394B2
Принадлежит: Dell Products L.P., DELL PRODUCTS LP

In one embodiment, a system includes a processor and a memory communicatively coupled to the processor. The processor is configured to receive a write request associated with a cache pool, which comprises a plurality of disks. The write request comprises data associated with the write request. The processor is additionally configured to select a first disk from the plurality of disks using a life parameter associated with the first disk. The processor is further configured to cause the data associated with the write request to be written to the first disk.

Подробнее
22-03-2005 дата публикации

Cache system

Номер: US0006871266B2

A cache system is provided which includes a cache memory and a cache refill mechanism which allocates one or more of a set of cache partitions in the cache memory to an item in dependence on the address of the item in main memory. This is achieved in one of the described embodiments by including with the address of an item a set of partition selector bits which allow a partition mask to be generated to identify into which cache partition the item may be loaded.

Подробнее
18-09-2018 дата публикации

Multiple-core computer processor for reverse time migration

Номер: US0010078593B2

A multi-core computer processor including a plurality of processor cores interconnected in a Network-on-Chip (NoC) architecture, a plurality of caches, each of the plurality of caches being associated with one and only one of the plurality of processor cores, and a plurality of memories, each of the plurality of memories being associated with a different set of at least one of the plurality of processor cores and each of the plurality of memories being configured to be visible in a global memory address space such that the plurality of memories are visible to two or more of the plurality of processor cores, wherein at least one of a number of the processor cores, a size of each of the plurality of caches, or a size of each of the plurality of memories is configured for performing a reverse-time-migration (RTM) computation.

Подробнее
14-09-2021 дата публикации

Synchronized access to data in shared memory by protecting the load target address of a fronting load

Номер: US0011119781B2

A data processing system includes multiple processing units all having access to a shared memory. A processing unit of the data processing system includes a processor core including an upper level cache, core reservation logic that records addresses in the shared memory for which the processor core has obtained reservations, and an execution unit that executes memory access instructions including a fronting load instruction. Execution of the fronting load instruction generates a load request that specifies a load target address. The processing unit further includes lower level cache that, responsive to receipt of the load request and based on the load request indicating an address match for the load target address in the core reservation logic, protects the load target address against access by any conflicting memory access request during a protection interval following servicing of the load request.

Подробнее
24-10-2019 дата публикации

STORAGE DEVICE AND OPERATING METHOD THEREOF

Номер: US20190324693A1
Принадлежит:

A memory control having improved cache program operation performance controls a memory device. The memory controller includes: a command queue for sequentially storing commands to be executed by the memory device; a cache program determinerfor determining, when a first command that is a program command stored in the command queue is provided to the memory device, whether a second command to be executed next in time to the first command is a program command; and a program operation controller for controlling the memory device to perform a program operation according to the first command as a normal program operation or cache program operation, depending on whether the second command is a program command.

Подробнее
03-10-2017 дата публикации

Techniques for surfacing host-side storage capacity to virtual machines when performing VM suspend or snapshot operations

Номер: US0009778847B2
Принадлежит: VMware, Inc., VMWARE INC

Techniques for surfacing host-side flash storage capacity to a plurality of VMs running on a host system are provided. In one embodiment, the host system creates, for each VM in the plurality of VMs, a flash storage space allocation in a flash storage device that is locally attached to the host system. The host system then causes the flash storage space allocation to be readable and writable by the VM as a virtual flash memory device.

Подробнее
08-12-2016 дата публикации

STORE FORWARDING CACHE

Номер: US20160357679A1
Принадлежит:

A load request is received to retrieve a piece of data from a location in memory and the load request follows one or more store requests in a set of instructions to store a piece of data in the location in memory. One or more possible locations in a cache for a piece of data corresponding to the location in memory is determined. Each possible location of the one or more possible locations in the cache is determined. It is then determined if at least one location of the one or more possible locations contains data to be stored in the location in memory. Data in one location of the at least one location is loaded, the data in the one location is from a store request of the one or more store requests and the store request is closest in the set of instructions to the load request.

Подробнее
03-01-2002 дата публикации

Cache system for concurrent processes

Номер: US2002002657A1
Автор:
Принадлежит:

A method of operating a cache memory is described in a system in which a processor is capable of executing a plurality of processes, each process including a sequence of instructions. In the method a cache memory is divided into cache partitions, each cache partition having a plurality of addressable storage locations for holding items in the cache memory. A partition indicator is allocated to each process identifying which, if any, of said cache partitions is to be used for holding items for use in the execution of that process. When the processor requests an item from main memory during execution of said current process and that item is not held in the cache memory, the item is fetched from main memory and loaded into one of the plurality of addressable storage locations in the identified cache partition.

Подробнее
04-05-2021 дата публикации

Asymmetric coherency protocol for first and second processing circuitry having different levels of fault protection or fault detection

Номер: US0010997076B2
Принадлежит: ARM Limited, ADVANCED RISC MACH LTD, ARM LIMITED

An apparatus has first processing circuitry and second processing circuity. The second processing circuitry has at least one hardware mechanism providing a greater level of fault protection or fault detection than is provided for the first processing circuitry. Coherency control circuitry controls access to data from at least part of a shared address space by the first and second processing circuitry according to an asymmetric coherency protocol in which a local-only update of data in a local cache of the first processing circuitry is restricted in comparison to a local-only update of data in a local cache of the second processing circuitry.

Подробнее
08-03-2018 дата публикации

MULTITHREADED TRANSACTIONS

Номер: US20180067762A1
Принадлежит:

Embodiments relate to multithreaded transactions. An aspect includes assigning a same transaction identifier (ID) corresponding to the multithreaded transaction to a plurality of threads of the multithreaded transaction, wherein the plurality of threads execute the multithreaded transaction in parallel. Another aspect includes determining one or more memory areas that are owned by the multithreaded transaction. Another aspect includes receiving a memory access request from a requester that is directed to a memory area that is owned by the transaction. Yet another aspect includes based on determining that the requester has a transaction ID that matches the transaction ID of the multithreaded transaction, performing the memory access request without aborting the multithreaded transaction.

Подробнее
17-05-2018 дата публикации

SEQUENTIAL DATA WRITES TO INCREASE INVALID TO MODIFIED PROTOCOL OCCURRENCES IN A COMPUTING SYSTEM

Номер: US20180137053A1
Принадлежит:

An example system on a chip (SoC) includes a cache, a processor, and a predictor circuit. The cache may store data. The processor may be coupled to the cache and store a first data set at a first location in the cache and receive a first request from an application to write a second data set to the cache. The predictor circuit may be coupled to the processor and determine that a second location where the second data set is to be written to in the cache is nonconsecutive to the first location, where the processor is to perform a request-for-ownership (RFO) operation for the second data set and write the second data set to the cache.

Подробнее
03-03-2020 дата публикации

Reducing overhead of managing cache areas

Номер: US0010579529B2

Maintaining multiple cache areas in a storage device having multiple processors includes loading data from a specific portion of non-volatile storage into a local cache slot in response to a specific processor of a first subset of the processors performing a read operation to the specific portion of non-volatile storage, where the local cache slot is accessible to the first subset of the processors and is inaccessible to a second subset of the processors that is different than the first subset of the processors and includes converting the local cache slot into a global cache slot in response to one of the processors performing a write operation to the specific portion of non-volatile storage, wherein the global cache area is accessible to the first subset of the processors and to the second subset of the processors. Different ones of the processors may be placed on different directors.

Подробнее
07-08-2018 дата публикации

Control system and method for cache coherency

Номер: US0010044829B2

Control systems and methods for cache coherency are provided. One control method includes steps of transmitting a link-connect request to a second electrical device when the first electrical device is coupled to the second electrical device by a cache coherency (CC) interface by a first electrical device, establishing a link between the first electrical device and second electrical device according to the link-connect request by the CC interface, and operating a first operating system of the first electrical device by a second processing unit of the second electrical device after establishing the link.

Подробнее
25-07-2018 дата публикации

ИНСТРУКЦИЯ И ЛОГИКА ДЛЯ ДОСТУПА К ПАМЯТИ В КЛАСТЕРНОЙ МАШИНЕ ШИРОКОГО ИСПОЛНЕНИЯ

Номер: RU2662394C2
Принадлежит: ИНТЕЛ КОРПОРЕЙШН (US)

Группа изобретений относится к области логики обработки информации. Техническим результатом является повышение производительности. Процессор включает в себя кэш второго уровня (L2), первый и второй кластер исполнительных блоков и первый и второй блок кэша данных (DCU), соединенные с возможностью связи с соответствующими кластерами исполнительных блоков и L2 кэшем, каждый из DCU включает в себя кэш данных и логику для приема операции памяти из исполнительного блока, ответа на операцию памяти с информацией из кэша данных, когда информация доступна в кэше данных, и извлечения информации из L2 кэша, когда информация не доступна в кэше данных. Процессор дополнительно включает в себя логику для поддержания кэша данных первого DCU равным контенту кэша данных второго DCU на всех тактовых циклах работы процессора. 3 н. и 17 з.п. ф-лы, 31 ил.

Подробнее
26-09-2017 дата публикации

СПОСОБ И УСТРОЙСТВО ОПТИМИЗАЦИИ РЕСУРСА ПАМЯТИ

Номер: RU2631767C1

Изобретение относится к области использования ресурсов памяти. Техническим результатом является оптимизация использования ресурсов памяти. Способ оптимизации ресурса памяти содержит этапы, на которых: приобретают данные о производительности каждой программы в рабочем наборе; категоризируют каждую программу согласно данным о производительности каждой программы и частоте доступа к памяти каждой программы, полученной посредством сбора статистики, причем данные о производительности каждой программы являются изменением, которое генерируется, когда предварительно установленный индикатор производительности каждой программы изменяется в зависимости от емкости распределенного ресурса кэша последнего уровня LLC; выбирают с учетом категоризации каждой программы в рабочем наборе и предварительно установленной политики принятия решений основанную на подгонке страниц политику разделения, соответствующую рабочему набору, причем основанная на подгонке страниц политика разделения содержит основанную на ...

Подробнее
04-02-1999 дата публикации

Performance improvement method using natural affinity for multi-processor cache system

Номер: DE0019833221A1
Принадлежит:

The method involves increasing a performance of a multi-processor system with common work memory (MM) and processor-specific cache memories (SIC), an interrupted program execution (PL) is continued on the same processor (CPU) it was started, if the processor becomes free within a predetermined period of time since the interruption. Another of the processors can take over the interrupted program execution only after the predetermined period of time. Program parts of different program executions requiring data from the same memory areas of the work memory are marked, and are attributed to the same processor because of this marking.

Подробнее
07-06-2012 дата публикации

Apparatus, method, and system for instantaneous cache state recovery from speculative abort/commit

Номер: US20120144126A1
Принадлежит: Intel Corp

An apparatus and method is described herein for providing instantaneous, efficient cache state recover upon an end of speculative execution. Speculatively accessed entries of a cache memory are marked as speculative, which may be on a thread specific basis. Upon an end of speculation, the speculatively marked entries are transitioned in parallel by a speculative port to their appropriate, thread specific, non-speculative coherency state; these parallel transitions allow for instantaneous commit or recovery of speculative memory state.

Подробнее
21-06-2012 дата публикации

Direct Access To Cache Memory

Номер: US20120159082A1
Принадлежит: International Business Machines Corp

Methods and apparatuses are disclosed for direct access to cache memory. Embodiments include receiving, by a direct access manager that is coupled to a cache controller for a cache memory, a region scope zero command describing a region scope zero operation to be performed on the cache memory; in response to receiving the region scope zero command, generating a direct memory access region scope zero command, the direct memory access region scope zero command having an operation code and an identification of the physical addresses of the cache memory on which the operation is to be performed; sending the direct memory access region scope zero command to the cache controller for the cache memory; and performing, by the cache controller, the direct memory access region scope zero operation in dependence upon the operation code and the identification of the physical addresses of the cache memory.

Подробнее
12-07-2012 дата публикации

Global instructions for spiral cache management

Номер: US20120179872A1
Автор: Volker Strumpen
Принадлежит: International Business Machines Corp

A method of operation of a pipelined cache memory supports global operations within the cache. The cache may be a spiral cache, with a move-to-front M2F network for moving values from a backing store to a front-most tile coupled to a processor or lower-order level of a memory hierarchy and a spiral push-back network for pushing out modified values to the backing-store. The cache controller manages application of global commands by propagating individual commands to the tiles. The global commands may provide zeroing, flushing and reconciling of the given tiles. Commands for interrupting and resuming interrupted global commands may be implemented, to reduce halting or slowing of processing while other global operations are in process. A line detector within each tile supports reconcile and flush operations, and a line patcher in the controller provides for initializing address ranges with no processor intervention.

Подробнее
12-07-2012 дата публикации

Mechanism to support flexible decoupled transactional memory

Номер: US20120179877A1
Принадлежит: UNIVERSITY OF ROCHESTER

The present invention employs three decoupled hardware mechanisms: read and write signatures, which summarize per-thread access sets; per-thread conflict summary tables, which identify the threads with which conflicts have occurred; and a lazy versioning mechanism, which maintains the speculative updates in the local cache and employs a thread-private buffer (in virtual memory) only in the rare event of an overflow. The conflict summary tables allow lazy conflict management to occur locally, with no global arbitration (they also support eager management). All three mechanisms are kept software-accessible, to enable virtualization and to support transactions of arbitrary length.

Подробнее
26-07-2012 дата публикации

Managing Access to a Cache Memory

Номер: US20120191917A1
Принадлежит: International Business Machines Corp

Managing access to a cache memory includes dividing said cache memory into multiple of cache areas, each cache area having multiple entries; and providing at least one separate lock attribute for each cache area such that only a processor thread having possession of the lock attribute corresponding to a particular cache area can update that cache area.

Подробнее
02-08-2012 дата публикации

Address-based hazard resolution for managing read/write operations in a memory cache

Номер: US20120198178A1
Принадлежит: International Business Machines Corp

One embodiment provides a cached memory system including a memory cache and a plurality of read-claim (RC) machines configured for performing read and write operations dispatched from a processor. According to control logic provided with the cached memory system, a hazard is detected between first and second read or write operations being handled by first and second RC machines. The second RC machine is suspended and a subset of the address bits of the second operation at specific bit positions are recorded. The subset of address bits of the first operation at the specific bit positions are broadcast in response to the first operation being completed. The second operation is then re-requested.

Подробнее
23-08-2012 дата публикации

Cache and a method for replacing entries in the cache

Номер: US20120215985A1
Автор: Douglas B. Hunt
Принадлежит: Advanced Micro Devices Inc

A processor is provided. The processor including a cache, the cache having a plurality of entries, each of the plurality of entries having a tag array and a data array, and a remapper configured to create at least one identifier, each identifier being unique to a process of the processor, and to assign a respective identifier to the tag array for the entries related to a respective process, the remapper further configured to determine a replacement value for the entries related to each identifier.

Подробнее
13-09-2012 дата публикации

Managing shared memory used by compute nodes

Номер: US20120233409A1
Автор: Jonathan Ross, Jork Loeser
Принадлежит: Microsoft Corp

A technology can be provided for managing shared memory used by a plurality of compute nodes. An example system can include a shared globally addressable memory to enable access to shared data by the plurality of compute nodes. A memory interface can process memory requests sent to the shared globally addressable memory from the plurality of processors. A memory write module can be included for the memory interface to allocate memory locations in the shared globally addressable memory and write read-only data to the globally addressable memory from a writing compute node. In addition, a read module for the memory interface can map read-only data in the globally addressable shared memory as read-only for subsequent accesses by the plurality of compute nodes.

Подробнее
04-10-2012 дата публикации

Method of generating code executable by processor

Номер: US20120254551A1
Принадлежит: WASEDA UNIVERSITY

It is provided a method of generating a code by a compiler, including the steps of: analyzing a program executed by a processor; analyzing data necessary to execute respective tasks included in a program; determining whether a boundary of the data used by divided tasks is consistent with a management unit of a cache memory based on results of the analyzing; and generating a code for providing a non-cacheable area from which the data to be stored in the management unit including the boundary is not temporarily stored into the cache memory and a code for storing an arithmetic processing result stored in the management unit including the boundary into a non-cacheable area in a case where it is determined that the boundary of the data used by the divided tasks is not consistent with the management unit of the cache memory.

Подробнее
06-12-2012 дата публикации

Multiprocessor and image processing system using the same

Номер: US20120311266A1
Автор: Hirokazu Takata
Принадлежит: Renesas Electronics Corp

To provide a multiprocessor capable of easily sharing data and buffering data to be transferred. Each of a plurality of shared local memories is connected to two processors of a plurality of processor units, and the processor units and the shared local memories are connected in a ring. Consequently, it becomes possible to easily share data and buffer data to be transferred.

Подробнее
17-01-2013 дата публикации

Multi-core processor system, memory controller control method, and computer product

Номер: US20130019069A1
Принадлежит: Fujitsu Ltd

A multi-core processor system includes a memory controller that includes multiple ports and shared memory that includes physical address spaces divided among the ports. A CPU acquires from a parallel degree information table, the number of CPUs to which software that is to be executed by the multi-core processor system, is to be assigned. After this acquisition, the CPU determines the CPUs to which the software to be executed is to be assigned and sets for each CPU, physical address spaces corresponding to logical address spaces defined by the software to be executed. After this setting, the CPU notifies an address converter of the addresses and notifies the software to be executed of the start of execution.

Подробнее
24-01-2013 дата публикации

Method and apparatus for adaptive cache frame locking and unlocking

Номер: US20130024620A1
Принадлежит: Agere Systems LLC

Most recently accessed frames are locked in a cache memory. The most recently accessed frames are likely to be accessed by a task again in the near future and may be locked at the beginning of a task switch or interrupt to improve cache performance. The list of most recently used frames is updated as a task executes and may be embodied as a list of frame addresses or a flag associated with each frame. The list of most recently used frames may be separately maintained for each task if multiple tasks may interrupt each other. An adaptive frame unlocking mechanism is also disclosed that automatically unlocks frames that may cause a significant performance degradation for a task. The adaptive frame unlocking mechanism monitors a number of times a task experiences a frame miss and unlocks a given frame if the number of frame misses exceeds a predefined threshold.

Подробнее
04-04-2013 дата публикации

System and method for supporting a tiered cache

Номер: US20130086326A1
Автор: Naresh Revanuru
Принадлежит: Oracle International Corp

A computer-implemented method and system can support a tiered cache, which includes a first cache and a second cache. The first cache operates to receive a request to at least one of update and query the tiered cache; and the second cache operates to perform at least one of an updating operation and a querying operation with respect to the request via at least one of a forward strategy and a listening scheme.

Подробнее
11-04-2013 дата публикации

Early Cache Eviction in a Multi-Flow Network Processor Architecture

Номер: US20130091330A1
Принадлежит: LSI Corporation

Described embodiments provide an input/output interface of a network processor that generates a request to store received packets to a system cache. If an entry associated with the received packet does not exist in the system cache, the system cache determines whether a backpressure indicator of the system cache is set. If the backpressure indicator is set, the received packet is written to the shared memory. If the backpressure indicator is not set, the system cache determines whether to evict data from the system cache in order to store the received packet. If an eviction rate of the system cache has reached a threshold, the system cache sets a backpressure indicator and writes the received packet to the shared memory. If the eviction rate has not reached the threshold, the system cache determines an available entry and writes the received packet to the available entry in the system cache. 1. A method of processing a received packet of a network processor , wherein the network processor comprises a plurality of processing modules , a shared system cache and a shared memory , the method comprising:generating, by an input/output interface of the network processor, a request to store data to the shared system cache, the data corresponding to the received packet;determining, by the system cache, whether an entry associated with a data flow corresponding to the received data packet exists in the system cache; determining whether a backpressure indicator of the system cache is set;', 'writing the data corresponding to the received packet to the shared memory;', 'if the backpressure indicator is set, determining whether to evict data from the system cache in order to store the data corresponding to the received packet;', 'determining an eviction rate of the system cache, wherein the eviction rate measures a frequency with which older cache entries are evicted to store data corresponding to received packets;', setting the backpressure indicator of the system cache; and', ...

Подробнее
18-04-2013 дата публикации

MULTI-CORE PROCESSOR SYSTEM, COMPUTER PRODUCT, AND CONTROL METHOD

Номер: US20130097382A1
Принадлежит: FUJITSU LIMITED

A multi-core processor system includes a first processor that among cores of the multi-core processor, identifies other cores having a cache miss-hit rate lower than that of a given core storing a specific program in a cache, based on a task information volume of each core; a control circuit that migrates the specific program from the cache of the given core to a cache of the identified core; and a second processor that, after the specific program is migrated to the cache of the identified core, sets as a write-inhibit area, an area that is of the cache of the identified core and to which the specific program is stored. 1. A multi-core processor system comprising:a first processor that among cores of the multi-core processor, identifies other cores having a cache miss-hit rate lower than that of a given core storing a specific program in a cache, based on a task information volume of each core;a control circuit that migrates the specific program from the cache of the given core to a cache of the identified core; anda second processor that, after the specific program is migrated to the cache of the identified core, sets as a write-inhibit area, an area that is of the cache of the identified core and to which the specific program is stored.2. A computer-readable recording medium stores a control program causing a multi-core processor capable of accessing a control circuit that upon receiving a migration notification of a cache of a destination core , a cache of a source core and a program to be migrated , migrates the program from the cache of the source core to the cache of the destination core , to execute a process comprising:identifying among cores of the multi-core processor, other cores having a cache miss-hit rate lower than that of a given core storing a specific program in a cache, based on a task information volume of each core;causing the control circuit to migrate the specific program from the cache of the given core to a cache of the identified core; ...

Подробнее
25-04-2013 дата публикации

Optimizing Memory Copy Routine Selection For Message Passing In A Multicore Architecture

Номер: US20130103905A1
Принадлежит:

In one embodiment, the present invention includes a method to obtain topology information regarding a system including at least one multicore processor, provide the topology information to a plurality of parallel processes, generate a topological map based on the topology information, access the topological map to determine a topological relationship between a sender process and a receiver process, and select a given memory copy routine to pass a message from the sender process to the receiver process based at least in part on the topological relationship. Other embodiments are described and claimed. 1. At least one computer-readable medium comprising instructions that when executed cause a system to:determine whether a sender of a message to be sent to a receiver using a message passing interface (MPI) shares a cache memory with the receiver and if so, select a first memory copy routine to pass the message;if the sender and the receiver do not share a cache memory, determine whether the sender and the receiver are located in a first processor package and if so, select a second memory copy routine to pass the message, wherein the second memory copy routine is selected from a different group of copy routines than the first memory copy routine; andif the sender and the receiver are not in the first processor package, select a third memory copy routine to pass the message, based on a size of the message.2. The at least one computer-readable medium of claim 1 , wherein the instructions further enable the system to access a topological map to determine whether the sender and the receiver share a cache memory and are of the first processor package.3. The at least one computer-readable medium of claim 1 , wherein the instructions further enable the system to select the first plurality of memory copy routines based on comparison of a size of the message to a first threshold.4. The at least one computer-readable medium of claim 3 , wherein the instructions further enable the ...

Подробнее
02-05-2013 дата публикации

METHOD FOR ACCESSING CACHE AND PSEUDO CACHE AGENT

Номер: US20130111142A1
Принадлежит: Huawei Technologies Co., Ltd.

Embodiments of the present invention disclose a method for accessing a cache and a pseudo cache agent (PCA). The method of the present invention is applied to a multiprocessor system, where the system includes at least one NC, at least one PCA conforming to a processor micro-architecture level interconnect protocol is embedded in the NC, the PCA is connected to at least one PCA storage device, and the PCA storage device stores data shared among memories in the multiprocessor system. The method of the present invention includes: if the NC receives a data request, obtaining, by the PCA, target data required in the data request from the PCA storage device connected to the PCA; and sending the target data to a sender of the data request. Embodiments of the present invention are mainly applied to a process of accessing cache data in the multiprocessor system. 1. A node controller , applied to a multiprocessor system comprising at least one node controller (NC) and at least one processor connected to the NC , wherein the NC comprises:at least one pseudo cache agent (PCA) conforming to a processor micro-architecture level interconnect protocol and embedded in the NC, whereinthe PCA is connected to at least one PCA storage device which stores data shared among memories in the multiprocessor system, and the PCA comprises:a data obtaining module, configured to, if the NC receives a data request, obtain target data required in the data request from the PCA storage device connected to the PCA; anda sending module, configured to send the target data obtained by the data obtaining module to a sender of the data request.2. The node controller according to claim 1 , the PCA further comprises:a query module, configured to, before the NC receives the data request, query, according to a data query request, an update state of the target data in the PCA storage device connected to the PCA; anda state feedback module, configured to send the update state of the target data found by the ...

Подробнее
23-05-2013 дата публикации

INFORMATION PROCESSING SYSTEM

Номер: US20130132678A1
Принадлежит: FUJITSU LIMITED

An information processing system has a plurality of nodes which use a snoop cache memory in each of the plurality of nodes. A directory, which maintains a cache coherence of the snoop cache memory of the plurality of nodes, has a first directory and a second directory which has a different format from a format of the first directory and is only used for a shared state. The node searches the first and second directories, and determines the other node to transmit a snoop. 1. An information processing system connected to a plurality of nodes ,each of said plurality of nodes comprising:at least one arithmetic processing unit, a cache memory that stores data to be used by the arithmetic processing unit, anda node controller that searches a directory which stores status information whether data stored in the cache memory has been held the cache memory of an other node and data that identifies the other node and transmits a snoop to the other node in response to a data request from the arithmetic processing unit; a first directory that stores status information whether data stored in the cache memory has been held the cache memory of the other node and data that identifies the other node; and', 'a second directory which stores information that identify a shared node of a shared state of which the data stored in the cache memory has been held the cache memory of the other node., 'wherein the directory in the node controller comprises2. The information processing system according to claim 1 , wherein the node controller searches the first directory in the response to the data request from the arithmetic processing unit claim 1 , searches the second directory when determining that the other node to transmit the snoop can not be identified from the first directory claim 1 , and transmits the snoop to the other node which is identified from a search result of the second directory.3. The information processing system according to claim 1 , wherein the node controller determines ...

Подробнее
23-05-2013 дата публикации

STORAGE SYSTEM, CONTROL PROGRAM AND STORAGE SYSTEM CONTROL METHOD

Номер: US20130132679A1
Принадлежит: Hitachi, Ltd.

There is provided a storage system including one or more LDEVs, one or more processors, a local memory or memories corresponding to the processor or processors, and a shared memory, which is shared by the processors, wherein control information on I/O processing or application processing is stored in the shared memory, and the processor caches a part of the control information in different storage areas on a type-by-type basis in the local memory or memories corresponding to the processor or processors in referring to the control information stored in the shared memory. 1. A storage system , comprising:a plurality of physical storage devices for providing storage resources to a plurality of logical devices;a plurality of disk interfaces coupled to the physical storage devices;a plurality of host interfaces;one or more processing units, each of which includes one or more processors and a local memory coupled to the processors; anda shared memory coupled to the processors, the shared memory including control information necessary for processing I/O requests to each of the logical devices from a host computer through one or more of the host interfaces,wherein for each of the logical devices, one of the processing units is allocated to be responsible for processing I/O requests for the corresponding logical device that it is responsible for,wherein, the one or more processors of each of the processing units are configured to:cache at least part of the control information in the local memory of the processing unit necessary for processing I/O requests to the corresponding logical device which the processing unit is responsible for, from the shared memory to the local memory of the processing unit, andwhen updating the cached control information in the processor unit, reflect the update to control information in the shared memory corresponding to the updated control information in the processor, and send a notification of the update of the control information to ...

Подробнее
13-06-2013 дата публикации

INTERFACE AND METHOD FOR INTER-THREAD COMMUNICATION

Номер: US20130151783A1

The interface for inter-thread communication between a plurality of threads including a number of producer threads for producing data objects and a number of consumer threads for consuming the produced data objects includes a specifier and a provider. The specifier is configured to specify a certain relationship between a certain producer thread of the number of producer threads which is adapted to produce a certain data object and a consumer thread of the number of consumer threads which is adapted to consume the produced certain data object. Further, the provider is configured to provide direct cache line injection of a cache line of the produced certain data object to a cache allocated to the certain consumer thread related to the certain producer thread by the specified certain relationship. 1. An interface for inter-thread communication between a plurality of threads including a number of producer threads for producing data objects and a number of consumer threads for consuming the produced data objects , the interface comprising:a specifier for specifying a certain relationship between a certain producer thread of a number of producer threads which is adapted to produce a certain data object and a consumer thread of a number of consumer threads which is adapted to consume a produced certain data object; anda provider for providing direct cache line injection of a cache line of the produced certain data object to a cache allocated to the certain consumer thread related to the certain producer thread by the specified certain relationship.2. The interface of claim 1 , wherein the interface is a programming tool claim 1 , including an Application Programming Interface (API).3. The interface of claim 1 , wherein the certain data object is embodied by a number of consecutive cache lines claim 1 , wherein the provider is configured to provide direct cache line injection of the number of consecutive cache lines from a cache allocated to the certain producer thread to ...

Подробнее
11-07-2013 дата публикации

TECHNIQUE FOR PRESERVING CACHED INFORMATION DURING A LOW POWER MODE

Номер: US20130179639A1
Принадлежит:

A technique to retain cached information during a low power mode, according to at least one embodiment. In one embodiment, information stored in a processor's local cache is saved to a shared cache before the processor is placed into a low power mode, such that other processors may access information from the shared cache instead of causing the low power mode processor to return from the low power mode to service an access to its local cache. 1. A system comprising:a first processor having at least two processor cores, wherein at least one of the processor cores is to enter a low power mode, in which information stored a local cache of the at least one processor core is no longer accessible;a second processor having at least one processor core to access information from a shared cache if the at least one processor core of the first processor is in the low power mode, the shared cache to store versions of information stored in each of the at least two processor cores of the first processor;a system memory to store versions of information stored in the shared cache; anda memory controller through which the at least one processor core of the second processor is to access the system memory.2. The system of claim 1 , further comprising a non-volatile memory to store a power state of the at least one processor core of the first processor.3. The system of claim 2 , wherein the at least one processor core of the second processor is to attempt to access the information from the at least one processor core of the first processor regardless of the power state in which the at least one processor core of the first processor is in.4. The system of claim 3 , wherein if the at least one processor core of the first processor has not entered the low power mode claim 3 , the at least one processor core of the second processor is to snoop the at least one processor core's local cache of the first processor.5. The system of claim 1 , wherein the first and second processors are coupled ...

Подробнее
01-08-2013 дата публикации

SYSTEMS AND METHODS FOR A DE-DUPLICATION CACHE

Номер: US20130198459A1
Принадлежит: Fusion-io, Inc.

A de-duplication is configured to cache data for access by a plurality of different storage clients, such as virtual machines. A virtual machine may comprise a virtual machine de-duplication module configured to identify data for admission into the de-duplication cache. Data admitted into the de-duplication cache may be accessible by two or more storage clients. Metadata pertaining to the contents of the de-duplication cache may be persisted and/or transferred with respective storage clients such that the storage clients may access the contents of the de-duplication cache after rebooting, being power cycled, and/or being transferred between hosts. 1. An apparatus , comprising:a de-duplication cache manager configured for operation on a host computing device comprising a plurality of virtual machines, the de-duplication cache manager configured to admit data into a de-duplication cache in response to admission requests from one or more of the plurality of virtual machines; anda cache interface module configured to provide access to a single copy of data admitted into the de-duplication cache to two or more of the virtual machines.2. The apparatus of claim 1 , further comprising a virtual machine de-duplication module configured to identify files that are infrequently modified for admission into the de-duplication cache.3. The apparatus of claim 1 , wherein the de-duplication cache manager is configured to index data admitted into the de-duplication cache using context-independent data identifiers.4. The apparatus of claim 3 , wherein the de-duplication cache manager is configured to determine whether data has already been admitted into the de-duplication cache by use of the context-independent identifiers.5. The apparatus of claim 4 , wherein the de-duplication cache manager is configured to verify a match between context-independent identifiers by a byte-by-byte comparison of data corresponding to the context-independent identifiers.6. The apparatus of claim 1 , ...

Подробнее
08-08-2013 дата публикации

MULTICORE COMPUTER SYSTEM WITH CACHE USE BASED ADAPTIVE SCHEDULING

Номер: US20130205092A1
Автор: Datta Soumya, Roy Shaibal
Принадлежит: EMPIRE TECHNOLOGY DEVELOPMENT LLC

An example multicore environment generally described herein may be adapted to improve use of a shared cache by a plurality of processing cores in a multicore processor. For example, where a producer task associated with a first core of the multicore processor places data in a shared cache at a faster rate than a consumer task associated with a second core of the multicore processor, relative task execution rates can be adapted to prevent eventual increased cache misses by the consumer task. 1. A multicore computer system including a first core and a second core , the multicore computer system comprising:a shared cache;a counter configured to count just-missed misses by the first core, wherein the just-missed misses include cache misses associated with data recently evicted from the shared cache; anda scheduler configured to, in response to an increase in the just-missed misses counted by the counter, adjust an execution rate of a task associated with the first core relative to an execution rate of a task associated with the second core.2. The multicore computer system of claim 1 , further comprising a cache controller configured to adjust and initialize the counter in response to a command from the scheduler.3. The multicore computer system of claim 2 , wherein the cache controller is configured to maintain a list of cache line addresses associated with data recently discarded from the shared cache claim 2 , and adjust the counter in response to a request by the first core for the data recently evicted from the shared cache.4. The multicore computer system of claim 2 , wherein the cache controller is configured to provide a counter value to the scheduler upon request by the scheduler.5. The multicore computer system of claim 1 , wherein the scheduler is further configured to adjust the execution rate of the task associated with the first core relative to the execution rate of the task associated with the second core by providing a policy to a cache controller.625-. ...

Подробнее
15-08-2013 дата публикации

SELECTIVELY READING DATA FROM CACHE AND PRIMARY STORAGE

Номер: US20130212332A1
Принадлежит: ORACLE INTERNATIONAL CORPORATION

Techniques are provided for using an intermediate cache to provide some of the items involved in a scan operation, while other items involved in the scan operation are provided from primary storage. Techniques are also provided for determining whether to service an I/O request for an item with a copy of the item that resides in the intermediate cache based on factors such as a) an identity of the user for whom the I/O request was submitted, b) an identity of a service that submitted the I/O request, c) an indication of a consumer group to which the I/O request maps, or d) whether the intermediate cache is overloaded. Techniques are also provided for determining whether to store items in an intermediate cache in response to the items being retrieved, based on logical characteristics associated with the requests that retrieve the items. 1. A method comprising:receiving, from a software application, at a storage system, a single I/O request to retrieve items involved in an operation that targets a plurality of items that reside on a primary storage that is managed by the storage system;wherein the plurality of items targeted by the operation includes a first set of items and a second set of items; and retrieving the first set of items from the primary storage; and', 'retrieving the second set of items from an intermediate cache managed by the storage system., 'in response to the single I/O request, the storage system performing steps of2. The method of wherein a copy of at least one item in the first set of items resides in the intermediate cache at the time the at least one item is retrieved from the primary storage in response to the single I/O request.3. The method of wherein at least a portion of the step of retrieving the first set of items from the primary storage is performed concurrently with performance of the step of retrieving the second set of items from the intermediate cache managed by the storage system.4. The method of wherein claim 1 , in response to ...

Подробнее
15-08-2013 дата публикации

INFORMATION PROCESSING APPARATUS, METHOD OF CONTROLLING MEMORY, AND MEMORY CONTROLLING APPARATUS

Номер: US20130212333A1
Принадлежит: FUJITSU LIMITED

An information processing apparatus provided with a plurality of nodes each including at least one processor, a system controller, and a main memory, includes a status storage unit that stores statuses of a plurality of cache lines and that is capable of reading statuses of a plurality of cache lines by one reading operation, a recording unit that is provided in a system controller in at least one node and that records all or part of the statuses stored in the status storage unit, wherein the system controller records obtained statuses in the recording unit on a condition that all of the statuses of the plurality of cache lines obtained by reading the status storage unit are invalid statuses or shared statuses in different nodes when the system controller has read the status storage unit in response to a request. 1. An information processing apparatus provided with a plurality of nodes each including at least one processor , a system controller , and a main memory , the information processing apparatus comprising:a status storage unit that stores statuses of a plurality of cache lines and that is capable of reading statuses of a plurality of cache lines by one reading operation; anda recording unit that is provided in a system controller in at least one node and that records all or part of the statuses stored in the status storage unit, whereinthe system controller records obtained statuses in the recording unit on a condition that all of the statuses of the plurality of cache lines obtained by reading the status storage unit are invalid statuses or shared statuses indifferent nodes when the system controller has read the status storage unit in response to a request.2. The information processing apparatus according to claim 1 , whereinwhen a request has been made by a different node to the node of the system controller, the request is a type of a request that eventually caches data, and the status of the request is included among records in the recording unit, the ...

Подробнее
19-09-2013 дата публикации

METHOD AND SYSTEM FOR DYNAMICALLY POWER SCALING A CACHE MEMORY OF A MULTI-CORE PROCESSING SYSTEM

Номер: US20130246825A1
Принадлежит: RESEARCH IN MOTION LIMITED

A system and method of power scaling cache memory () of a multi-core processing system includes a plurality of core processors (), a cache memory () and a controller (). The cache memory () includes partitioned cache () and shared cache (). The shared cache () can be partitioned into the partitioned cache (). Each core processor () is communicatively coupled to at least one corresponding partitioned cache () and the shared cache (). The controller () is communicatively coupled to each of the core processors (), to the partitioned cache (), and to the shared cache (). The controller () is configured to cause the at least one corresponding partitioned cache () to power down in response to the corresponding core processor () powering down. The controller () can also be configured to flush the cache lines of the partitioned cache () prior to powering down the partitioned cache () in response to the corresponding processor () powering down. 1. An electronic device comprising:a plurality of core processors;cache memory comprising partitioned cache and shared cache, with each core processor communicatively coupled to at least one corresponding partitioned cache and the shared cache; anda controller communicatively coupled to each of the core processors, to the partitioned cache, and to the shared cache, the controller configured to cause the at least one corresponding partitioned cache to power down in response to the corresponding core processor powering down.2. The electronic device as recited in claim 1 , wherein the partitioned cache is a portion of the shared cache.3. The electronic device as recited in claim 1 , wherein the controller comprises a plurality of controllers and each controller is communicatively coupled to a corresponding core processor.4. The electronic device as recited in claim 1 , further comprising a lookup pipeline communicatively coupled to the controller and the cache memory claim 1 , wherein the controller is further configured to access the ...

Подробнее
03-10-2013 дата публикации

Translation lookaside buffer for multiple context compute engine

Номер: US20130262816A1
Принадлежит: Intel Corp

Some implementations disclosed herein provide techniques and arrangements for an specialized logic engine that includes translation lookaside buffer to support multiple threads executing on multiple cores. The translation lookaside buffer enables the specialized logic engine to directly access a virtual address of a thread executing on one of the plurality of processing cores. For example, an acceleration compute engine may receive one or more instructions from a thread executed by a processing core. The acceleration compute engine may retrieve, based on an address space identifier associated with the one or more instructions, a physical address associated with the one or more instructions from the translation lookaside buffer to execute the one or more instructions using the physical address.

Подробнее
17-10-2013 дата публикации

CACHING FOR HETEROGENEOUS PROCESSORS

Номер: US20130275681A1
Принадлежит:

A multi-core processor providing heterogeneous processor cores and a shared cache is presented. 1. A processor comprising:processor cores including heterogeneous processor cores, the heterogeneous processor cores comprising multiple central processing unit (CPU) cores; anda first cache connected to and shared by a first set of the processor cores;a second cache connected to and shared by a second set of the processor cores, the second cache being connected to the first cache; 'wherein the first cache and second cache are kept coherent.', 'wherein at least the first set of processor cores and the first cache are integrated on a single integrated die; and'}2. The processor of wherein the heterogeneous processor cores comprises a special purpose processor core.3. The processor of wherein the special purpose processor core comprises a network processor unit (NPU) core having an instruction set that does not include an instruction for a floating point operation.4. The processor of wherein the special purpose processor core comprises a graphics engine core.5. The processor of wherein the heterogeneous processor cores comprises a NPU core and a CPU core.6. The processor of wherein the heterogeneous processor cores comprises a CPU core and a graphics engine core.7. The processor of wherein the first cache comprises a multi-ported cache.8. The processor of wherein the processor cores comprise one or more processor cores of a first type and one or more processor cores of a second type claim 7 , and the multiported cache comprises at least one port to support transactions generated by the one or more processor cores of the first type and at least one port to support transactions generated by the one or more processor cores of the second type.9. The processor of wherein the ports are configured to operate based on the respective processor core types that the ports support.10. The processor of wherein the ports comprise ports configured based on one or more of: command types ...

Подробнее
24-10-2013 дата публикации

MANAGING CONCURRENT ACCESSES TO A CACHE

Номер: US20130282985A1

Various embodiments of the present invention allow concurrent accesses to a cache. A request to update an object stored in a cache is received. A first data structure comprising a new value for the object is created in response to receiving the request. A cache pointer is atomically modified to point to the first data structure. A second data structure comprising an old value for the cached object is maintained until a process, which holds a pointer to the old value of the cached object, at least one of one of ends and indicates that the old value is no longer needed. 1. An information processing system for allowing concurrent accesses to a cache , the information processing system comprising:a memory; receiving a request to update an object stored in a cache;', 'creating, in response to receiving the request, a first data structure comprising a new value for the object;', 'atomically modifying a cache pointer to point to the first data structure; and', 'maintaining a second data structure comprising an old value for the object until a process, which holds a pointer to the old value of the object, at least one of ends and indicates that the old value is no longer needed., 'a processor communicatively coupled to the memory, wherein the processor is configured to perform a method comprising2. The information processing system of claim 1 , wherein the method further comprises:receiving a request to add a new object to the cache;creating a third data structure comprising the new object; andatomically modifying a cache pointer to point to the third data structure.3. The information processing system of claim 1 , wherein the method further comprises:receiving a request to delete an existing object from the cache;atomically modifying a pointer associated with the existing object, the atomically modifying preventing the existing object from being accessible by searching the cache; andmaintaining a data structure comprising a value for the existing object until a process ...

Подробнее
05-12-2013 дата публикации

INFORMATION PROCESSING APPARATUS, MEMORY APPARATUS, AND DATA MANAGEMENT METHOD

Номер: US20130326146A1
Автор: ABE Tomonori
Принадлежит:

An information processing apparatus that appropriately manages data of an auxiliary memory apparatus is provided to prevent data from leaking. The information processing apparatus includes a first memory apparatus, a second memory apparatus, and a caching unit. The caching unit stores write data to be written on the second memory apparatus in a cache area ensured on the first memory apparatus. When a first event occurs, the caching unit initializes a management information table, in which the address of the cache area in which the write data is stored is associated with the address of the second memory apparatus in which the write data is to be stored, and restores the second memory apparatus to a state pervious to a state in which data is written. 1. An information processing apparatus comprising:a first memory apparatus;a second memory apparatus; anda caching unit that stores write data to be written on the second memory apparatus in a cache area ensured on the first memory apparatus,wherein, when a first event occurs, the caching unit initializes a management information table so as to do restores the second memory apparatus to the state before the writing of the write data, the management information table in which an address of the cache area in which the write data is stored is associated with an address of the second memory apparatus in which the write data is to be stored.2. The information processing apparatus according to claim 1 , whereinthe second memory apparatus is a removable device which is able to be ejected, andthe first event is a request for detaching the second memory apparatus.3. The information processing apparatus according to claim 1 , wherein claim 1 , when a second even occurs claim 1 , the cashing unit writes the write data stored in the cache area on the second memory apparatus based on the management information table and updates the second memory apparatus to a latest state.4. The information processing apparatus according to claim 3 , ...

Подробнее
05-12-2013 дата публикации

Short circuit of probes in a chain

Номер: US20130326147A1
Принадлежит: Intel Corp

A multi-core processing apparatus may provide a cache probe and data retrieval method. The method may comprise sending a memory request from a requester to a record keeping structure. The memory request may have a memory address of a memory that stores requested data. The method may further comprise determining that a local last accessor of the memory address may have a copy of the requested data up to date with the memory. The local last accessor may be within a local domain that the requester belongs to. The method may further comprise sending a cache probe to the local last accessor and retrieving a latest value of the requested data from the local last accessor to the requester.

Подробнее
19-12-2013 дата публикации

IDENTIFYING AND PRIORITIZING CRITICAL INSTRUCTIONS WITHIN PROCESSOR CIRCUITRY

Номер: US20130339595A1
Принадлежит:

In one embodiment, the present invention includes a method for identifying a memory request corresponding to a load instruction as a critical transaction if an instruction pointer of the load instruction is present in a critical instruction table associated with a processor core, sending the memory request to a system agent of the processor with a critical indicator to identify the memory request as a critical transaction, and prioritizing the memory request ahead of other pending transactions responsive to the critical indicator. Other embodiments are described and claimed. 1. A processor comprising:a first core to execute instructions, the first core including a pipeline having a reorder buffer (ROB) including a plurality of entries each associated with an instruction received in the pipeline, and a critical instruction logic to determine whether a load instruction is a critical instruction and if so to send a memory request transaction associated with the load instruction to a system agent of the processor with a critical indicator to indicate the critical instruction; andthe system agent coupled to the first core and including a distributed cache controller having a plurality of portions each associated with a corresponding portion of a distributed shared cache memory, a memory controller to interface with a system memory coupled to the processor, and an interconnect to couple the distributed shared cache memory and the distributed cache controller with the first core, wherein the system agent is to prioritize the memory request transaction when indicated to be a critical instruction.2. The processor of claim 1 , wherein the system agent comprises a first arbiter associated with the interconnect claim 1 , the first arbiter to prioritize the memory request transaction including the critical indicator ahead of at least one other transaction present in a buffer to store pending transactions for insertion onto the interconnect.3. The processor of claim 2 , wherein ...

Подробнее
19-12-2013 дата публикации

MANAGING TRANSACTIONAL AND NON-TRANSACTIONAL STORE OBSERVABILITY

Номер: US20130339616A1

Embodiments relate to controlling observability of transactional and non-transactional stores. An aspect includes receiving one or more store instructions. The one or more store instructions are initiated within an active transaction and include store data. The active transaction effectively delays committing stores to memory until successful completion of the active transaction. The store data is stored in a local storage buffer causing alterations to the local storage buffer from a first state to a second state. A signal is received that the active transaction has terminated. If the active transaction has terminated abnormally then: the local storage buffer is reverted back to the first state if the store data was stored by a transactional store instruction, and is propagated to a shared cache if the store instruction is non-transactional. 1. A method for controlling observability of transactional and non-transactional stores , the method comprising:receiving, by a processing circuit, one or more store instructions, the one or more store instructions initiated within an active transaction and including store data, the active transaction effectively delaying committing stores to memory until successful completion of the active transaction;storing the store data in a local storage buffer, the storing causing alterations to the local storage buffer from a first state to a second state;receiving a signal that the active transaction has terminated;based on determining that the active transaction terminated abnormally, for each stored data of the one or more store instructions performing:based on determining that the store data was stored by a transactional store instruction, reverting the local storage buffer back to the first state; andbased on determining that the stored data was stored by a non-transactional store instruction, propagating the second state having the stored data to a shared cache.2. The method of claim 1 , wherein all storage alterations by all of ...

Подробнее
02-01-2014 дата публикации

Method and Apparatus For Bus Lock Assistance

Номер: US20140006661A1
Принадлежит: Intel Corp

A method is described that includes detecting that an instruction of a thread is a locked instruction. The instruction also includes determining that execution of said instruction includes imposing a bus lock. The instruction also include executing a bus lock assistance function in response to said determining, said bus lock assistance function including a function associated with said bus lock other than implementation of a bus lock protocol.

Подробнее
02-01-2014 дата публикации

Data control using last accessor information

Номер: US20140006716A1
Принадлежит: Intel Corp

In some implementations, a shared cache structure may be provided for sharing data among a plurality of processor cores. A data structure may be associated with the shared cache structure, and may include a plurality of entries, with each entry corresponding to one of the cache lines in the shared cache. Each entry in the data structure may further include a field to identify a processor core that most recently requested the data of the cache line corresponding to the entry. When a request for a particular cache line is received, a request for the data may be sent to a particular processor core identified in the data structure as the last accessor of the data.

Подробнее
09-01-2014 дата публикации

ENSURING CAUSALITY OF TRANSACTIONAL STORAGE ACCESSES INTERACTING WITH NON-TRANSACTIONAL STORAGE ACCESSES

Номер: US20140013055A1

A data processing system implements a weak consistency memory model for a distributed shared memory system. The data processing system concurrently executes, on a plurality of processor cores, one or more transactional memory instructions within a memory transaction and one or more non-transactional memory instructions. The one or more non-transactional memory instructions include a non-transactional store instruction. The data processing system commits the memory transaction to the distributed shared memory system only in response to enforcement of causality of the non-transactional store instruction with respect to the memory transaction. 1. A method , comprising:in a data processing system implementing a weak consistency memory model for a distributed shared memory system, concurrently executing on a plurality of processor cores one or more transactional memory instructions within a memory transaction and one or more non-transactional memory instructions, wherein the one or more non-transactional memory instructions include a non-transactional store instruction; andcommitting the memory transaction to the distributed shared memory system of the data processing system only in response to enforcement of causality of the non-transactional store instruction with respect to the memory transaction.2. The method of claim 1 , wherein:the data processing system includes a system fabric that distributes memory access requests to the shared distributed memory system;the memory transaction includes a transactional load instruction that reads a value stored to the distributed shared memory system by the non-transactional store instruction;the method further includes snooping, on the system fabric, a store request corresponding to the non-transactional store instruction and snooping, on the system fabric, a read request corresponding to the transactional load instruction; and 'in response to snooping the read request while the store request is pending, preventing the memory ...

Подробнее
09-01-2014 дата публикации

Systems, methods and apparatus for cache transfers

Номер: US20140013059A1
Принадлежит: Fusion IO LLC

A virtual machine cache provides for maintaining a working set of the cache during a transfer between virtual machine hosts. In response to a virtual machine transfer, the previous host of the virtual machine is configured to retain cache data of the virtual machine, which may include both cache metadata and data that has been admitted into the cache. The cache data may be transferred to the destination host via a network (or other communication mechanism). The destination host populates a virtual machine cache with the transferred cache data to thereby reconstruct the working state of the cache.

Подробнее
09-01-2014 дата публикации

Ensuring causality of transactional storage accesses interacting with non-transactional storage accesses

Номер: US20140013060A1
Принадлежит: International Business Machines Corp

A data processing system implements a weak consistency memory model for a distributed shared memory system. The data processing system concurrently executes, on a plurality of processor cores, one or more transactional memory instructions within a memory transaction and one or more non-transactional memory instructions. The one or more non-transactional memory instructions include a non-transactional store instruction. The data processing system commits the memory transaction to the distributed shared memory system only in response to enforcement of causality of the non-transactional store instruction with respect to the memory transaction.

Подробнее
23-01-2014 дата публикации

Technique for using memory attributes

Номер: US20140025901A1
Принадлежит: Individual

A technique for using memory attributes to relay information to a program or other agent. More particularly, embodiments of the invention relate to using memory attribute bits to check various memory properties in an efficient manner.

Подробнее
30-01-2014 дата публикации

Sharing Pattern-Based Directory Coherence for Multicore Scalability ("SPACE")

Номер: US20140032848A1
Принадлежит: UNIVERSITY OF ROCHESTER

A method and directory system that recognizes and represents the subset of sharing patterns present in an application is provided. As used herein, the term sharing pattern refers to a group of processors accessing a single memory location in an application. The sharing pattern is decoupled from each cache line and held in a separate directory table. The sharing pattern of a cache block is the bit vector representing the processors that share the block. Multiple cache lines that have the same sharing pattern point to a common entry in the directory table. In addition, when the table capacity is exceeded, patterns that are similar to each other are dynamically collated into a single entry. 1. A system for decoupling the metadata representing sharing patterns from the address tags representing the data blocks , in a cache coherent shared memory computer comprising:a directory table for storing in each entry of the directory table a unique sharing pattern that an application exhibits when it is executing; anda cache having a pointer associated with the address tag of each of its data blocks (cache lines), where each pointer points to an entry in the directory table.2. A method of decoupling the metadata representing sharing patterns from the address tags representing the data blocks , in a cache coherent shared memory computer comprising:providing a directory table for storing in each entry of the directory table a unique sharing pattern that an application exhibits when it is executing; andproviding a cache having a pointer associated with the address tag of each of its data blocks (cache lines), where each pointer points to an entry in the directory table.3. The method of further comprising determining when a sharing pattern in the directory table is no longer is use in the pattern table.4. The method of claim 3 , wherein determining when a sharing pattern in the directory table is no longer is use in the pattern table comprises:providing a reference counter ...

Подробнее
06-02-2014 дата публикации

SYSTEM AND METHOD OF CACHING INFORMATION

Номер: US20140040559A1
Принадлежит: GOOGLE INC.

A system and method is provided wherein, in one aspect, a currently-requested item of information is stored in a cache based on whether it has been previously requested and, if so, the time of the previous request. If the item has not been previously requested, it may not be stored in the cache. If the subject item has been previously requested, it may or may not be cached based on a comparison of durations, namely (1) the duration of time between the current request and the previous request for the subject item and (2) for each other item in the cache, the duration of time between the current request and the previous request for the other item. If the duration associated with the subject item is less than the duration of another item in the cache, the subject item may be stored in the cache. 1. A method comprising:determining whether an item of information has been previously requested within a predetermined period;processing, with one or more processors, the item of information without storing the item in a cache when the item has not been previously requested within the predetermined period;processing, with the one or more processors, the item of information without storing the item of information in the cache when the item of information has been previously requested within the predetermined period and the time of the previous request is earlier than a latest request for each item within a set of items stored in the cache; andprocessing, with the one or more processors, the item of information and storing the item of information in the cache when the item of information has been previously requested within the predetermined period and the time of the previous request is later than the latest request of at least one item within the set of items stored in the cache.2. The method of claim 1 , wherein the item of information comprises audio or visual data to be rendered at the client device.3. The method of claim 1 , wherein the item of information comprises a file. ...

Подробнее
13-02-2014 дата публикации

System and method of caching information

Номер: US20140047191A1
Принадлежит: Google LLC

A system and method is provided wherein, in one aspect, a currently-requested item of information is stored in a cache based on whether it has been previously requested and, if so, the time of the previous request. If the item has not been previously requested, it may not be stored in the cache. If the subject item has been previously requested, it may or may not be cached based on a comparison of durations, namely (1) the duration of time between the current request and the previous request for the subject item and (2) for each other item in the cache, the duration of time between the current request and the previous request for the other item. If the duration associated with the subject item is less than the duration of another item in the cache, the subject item may be stored in the cache.

Подробнее
13-02-2014 дата публикации

Providing service address space for diagnostics collection

Номер: US20140047204A1
Принадлежит: International Business Machines Corp

A method and system are provided for providing a service address space for diagnostics collection. The system includes: a service co-processor attached to a main processor, wherein the service co-processor maintains an independent copy of the main processor's address space in the form of a service address space; and a storage update receiving component for updating the service address space by receiving storage update packets from the main processor and applying these to the service address space. An instruction pipe may be provided between the main processor and the service co-processor. The main processor may include: a service delegation component for delegating collection of diagnostic data to the co-processor by sending a collection command from the main processor to the service co-processor for collection of data from the service address space.

Подробнее
20-02-2014 дата публикации

PROCESSOR AND CONTROL METHOD FOR PROCESSOR

Номер: US20140052923A1
Автор: IKEDA Yoshiro
Принадлежит:

A processor includes a plurality of nodes arranged two dimensionally in the X-axis direction and in the Y-axis direction, and each of the nodes includes a processor core and a distributed shared cache memory. The processor also includes a first connecting unit and a second connecting unit. The first connecting unit connects adjacent nodes in the X-axis direction among the nodes, in a ring shape. The second connecting unit connects adjacent nodes in the Y-axis direction among the nodes, in a ring shape. The cache memories included in the respective nodes are divided into banks in the Y-axis direction. Coherency of the cache memories in the X-axis direction is controlled by a snoop system. The cache memories are shared by the nodes. 1. A processor in which a plurality of nodes , each including a processor core and a distributed shared cache memory , are arranged two-dimensionally in an X-axis direction and a Y-axis direction , the processor comprising:a first connecting unit that connects adjacent nodes in the X-axis direction among the nodes, in a ring shape; anda second connecting unit that connects adjacent nodes in the Y-axis direction among the nodes, in a ring shape, whereinthe cache memories included in the respective nodes are divided into banks in the Y-axis direction, coherency of the cache memories in the X-axis direction is controlled by a snoop system, and the cache memories are shared by the nodes.2. The processor according to claim 1 , wherein connects a node located at a position other than both ends in the X-axis direction and a node located adjacent to the node located at the position other than both ends in the X-axis direction,', 'connects a node located at the either end in the X-axis direction and a node adjacent to the node located at the either end in the X-axis direction and connects the node located at the either end in the X-axis direction and a node adjacent to the node adjacent to the node located at the either end in the X-axis direction, ...

Подробнее
06-03-2014 дата публикации

Systems, methods, and interfaces for adaptive cache persistence

Номер: US20140068197A1
Принадлежит: Fusion IO LLC

A storage module may be configured to service I/O requests according to different persistence levels. The persistence level of an I/O request may relate to the storage resource(s) used to service the I/O request, the configuration of the storage resource(s), the storage mode of the resources, and so on. In some embodiments, a persistence level may relate to a cache mode of an I/O request. I/O requests pertaining to temporary or disposable data may be serviced using an ephemeral cache mode. An ephemeral cache mode may comprise storing I/O request data in cache storage without writing the data through (or back) to primary storage. Ephemeral cache data may be transferred between hosts in response to virtual machine migration.

Подробнее
03-04-2014 дата публикации

PERFORMANCE-DRIVEN CACHE LINE MEMORY ACCESS

Номер: US20140095796A1

According to one aspect of the present disclosure, a method and technique for performance-driven cache line memory access is disclosed. The method includes: receiving, by a memory controller of a data processing system, a request for a cache line; dividing the request into a plurality of cache subline requests, wherein at least one of the cache subline requests comprises a high priority data request and at least one of the cache subline requests comprises a low priority data request; servicing the high priority data request; and delaying servicing of the low priority data request until a low priority condition has been satisfied. 1. A method , comprising:receiving, by a memory controller of a data processing system, a request for a cache line;dividing the request into a plurality of cache subline requests, wherein at least one of the cache subline requests comprises a high priority data request and at least one of the cache subline requests comprises a low priority data request;servicing the high priority data request; anddelaying servicing of the low priority data request until a low priority condition has been satisfied.2. The method of claim 1 , further comprising:placing the low priority data request into a queue;initiating a timer; andresponsive to expiration of the timer, servicing the low priority data request.3. The method of claim 1 , further comprising:placing the low priority data request into a queue;determining bus utilization; andresponsive to the bus utilization being below a threshold, servicing the low priority data request.4. The method of claim 1 , further comprising:placing the low priority data request into a queue; andresponsive to a processor core cancelling the low priority data request, removing the low priority data request from the queue.5. The method of claim 1 , further comprising:placing the low priority data request into a low priority queue; andresponsive to receiving a sector address request corresponding to the low priority data ...

Подробнее
05-01-2017 дата публикации

SYSTEM PERFORMANCE MANAGEMENT USING PRIORITIZED COMPUTE UNITS

Номер: US20170004080A1
Принадлежит: Advanced Micro Devices, Inc.

Methods, devices, and systems for managing performance of a processor having multiple compute units. An effective number of the multiple compute units may be determined to designate as having priority. On a condition that the effective number is nonzero, the effective number of the multiple compute units may each be designated as a priority compute unit. Priority compute units may have access to a shared cache whereas non-priority compute units may not. Workgroups may be preferentially dispatched to priority compute units. Memory access requests from priority compute units may be served ahead of requests from non-priority compute units. 1. A method for managing performance of a processor having multiple compute units , the method comprising:determining an effective number of the multiple compute units to designate as having priority; and 'designating the effective number of the multiple compute units each as a priority compute unit.', 'when the effective number is nonzero2. The method of claim 1 , further comprising allowing each priority compute unit to allocate into a shared cache.3. The method of claim 1 , further comprising disallowing a compute unit which is not a priority compute unit to allocate into a shared cache.4. The method of claim 1 , further comprising prioritizing access to a memory by a priority compute unit over a compute unit which is not a priority compute unit.5. The method of claim 1 , further comprising serving a pending request for access to a memory by a priority compute unit prior to serving any pending request for access to the memory by a compute unit which is not a priority compute unit.6. The method of claim 1 , wherein the determining is performed dynamically.7. The method of claim 1 , wherein the determining comprises set dueling.8. The method of claim 1 , further comprising dispatching a workgroup to a priority compute unit preferentially to dispatching the workgroup to a compute unit which is not a priority compute unit.9. A ...

Подробнее
07-01-2016 дата публикации

Detecting cache conflicts by utilizing logical address comparisons in a transactional memory

Номер: US20160004643A1
Принадлежит: International Business Machines Corp

A processor in a multi-processor configuration is configured perform dynamic address translation from logical addresses to real address and to detect memory conflicts for shared logical memory in transactional memory based on logical (virtual) addresses comparisons.

Подробнее
07-01-2016 дата публикации

SYSTEM AND METHOD OF ARBITRATING CACHE REQUESTS

Номер: US20160004651A1
Автор: WANG CHUNLIN
Принадлежит:

This disclosure relates to arbitration of different types of requests to access a cache. Features of this disclosure can be implemented in a graphics processing unit (GPU). In one embodiment, an arbiter can receive requests from a color processor and a depth processor and determine which of the received requests has the highest priority. The request with the highest priority can then be provided to the cache. The priority can be configurable. The arbiter can determine priority, for example, based on whether a location in the cache associated with a request is available, a weight associated with the request, a number of requests of a particular type processed by the arbiter, or any combination thereof. 1. An apparatus comprising:a cache configured to store data; and assign weights to different types of cache requests based on information received by the arbiter, the different types of cache requests;', 'receive a request of a first type to access the cache;', 'receive a request of a second type to access the cache;', 'determine which of the received requests has a higher priority based at least partly on the weights assigned to the first type of request and the second type of request; and', 'provide the cache with the received request determined to have the higher priority., 'an arbiter comprising electronic hardware, the arbiter configured to2. The apparatus of claim 1 , wherein the arbiter is configured to determine the higher priority based at least partly on an indication of whether a location in the cache associated with the first request is available.3. The apparatus of claim 1 , wherein the arbiter comprises a plurality of input counters claim 1 , each of the plurality of input counters configured to count a number of requests of a respective one of the different types of cache requests processed by the arbiter.4. The apparatus of claim 3 , wherein the arbiter is configured to determine the higher priority based at least partly on a comparison of a selected ...

Подробнее
04-01-2018 дата публикации

PROGRESSIVE FINE TO COARSE GRAIN SNOOP FILTER

Номер: US20180004663A1
Принадлежит: ARM LIMITED

A data processing system includes a snoop filter organized as a number of lines, each storing an address tag associated with the address of data stored in one or more caches of the system, a coherency state of the data, and presence data. A snoop controller sends snoop messages in response to data access requests. The presence data is configurable in a first format, in which the value of a bit in the presence data is indicative of a subset of the nodes for which at least one node in the subset has a copy of the data in its local cache, and in a second format, in which the presence data comprises a unique identifier of a node having a copy of the data in its local cache. The snoop controller sends snoop messages to the nodes indicated by the presence data. 1. A method of operation of a snoop filter of a data processing system having a plurality of nodes , where each request node has a local cache , where the plurality of nodes are grouped in a plurality of subsets and where each subset consists of one or more nodes , the method comprising:accessing the snoop filter, dependent upon a data address received from a first node of the plurality of nodes, to retrieve format data and presence data, where the format data is indicative of a format of the presence data; determining, from the retrieved presence data, one or more unique identifiers of nodes from the retrieved presence data, each of the one or more identified nodes having a copy of data associated with the data address in its local cache; and', 'sending a snoop message to each of the one or more nodes; and, 'when the format data indicates a first format for the presence data identifying, from positions of set bits within the retrieved presence data, one or more subsets of the plurality of subsets; and', 'sending a snoop message to each node in each of subset of the identified one or more subsets., 'when the format data indicates a second format for the presence data2. The method of claim 1 , where the format data ...

Подробнее
04-01-2018 дата публикации

IDENTIFICATION OF A COMPUTING DEVICE ACCESSING A SHARED MEMORY

Номер: US20180004665A1
Принадлежит:

A method for identifying, in a system including two or more computing devices that are able to communicate with each other, with each computing device having with a cache and connected to a corresponding memory, a computing device accessing one of the memories, includes monitoring memory access to any of the memories; monitoring cache coherency commands between computing devices; and identifying the computing device accessing one of the memories by using information related to the memory access and cache coherency commands. 1. A method for identifying , in a system including two or more computing devices that are able to communicate with each other , with each computing device having a cache and connected to a corresponding memory , the computing device accessing one of the memories , the method comprising:monitoring memory access to any of the memories, wherein monitoring memory access comprises identifying respective computing devices that access respective memories;monitoring cache coherency commands between computing devices; andidentifying the computing device accessing one of the memories by using information related to the memory access and cache coherency commands.2. The method of claim 1 , wherein monitoring memory access to any of the memories further comprises collecting an access time claim 1 , a type of command claim 1 , and a first memory address from a first memory read access to a first memory based on information acquired from a first probe attached to a first bus connecting the first memory to a first computing device claim 1 , wherein the first memory is remote from the first computing device claim 1 , and wherein a cache line in the first computing device is in an invalid state;wherein monitoring cache coherency commands further comprises monitoring a first cache coherency command sent from the first computing device to at least a second computing device at a first cache coherency time based on information acquired from a second probe attached to ...

Подробнее
04-01-2018 дата публикации

IDENTIFICATION OF A COMPUTING DEVICE ACCESSING A SHARED MEMORY

Номер: US20180004666A1
Принадлежит:

A method for identifying, in a system including two or more computing devices that are able to communicate with each other, with each computing device having with a cache and connected to a corresponding memory, a computing device accessing one of the memories, includes monitoring memory access to any of the memories; monitoring cache coherency commands between computing devices; and identifying the computing device accessing one of the memories by using information related to the memory access and cache coherency commands. 1. A method for identifying , in a system including two or more computing devices that are able to communicate with each other via an interconnect , with each computing device provided with a cache and connected to the corresponding memory , the computing device accessing a first memory being one of the memories , the method comprising:monitoring memory access to the first memory via a memory device connected to the first memory, wherein monitoring memory access comprises identifying respective computing devices that access respective memories;monitoring cache coherency commands between computing devices via an interconnect between computing device and storing information related to the commands;identifying a command from a history of information related to the commands including a memory address identical to the memory address in memory access to the first memory; andidentifying, as the computing device accessing the first memory, the computing device issuing the identified command at the timing closest to the timing of the memory access to the first memory.2. The method of claim 1 , wherein monitoring memory access to the first memory further comprises collecting an access time claim 1 , a type of command claim 1 , and a first memory address from a first memory read access to the first memory based on information acquired from a first probe attached to a first bus connecting the first memory to a first computing device claim 1 , wherein the ...

Подробнее
07-01-2021 дата публикации

HARDWARE/SOFTWARE CO-OPTIMIZATION TO IMPROVE PERFORMANCE AND ENERGY FOR INTER-VM COMMUNICATION FOR NFVS AND OTHER PRODUCER-CONSUMER WORKLOADS

Номер: US20210004328A1
Принадлежит:

Methods and apparatus implementing Hardware/Software co-optimization to improve performance and energy for inter-VM communication for NFVs and other producer-consumer workloads. The apparatus include multi-core processors with multi-level cache hierarchies including and L1 and L2 cache for each core and a shared last-level cache (LLC). One or more machine-level instructions are provided for proactively demoting cachelines from lower cache levels to higher cache levels, including demoting cachelines from L1/L2 caches to an LLC. Techniques are also provided for implementing hardware/software co-optimization in multi-socket NUMA architecture system, wherein cachelines may be selectively demoted and pushed to an LLC in a remote socket. In addition, techniques are disclosure for implementing early snooping in multi-socket systems to reduce latency when accessing cachelines on remote sockets. 1a plurality of cores, each having at least one associated cache occupying a respective level in a cache hierarchy;a last level cache (LLC), communicatively coupled to the plurality of cores; anda memory controller, communicatively coupled to the plurality of cores, configured to support access to external system memory when the processor is installed in the computer system;wherein each of the caches associated with a core, and the LLC include a plurality of cacheline slots for storing cacheline data, and wherein the processor is further is configured to support a machine instruction that when executed causes the processor to demote a cacheline from a lower-level cache to a higher-level cache.. A processor, configured to be implemented in a computer system, comprising: The present application claims priority to U.S. patent application Ser. No. 14/583,389, entitled “HARDWARE/SOFTWARE CO-OPTIMIZATION TO IMPROVE PERFORMANCE AND ENERGY FOR INTER-VM COMMUNICATION FOR NFVS AND OTHER PRODUCER-CONSUMER WORKLOADS,” and filed on Dec. 26, 2014, the entirety of which is incorporated by reference ...

Подробнее
02-01-2020 дата публикации

CACHE MANAGEMENT IN A STREAM COMPUTING ENVIRONMENT THAT USES A SET OF MANY-CORE HARDWARE PROCESSORS

Номер: US20200004682A1
Принадлежит:

Disclosed aspects relate to cache management in a stream computing environment that uses a set of many-core hardware processors to process a stream of tuples by a plurality of processing elements which operate on the set of many-core hardware processors. The stream of tuples to be processed by the plurality of processing elements which operate on the set of many-core hardware processors may be received. A tuple-processing hardware-route on the set of many-core hardware processors may be determined based on a cache factor associated with the set of many-core hardware processors. The stream of tuples may be routed based on the tuple-processing hardware-route on the set of many-core hardware processors. The stream of tuples may be processed by the plurality of processing elements which operate on the set of many-core hardware processors. 1. A computer-implemented method for cache management in a stream computing environment that uses a set of many-core hardware processors to process a stream of tuples by a plurality of processing elements which operate on the set of many-core hardware processors , the method comprising:determining, based on a cache factor associated with the set of many-core hardware processors, a tuple-processing hardware-route on the set of many-core hardware processors; andprocessing, utilizing the set of many-core hardware processors, a stream of tuples by the plurality of processing elements which operate on the set of many-core hardware processors, wherein the stream of tuples is routed based on the tuple-processing hardware-route.2. The method of claim 1 , further comprising:computing, for cache management in the stream computing environment, a first cache utilization factor for a first cache of a first core of the set of many-core hardware processors;computing, for cache management in the stream computing environment, a second cache utilization factor for a second cache of the first core of the set of many-core hardware processors;resolving, by ...

Подробнее
02-01-2020 дата публикации

CACHE PARTITIONING MECHANISM

Номер: US20200004683A1
Принадлежит: Intel Corporation

An apparatus to facilitate cache partitioning is disclosed. The apparatus includes a set associative cache to receive access requests from a plurality of agents and partitioning logic to partition the set associative cache by assigning sub-components of a set address to each of the plurality of agents. 1. An apparatus to facilitate cache partitioning , comprising:a set associative cache to receive access requests from a plurality of agents; andpartitioning logic to partition the set associative cache by assigning sub-components of a set address to each of the plurality of agents.2. The apparatus of claim 1 , wherein the partitioning logic comprises target conversion logic to assign the sub-components of the set address by separating address bits of a received memory request.3. The apparatus of claim 2 , wherein the partitioning logic separates the address bits of a received memory request into original tag bits claim 2 , fixed set bits claim 2 , and variable set bits.4. The apparatus of claim 3 , wherein the target conversion logic calculates updated set bits based on the variable set bits.5. The apparatus of claim 4 , wherein the partitioning logic further comprises a partition assignment table having entries associated with each of the plurality of agents.6. The apparatus of claim 5 , wherein the target conversion logic calculates the updated set bits based on a received client ID associated with an agent and an entry in the partition assignment table associated with the client ID.7. The apparatus of claim 4 , wherein the target conversion logic further calculates updated tag bits.8. The apparatus of claim 7 , wherein the target conversion logic calculates the updated tag bits by adding the variable set bits to the original tag bits.9. The apparatus of claim 8 , wherein the partitioning logic accesses the set associative cache using the updated tag bits and the update set bits.10. The apparatus of claim 3 , wherein the set associative cache is a translation ...

Подробнее
02-01-2020 дата публикации

CACHE REPLACING METHOD AND APPARATUS, HETEROGENEOUS MULTI-CORE SYSTEM AND CACHE MANAGING METHOD

Номер: US20200004692A1
Принадлежит:

This disclosure provides a cache replacing method applied to a heterogeneous multi-core system, the method including: determining whether a first application currently running is an application running on the GPU; when it is determined that the first application currently running is an application running on the GPU, determining a cache priority of first data accessed by the first application according to a performance parameter of the first application, the cache priority of the first data including a priority other than a predefined highest cache priority; and caching the first data into a cache queue of the shared cache according to a predetermined cache replacement algorithm and the cache priority of the first data, and replacing data in the cache queue. 1. A cache replacing method applied to a heterogeneous multi-core system , the heterogeneous multi-core system including at least one central processing unit CPU , at least one graphic processing unit GPU and a shared cache , the method including:determining whether a first application currently running is an application running on the GPU;when it is determined that the first application currently running is an application running on the GPU, determining a cache priority of first data accessed by the first application according to a performance parameter of the first application, the cache priority of the first data including a priority other than a predefined highest cache priority; andcaching the first data into a cache queue of the shared cache according to a predetermined cache replacement algorithm and the cache priority of the first data, and replacing data in the cache queue.2. The method of claim 1 , further including:determining whether a second application currently running is an application running on the CPU;when it is determined that the second application currently running is an application running on the CPU, determining a cache priority of second data accessed by the second application according ...

Подробнее
03-01-2019 дата публикации

VIRTUAL MACHINE BACKUP

Номер: US20190004902A1
Принадлежит:

A computer system comprises a processor unit arranged to run a hypervisor running one or more virtual machines; a cache connected to the processor unit and comprising a plurality of cache rows, each cache row comprising a memory address, a cache line and an image modification flag; and a memory connected to the cache and arranged to store an image of at least one virtual machine. The processor unit is arranged to define a log in the memory and the cache further comprises a cache controller arranged to set the image modification flag for a cache line modified by a virtual machine being backed up, but not for a cache line modified by the hypervisor operating in privilege mode; periodically check the image modification flags; and write only the memory address of the flagged cache rows in the defined log. 1. A computer system for virtual machine backup , the computer system comprising:a processor unit arranged to run a hypervisor running one or more virtual machines, to run multiple execution threads, and to define a log in memory;a cache connected to the processor unit and comprising a plurality of cache rows, each cache row comprising a memory address, a cache line, and an image modification flag; anda memory connected to the cache, wherein the hypervisor is arranged to maintain a thread mask flagging those threads that relate to one or more virtual machines being backed up; and 'set the image modification flag for a cache line modified by a virtual machine being backed up by reference to the thread mask; and', 'the cache further comprises a cache controller arranged towrite only the memory address of the flagged cache rows in the defined log.2. The computer system of claim 1 , wherein the cache controller is further arranged to write the memory address of a flagged cache line in the defined log upon the eviction of the flagged cache row from the cache.3. The computer system of claim 2 , wherein the cache controller is further arranged to write a thread ID of a ...

Подробнее
03-01-2019 дата публикации

MEMORY SYSTEM, MEMORY CONTROLLER FOR MEMORY SYSTEM, OPERATION METHOD OF MEMORY CONTROLLER, AND OPERATION METHOD OF USER DEVICE INCLUDING MEMORY DEVICE

Номер: US20190004949A1
Принадлежит:

A system includes: a nonvolatile memory; a memory controller configured to control the nonvolatile memory, the memory controller including a first buffer memory for temporarily storing write data to be written to the nonvolatile memory; and a second buffer memory having a lower operational speed and a higher memory capacity than the first buffer memory. The memory controller is configured to transmit the write data from the first buffer memory to the second buffer memory and to the nonvolatile memory, and to release an operational state of the first buffer memory after transmitting the write data from the first buffer memory to the second buffer memory and to the nonvolatile memory. Writing additional write data to the first buffer memory is prohibited prior to the release of the operational state of the first buffer memory, and is permitted after the release of the operational state of the first buffer memory. 1. A system , comprising:a nonvolatile memory;a memory controller configured to control the nonvolatile memory, the memory controller including a first buffer memory for temporarily storing write data to be written to the nonvolatile memory; anda second buffer memory having a lower operational speed and a higher memory capacity than the first buffer memory,the memory controller being configured to transmit the write data from the first buffer memory to the second buffer memory and to the nonvolatile memory, and to release an operational state of the first buffer memory after transmitting the write data from the first buffer memory to the second buffer memory and to the nonvolatile memory,wherein writing additional write data to the first buffer memory is prohibited prior to the release of the operational state of the first buffer memory, and writing the additional write data to the first buffer memory is permitted after the release of the operational state of the first buffer memory.2. The system of claim 1 , wherein the memory controller is configured to ...

Подробнее
03-01-2019 дата публикации

MEMORY NODE WITH CACHE FOR EMULATED SHARED MEMORY COMPUTERS

Номер: US20190004950A1
Автор: Forsell Martti
Принадлежит: TEKNOLOGIAN TUTKIMUSKESKUS VTT OY

Data memory node () for ESM (Emulated Share Memory) architectures (), comprising a data memory module () containing data memory for storing input data therein and retrieving stored data therefrom responsive to predetermined control signals, a multi-port cache () for the data memory, said cache being provided with at least one read port (A, B) and at least one write port (C, D, E), said cache () being configured to hold recently and/or frequently used data stored in the data memory (), and an active memory unit () at least functionally connected to a plurality of processors via an interconnection network (), said active memory unit () being configured to operate the cache () upon receiving a multioperation reference () incorporating a memory reference to the data memory of the data memory module from a number of processors of said plurality, wherein responsive to the receipt of the multioperation reference the active memory unit () is configured to process the multioperation reference according to the type of the multioperation indicated in the reference, utilizing cached data in accordance with the memory reference and data provided in the multioperation reference. A method to be performed by the memory node is also presented. 1. A data memory node for use in ESM (Emulated Shared Memory) architectures , comprisinga data memory module containing data memory for storing input data therein and retrieving stored data therefrom responsive to predetermined control signals,a multi-port cache for the data memory, said cache being provided with at least one read port and at least one write port, said cache being configured to hold recently and/or frequently used data stored in the data memory, andan active memory unit at least functionally connected to a plurality of processors via an interconnection network, said active memory unit being configured to operate the cache upon receiving a multioperation reference incorporating a memory reference to the data memory of the data ...

Подробнее
01-01-2015 дата публикации

CACHING DATA BETWEEN A DATABASE SERVER AND A STORAGE SYSTEM

Номер: US20150006813A1
Принадлежит:

Techniques are provided for using an intermediate cache between the shared cache of an application and the non-volatile storage of a storage system. The application may be any type of application that uses a storage system to persistently store data. The intermediate cache may be local to the machine upon which the application is executing, or may be implemented within the storage system. In one embodiment where the application is a database server, the database system includes both a DB server-side intermediate cache, and a storage-side intermediate cache. The caching policies used to populate the intermediate cache are intelligent, taking into account factors that may include which object an item belongs to, the item type of the item, a characteristic of the item, or the type of operation in which the item is involved. 1. A method comprising:at a storage system, responding to input/output (I/O) requests from one or more database servers by retrieving requested disk blocks from one or more storage devices within the storage system, the requested disk blocks storing data representative of database objects with respect to which the one or more database servers perform database operations; whether a given database object, for which the given disk block stores data, is associated with a particular designation;', 'whether the given disk block is of an index block type;', 'whether the given disk block is of a data block type;', 'whether the given disk block is of an undo block type;', 'whether the given disk block is encrypted;', 'whether the given disk block is a secondary copy of a mirrored item; or', 'whether the given disk block is involved in a table scan operation;, 'for a given disk block of the requested disk blocks, the storage system determining whether to cache the given disk block in an intermediate cache within the storage system, the determining being based at least partially upon one or more ofwhen a particular disk block is cached in the intermediate ...

Подробнее
27-01-2022 дата публикации

OUT OF ORDER MEMORY REQUEST TRACKING STRUCTURE AND TECHNIQUE

Номер: US20220027160A1
Принадлежит:

In a streaming cache, multiple, dynamically sized tracking queues are employed. Request tracking information is distributed among the plural tracking queues to selectively enable out-of-order memory request returns. A dynamically controlled policy assigns pending requests to tracking queues, providing for example in-order memory returns in some contexts and/or for some traffic and out of order memory returns in other contexts and/or for other traffic. 1. A memory request tracking circuit for use with a streaming cache memory , the memory request tracking circuit comprising:a tag check configured to detect cache misses;plural tracking queues; anda queue mapper coupled to the tag check and the plural tracking queues, the queue mapper being configured to distribute request tracking information to the plural tracking queues to enable in-order and out-of-order memory request returns.2. The memory request tracking circuit of wherein the queue mapper is programmable to preserve in-order memory request return handling for a first type of memory requests and to enable out-of-order memory request return handling for a second type of memory requests different from the first type of memory requests.3. The memory request tracking circuit of wherein the first and second types of memory requests are selected from the group consisting of loads from local or global memory; texture memory/storage; and acceleration data structure storage.4. The memory request tracking circuit of wherein the plural tracking queues comprise first through N tracking queues claim 1 , and the queue mapper allocates a first tracking queue to a particular warp and distributes certain types of memory requests evenly across second through N tracking queues.5. The memory request tracking circuit of wherein the plural tracking queues each comprise a first-in-first-out storage.6. The memory request tracking circuit of further includes a pipelined checker picker that selects tracking queue outputs for application ...

Подробнее
27-01-2022 дата публикации

CONCURRENT MEMORY MANAGEMENT IN A COMPUTING SYSTEM

Номер: US20220027264A1
Принадлежит:

An example method of memory management in a computing system having a plurality of processors includes: receiving a first memory allocation request at a memory manager from a process executing on a processor of the plurality of processors in the computing system; allocating a local memory pool for the processor from a global memory pool for the plurality of processors in response to the first memory allocation request; and allocating memory from the local memory pool for the processor in response to the first memory allocation request without locking the local memory pool. 1. A method of memory management in a computing system having a plurality of processors , the method comprising:receiving a first memory allocation request at a memory manager from a process executing on a processor of the plurality of processors in the computing system;allocating a local memory pool for the processor from a global memory pool for the plurality of processors in response to the first memory allocation request; andallocating memory from the local memory pool for the processor in response to the first memory allocation request without locking the local memory pool.2. The method of claim 1 , wherein the step of allocating the local memory pool comprises:locking the global memory pool;allocating an amount of memory from the global memory pool to the local memory pool; andreducing the global memory pool by the amount.3. The method of claim 1 , wherein the step of allocating the local memory pool comprises:determining insufficient memory in the global memory pool to satisfy allocation of the local memory pool;adding a request for allocation of the local memory pool to a global wait queue; andallocating an amount of memory from the global memory pool to the local memory pool in response to the request in the global wait queue and in response to sufficient memory becoming available in the global memory pool.4. The method of claim 1 , further comprising:receiving a second memory allocation ...

Подробнее
27-01-2022 дата публикации

Memory pipeline control in a hierarchical memory system

Номер: US20220027275A1
Принадлежит: Texas Instruments Inc

In described examples, a processor system includes a processor core generating memory transactions, a lower level cache memory with a lower memory controller, and a higher level cache memory with a higher memory controller having a memory pipeline. The higher memory controller is connected to the lower memory controller by a bypass path that skips the memory pipeline. The higher memory controller: determines whether a memory transaction is a bypass write, which is a memory write request indicated not to result in a corresponding write being directed to the higher level cache memory; if the memory transaction is determined a bypass write, determines whether a memory transaction that prevents passing is in the memory pipeline; and if no transaction that prevents passing is determined to be in the memory pipeline, sends the memory transaction to the lower memory controller using the bypass path.

Подробнее
12-01-2017 дата публикации

SYSTEMS AND METHODS FACILITATING REDUCED LATENCY VIA STASHING IN SYSTEM ON CHIPS

Номер: US20170010966A1
Автор: Mittal Millind
Принадлежит:

Systems and methods that facilitate reduced latency via stashing in multi-level cache memory architectures of systems on chips (SoCs) are provided. One method involves stashing, by a device includes a plurality of multi-processor central processing unit cores, first data into a first cache memory of a plurality of cache memories, the plurality of cache memories being associated with a multi-level cache memory architecture. The method also includes generating control information including: a first instruction to cause monitoring contents of a second cache memory of the plurality of cache memories to determine whether a defined condition is satisfied for the second cache memory; and a second instruction to cause prefetching the first data into the second cache memory of the plurality of cache memories based on a determination that the defined condition is satisfied. 1. A method , comprising:stashing, by a device comprising a plurality of multi-processor central processing unit cores, first data into a first cache memory of a plurality of cache memories, the plurality of cache memories being associated with a multi-level cache memory architecture;generating control information comprising a first instruction to cause monitoring contents of a second cache memory of the plurality of cache memories to determine whether a defined condition is satisfied for the second cache memory; andprefetching the first data from the first cache memory to the second cache memory based on execution of the first instruction.2. The method of claim 1 , wherein the prefetching and the generating are performed concurrently.3. The method of claim 1 , wherein the defined condition comprises the second cache memory failing to store the first data associated with a defined address.4. The method of claim 1 , wherein the first cache memory is a shared cache memory for two or more of the plurality of multi-processor CPU cores.5. The method of claim 1 , wherein the second cache memory is a per ...

Подробнее
12-01-2017 дата публикации

SYSTEM AND METHOD FOR DATA CACHING IN PROCESSING NODES OF A MASSIVELY PARALLEL PROCESSING (MPP) DATABASE SYSTEM

Номер: US20170010968A1
Принадлежит:

The present technology relates to managing data caching in processing nodes of a massively parallel processing (MPP) database system. A directory is maintained that includes a list and a storage location of the data pages in the MPP database system. Memory usage is monitored in processing nodes by exchanging memory usage information with each other. Each of the processing nodes manages a list and a corresponding amount of available memory in each of the processing nodes based on the memory usage information. Data pages are read from a memory of the processing nodes in response to receiving a request to fetch the data pages, and a remote memory manager is queried for available memory in each of the processing nodes in response to receiving the request. The data pages are distributed to the memory of the processing nodes having sufficient space available for storage during data processing. 1. A method of managing data caching in processing nodes of a massively parallel processing (MPP) database system , comprising:maintaining a directory including a list of data pages, the list of data pages stored in one or more data tables, and a storage location of the data pages in the MPP database system;monitoring memory usage in one or more of the processing nodes of the MPP database system by exchanging memory usage information with each of the one or more processing nodes in the MPP database system, each of the one or more processing nodes managing a list of the one or more processing nodes and a corresponding amount of available memory in each of the one or more processing nodes based on the memory usage information;reading data pages from a memory of the one or more processing nodes in response to receiving a request to fetch the data pages; andquerying a remote memory manager for available memory in each of the one or more processing nodes in response to receiving a request and distributing the data pages to the memory of one of the one or more processing nodes having ...

Подробнее
12-01-2017 дата публикации

FACILITATING PREFETCHING FOR DATA STREAMS WITH MULTIPLE STRIDES

Номер: US20170010970A1
Автор: Chou Yuan C.
Принадлежит: ORACLE INTERNATIONAL CORPORATION

The disclosed embodiments relate to a system that generates prefetches for a stream of data accesses with multiple strides. During operation, while a processor is generating the stream of data accesses, the system examines a sequence of strides associated with the stream of data accesses. Next, upon detecting a pattern having a single constant stride in the examined sequence of strides, the system issues prefetch instructions to prefetch a sequence of data cache lines consistent with the single constant stride. Similarly, upon detecting a recurring pattern having two or more different strides in the examined sequence of strides, the system issues prefetch instructions to prefetch a sequence of data cache lines consistent with the recurring pattern having two or more different strides. 1. A method for generating prefetches for a stream of data accesses with multiple strides , comprising:while a processor is generating the stream of data accesses, examining a sequence of strides associated with data addresses for the stream of data accesses;upon detecting a pattern having a single constant stride in the examined sequence of strides, issuing prefetch instructions to prefetch a sequence of data cache lines consistent with the single constant stride; andupon detecting a recurring pattern having two or more different strides in the examined sequence of strides, issuing prefetch instructions to prefetch a sequence of data cache lines consistent with the recurring pattern having two or more different strides.2. The method of claim 1 , wherein prior to examining the sequence of strides claim 1 , the method further comprises generating the sequence of strides claim 1 , wherein each stride indicates a distance between addresses for consecutive memory references associated with the stream of data accesses.3. The method of claim 2 ,wherein while generating the sequence of strides, the method keeps track of data cache misses in a prefetch learning table (PLT), wherein each entry ...

Подробнее
08-01-2015 дата публикации

REDUCING MEMORY TRAFFIC IN DRAM ECC MODE

Номер: US20150012705A1
Принадлежит: NVIDIA CORPORATION

A method for managing memory traffic includes causing first data to be written to a data cache memory, where a first write request comprises a partial write and writes the first data to a first portion of the data cache memory, and further includes tracking the number of partial writes in the data cache memory. The method further includes issuing a fill request for one or more partial writes in the data cache memory if the number of partial writes in the data cache memory is greater than a predetermined first threshold. 1. A method for managing memory traffic , comprising:causing first data to be written to a data cache memory, wherein a first write request comprises a partial write and writes the first data to a first portion of the data cache memory;tracking the number of partial writes in the data cache memory;issuing a fill request for one or more partial writes in the data cache memory if the number of partial writes in the data cache memory is greater than a predetermined first threshold.2. The method of claim 1 , further comprising causing second data to be written to the data cache memory claim 1 , wherein a second write request comprises a partial write and writes the second data to a second portion of the data cache memory to create a full write in the data cache memory.3. The method of claim 1 , further comprising causing second data to be written to the data cache memory claim 1 , wherein a second write request comprises a partial write and writes the second data to a second portion of the data cache memory without creating a full write in the data cache memory.4. The method of claim 1 , wherein a fill request is not issued for any partial write in the data cache memory if the number of partial writes in the data cache memory is less than the predetermined first threshold.5. The method of claim 2 , further comprising transmitting the data in the full write to a memory.6. The method of claim 5 , further comprising computing an error control checksum for ...

Подробнее
08-01-2015 дата публикации

CACHE STICKINESS INDEX FOR CONTENT DELIVERY NETWORKING SYSTEMS

Номер: US20150012710A1
Принадлежит: FACEBOOK, INC.

Various embodiments of the present disclosure relate to a cache stickiness index for providing measurable metrics associated with caches of a content delivery networking system. In one embodiment, a method for generating a cache stickiness index, including a cluster stickiness index and a region stickiness index, is disclosed. In embodiments, the cluster stickiness index is generated by comparing cache keys shared among a plurality of front-end clusters. In embodiments, the region stickiness index is generated by comparing cache keys shared among a plurality of data centers. In one embodiment, a system comprising means for generating a stickiness index is disclosed. 1. A method comprising:determining a plurality of working sets, each associated with one or more cache devices of a networking system;generating an index based on the plurality of working sets by identifying a shared cache key across the plurality of working sets; andstoring the index in a database of the networking system for assisting content delivery.2. A method according to claim 1 , wherein generating the index based on the plurality of working sets includes generating a cluster stickiness index claim 1 , wherein generating the cluster stickiness index comprises:determining a shared percentage for each working set of the plurality of working sets, the shared percentage being a percent of each working set that contains a cache key shared with remaining working sets from the plurality of working sets; andcomputing a cluster index value for each working set based on the shared percentage.3. A method according to claim 2 , wherein each working set corresponds to a front-end cluster of the networking system claim 2 , wherein the front-end cluster is implemented by the one or more cache devices claim 2 , such that each working set contains an aggregation of unique cached items of the one or more cache devices.4. A method according to claim 3 , wherein determining the plurality of working sets includes: ' ...

Подробнее
08-01-2015 дата публикации

System and method for atomically updating shared memory in multiprocessor system

Номер: US20150012711A1
Принадлежит: Individual

A system for operating a shared memory of a multiprocessor system includes a set of processor cores and a corresponding set of core local caches, a set of I/O devices and a corresponding set of I/O device local caches. Read and write operations performed on a core local cache, an I/O device local cache, and the shared memory are governed by a cache coherence protocol (CCP) that ensures that the shared memory is updated atomically.

Подробнее
14-01-2016 дата публикации

MEMORY SEQUENCING WITH COHERENT AND NON-COHERENT SUB-SYSTEMS

Номер: US20160011977A1
Принадлежит:

Operations associated with a memory and operations associated with one or more functional units may be received. A dependency between the operations associated with the memory and the operations associated with one or more of the functional units may be determined. A first ordering may be created for the operations associated with the memory. Furthermore, a second ordering may be created for the operations associated with one or more of the functional units based on the determined dependency and the first operating of the operations associated with the memory. 1. A processor comprising:a memory;one or more functional units coupled to the memory; and receive a plurality of operations associated with the memory;', 'receive a plurality of operations associated with the one or more functional units;', 'determine a dependency between the plurality of operations associated with the memory and the plurality of operations associated with the one or more functional units;', 'create a first ordering of the plurality of operations associated with the memory; and', 'create a second ordering of the plurality of operations associated with the one or more functional units based on the dependency and the first ordering of the plurality of operations associated with the memory., 'a memory stream module coupled to the memory and the one or more functional units and to2. The processor of claim 1 , wherein the dependency between the plurality of operations associated with the memory and the plurality of operations associated with the one or more functional units specifies that at least one operation of the plurality of operations associated with the one or more functional units is not to be executed until an operation of the plurality of operations associated with the memory has been executed.3. The processor of claim 1 , wherein the plurality of operations associated with the memory comprises read and write operations associated with the memory and wherein the plurality of operations ...

Подробнее
11-01-2018 дата публикации

MEMORY RESOURCE OPTIMIZATION METHOD AND APPARATUS

Номер: US20180011638A1
Принадлежит: Huawei Technologies CO.,Ltd.

Embodiments of the present invention provide a memory resource optimization method and apparatus, relate to the computer field, solve a problem that existing multi-level memory resources affect each other, and optimize an existing single partitioning mechanism. A specific solution is: obtaining performance data of each program in a working set by using a page coloring technology, obtaining a category of each program in light of a memory access frequency, selecting, according to the category of each program, a page coloring-based partitioning policy corresponding to the working set, and writing the page coloring-based partitioning policy to an operating system kernel, to complete corresponding page coloring-based partitioning processing. The present invention is used to eliminate or reduce mutual interference of processes or threads on a memory resource in light of a feature of the working set, thereby improving overall performance of a computer. 1. A computer system , comprising an allocated last level cache (LLC) , a dynamic random access memory bank (DRAM Bank) and a processor coupled to the LLC and the DRAM Bank , the processor to execute a first working set of programs via the LLC and the DRAM Bank , wherein the processor is configured to:partition processing resources according to a page coloring-based collaborative partitioning policy for the first working set, the processing resources including both the LLC and the DRAM Bank.2. The computer system according to the claim 1 , wherein the page coloring-based collaborative partitioning policy is based on overlapped address bits (O-bits) of index bits of the LLC and index bits of the DRAM Bank in a physical page frame claim 1 , the O-bits to index page coloring-based partitioning for capacity of the LLC and capacity of the DRAM Bank.3. The computer system according to the claim 2 , wherein the processor is further configured to:acquire performance data of each program in the first working set of programs, wherein ...

Подробнее
11-01-2018 дата публикации

Apparatus to optimize gpu thread shared local memory access

Номер: US20180011711A1
Принадлежит: Intel Corp

One embodiment provides for a graphics processor comprising first logic coupled with a first execution unit, the first logic to receive a first single instruction multiple data (SIMD) message from the first execution unit; second logic coupled with a second execution unit, the second logic to receive a second SIMD message from the second execution unit; and third logic coupled with a bank of shared local memory (SLM), the third logic to receive a first request to access the bank of SLM from the first logic, a second request to access the bank of SLM from the second logic, and in a single access cycle, schedule a read access to a read port for the first request and a write access to a write port for the second request.

Подробнее
11-01-2018 дата публикации

CONTROL STATE PRESERVATION DURING TRANSACTIONAL EXECUTION

Номер: US20180011765A1
Принадлежит:

A method includes saving a control state for a processor in response to commencing a transactional processing sequence, wherein saving the control state produces a saved control state. The method also includes permitting updates to the control state for the processor while executing the transactional processing sequence. Examples of updates to the control state include key mask changes, primary region table origin changes, primary segment table origin changes, CPU tracing mode changes, and interrupt mode changes. The method also includes restoring the control state for the processor to the saved control state in response to encountering a transactional error during the transactional processing sequence. In some embodiments, saving the control state comprises saving the current control state to memory corresponding to internal registers for an unused thread or another level of virtualization. A corresponding computer system and computer program product are also disclosed herein. 1. A method comprising:saving a control state for a processor in response to commencing a transactional processing sequence, wherein saving the control state produces a saved control state;permitting updates to the control state for the processor while executing the transactional processing sequence; andrestoring the control state for the processor to the saved control state in response to encountering a transactional error during the transactional processing sequence.2. The method of claim 1 , wherein saving the control state comprises saving the current control state to a backup set of internal control registers or registers corresponding to an unused thread or another level of virtualization.3. The method of claim 1 , wherein saving the control state comprises saving the current control state to a private location in memory.4. The method of claim 3 , wherein the private location is owned by an operating system thread or the central processing unit (CPU).5. The method of claim 1 , wherein ...

Подробнее
09-01-2020 дата публикации

Method and apparatus for caching data

Номер: US20200012438A1
Принадлежит: EMC IP Holding Co LLC

Embodiments of the present disclosure relate to methods and apparatuses for caching data. A method comprises writing data into a first cache module on a first processor in response to receiving a first request for caching the data from a client module running on the first processor. The method further comprises transmitting, to the client module, a first indication that the data has been written into the first cache module. The method further comprises, in response to receiving from the client module a second request for synchronizing the data to a second processor, transmitting to the second processor a first command for causing the data to be written into a second cache module on the second processor. In addition, the method further comprises transmitting to the client module a second indication that the data has been synchronized.

Подробнее
19-01-2017 дата публикации

Compensating for Aging in Integrated Circuits

Номер: US20170017572A9
Принадлежит:

An age compensation method and apparatus for an integrated circuit (IC). An IC may be configured to operate at an initial operating voltage at the beginning of its operational life. Various circuits may be used to detect aging of the IC, and indications of aging may be stored to determine the aging of the IC. The information indicative of the determined aging of the IC may be compared to an aging threshold. If the information indicates that the aging is greater than or equal to the determined aging threshold, the operating voltage of the IC may be increased. This process may be repeated over the life of the IC, increasing the operating voltage as the IC ages. Raising the operating voltage in response to aging may compensate for various age related degradation mechanisms that can occur over the operational life of the IC. 120-. (canceled)21. A method comprising:operating an integrated circuit (IC) at a first operating voltage;monitoring aging of the IC using one or more age detection circuits;determining if the aging of the IC is greater than or equal to a first aging threshold; andoperating the IC at a second operating voltage responsive to determining that the aging of the IC is greater than or equal to the first aging threshold, wherein the second operating voltage is greater than the first operating voltage.22. The method as recited in claim 21 , further comprising:monitoring the aging of the IC subsequent to operating the IC at the second operating voltage;determining if the aging of the IC is greater than or equal to a second aging threshold;operating the IC at a third operating voltage responsive to determining that the aging of the IC is greater than or equal to the second aging threshold, wherein the third operating voltage is greater than the second operating voltage.23. The method as recited in claim 21 , further comprising continuing operation of the IC at the first operating voltage responsive to determining that the aging of the IC is less than the ...

Подробнее
21-01-2016 дата публикации

Method and Apparatus For Flexible Cache Partitioning By Sets And Ways Into Component Caches

Номер: US20160019157A1
Принадлежит:

Aspects include computing devices, systems, and methods for partitioning a system cache by sets and ways into component caches. A system cache memory controller may manage the component caches and manage access to the component caches. The system cache memory controller may receive system cache access requests specifying component cache identifiers, and match the component cache identifiers with records correlating traits of the component cache identifiers with in a component cache configuration table. The component cache traits may include a set shift trait, set offset trait, and target ways, which may define the locations of the component caches in the system cache. The system cache memory controller may also receive a physical address for the system cache in the system cache access request, determine an indexing mode for the component cache, and translate the physical address for the component cache.

Подробнее
21-01-2016 дата публикации

Method And Apparatus For A Shared Cache With Dynamic Partitioning

Номер: US20160019158A1
Принадлежит:

Aspects include computing devices, systems, and methods for dynamically partitioning a system cache by sets and ways into component caches. A system cache memory controller may manage the component caches and manage access to the component caches. The system cache memory controller may receive system cache access requests and reserve locations in the system cache corresponding to the component caches correlated with component cache identifiers of the requests. Reserving locations in the system cache may activate the locations in the system cache for use by a requesting client, and may also prevent other client from using the reserved locations in the system cache. Releasing the locations in the system cache may deactivate the locations in the system cache and allow other clients to use them. A client reserving locations in the system cache may change the amount of locations it has reserved within its component cache. 1. A method for dynamically partitioning a system cache , comprising:receiving a system cache access request comprising a component cache identifier from a client;retrieving a set shift trait and a set offset trait from a component cache configuration table correlated with the component cache identifier in the component cache configuration table; andactivating locations in the system cache correlating to at least a portion of a group of ways of a component cache within a group of sets indicated by the set shift trait and the set offset trait.2. The method of claim 1 , wherein activating the locations in the system cache correlating to at least the portion of the group of ways of the component cache within the group of sets indicated by the set shift trait and the set offset trait comprises reserving the locations in the system cache.3. The method of claim 2 , wherein reserving the locations in the system cache comprises setting reserved indicators in a component cache reserve table for the locations in the system cache.4. The method of claim 3 , further ...

Подробнее
15-01-2015 дата публикации

PREFETCHING FOR MULTIPLE PARENT CORES IN A MULTI-CORE CHIP

Номер: US20150019819A1
Принадлежит:

Embodiments relate to a method and computer program product for prefetching data on a chip. The chip has at least one scout core, multiple parent cores that cooperate together to execute various tasks, and a shared cache that is common between the scout core and the multiple parent cores. An aspect of the embodiments includes monitoring the multiple parent cores by the at least one scout core through the shared cache for a shared cache access occurring in a base parent core. The method includes saving a fetch address by the at least one scout core based on the shared cache access occurring. The fetch address indicates a location of a specific line of cache requested by the base parent core. 1. A computer program product for prefetching data on a chip having at least one scout core , multiple parent cores that cooperate together to execute various tasks , and a shared cache that is common between the at least one scout core and the multiple parent cores , the computer program product comprising: monitoring the multiple parent cores by the at least one scout core through the shared cache for a shared cache access occurring in a base parent core;', 'saving a fetch address by the at least one scout core based on the shared cache access occurring, the fetch address indicating a location of a specific line of cache requested by the base parent core;', 'determining an existence of a specific pattern by the at least one scout core, the specific pattern based on the fetch address, the specific pattern indicating that a minoring parent core has a cache miss pattern correlating to a shared cache access pattern of the base parent core; and', 'sending a prefetch request by the at least one scout core on the behalf of the mirroring parent core based on determining the existence of the specific pattern, the prefetch request for fetching at least one projected future missing line of cache., 'a tangible storage medium readable by a processing circuit and storing instructions for ...

Подробнее
15-01-2015 дата публикации

PREFETCHING FOR A PARENT CORE IN A MULTI-CORE CHIP

Номер: US20150019820A1
Принадлежит:

Embodiments of the invention relate to prefetching data on a chip having at least one scout core, at least one parent core, and a shared cache that is common between the at least one scout core and the at least one parent core. A prefetch code is executed by the scout core for monitoring the parent core. The prefetch code executes independently from the parent core. The scout core determines that at least one specified data pattern has occurred in the parent core based on monitoring the parent core. A prefetch request is sent from the scout core to the shared cache. The prefetch request is sent based on the at least one specified pattern being detected by the scout core. A data set indicated by the prefetch request is sent to the parent core by the shared cache. 1. A computer program product for prefetching data on a chip having at least one scout core , at least one parent core , and a shared cache that is common between the at least one scout core and the at least one parent core , the computer program product comprising: executing a prefetch code by the at least one scout core for monitoring the at least one parent core, the prefetch code executing independently from the at least one parent core;', 'determining by the at least one scout core that at least one specified data pattern has occurred in the at least one parent core based on monitoring the at least one parent core;', 'sending a prefetch request from the at least one scout core to the shared cache, the sending based on the determining; and', 'sending, by the shared cache, a data set indicated by the prefetch request to the at least one parent core., 'a tangible storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for performing a method comprising2. The computer program product as claimed in further comprising informing the at least one parent core that the prefetch request was made on behalf of the at least one parent core.3. The computer ...

Подробнее
15-01-2015 дата публикации

SPECIFIC PREFETCH ALGORITHM FOR A CHIP HAVING A PARENT CORE AND A SCOUT CORE

Номер: US20150019821A1
Принадлежит:

Embodiments relate to a method and computer program product for prefetching data on a chip having at least one scout core and a parent core. The method includes saving a prefetch code start address by the parent core. The prefetch code start address indicates where a prefetch code is stored. The prefetch code is specifically configured for monitoring the parent core based on a specific application being executed by the parent core. The method includes sending a broadcast interrupt signal by the parent core to the at least one scout core. The broadcast interrupt signal being sent based on the prefetch code start address being saved. The method includes monitoring the parent core by the prefetch code executed by at least one scout core. The scout core executes the prefetch code based on receiving the broadcast interrupt signal. 1. A computer program product for prefetching data on a chip having at least one scout core and a parent core , the computer program product comprising: saving a prefetch code start address by the parent core, the prefetch code start address indicating where a prefetch code is stored, and the prefetch code specifically configured for monitoring the parent core based on a specific application being executed by the parent core;', 'sending a broadcast interrupt signal by the parent core to the at least one scout core, the broadcast interrupt signal being sent based on the prefetch code start address being saved; and', 'monitoring the parent core by the at least one scout core, the at least one scout core executing the prefetch code to monitor the parent core, the executing of the prefetch code based on receiving the broadcast interrupt signal., 'a tangible storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for performing a method comprising2. The computer program product as claimed in claim 1 , wherein the parent core saves the prefetch code start address based on a task swap occurring ...

Подробнее
18-01-2018 дата публикации

METHOD FOR INCREASING CACHE SIZE

Номер: US20180018265A1
Принадлежит:

A method for increasing storage space in a system containing a block data storage device, a memory, and a processor is provided. Generally, the processor is configured by the memory to tag metadata of a data block of the block storage device indicating the block as free, used, or semifree. The free tag indicates the data block is available to the system for storing data when needed, the used tag indicates the data block contains application data, and the semifree tag indicates the data block contains cache data and is available to the system for storing application data type if no blocks marked with the free tag are available to the system. 1. A method for using a resource by one or more applications , the resource comprising multiple resource components that are individually accessed and controlled by an operating system for being used by the one or more applications , each of the resource components is tagged using a first tag , a second tag , or a third tag , and each of the resource components is capable of being used by the one or more applications for a first purpose and a second purpose , for use with a request from an application by an operating system to use two resource components respectively for the first and second purposes , the method comprising the steps of:determining if a resource component associated with the first tag or with the second tag is available for use;responsive to the determining, notifying the application if no resource component in the resource is associated with the first tag or with the second tag;determining, by the operating system, if a first resource component associated with the first tag is available in the resource;if a first resource component associated with the first tag is available, then:selecting the first resource component associated with the first tag;using the selected first resource component by the application for the first purpose; andtagging the first resource component with the third tag;determining, by the ...

Подробнее
28-01-2016 дата публикации

STORAGE SYSTEM AND METHOD FOR MIGRATING THE SAME

Номер: US20160026409A1
Принадлежит: Hitachi, Ltd.

The present invention provides a storage system capable of performing data migration without stopping the IO access from a host computer, without having to copy data in an external connection storage subsystem to a migration destination storage subsystem. A configuration information of a migration source volume in a migration source storage subsystem is migrated to the migration destination storage subsystem, and during migration, cache memories of the respective subsystems are used to duplicate and store write data from the host computer, and the migration destination storage subsystem reflects the write data to the external connection storage subsystem. Reading of data is also performed by the migration destination storage subsystem. 1. A storage system comprising a migration source storage subsystem , a migration destination storage subsystem and an external storage subsystem , wherein:the storage system has a host computer and a management device connected thereto;the migration source storage subsystem includes a migration source cache memory, which is associated with an external volume composed of a storage device of an external storage subsystem and provided virtually to the host computer as a migration source volume;the migration destination storage subsystem includes a migration destination cache memory;wherein based on a data migration instruction output by the management device from the migration source storage subsystem to the migration destination storage subsystem;the storage system;associates and allocates the migration source volume to the migration destination storage subsystem as a migration destination volume;associates the migration destination volume to the external volume and virtually provides the same to the host computer;based on a purge command to the migration source cache memory, deletes stored data after reflecting the stored data in the external volume;after completing the purge command, restricts the access from the host computer to the ...

Подробнее
26-01-2017 дата публикации

PARALLEL COMPUTER, INITIALIZATION METHOD OF PARALLEL COMPUTER, AND NON-TRANSITORY MEDIUM FOR STORING A PROGRAM

Номер: US20170024222A1
Автор: MATSUMORI Hitoshi
Принадлежит: FUJITSU LIMITED

A parallel computer includes a first processor, a second processor, and a first storage device. The first processor outputs, in response to an instruction for starting up the parallel computer, a first read-out request causing the first storage device to transmit a command of an initialization process to the first processor. The first processor executes the initialization process of the first processor by using the command received from the first storage device. The second processor monitors, in response to the instruction for starting up the parallel computer, a signal transmitted between the first processor and the first storage device. The second processor detects, from the signal monitored, the command output from the first storage device. And, the second processor is configured to execute the initialization process of the second processor by using the detected command. 1. A parallel computer , comprising:a first processor configured tooutput, in response to an instruction for starting up the parallel computer, a first read-out request to a first storage device, the first read-out request causing the first storage device to transmit a command of an initialization process to the first processor,execute the initialization process of the first processor by using the command received from the first storage device; anda second processor configured tomonitor, in response to the instruction for starting up the parallel computer, a signal transmitted between the first processor and the first storage device,detect, from the signal transmitted between the first processor and the first storage device, the command output from the first storage device,execute the initialization process of the second processor by using the detected command.2. The parallel computer according to claim 1 , whereinthe second processor comprises a cache memory,the first read-out request output from the first processor includes an address at which the command is stored in the first storage device, ...

Подробнее
26-01-2017 дата публикации

SYSTEMS AND METHODS FOR SCHEDULING TASKS IN A HETEROGENEOUS PROCESSOR CLUSTER ARCHITECTURE USING CACHE DEMAND MONITORING

Номер: US20170024316A1
Принадлежит:

Systems, methods, and computer programs are disclosed for scheduling tasks in a heterogeneous processor cluster architecture in a portable computing device. One embodiment is a system comprising a first processor cluster and a second processor cluster. The first processor cluster comprises a first shared cache, and the second processor cluster comprises a second shared cache. The system further comprises a controller in communication with the first and second processor clusters for performing task migration between the first and second processor clusters. The controller initiates execution of a task on a first processor in the first processor cluster. The controller monitors a processor workload for the first processor and a cache demand associated with the first shared cache while the task is running on the first processor in the first processor cluster. The controller migrates the task to the second processor cluster based on the processor workload and the cache demand. 1. A method for scheduling tasks in a heterogeneous processor cluster architecture in a portable computing device , the method comprising:running a task on a first processor in a first processor cluster in a heterogeneous processor cluster architecture comprising the first processor cluster having a first shared cache and a second processor cluster having a second shared cache;while the task is running on the first processor in the first processor cluster, monitoring a processor workload for the first processor and a cache demand associated with the first shared cache; andmigrating the task to the second processor cluster based on the processor workload and the cache demand.2. The method of claim 1 , wherein the first processor comprises a dedicated cache miss counter in communication with the first shared cache.3. The method of claim 1 , wherein each processor in the first and second processor clusters comprises a dedicated cache miss counter for receiving cache miss signals from the corresponding ...

Подробнее
28-01-2016 дата публикации

Using a decrementer interrupt to start long-running hardware operations before the end of a shared processor dispatch cycle

Номер: US20160026573A1
Принадлежит: International Business Machines Corp

Method to perform an operation, the operation comprising processing a first logical partition on a shared processor for the duration of a dispatch cycle, issuing, by a hypervisor, at a predefined time prior to completion of the dispatch cycle, a lightweight hypervisor decrementer (HDEC) interrupt specifying a cache line address buffer location in a virtual processor, and responsive to the lightweight HDEC, writing, by the shared processor, a set of cache line addresses used by the first logical partition to the cache line address buffer location in the virtual processor.

Подробнее
28-01-2016 дата публикации

Using a decrementer interrupt to start long-running hardware operations before the end of a shared processor dispatch cycle

Номер: US20160026586A1
Принадлежит: International Business Machines Corp

Systems, methods, and computer program products to perform an operation, the operation comprising processing a first logical partition on a shared processor for the duration of a dispatch cycle, issuing, by a hypervisor, at a predefined time prior to completion of the dispatch cycle, a lightweight hypervisor decrementer (HDEC) interrupt specifying a cache line address buffer location in a virtual processor, and responsive to the lightweight HDEC, writing, by the shared processor, a set of cache line addresses used by the first logical partition to the cache line address buffer location in the virtual processor.

Подробнее
25-01-2018 дата публикации

Modified query execution plans in hybrid memory systems for in-memory databases

Номер: US20180024928A1
Автор: Ahmad Hassan
Принадлежит: SAP SE

Implementations of the present disclosure include methods, systems, and computer-readable storage mediums for receiving a query from an application, processing a query execution plan (QEP) of the query using a cache simulator to simulate queries to an in-memory database in a hybrid memory system, providing a miss-curve based on the QEP, the miss-curve relating miss-ratios to memory sizes, and determining relative sizes of a first type of memory and a second type of memory in the hybrid memory system at least partially based on the miss-curve.

Подробнее
23-01-2020 дата публикации

NETWORK INTERFACE DEVICE AND HOST PROCESSING DEVICE

Номер: US20200026557A1
Принадлежит: Solarflare Communications, Inc.

A network interface device has an input configured to receive data from a network. The data is for one of a plurality of different applications. The network interface device also has at least one processor configured to determine which of a plurality of available different caches in a host system the data is to be injected by accessing to a receive queue comprising at least one descriptor indicating a cache location in one of said plurality of caches to which data is to be injected, wherein said at least one descriptor, which indicates the cache location, has an effect on subsequent descriptors of said receive queue until a next descriptor indicates another cache location. The at least one processor is also configured to cause the data to be injected to the cache location in the host system. 1. A network interface device comprising:an input configured to receive data from a network, said data being for one of a plurality of different applications; andat least one processor configured to:determine which of a plurality of available different caches in a host system said data is to be injected by accessing a receive queue comprising at least one descriptor indicating a cache location in one of said plurality of caches to which data is to be injected, wherein said at least one descriptor, which indicates the cache location, has an effect on subsequent descriptors of said receive queue until a next descriptor indicates another cache location; andcause said data to be injected to the cache location in said host system.2. The network interface device as claimed in claim 1 , wherein at least two of said plurality of caches are associated with different CPU cores and/or different physical dies.3. The network interface device as claimed in claim 1 , wherein said at least one processor is configured to determine which of said plurality of caches in a host system is to be injected in dependence on cache information provided by an application thread of said application.4. The ...

Подробнее
23-01-2020 дата публикации

REGULATING HARDWARE SPECULATIVE PROCESSING AROUND A TRANSACTION

Номер: US20200026558A1
Принадлежит:

A transaction is detected. The transaction has a begin-transaction indication and an end-transaction indication. If it is determined that the begin-transaction indication is not a no-speculation indication, then the transaction is processed. 1. A method comprising:determining, by one or more computer processors, that instructions preceding a transaction have not completed;prohibiting, by one or more computer processors, the transaction from being processed until a determination is made that indicates that all pending outside instructions are not, or are no longer, being processed in a speculative manner; anddetermining, by one or more computer processors, that an end-transaction indication associated with the transaction indicates an end to a period of no-speculation transaction processing.2. The method of claim 1 , wherein the transaction comprises two or more instructions to be processed atomically on a data structure in a memory.3. The method of claim 1 , the method comprising:determining, by one or more computer processors, that a begin-transaction indication associated with a transaction is a no-speculation indication, wherein the begin-transaction indication is selected from the group consisting of: a new instruction, a new prefix instruction, or a variant of an instruction in a current instruction set architecture.4. The method of claim 1 , the method comprising:determining, by one or more computer processors, that the instructions preceding the transaction have completed; andresponsive to determining that the instructions preceding the transaction have completed, processing, by one or more computer processors, the transaction.5. The method of claim 4 , the method comprising:responsive to processing the transaction, determining, by one or more computer processors, whether an end-transaction indication associated with the transaction is a no-speculation indication; andresponsive to determining that the end-transaction indication is the no-speculation ...

Подробнее
23-01-2020 дата публикации

PREFETCH PROTOCOL FOR TRANSACTIONAL MEMORY

Номер: US20200026651A1
Принадлежит:

Providing control over processing of a prefetch request in response to conditions in a receiver of the prefetch request and to conditions in a source of the prefetch request. A processor generates a prefetch request and a tag that dictates processing the prefect request. A processor sends the prefetch request and the tag to a second processor. A processor generates a conflict indication based on whether a concurrent processing of the prefetch request and an atomic transaction by the second processor would generate a conflict with a memory access that is associated with the atomic transaction. Based on an analysis of the conflict indication and the tag, a processor processes (i) either the prefetch request or the atomic transaction, or (ii) both the prefetch request and the atomic transaction. 1. A method to control processing of a prefetch request , the method comprising:generating, by a first processors in a multiprocessor system, a prefetch request and a tag that dictates processing the prefect request;sending, by one or more processors in the multiprocessor system, the prefetch request and the tag to a second processor in the multiprocessor system;generating, by one or more processors in the multiprocessor system, a conflict indication based on whether a concurrent processing of the prefetch request and an atomic transaction by the second processor would generate a conflict with a memory access that is associated with the atomic transaction; andbased on an analysis of the conflict indication and the tag, processing, by the second processor in the multiprocessor system (i) either the prefetch request or the atomic transaction, or (ii) both the prefetch request and the atomic transaction.2. The method of claim 1 , further comprising:generating, by the first processors in a multiprocessor system, the tag of the prefetch request according to a prefetch protocol, wherein the prefetch request includes (a) a description of at least one prefetch request operation and (b) ...

Подробнее
28-01-2021 дата публикации

NETWORK INTERFACE DEVICE AND HOST PROCESSING DEVICE

Номер: US20210026689A1
Принадлежит: XILINX, INC.

A network interface device has an input configured to receive data from a network. The data is for one of a plurality of different applications. The network interface device also has at least one processor configured to determine which of a plurality of available different caches in a host system the data is to be injected by accessing to a receive queue comprising at least one descriptor indicating a cache location in one of said plurality of caches to which data is to be injected, wherein said at least one descriptor, which indicates the cache location, has an effect on subsequent descriptors of said receive queue until a next descriptor indicates another cache location. The at least one processor is also configured to cause the data to be injected to the cache location in the host system. 1. A network interface device configured to interface a host system with a network , the network interface device comprising:an input configured to receive data from the network, said data being for one of a plurality of different applications; and determine at least one of a plurality of caches into which at least part of said data is to be injected; and', 'cause the at least part of said data to be injected to the determined at least one of the caches,, 'circuitry configured towherein each of the plurality of caches are associated with one or more of a plurality of processors,wherein the network interface device comprises at least one of the plurality of processors and one or more of the determined at least one of the caches.2. The network interface device as claimed in claim 1 , wherein the network interface device comprises the plurality of processors and the plurality of caches.3. The network interface device as claimed in claim 1 , wherein the host system comprises at least one of the plurality of processors and a further one or more of the determined at least one of the caches.4. The network interface device as claimed in claim 1 , wherein the determined at least one of ...

Подробнее
28-01-2021 дата публикации

TRANSLATION SUPPORT FOR A VIRTUAL CACHE

Номер: US20210026783A1
Принадлежит:

Disclosed herein is a virtual cache and method in a processor for supporting multiple threads on the same cache line. The processor is configured to support virtual memory and multiple threads. The virtual cache directory includes a plurality of directory entries, each entry is associated with a cache line. Each cache line has a corresponding tag. The tag includes a logical address, an address space identifier, a real address bit indicator, and a per thread validity bit for each thread that accesses the cache line. When a subsequent thread determines that the cache line is valid for that thread the validity bit for that thread is set, while not invalidating any validity bits for other threads. 1. A method of operating a primary processor cache for a processor with virtual memory support and multiple threads , wherein a logically indexed and logically tagged cache directory is used , and wherein an entry in the directory contains an absolute memory address in addition to a corresponding logical memory address , each entry includes a valid bit for each thread that accesses each entry , comprising:creating a new entry for the cache line in the primary cache in response to the cache line being in a secondary cache, and not in the primary cache;determining by a second thread that an entry for the cache line is present in the primary cache;in response to determining that the entry for the cache line is present in the primary cache, determining that the cache line is not valid for the second thread;executing a lookup to determine an address for the cache line in the primary cache;determining that the address for the cache line and the entry are the same cache line;in response to determining that the address and the entry are the same, setting the valid bit associated with the second thread to valid, and not invalidating the valid bit associated with other threads in the cache entry that have a valid bit in the cache entry, wherein the valid bit is independent of other ...

Подробнее
29-01-2015 дата публикации

ELECTRONIC DEVICES HAVING SEMICONDUCTOR MEMORY UNITS AND METHOD OF FABRICATING THE SAME

Номер: US20150032960A1
Автор: DONG Cha-Deok
Принадлежит: SK HYNIX INC.

Electronic devices have a semiconductor memory unit including a magnetization compensation layer in a contact plug. One implementation of the semiconductor memory unit includes a variable resistance element having a stacked structure of a first magnetic layer, a tunnel barrier layer, and a second magnetic layer, and a contact plug arranged in at least one side of the variable resistance element and comprising a magnetization compensation layer. Another implementation includes a variable resistance element having a stacked structure of a first magnetic layer having a variable magnetization, a tunnel barrier layer, and a second magnetic layer having a pinned magnetization; and a contact plug arranged at one side of and separated from the variable resistance element to include a magnetization compensation layer that produces a magnetic field to reduce an influence of a magnetic field of the second magnetic layer on the first magnetic layer. 1. An electronic device comprising a semiconductor memory unit that includes:a variable resistance element having a stacked structure of a first magnetic layer having a variable magnetization, a tunnel barrier layer, and a second magnetic layer having a pinned magnetization; anda contact plug arranged at one side of the variable resistance element and separated from the variable resistance element, the contact plug comprising a magnetization compensation layer that produces a magnetic field at the variable resistance element to reduce an influence of a magnetic field of the second magnetic layer on the first magnetic layer.2. The electronic device of claim 1 , wherein the magnetization compensation layer has a thickness greater than a critical dimension (CD) thereof.3. The electronic device of claim 1 , wherein the magnetization compensation layer comprises a conductive material having a horizontal magnetic property in that a magnetization of the magnetization compensation layer is in a plane of the magnetization compensation layer. ...

Подробнее
02-02-2017 дата публикации

NUMA SCHEDULING USING INTER-VCPU MEMORY ACCESS ESTIMATION

Номер: US20170031819A1
Принадлежит:

In a system having non-uniform memory access architecture, with a plurality of nodes, memory access by entities such as virtual CPUs is estimated by invalidating a selected sub-set of memory units, and then detecting and compiling access statistics, for example by counting the page faults that arise when any virtual CPU accesses an invalidated memory unit. The entities, or pairs of entities, may then be migrated or otherwise co-located on the node for which they have greatest memory locality. 1. A method for managing memory in a system , said system including a plurality of software entities that each access the memory and having a non-uniform memory access architecture (NUMA) with a plurality of nodes , the method comprising:assigning a first software entity to a first node;computing a first metric measuring pair-wise memory sharing between pairs of the software entities;selecting a second software entity having the highest memory sharing with the first software entity based on the computed first metric; andassigning the second software entity to the first node.2. The method of claim 1 , wherein the software entities are virtual CPUs of virtual machines.3. The method of claim 2 , wherein the process of computing the first metric is performed for each pair of the virtual CPUs within a single virtual machine.4. The method of claim 2 , wherein the system includes a virtual machine.5. The method of claim 1 , further comprising:selecting a sample set of memory units; andinvalidating the sample set of memory units and detecting accesses by any of the software entities to the invalidated memory units;wherein the process of computing the first metric is based at least in part on the detected accesses by any of the software entities to the invalidated memory units.6. The method of claim 5 , wherein the software entities are virtual CPUs of virtual machines claim 5 , the method further comprising:invalidating the sample set of memory units by invalidating the sample set in a ...

Подробнее
04-02-2016 дата публикации

Slice-Based Random Access Buffer for Data Interleaving

Номер: US20160034393A1
Принадлежит: LSI Corporation

The disclosure is directed to a system and method for interleaving data utilizing a random access buffer that includes a plurality of independently accessible memory slots. The random access buffer is configured to store slices of incoming data sectors in free memory slots, where a free memory slot is identified by a status flag associated with a logical address of the free memory slot. Meanwhile, a label buffer is configured to store labels associated with the slices of the incoming data sectors in a sequence based upon an interleaving scheme. Media sectors including the interleaved data slices are read out from the memory slots of the random access buffer in order of the sequence of labels stored by the label buffer. As the media sectors are read out of the random access buffer, the corresponding memory slots are freed up for incoming slices of the next super-sector. 1. A system for interleaving data , comprising:a slice divider configured to receive incoming data sectors of a super-sector, the slice divider being further configured to divide the incoming data sectors into slices;a random access buffer including memory slots for storing data sector slices, the random access buffer being configured to store the slices of the incoming data sectors in free memory slots, wherein a free memory slot is identified by a status flag associated with a logical address of the free memory slot;a label buffer configured to store labels associated with the slices of the incoming data sectors in a sequence based upon an interleaving scheme; anda processor in communication with the random access buffer and the label buffer, the processor being configured to read out media sectors corresponding to the super-sector, wherein a media sector includes interleaved data slices read out from the memory slots of the random access buffer in order of the sequence of labels stored by the label buffer.2. The system of claim 1 , wherein the processor is configured to read out the media sectors ...

Подробнее
04-02-2016 дата публикации

Method and Apparatus for Ensuring Data Cache Coherency

Номер: US20160034395A1
Принадлежит: Imagination Technologies Ltd

A multithreaded processor can concurrently execute a plurality of threads in a processor core. The threads can access a shared main memory through a memory interface; the threads can generate read and write transactions that cause shared main memory access. An incoherency detection module prevents incoherency by maintaining a record of outstanding global writes, and detecting a conflicting global read. A barrier is sequenced with the conflicting global write. The conflicting global read is allowed to proceed after the sequence of the conflicting global write and the barrier are cleared. The sequence can be maintained by a separate queue for each thread of the plurality.

Подробнее
04-02-2016 дата публикации

Method and Apparatus for Processing Data and Computer System

Номер: US20160034397A1
Принадлежит:

A method and an apparatus for processing data and a computer system are provided. The method includes copying a shared virtual memory page to which a first process requests access into off-chip memory of a computing node, and using the shared virtual memory page copied into the off-chip memory as a working page of the first process; and before the first process performs a write operation on the working page, creating, in on-chip memory of the computing node, a backup page of the working page, so as to back up original data of the working page. Before a write operation is performed on a working page, page data is backed up in the on-chip memory, so as to ensure data consistency when multiple processes perform an operation on a shared virtual memory page while accessing off-chip memory as less as possible and improving a speed of a program. 1. A method for processing data , comprising:copying a shared virtual memory page to which a first process requests access into off-chip memory of a computing node, and using the shared virtual memory page copied into the off-chip memory as a working page of the first process, wherein the shared virtual memory page is a virtual memory page in shared virtual memory of an application program to which the first process belongs, and wherein the application program runs on the computing node; andcreating a backup page of the working page, before the first process performs a write operation on the working page, and storing the created backup page into on-chip memory of the computing node, so as to back up original data of the working page.2. The method according to claim 1 , wherein a quantity of working pages of the first process is M claim 1 , wherein M is a positive integer greater than or equal to 1; and wherein claim 1 , before the storing of the created backup page into the on-chip memory of the computing node claim 1 , the method further comprises determining whether remaining space of the on-chip memory is less than a first ...

Подробнее
01-02-2018 дата публикации

TECHNIQUES TO ALLOCATE REGIONS OF A MULTI-LEVEL, MULTI-TECHNOLOGY SYSTEM MEMORY TO APPROPRIATE MEMORY ACCESS INITIATORS

Номер: US20180032429A1
Принадлежит:

A method is described. The method includes recognizing different latencies and/or bandwidths between different levels of a system memory and different memory access requestors of a computing system. The system memory includes the different levels and different technologies. The method also includes allocating each of the memory access requestors with a respective region of the system memory having an appropriate latency and/or bandwidth. 1. A method , comprising:recognizing different latencies and/or bandwidths between different levels of a system memory and different memory access requestors of a computing system, the system memory comprising the different levels and different technologies; and,allocating each of the memory access requestors with a respective region of the system memory having an appropriate latency and/or bandwidth.2. The method of wherein the different technologies comprise DRAM and an emerging non volatile memory technology.3. The method of wherein the emerging non volatile memory technology comprises chalcogenide.4. The method of wherein the different latencies and/or bandwidths further comprise different latencies and/or bandwidths between a read operation and a write operation.5. The method of wherein the different levels of the system memory comprises a level that is integrated in a same semiconductor chip package as a processor having CPU cores.6. The method of wherein the recognizing further comprises analyzing attributes of the different levels of the system memory from a record kept in BIOS of the computing system.7. The method of wherein the attributes are compatible with any of the following standards:ACPI;NVDIMM.8. A machine readable storage medium having contained thereon program code that when processed by a computing system causes the computing system to perform a method claim 6 , comprising:recognizing different latencies and/or bandwidths between different levels of a system memory and different memory access requestors of a ...

Подробнее
31-01-2019 дата публикации

Transfer track format information for tracks in cache at a first processor node to a second process node to which the first processor node is failing over

Номер: US20190034303A1
Принадлежит: International Business Machines Corp

Provided are a computer program product, system, and method for managing failover from a first processor node including a first cache to a second processor node including a second cache. Storage areas assigned to the first processor node are reassigned to the second processor node. For each track indicated in a cache list of tracks in the first cache for the reassigned storage areas, the first processor node adds a track identifier of the track and track format information indicating a layout and format of data in the track to a cache transfer list. The first processor node transfers the cache transfer list to the second processor node. The second processor node uses the track format information transferred with the cache transfer list to process read and write requests to tracks in the reassigned storage areas staged into the second cache.

Подробнее