Настройки

Укажите год
-

Небесная энциклопедия

Космические корабли и станции, автоматические КА и методы их проектирования, бортовые комплексы управления, системы и средства жизнеобеспечения, особенности технологии производства ракетно-космических систем

Подробнее
-

Мониторинг СМИ

Мониторинг СМИ и социальных сетей. Сканирование интернета, новостных сайтов, специализированных контентных площадок на базе мессенджеров. Гибкие настройки фильтров и первоначальных источников.

Подробнее

Форма поиска

Поддерживает ввод нескольких поисковых фраз (по одной на строку). При поиске обеспечивает поддержку морфологии русского и английского языка
Ведите корректный номера.
Ведите корректный номера.
Ведите корректный номера.
Ведите корректный номера.
Укажите год
Укажите год

Применить Всего найдено 3090. Отображено 199.
07-10-2019 дата публикации

МАСШТАБИРУЕМЫЕ ПУЛЫ ХРАНЕНИЯ ДАННЫХ

Номер: RU2702268C2

Изобретение относится к вычислительной технике. Технический результат заключается в увеличении максимального количества устройств хранения данных в пуле хранения данных. Способ управления ресурсами хранения пула хранения данных посредством вычислительного устройства, содержащий этапы, на которых: получают данные, которые описывают области неисправностей в иерархии хранения и имеющиеся в наличии ресурсы хранения в пуле хранения данных; назначают распределение метаданных на одно или более конкретных устройств хранения данных в пуле хранения данных, основываясь на описанных областях неисправностей; и определяют рабочие характеристики устройств, ассоциированных с упомянутыми имеющимися в наличии ресурсами хранения, в пределах одного или более уровней иерархии хранения, причем рабочие характеристики включают в себя степень исправности, тип информационного соединения, тип носителя, типы использования в пределах пула хранения данных или то, используются ли эти устройства в настоящий момент для ...

Подробнее
20-12-2012 дата публикации

ИЗБЫТОЧНОСТЬ МОБИЛЬНЫХ УЗЛОВ БАЗОВОЙ СЕТИ

Номер: RU2470484C2

Изобретение относится к сетям связи. Технический результат заключается в предотвращении конфликтов по интерфейсу между прокси-серверами и другими сетевыми узлами. Устройство, сконфигурированное с возможностью функционирования в пределах сети связи как прокси-сервер пула, маршрутизирует трафик сигнализации между узлом первой сети и одним из набора узлов пула в пределах второй сети. Устройство дополнительно сконфигурировано с возможностью функционирования в одном из активного состояния и пассивного состоянии в отношении к первой сети и, когда в упомянутом активном состоянии, с возможностью отправки периодического сигнала в, по меньшей мере, один из упомянутых узлов пула для ретрансляции в одноранговый прокси-сервер пула и, когда в упомянутом пассивном состоянии, с возможностью приема периодического сигнала от однораногового прокси-сервера пула, ретранслируемого через, по меньшей мере, один из упомянутых узлов пула. В событии, когда не принимается никакой периодический сигнал, когда находится ...

Подробнее
08-06-2018 дата публикации

Самодиагностируемая бортовая вычислительная система с резервированием замещением

Номер: RU2657166C1

Изобретение относится к вычислительной технике и может быть использовано в системах различного назначения, где требуется высокая надежность и радиационная стойкость. Техническим результатом является сокращение времени задействования резервной системы, находящейся в выключенном состоянии, при одновременном обеспечении высокой надежности, отказоустойчивости и радиационной стойкости. В самодиагностируемую бортовую вычислительную систему, содержащую основную систему, введена аналогичная резервная система, каждая из систем имеет два идентичных канала основной и резервный, в каждый канал введена схема подключения вторичного питания, устройство резервирования. В каждом канале входы устройства резервирования подключены к выходу процессора, к выходу системного генератора, к выходу схемы начальной установки. Выход процессора подключен к входу коммутатора, второй вход которого подключен к выходу схемы начальной установки. Выход источника вторичного питания подключен к входу схемы подключения вторичного ...

Подробнее
20-03-2009 дата публикации

СИСТЕМА И СПОСОБ БОРТОВОЙ ОБРАБОТКИ ДАННЫХ ИСПЫТАНИЙ В ПОЛЕТЕ

Номер: RU2007133839A
Принадлежит:

... 1. Система бортовой обработки данных испытаний в полете, характеризующаяся тем, что она включает в себя комплекс обезличенных и взаимозаменяемых вычислительных средств, присоединенных к сети Ethernet, причем на каждом вычислительном средстве установлено идентичное программное обеспечение, и тем, что каждое вычислительное средство включает в себя средства наблюдения за другими вычислительными средствами этого комплекса, которые позволяют узнать в любой момент, в каком состоянии находятся соучаствующие вычислительные средства и каковы задания, выполняемые каждым вычислительным средством, причем эти средства наблюдения используют файл стратегий, идентичный на каждом вычислительном средстве, который исчерпывающим образом описывает все данные комплекса вычислительных средств и в котором перечислены задания для каждого вычислительного средства. 2. Система по п.1, в которой программное обеспечение, установленное на каждом вычислительном средстве, включает в себя следующее: первые средства, называемые ...

Подробнее
10-11-2016 дата публикации

Способ резервирования систем и устройство его реализации

Номер: RU2015114879A
Принадлежит:

... 1. Способ резервирования систем, заключающийся в сравнении выходных параметров (ВП) рабочего элемента системы со средним арифметическим значением выходных параметров рабочего и элементов сравнения, обнаружении отказа и по его результатам отключении рабочего элемента от выхода системы, отличающийся тем, что до вычисления среднего арифметического значения ВП определяют условие нахождения значения ВП рабочего и каждого элемента сравнения системы в пределах области допустимых значений, и если условие не выполняется, то исключают данный элемент из дальнейшего рассмотрения, определяют элемент, значение ВП которого находится наиболее близко к среднему арифметическому значению, и подключают его к выходу резервированной системы.2. Устройство реализации способа резервирования систем, содержащее рабочий элемент (РЭ), n=1…N элементов сравнения (ЭС) и последовательно соединенные суммирующее устройство (СУ), делитель (Д), вычитающее устройство (ВУ), блок управления (БУ) и устройство коммутации (УК), ...

Подробнее
05-04-2012 дата публикации

Duale, unabhängige, nicht flüchtige Speichersysteme

Номер: DE112008003990T5
Принадлежит: SIERRA WIRELESS INC, SIERRA WIRELESS, INC.

Es wird ein Verfahren, ein System und ein computerlesbares Medium zum Wiederherstellen eines fehlerhaften nicht flüchtigen Speicher(NVM)-Systems in einer kabellosen Vorrichtung (100) offenbart, welche ein primäres NVM-System (110) und ein sekundäres NVM-System (112) aufweisen. Das Verfahren erfordert keinen Neustart der kabellosen Vorrichtung (100). Eine NVM-Manageranwendung (106) erkennt ein Versagen (28) in einem der beiden NVM-Systeme (110, 112) und bestimmt, welches NVM-System versagt hat (30). Wenn das primäre NVM-System (110) versagt hat, dann schaltet der NVM-Manager (106) die kabellose Vorrichtung (100) um (32), um unter Verwendung des sekundären NVM-Systems (112) zu arbeiten, stellt das primäre NVM-System (110) unter Verwendung von Daten von dem sekundären NVM-System (110) wieder her (34), schaltet dann die kabellose Vorrichtung (100) zurück zu dem primären NVM-System (110) um (36), wenn dieses einmal wiederhergestellt wurde. Wenn das sekundäre NVM-System (112) versagt hat, dann ...

Подробнее
16-09-2015 дата публикации

Distributed computing system

Номер: GB0002524000A
Принадлежит:

A distributed computing system 10 comprises a plurality computing environments 12, 14, each of the plurality of individual computing environments having: one or more computing resources 20, 22, 24, 26, 28, 30; a local controller 32, 34; and a local repository (50, Fig. 2), the distributed computing system further comprising a global repository 40, wherein each of the local controllers is configured to manage all of the computing resources in the computing system. In the event of a failure of a controller, computing resources may fail over to another local controller. Each controller provides itself with cached metadata copies from an API of the controllers environment, where the cached copy is used for failover in the event of controller failure. Local controllers may be linked to a global version control system, committing identifiers associated with resources to a global repository. The resources may be provided with one or more of a secure configuration management repository, a secure ...

Подробнее
16-10-2013 дата публикации

Method for virtual machine failover management and system supporting the same

Номер: GB0002501204A
Принадлежит:

In a mirrored virtual machine environment utilising a checkpoint process to control the transfer of data from primary (402) to secondary (406) virtual machines, an internal network (400) enables primary virtual machines to exchange network packets with other virtual machines (404) without having to wait for a checkpoint to occur. A mechanism is provided to ensure that all primary virtual machines that can see the network traffic of a particular virtual machine cannot affect the outside environment until the checkpoint has been completed. This is achieved by synchronising checkpoints between all primary virtual machines and ensuring that, if one fails, then all failover to their respective secondary.

Подробнее
12-06-2013 дата публикации

Computer system

Номер: GB0201308167D0
Автор:
Принадлежит:

Подробнее
27-11-1991 дата публикации

COMPUTING SYSTEMS AND METHODS

Номер: GB0009121540D0
Автор:
Принадлежит:

Подробнее
15-01-2011 дата публикации

RANGES FOR THE SAFETY TRANSMISSION OF KNOTS IN A CLUSTER SYSTEM

Номер: AT0000491989T
Принадлежит:

Подробнее
15-02-2011 дата публикации

DEVICE AND PROCEDURE FOR A SINGLE ZERO ERROR LOAD DISTRIBUTOR

Номер: AT0000496336T
Принадлежит:

Подробнее
25-10-1973 дата публикации

TELEPHONE SUBSCRIBERS@ METERING SYSTEM

Номер: AU0000465529B2
Автор:
Принадлежит:

Подробнее
29-11-1999 дата публикации

Controlling a bus with multiple system hosts

Номер: AU0003974699A
Принадлежит:

Подробнее
20-07-2006 дата публикации

A GROUND-BASED SOFTWARE TOOL FOR CONTROLLING REDUNDANCY MANAGEMENT SWITCHING OPERATIONS

Номер: CA0002594607A1
Принадлежит:

Подробнее
20-05-1986 дата публикации

METHOD AND APPARATUS FOR THE SELECTION OF REDUNDANT SYSTEM MODULES

Номер: CA0001204875A1
Автор: TOWNSEND GREG M
Принадлежит:

Подробнее
18-01-2018 дата публикации

METHOD AND ARCHITECTURE FOR CRITICAL SYSTEMS UTILIZING MULTI-CENTRIC ORTHOGONAL TOPOLOGY AND PERVASIVE RULES-DRIVEN DATA AND CONTROL ENCODING

Номер: CA0003069419A1
Принадлежит: RICHES, MCKENZIE & HERBERT LLP

The present disclosure relates to novel and advantageous systems and methods of processing and managing data in critical or large-scale systems, such as airliner, automobile, space station, power plant, and healthcare systems. Particularly, the present disclosure relates to a rules-driven data and control method mapped onto complementary physical architecture for a more reliable operational system. By maintaining an algebraic encoding of control and application data at fine granularities, whether static or in transit, it is possible to detect, isolate, and correct many errors that would otherwise go undetected. This more dynamic and precise method addresses cases where deteriorating conditions or cataclysmic events affect much of the system simultaneously, including the control system itself.

Подробнее
07-01-2016 дата публикации

SYSTEMS AND METHODS FOR FAULT TOLERANT COMMUNICATIONS

Номер: CA0002957749A1
Принадлежит: BORDEN LADNER GERVAIS LLP

Подробнее
16-02-2021 дата публикации

METHOD FOR PROCESSING ACQUIRE LOCK REQUEST AND SERVER

Номер: CA2960982C

The present invention provides a technique for processing a lock request. A first lock server is a takeover lock server of a second lock server. The first lock server enters a silent state after learning that a fault occurs in the second lock server, where a silent range is a resource for which the second lock server has assigned permission. The first lock server receives an acquire lock request that is originally sent to the second lock server, and the first lock server assigns lock permission for a corresponding resource according to the acquire lock request if the second lock server has not assigned resource for the resource. By means of this solution, an impact range of a fault occurring in a lock server can be reduced, and stability of a lock management system is improved.

Подробнее
18-03-2003 дата публикации

METHOD AND APPARATUS FOR PROVIDING FAILURE DETECTION AND RECOVERY WITH PREDETERMINED REPLICATION STYLE FOR DISTRIBUTED APPLICATIONS IN ANETWORK

Номер: CA0002273523C

An application module (A) running on a host computer in a computer netwo rk is failure-protected with one or more backup copies that are operative on other host computers in the network. In order to effect fault protection, the applicati on module registers itself with a ReplicaManager daemon process (112) by sending a registration message, which message, in addition to identifying the registering application module and the host computer on which it is running, includes th e particular replication strategy (cold backup, warm backup, or hot backup) an d the degree of replication associated with that application module. The backup copies are then maintained in a fail-over state according to the registered replication strategy. A WatchDog daemon (113), running on the same host computer as the registered application periodically monitors the registered application to detect failures. When a failure, such as a crash or hangup of the application modul e, is detected, the failure is reported to ...

Подробнее
30-05-1975 дата публикации

Номер: CH0000562476A5
Автор:
Принадлежит: BURROUGHS CORP, BURROUGHS CORP.

Подробнее
31-01-1975 дата публикации

ELEKTRISCHE SCHALTUNGSANORDNUNG ZUR STEUERUNG EINES RECHNERS.

Номер: CH0000558569A
Автор:
Принадлежит: KENT LTD G, KENT, GEORGE, LTD.

Подробнее
31-05-1977 дата публикации

Номер: CH0000588121A5
Автор:
Принадлежит: BURROUGHS CORP, BURROUGHS CORP.

Подробнее
15-06-2016 дата публикации

Flight Data-Interface Device.

Номер: CH0000710509A2
Принадлежит:

Die Erfindung betrifft eine Flugdaten-Schnittstelleneinrichtung (1) mit einer Terminal-Schnittstelle zum Koppeln der Flugdaten-Schnittstelleneinrichtung (1) an ein Terminal (3), insbesondere einen Check-in-Schalter und/oder eine Arbeitsstation eines Mitarbeiters am Flugsteig, und eine Abfertigungssystem (DCS)-Schnittstelle (10), wobei die DCS-Schnittstelle (10) dazu ausgelegt ist, um die Flugdaten-Schnittstelleneinrichtung (1) mit mindestens einem entfernten DCS (2) zu koppeln. Die Flugdaten-Schnittstelleneinrichtung (1) umfasst weiter ein Reservesystem (12). Das Reservesystem (12) ist operativ mit der Terminal-Schnittstelle (11) und der DCS-Schnittstelle (10) gekoppelt, wobei das Reservesystem (12) dazu ausgelegt ist, um in einem Standby-Modus zu arbeiten, wenn eine operative Kopplung mit dem DCS (2) vorhanden ist, und weiter dazu ausgelegt ist, um, in einem Fall, wenn die operative Kopplung mit dem DCS (2) vorübergehend unterbrochen ist, vorübergehend in einem aktiven Modus zu arbeiten ...

Подробнее
15-06-2017 дата публикации

Flight Data Interface Device.

Номер: CH0000710509B1
Принадлежит: ZAMAR AG

Die Erfindung betrifft eine Flugdaten-Schnittstelleneinrichtung (1) mit einer Terminal-Schnittstelle zum Koppeln der Flugdaten-Schnittstelleneinrichtung (1) an ein Terminal (3), insbesondere einen Check-In-Schalter und/oder eine Arbeitsstation eines Mitarbeiters am Flugsteig, und eine Abfertigungssystem(DCS)-Schnittstelle (10), wobei die DCS-Schnittstelle (10) dazu ausgelegt ist, um die Flugdaten-Schnittstelleneinrichtung (1) mit mindestens einem entfernten DCS (2) zu koppeln. Die Flugdaten-Schnittstelleneinrichtung (1) umfasst weiter ein Reservesystem (12). Das Reservesystem (12) ist operativ mit der Terminal-Schnittstelle (11) und der DCS-Schnittstelle (10) gekoppelt, wobei das Reservesystem (12) dazu ausgelegt ist, um in einem Standby-Modus zu arbeiten, wenn eine operative Kopplung mit dem DCS (2) vorhanden ist, und weiter dazu ausgelegt ist, um, in einem Fall, wenn die operative Kopplung mit dem DCS (2) vorübergehend unterbrochen ist, vorübergehend in einem aktiven Modus zu arbeiten ...

Подробнее
02-04-2014 дата публикации

Thread sparing between cores in a multi-threaded processor

Номер: CN103699512A
Принадлежит:

The present invention relates generally to error recovery in a multi-threaded processor, and more specifically to a multi-threaded, multi-core processor configured to spare threads between cores. The method includes determining, by a first core of the processor, that a number of recovery attempts made by a first thread on the first core has exceeded a recovery attempt threshold; sending, by the first core to a processor controller in the processor, a request to transfer the first thread to another core of the processor; based on receiving the request, selecting, by the processor controller, a second core from a plurality of cores of the processor to receive the first thread from the first core, wherein the second core is selected based on the second core having an idle thread; transferring a last good architected state of the first thread from an error recovery logic of the first core to the second core; loading the last good architected state of the first thread by the idle thread on the ...

Подробнее
07-12-1984 дата публикации

SYSTEME DE TRAITEMENT PAR MULTIPROCESSEURS

Номер: FR0002547082A
Принадлежит:

LA PRESENTE INVENTION SE RAPPORTE A UN SYSTEME DE TRAITEMENT PAR MULTIPROCESSEURS. CE SYSTEME COMPREND PLUSIEURS MODULES 33 DE PROCESSEURS INTERCONNECTES PAR DEUX LIGNES 35, CONTROLEES CHACUNE PAR UN CONTROLEUR 37. DES ORGANES DE COMMANDE 41 A ENTREES MULTIPLES 43 SONT CONNECTES PAR DES LIGNES 39 AUX MODULES 33 DE PROCESSEURS. CHAQUE ORGANE DE COMMANDE COMPREND UNE LOGIQUE QUI ASSURE LA SELECTION D'UNE ENTREE D'ACCES A LA FOIS. LES MODULES DE PROCESSEURS CONTIENNENT CHACUN UNE COMMANDE 55 ENTRE PROCESSEURS, UNE UNITE CENTRALE 105, UNE MEMOIRE 107 ET UN CANAL 109 D'ENTREES-SORTIES. UN SYSTEME D'ALIMENTATION ELECTRIQUE ASSURE UN FONCTIONNEMENT CONTINU DU RESTE DU SYSTEME DE TRAITEMENT EN CAS DE PANNE D'UNE ALIMENTATION D'UNE PARTIE DU SYSTEME. DANS LE CADRE D'UNE EXPLOITATION EN MULTIPROGRAMMATION, LES PROGRAMMES SONT PROTEGES CONTRE LES ACTIONS DES UTILISATEURS. LA PRESENTE INVENTION EST NOTAMMENT APPLICABLE A LA GESTION DE POINTS DE VENTE AUTOMATISES, AUX INVENTAIRES ET AUX OPERATIONS DE ...

Подробнее
22-12-1978 дата публикации

CONTROL DEVICE IMPROVES Of EQUIPMENT OF COMMUTATION

Номер: FR0002392569A2
Принадлежит:

Подробнее
29-11-1985 дата публикации

SYSTEM Of INPUTS/OUTPUTS FOR a SYSTEM OF TREATMENT BY MULTIPROCESSORS

Номер: FR0002485227B1
Автор:
Принадлежит:

Подробнее
21-07-2005 дата публикации

DUAL-AAA SERVER USING OMP AUDITOR FOR AUDITING ACTIVE/STANDBY STATE OF AAA SERVER AND OPERATION METHOD THEREOF

Номер: KR1020050075484A
Автор: KIM, YEON JUNG
Принадлежит:

PURPOSE: A dual-AAA(Authentication Authorization Account) server using an OMP(Operation and Maintenance Processor) auditor for auditing an active/standby state of the AAA server and an operation method thereof are provided to quickly manage a failure by making the OMP auditor continuously detect the active/standby state of the dual-AAA server. CONSTITUTION: The first and the second AAA server(20,30) exclusively have the active state or the standby state. The first OMP(21) executes an instruction inputted to the first AAA server from a user interface(10) and returns an execution result. The first OMP auditor(22) checks and transfers the active/standby state of the first AAA server to the user interface. The second OMP(32) executes the instruction inputted to the second AAA server from the user interface and returns the execution result. The second OMP auditor(33) checks and transfers the active/standby state of the second AAA server to the user interface. © KIPO 2006 ...

Подробнее
02-09-2005 дата публикации

BACKUP FIRMWARE IN A DISTRIBUTED SYSTEM

Номер: KR1020050088172A
Принадлежит:

In a distributed system of modules in a network, each module having an associated processor node comprising a processing unit for operating the associated module. The processing unit comprises a processor interface for communication in the network; and nonvolatile memory for storing code for the processing unit for operating the associated module, and for storing backup code for at least one other processing unit of another processor node in the network, the backup code for operating an associated module of the another processor node. In response to a request, the processing unit supplies the backup code to a processor node to be used to restIn a distributed system of modules in a network, each module having an associated processor node comprising a processing unit for operating the associated module. The processing unit comprises a processor interface for communication in the network; and nonvolatile memory for storing code for the processing unit for operating the associated module, and ...

Подробнее
17-11-2011 дата публикации

METHOD FOR VISUALIZING SERVER RELIABILITY, COMPUTER SYSTEM, AND MANAGEMENT SERVER

Номер: WO2011142042A1
Принадлежит:

The disclosed method quantifies the reliability of hardware and software installed in a physical server, and calculates an indication of the reliability of each of a plurality of physical servers. Configuration information, failure information, and running information of the hardware and software installed in the physical servers are collected while taking into account lifecycle information of the physical servers, and an indicator for the reliability of the hardware and software is quantified and calculated. Furthermore, an indicator of the reliability of a physical server as a whole is determined based on the indicators for the reliability of the hardware and the software.

Подробнее
17-08-2006 дата публикации

TEST FLIGHT ON-BOARD PROCESSING SYSTEM AND METHOD

Номер: WO2006085028A2
Принадлежит:

The invention relates to a test flight on-board processing system, comprising an assembly of simple interchangeable processors (50) connected to an Ethernet connection (51), with identical software installed on each processor.

Подробнее
29-11-2001 дата публикации

A NETWORK DEVICE FOR SUPPORTING MULTIPLE UPPER LAYER NETWORK PROTOCOLS OVER A SINGLE NETWORK CONNECTION

Номер: WO0000190843A2
Принадлежит:

The present invention provides a network device with at least one physical interface or port that is capable of transferring network packets including data organized into one or more upper layer network protocols (e.g., ATM, MPLS, IP, Frame Relay, Voice, Circuit Emulation). The port is capable of being connected to a network attachment to allow the network device to transfer network packets with other network devices. Network packets are received by the port and a port subsystem in accordance with a physical layer network protocol and transferred to forwarding subsystems within the network device in accordance with the upper layer protocols into which the network packet data has been organized. For example, data organized in accordance with ATM over SONET, MPLS ove SONET and IP over SONET may be transferred over one network attachment to one network device port. Network packets including data organized in accordance with ATM are then transferred to one or more ATM forwarding subsystems, ...

Подробнее
11-04-2017 дата публикации

Biasing active-standby determination

Номер: US0009619349B2

In computing systems that provide multiple computing domains configured to operate according to an active-standby model, techniques are provided for intentionally biasing the race to gain mastership between competing computing domains, which determines which computing domain operates in the active mode, in favor of a particular computer domain. The race to gain mastership may be biased in favor of a computing domain operating in a particular mode prior to the occurrence of the event that triggered the race to gain mastership. For example, in certain embodiments, the race to mastership may be biased in favor of the computing domain that was operating in the active mode prior to the occurrence of an event that triggered the race to gain mastership.

Подробнее
02-08-2007 дата публикации

Software duplication

Номер: US20070180312A1
Принадлежит: Avaya Technology LLC

In one embodiment, the present invention is directed to a software duplication process in which write faults are used to track memory areas that have been changed by the active processor.

Подробнее
15-07-2004 дата публикации

Backup firmware in a distributed system

Номер: US20040139294A1

In a distributed system of modules in a network, each module having an associated processor node comprising a processing unit for operating the associated module. The processing unit comprises a processor interface for communication in the network; and nonvolatile memory for storing code for the processing unit for operating the associated module, and for storing backup code for at least one other processing unit of another processor node in the network, the backup code for operating an associated module of the another processor node. In response to a request, the processing unit supplies the backup code to a processor node to be used to restore the code for operating the module associated with the requesting processor node.

Подробнее
12-04-1994 дата публикации

Network management system capable of easily switching from an active to a backup manager

Номер: US0005303243A1
Автор: Anezaki; Akihiro
Принадлежит: NEC Corporation

In a network system comprising first through N-th agents, each of which performs a management operation on a management object (N representing a predetermined natural number which is not less than two), an active manager for managing the first through the N-th agents, and a backup manager for managing the first through the N-th agents when a fault occurs in the active manager, the active manager includes a fault detecting unit for detecting the fault in the active manager to produce a fault detection signal. Supplied with the fault detection signal, the backup managing process delivers a name signal representative of a backup manager name assigned to the backup manager as an indicated name to the first through the N-th agents through backup transmitting/receiving unit and the circuit switching device. In each of the first through the N-th agents, an agent process stores, as a stored manager name, the indicated name in a memory unit on reception of the name signal.

Подробнее
30-08-1983 дата публикации

Automatic line termination in distributed industrial process control system

Номер: US0004402082A
Автор:
Принадлежит:

A control system for controlling an industrial process includes a plurality of remotely located process control units (remotes) each coupled to an associated input/output device(s) and adapted to communicate with one another through a dual channel communications link. Digital information in the form of data and control information blocks is transmitted between remotes with the blocks transmitted twice on each channel of the dual channel communications link. The destination remote checks the block validity on one of the dual channels and, if valid, responds with an acknowledgement signal (ACK), and, if invalid, tests the blocks on the other, alternate channel and then responds with an acknowledgement or non-acknowledgement signal (NAK) depending on whether the data blocks tested on the alternate channel are found valid or invalid. Each remote in the system is adapted to test the communication integrity of both channels of the communication link between it and its immediately adjacent remotes ...

Подробнее
21-04-2009 дата публикации

Cascading failover of a data management application for shared disk file systems in loosely coupled node clusters

Номер: US0007523345B2

Disclosed is a mechanism for handling failover of a data management application for a shared disk file system in a distributed computing environment having a cluster of loosely coupled nodes which provide services. According to the mechanism, certain nodes of the cluster are defined as failover candidate nodes. Configuration information for all the failover candidate nodes is stored preferably in a central storage. Message information including but not limited to failure information of at least one failover candidate node is distributed amongst the failover candidate nodes. By analyzing the distributed message information and the stored configuration information it is determined whether to take over the service of a failure node by a failover candidate node or not. After a take-over of a service by a failover candidate node the configuration information is updated in the central storage.

Подробнее
08-11-2018 дата публикации

DISASTER RECOVERY SERVICE

Номер: US20180322022A1
Принадлежит:

A customer may use a disaster recovery service to generate a disaster recovery scenario in order to make certain resources available to the customer in the event of a data region failure. The customer may specify a recovery point objective, a recovery time objective and a recovery data region for the scenario. Accordingly, the disaster recovery service may coordinate with one or more other services provided by the computing resource service provider to reproduce the customer resources and other resources necessary to support the customer resources. These reproduced resources may be transferred to the recovery data region based at least in part on the parameters specified by the customer. In the event of a data region failure, the disaster recovery service may update the domain name system to resolve any customer requests for the customer resources to the recovery data region.

Подробнее
05-11-2019 дата публикации

High availability state machine and recovery

Номер: US0010467100B2

Embodiments of the present invention provide systems and methods for recovering a high availability storage system. The storage system includes a first layer and a second layer, each layer including a controller board, a router board, and storage elements. When a component of a layer fails, the storage system continues to function in the presence of a single failure of any component, up to two storage element failures in either layer, or a single power supply failure. While a component is down, the storage system will run in a degraded mode. The passive zone is not serving input/output requests, but is continuously updating its state in dynamic random access memory to enable failover within a short period of time using the layer that is fully operational. When the issue with the failed zone is corrected, a failback procedure brings the system back to a normal operating state.

Подробнее
19-07-2007 дата публикации

Electronic device for automatically continuing to provide service

Номер: US2007169128A1
Принадлежит:

An object to the present invention is to provide an electronic device capable of continuing to provide a service. The electronic device of the present invention comprises an application recognizing section for recognizing an application held by the other electronic device, an application unexecutability detecting section for detecting whether or not the application recognized by the application recognizing section is unexecutable in the other electronic device, an application execution determining section for determining whether or not a substitute application which can substitute for an application which the application unexecutability detecting section has determined that is unexecutable, is to be executed, a substitute application holding determining section for determining whether or not the substitute application which can substitute for the application determined to be unexecutable is held in the electronic device, and an application executing section for executing the substitute ...

Подробнее
28-03-2019 дата публикации

SCALABLE BYZANTINE FAULT-TOLERANT PROTOCOL WITH PARTIAL TEE SUPPORT

Номер: US2019097790A1
Принадлежит:

A method for establishing consensus between a plurality of distributed nodes connected via a data communication network includes preparing a set of random numbers, wherein each of the random numbers is a share of an initial secret, wherein each share of the initial secret corresponds to one of a plurality of active nodes; encrypting, in order to generate encrypted shares of the initial secret, each respective share of the initial secret with a shared key corresponding to respective one of the plurality of active nodes to which the respective share corresponds; applying a bitwise xor function to the set of random numbers to provide the initial secret; and binding the initial secret to a last counter value to provide a commitment and a signature for the last counter. The method includes generating shares of a second and of a plurality of subsequent additional secrets by iteratively applying a hash function.

Подробнее
25-11-2004 дата публикации

Disaster recovery for processing resources using configurable deployment platform

Номер: US20040236987A1
Принадлежит: Egenera, Inc.

A system and method for disaster recovery for processing resources using configurable deployment platform. A primary site has a configuration of processing resources. A specification of the configuration of processing resources of the primary site is generated. The specification is provided to a fail-over site that has a configurable processing platform capable of deploying processing area networks in response to software commands. Using the specification, software commands are generated to the configurable platform to deploy processing area network corresponding to the specifications.

Подробнее
31-10-2013 дата публикации

Redundant Automation System and Method for Operating the Redundant Automation System

Номер: US20130290776A1
Принадлежит: Siemens Aktiengesellschaft

A redundant automation system and a method for operating the redundant automation system which is provided with a first subsystem and a second subsystem that each process a control program while controlling a technical process, one of these subsystems operating as a master and the other subsystem operating as a slave, and the slave assuming the function of the master if the master fails such that it becomes possible to dispense with temporally synchronous communication between the participants with regard to the synchronization of the program processing in the two subsystems, thus reducing the communication load.

Подробнее
17-09-2020 дата публикации

CONTAINER DOCKERFILE AND CONTAINER MIRROR IMAGE QUICK GENERATION METHODS AND SYSTEMS

Номер: US20200293354A1
Принадлежит: GENETALKS BIO-TECH (CHANGSHA) CO., LTD.

The invention discloses a container Dockerfile and container mirror image quick generation methods and systems. The container Dockerfile quick generation method includes the steps of for a to-be-packaged target application, running and performing tracking execution on the target application, and recording operation system dependencies of the target application in the running process; organizing and constructing a file list required for packaging the target application to a container mirror image; and according to the file list required for packaging the target application to the container mirror image, generating a Dockerfile and container mirror image file creation directory used for packaging the target application to the container mirror image. Any target application can be automatically packaged by the invention to a container; the construction of an executable minimal environmental closure of the target application is finished; the packaged container is smaller than a manually made ...

Подробнее
30-04-2019 дата публикации

Distributed computing system failure detection

Номер: US0010275326B1
Принадлежит: Amazon Technologies, Inc., AMAZON TECH INC

A technology is described for detecting a failure of a distributed system component. An example method may include registering a declarative file that may identify a distributed computing cluster in a service provider environment and provide failure criteria used to detect a failure of a distributed system component included in the distributed computing cluster. Distributed system components included in the distributed computing cluster may then be identified using information included in the declarative file. A distributed system component included in the distributed computing cluster may then be queried according to query criteria provided by the declarative file and a failure state of the distributed system component included in the distributed computing cluster may be identified based in part on a result of querying the distributed system component.

Подробнее
04-04-2017 дата публикации

Managing server processes with proxy files

Номер: US0009612927B1

Computer-implemented methods and systems are provided for detecting a failed server. The computer-implemented method includes creating a proxy file for each server of a plurality of servers in an active state and assigning a timestamp to each proxy file of each server of the plurality of servers. The computer-implemented method further includes permitting each server to inspect each timestamp of each proxy file of each server of the plurality of servers and determining whether the timestamp assigned to each proxy file of each server of the plurality of servers exceeds a predetermined threshold. The computer-implemented method further includes, in response to a timestamp of a proxy file of a failed server exceeding the predetermined threshold, allowing another server of the plurality of servers to complete remaining work of the failed server.

Подробнее
04-05-2021 дата публикации

Inter-application communication via signal-routes

Номер: US0010999125B1
Принадлежит: Juniper Networks, Inc., JUNIPER NETWORKS INC

A system and method for communicating between applications using a routing process. A set of one or more signal-routes are defined on a network device, including a first signal-route. Each signal-route is associated with a state of an application to be executed on the network device, wherein the first signal-route is associated with a first application state of the application. The network device detects, within the application executing within an application layer of the network device, a change in the first application state and notifies other applications of the change in the first application state. Notifying includes modifying the first signal-route, wherein modifying includes adding the first signal-route to or removing the first signal-route from a Routing Information Base (RIB) and advertising the change in the RIB.

Подробнее
01-06-2021 дата публикации

Information processing system and control apparatus

Номер: US0011023337B2
Принадлежит: FUJITSU LIMITED, FUJITSU LTD

An information processing system includes a plurality of control apparatuses communicably coupled to each other. A first control apparatus of the plurality of control apparatuses includes a first memory configured to store first instructions and a first processor configured to operate using standby power before a power-on selection is made. The first processor executes the first instructions causing a process including collecting first identification information of each of the plurality of control apparatuses other than the first control apparatus. The process includes storing the first identification information in the first memory. The process includes determining a role of the first control apparatus based on a comparison result derived by comparing second identification information of the first control apparatus with the first identification information.

Подробнее
12-05-2015 дата публикации

Continuous data replication

Номер: US0009032160B1

In a first embodiment, a method and computer program product for use in a storage system comprising quiescing IO commands the sites of an ACTIVE/ACTIVE storage system, the active/active storage system having at least two storage sites communicatively coupled via a virtualization layer, creating a change set, unquiescing IO commands by the virtualization layers, transferring data of a change set to the other sites of the active/active storage system by the virtualization layer, and flushing the data by the virtualization layer. In a second embodiment, a method and computer program product for use in a storage system comprising fracturing a cluster of an active/active storage system; wherein the cluster includes at least two sites, stopping IO on a first site of the cluster; and rolling to a point in time on the first site.

Подробнее
12-05-2015 дата публикации

Guaranteed in-flight SQL insert operation support during an RAC database failover

Номер: US0009031969B2

The present invention is directed to methods and systems of implementing a guaranteed SQL insert operation. In one embodiment, the method may include initiating an SQL insert operation for a database, receiving an SQL exception indicating that a failover for the database has occurred, and in response to the SQL exception, caching the SQL insert operation and caching the SQL insert operation as an SQL merge operation. The method further includes determining that a primary key is associated with the SQL insert operation, and in response to determining that a primary key is associated with the SQL insert operation, executing the SQL merge operation.

Подробнее
02-02-2021 дата публикации

Fault tolerance method and system for virtual machine group

Номер: US0010909002B2

A fault tolerance method and system for a virtual machine group is proposed. The method includes: establishing fault tolerance backup connections of virtual machines between a virtual machine hypervisor of at least one primary host and a virtual machine hypervisor of at least one backup host to perform fault tolerance backups of the virtual machines, wherein the plurality of virtual machines are included in a fault tolerance group; when a synchronizer determines that a failover of at least one first virtual machine among the primary virtual machines in the fault tolerance group is being performed. Informing, by the synchronizer, to perform a failover of other remaining primary virtual machines among the primary virtual machines in the fault tolerance group, or to return other remaining primary virtual machines among the primary virtual machines in the fault tolerance group back to a last fault tolerance backup state of each and continue performing fault tolerance backups of the other remaining ...

Подробнее
27-04-2006 дата публикации

Consistent cluster operational data in a server cluster using a quorum of replicas

Номер: US2006090095A1
Принадлежит:

A method and system for increasing server cluster availability by requiring at a minimum only one node and a quorum replica set of replica members to form and operate a cluster. Replica members maintain cluster operational data. A cluster operates when one node possesses a majority of replica members, which ensures that any new or surviving cluster includes consistent cluster operational data via at least one replica member from the immediately prior cluster. Arbitration provides exclusive ownership by one node of the replica members, including at cluster formation, and when the owning node fails. Arbitration uses a fast mutual exclusion algorithm and a reservation mechanism to challenge for and defend the exclusive reservation of each member. A quorum replica set algorithm brings members online and offline with data consistency, including updating unreconciled replica members, and ensures consistent read and update operations.

Подробнее
04-05-2023 дата публикации

MODULAR ARCHITECTURE AVIONICS

Номер: US20230136484A1
Принадлежит: Maxar Space LLC

A distributed computer system for a spacecraft is disclosed. The system has multiple computer nodes, each controlling a different aspect of a mission of the spacecraft. Each node includes a control circuit(s) that controls a set of components, a router processor, and a programmable processor. The programmable processor of each respective computer node issue commands to the control circuit(s) of the respective computer node to carry out an aspect of the mission associated with the respective computer node. Upon failure of the programmable processor in a particular computer node, a healthy programmable processor send commands to the router processor in the particular computer node The router processor of the particular computer node routes the commands received from the remote programmable processor to the control circuit(s) in the particular computer node to control the set of components to carry out the aspect of the mission associated with particular computer node.

Подробнее
12-12-2023 дата публикации

Method and system for active failure recovery of single node improved based on PBFT algorithm, computer device and storage medium

Номер: US0011841778B2
Принадлежит: HANGZHOU QULIAN TECHNOLOGY CO., LTD.

A method for active failure recovery of a single node improved based on PBFT algorithm is disclosed. The abnormal node first initiates a view change request, if (2f+1) view change requests containing the same view value cannot be received within a specified period of time, the abnormal node enters a state to be recovered, and the node to be recovered initiates a recovery request to all nodes of the whole network, waits for replies from normal nodes and counts the number of replies, calculates a height of stable checkpoint of the whole network after receiving replies contain the same view value from (2f+1) nodes, and update the state thereof to finally complete the recovery. This method solves an inherent problem in the PBFT algorithm that a failure in a single node cannot be recovered autonomously, so that a practicability of the PBFT algorithm is greatly improved.

Подробнее
20-08-2008 дата публикации

NON-STOP TRANSACTION PROCESSING SYSTEM

Номер: EP0001959347A1
Принадлежит:

... [Object] In a business application system, it is important to shorten a time of halt of a service due to a failure in a server, a network or the like. As current failure addressing is generally a method by failure-detectionandtake-over. Asthefailure-detectionrequires at least ten seconds to a few minuets, a service halt time is more than that, which causes a significant problem. [Solution] The present invention proposes a system for resending a process to a backup server farm from a client without waiting for the failure-detection, if no reply is received for a certain time. The transaction processing mechanism of the present invention has a transaction start processing mechanism in which an exclusive control using a processing authority Token and data consistency are combined, and a commit processing mechanism in which determination on whether a commit is available or not based on a distributed agreement and replication of updated data. With the mechanisms, a system for shortening a service ...

Подробнее
08-01-1986 дата публикации

Computer or processor control systems

Номер: EP0000062463B1
Принадлежит: BRITISH TELECOMMUNICATIONS

Подробнее
16-08-2007 дата публикации

COMPUTER SYSTEM MANAGEMENT METHOD, MANAGEMENT SERVER, COMPUTER SYSTEM, AND PROGRAM

Номер: JP2007207219A
Принадлежит:

PROBLEM TO BE SOLVED: To provide a method of controlling switching of computers according to a cause of failure without preparing one standby node for each active node. SOLUTION: For n active nodes (200), m standby nodes (300) of different characteristics (in terms of CPU performance, I/O performance, communication performance, and the like) are prepared. The m standby nodes (300) are assigned in advance with priority levels to be failover targets for each cause of failure. When a failure occurs in one active node (200), a standby node that can remove the cause of the failure is selected out of the m standby nodes (300) to take over data processing. COPYRIGHT: (C)2007,JPO&INPIT ...

Подробнее
10-12-2016 дата публикации

СПОСОБ РЕЗЕРВИРОВАНИЯ СИСТЕМ И УСТРОЙСТВО ЕГО РЕАЛИЗАЦИИ

Номер: RU2604335C2

Группа изобретений относится к области вычислительной техники и может быть использована в сложных радиотехнических комплексах, автоматизированных системах управления. Техническим результатом является повышение надежности. Устройство содержит рабочий элемент, элементы сравнения, суммирующее устройство, делитель, вычитающие устройства, блок управления, устройства коммутации, контроллер. 2 н.п. ф-лы, 5 ил., 1 прил.

Подробнее
27-01-2012 дата публикации

ИЗБЫТОЧНОСТЬ МОБИЛЬНЫХ УЗЛОВ БАЗОВОЙ СЕТИ

Номер: RU2010129966A
Принадлежит:

... 1. Устройство, сконфигурированное с возможностью функционирования в сети связи как прокси-сервер (1) пула, маршрутизирующее трафик сигнализации между первым узлом и одним из набора вторых узлов пула, причем устройство дополнительно сконфигурировано с возможностью функционирования в одном из активного состояния и пассивного состояния в отношении первого узла, и когда в упомянутом активном состоянии, с возможностью отправки периодического сигнала (2, 3, 7) для, по меньшей мере, одного из упомянутых вторых узлов пула для ретрансляции в одноранговый прокси-сервер пула, и когда в упомянутом пассивном состоянии, с возможностью приема (2) периодического сигнала от однорангового прокси-сервера пула, ретранслируемого через, по меньшей мере, один из упомянутых вторых узлов пула, причем при событии, когда не принимается никакой периодический сигнал или принимаемый сигнал не удовлетворяет некоторому минимальному критерию, когда в пассивном состоянии, устройство сконфигурировано (2) с возможностью активации ...

Подробнее
15-01-2024 дата публикации

Система и способ обеспечения отказоустойчивого взаимодействия узлов сети с хранилищем файлов

Номер: RU2811674C1

Изобретение относится к системам и способам обеспечения отказоустойчивого взаимодействия узлов сети с хранилищем файлов. Технический результат заключается в обеспечении отказоустойчивости взаимодействия узлов сети с хранилищем файлов. Технический результат достигается путем использования основного и дополнительного файлов из хранилища файлов для перевода узлов сети из пассивного состояния в активное и обратно. Со своей стороны, хранилище файлов управляет длительностью использования файлов и освобождением от использования основного и дополнительного файла в принудительном порядке. 2 н. и 6 з.п. ф-лы, 3 ил.

Подробнее
24-06-2021 дата публикации

Verbesserung der Betriebsparameter eines Rechensystems im Fahrzeug

Номер: DE102019134872A1
Принадлежит:

Die Erfindung bezieht sich auf ein Rechnersystem (10) zum Zuweisen von Aufgaben in dem Rechnersystem (10) sowie auf ein zugehöriges Verfahren. Das Rechnersystem (10) ist in einem Fahrzeug anordenbar und weist mehrere miteinander verbundene Rechnerkomponenten (20, 30, 40), mehrere Überwachungssensoren (22, 32, 42), mehrere Systeminformationskomponenten (24, 34, 46) und mehrere Steuerkomponenten auf (26, 36, 46).

Подробнее
22-07-2021 дата публикации

Verbesserung der Betriebsparameter eines Rechensystems im Fahrzeug

Номер: DE102019134872B4
Принадлежит: HELLA GMBH & CO KGAA, HELLA GmbH & Co. KGaA

Die Erfindung bezieht sich auf ein Rechnersystem (10) zum Zuweisen von Aufgaben in dem Rechnersystem (10) sowie auf ein zugehöriges Verfahren. Das Rechnersystem (10) ist in einem Fahrzeug anordenbar und weist mehrere miteinander verbundene Rechnerkomponenten (20, 30, 40), mehrere Überwachungssensoren (22, 32, 42), mehrere Systeminformationskomponenten (24, 34, 46) und mehrere Steuerkomponenten auf (26, 36, 46).

Подробнее
26-03-1997 дата публикации

Fault tolerant remote procedure call system

Номер: GB0002305087A
Принадлежит:

When a client sends an RPC (remote procedure call) request that requests a server for a service, the client adds identification information to the RPC request. When an active server receives an RPC request, it stores the identification information thereof in a stable area that is not destroyed even if a defect takes place in the client or the server, and executes the requested service. When a defect takes place in the active server, the backup server takes over the process of the active server. A PALIB of the backup server compares the identification information of the RPC request resent from the client with the identification information in the stable area. When they match, the PALIB determines that the RPC request is redundant. The backup server performs a redundant process and sends back the correct result to the client.

Подробнее
07-05-2003 дата публикации

Failover mechanism involving blocking of access of a malfunctioning server and continuing monitoring to enable unblocking of access if server recovers

Номер: GB0002381713A
Принадлежит:

A method and apparatus for blocking access of a malfunctioning server to a data storage facility. Characteristics such as IAmAlive signals from a server are monitored and when out of profile a malfunction is indicated and data access of that server is inhibited. Characteristics continue to be monitored for a return from the malfunction. The system is used in a resilient cluster of servers to shut out a malfunctioning sever and enable its recovery to be indicated so as to enable readmittance to the cluster.

Подробнее
08-08-2007 дата публикации

Software duplication

Номер: GB0002434890A
Принадлежит:

Software duplication system (100) having an active processing system (104a) comprising a processor (112) and a memory system (108) having at least one memory area (120a-y), a duplication system (128) and a standby processing system (104b-n) operable to perform the functions of the active processing system (104a). The duplication system (128) is operable to set a number of memory areas (120a-y) to a read-only state. In response to an attempt to write to these memory areas (120a-y) a notification that a write fault has occurred is produced. The state of the selected memory areas (120a-y) is changed to a writeable state and at least some of the alterations to the selected memory areas (120a-y) are provided to the standby processing system (120a-y) which then replicates the changes in appropriate locations in its memory. This way write faults are used to track memory areas that have been changed by an application processes (116a-y) in the active processor (104a) during a selected interval.

Подробнее
05-11-1980 дата публикации

Transfer system for multi-variable control units

Номер: GB0002045968A
Принадлежит:

An automatic transfer method and apparatus for a multi-variable control unit having at least two process control units is disclosed which initiates transfer from the selected controller to the non-selected controller through failure of the selected controller to produce transfer preventing signals.

Подробнее
28-09-2005 дата публикации

Method for validating system changes by use of a replicated system as a system testbed

Номер: GB0000516997D0
Автор:
Принадлежит:

Подробнее
24-04-2008 дата публикации

Mux BOP database mirroring

Номер: AU2007311054A1
Автор: MILNE ERIC, ERIC MILNE
Принадлежит:

Подробнее
19-10-2017 дата публикации

Fast crash recovery for distributed database systems

Номер: AU2014235433C1
Принадлежит: Phillips Ormonde Fitzpatrick

A distributed database system may implement fast crash recovery. Upon recovery from a database head node failure, a connection with one or more storage nodes of a distributed storage system storing data for a database implemented by the database head node may be established. Upon establishment of the connection with the storage nodes, that database may be made available for access, such as for various access requests. In various embodiments, redo log records may not be replayed in order to provide access to the database. In at least some embodiments, the storage nodes may provide a current state of data stored for the database in response to requests.

Подробнее
16-05-1989 дата публикации

MULTICOMPUTER DIGITAL PROCESSING SYSTEM

Номер: CA0001254304A1
Принадлежит:

Подробнее
16-08-1977 дата публикации

DATA PROCESSING CONTROL SYSTEM

Номер: CA1015861A
Автор:
Принадлежит:

Подробнее
16-05-1989 дата публикации

MULTICOMPUTER DIGITAL PROCESSING SYSTEM

Номер: CA1254304A

A multiple computer digital processing system including several Local Buses positioned orthogonally to a Common Bus. Each Local Bus is connected to the Common Bus through a plugably connected Common Bus interface card to provide a transfer of information between Local Buses across the Common Bus. Computer cards, memory cards and other device cards may be plugably connected to the Local Bus to communicate with each other via the Local Buses and Common Bus. The number and types of cards connected and even the number of Local Buses connected to the Common Bus may be varied according to the requirements of each application. Additionally, the Common Bus includes a shared memory accessible by all devices and an InterComputer Interrupt circuit providing interrupts to the computer cards. Further the computer cards are plugably connectable to a Peripheral Bus to provide communications with peripheral devices located externally to the system. All cards connected to the Local Buses and Common Bus ...

Подробнее
16-02-2018 дата публикации

The cluster system and used in the cluster system to provide service in the usability of the method

Номер: CN0104427002B
Автор:
Принадлежит:

Подробнее
27-10-2010 дата публикации

Device and method for configuring redundancy in a supervisory process control system

Номер: CN0101076736B
Принадлежит:

A redundant host pair runtime arrangement is disclosed for a process control network environment. The environment includes a primary network through which process control information is transmitted. An active partner of a fail-over host pair operates on a first machine communicatively connected to the primary network, and the active partner hosts a set of executing application components. A standby partner of the fail-over host pair operates on a second machine communicatively connected to the primary network. The standby partner receives updates including engine synchronization data associated with the set of executing application components to facilitate taking over an active partner role in response to a fail-over event. The environment also includes a redundancy message channel, separate and distinct from the primary network. The redundancy message channel provides a communications path between the first machine and second machine facilitating passing the updates including engine synchronization ...

Подробнее
24-12-1981 дата публикации

SYSTEME D'ENTREES/SORTIES POUR UN SYSTEME DE TRAITEMENT PAR MULTIPROCESSEURS

Номер: FR0002485227A
Автор:
Принадлежит:

LA PRESENTE INVENTION SE RAPPORTE A UN SYSTEME D'ENTREESSORTIES POUR UN SYSTEME DE TRAITEMENT PAR MULTIPROCESSEURS. LE SYSTEME D'ENTREESSORTIES COMPREND PLUSIEURS MODULES DE PROCESSEURS SEPARES, CHAQUE PROCESSEUR COMPRENANT UNE UNITE CENTRALE ET UNE MEMOIRE, CERTAINS AU MOINS DES MODULES DE PROCESSEURS AYANT UN CANAL D'ENTREESSORTIES; UN ORGANE DE COMMANDE AU MOINS POUR COMMANDER LA TRANSMISSION DES DONNEES ENTRE UN MODULE DE PROCESSEUR ET UN APPAREIL PERIPHERIQUE; PLUSIEURS ENTREES DANS CHAQUE ORGANE DE COMMANDE ET PLUSIEURS LIGNES D'ENTREESSORTIES POUR CONNECTER CHAQUE ORGANE DE COMMANDE AFIN QU'IL SOIT ACCESSIBLE AUX DIFFERENTS MODULES DE PROCESSEURS; ET UN MOYEN DE LOGIQUE COMMUNE D'INTERFACE POUR ASSURER QU'UNE SEULE ENTREE A LA FOIS CONNECTE OPERATIONNELLEMENT L'ORGANE DE COMMANDE AU SYSTEME DE TRAITEMENT PAR MULTIPROCESSEURS. DANS LE CADRE D'UNE EXPLOITATION EN MULTIPROGRAMMATION, LES PROGRAMMES SONT PROTEGES CONTRE LES ACTIONS DES UTILISATEURS. LA PRESENTE INVENTION EST NOTAMMENT ...

Подробнее
24-12-1981 дата публикации

MEMORY SYSTEM FOR MODULE PROCESSORS

Номер: FR0002485228A1
Автор:
Принадлежит:

Подробнее
04-11-2016 дата публикации

리던던트 시스템 및 통신 유닛

Номер: KR1020160127835A
Автор: 다지마 나오야
Принадлежит:

... 본 발명은 제1 제어 장치와, 제1 제어 장치에 접속되고, 제1 제어 장치의 예비 장치인 제2 제어 장치와, 제1 제어 장치 및 제2 제어 장치에 접속되는 복수의 종속계 통신 유닛을 포함하여 구성되는 리던던트 시스템에 있어서, 제1 제어 장치는 복수의 종속계 통신 유닛 중에서 통신 이상의 감시 대상이 되는 감시 대상 유닛을 특정하기 위한 정보를 포함하는 파라미터와, 복수의 종속계 통신 유닛의 생존 확인용 데이터를 기억하는 기억부와, 생존 확인용 데이터에 기초하여 복수의 종속계 통신 유닛 중에 생존 미확인 유닛이 존재한다고 판단할 수 있고, 파라미터에 기초하여 당해 생존 미확인 유닛이 감시 대상 유닛이라고 판단할 수 있는 경우에는, 제1 제어 장치에서 제2 제어 장치로의 시스템 전환을 행하는 제어부를 구비한다.

Подробнее
13-11-2008 дата публикации

DYNAMIC CLI MAPPING FOR CLUSTERED SOFTWARE ENTITIES

Номер: WO000002008135875A1
Автор: TOEROE, Maria
Принадлежит:

Techniques for mapping availability management (AM) functions to software installation locations are described. An availability management function (AMF) can look-up a component type and determine software associated with that component. For a selected AMF node, the AMF software entity can then determine a pathname prefix associated with that software. The pathname prefix can then be used for various AM functions, e.g., instantiation of a new component or service unit.

Подробнее
26-01-2012 дата публикации

DUAL-CHANNEL HOT STANDBY SYSTEM AND METHOD FOR ACHIEVING DUAL-CHANNEL HOT STANDBY

Номер: WO2012009960A1
Принадлежит:

Disclosed are a dual-channel hot standby system and a method for achieving dual-channel hot standby, said system comprising a hot standby state management layer, an application processing layer and a data communication layer. The hot standby state management layer comprises two hot standby management units, the application processing layer comprises two application processors and the data communication layer comprises two communicators. The hot standby state management layer is used for controlling the settings of and switching between the main and standby states of the two application processors, for monitoring the working state of the data communication layer, and for achieving the control cycle synchronization of the two channels of the system. One hot standby management unit controls one application processor and constitutes a channel with the application processor. The data communication layer is used for receiving data from the outside and forwarding the data to the application processing ...

Подробнее
13-05-2014 дата публикации

Intra-realm AAA fallback mechanism

Номер: US0008726068B2

There is provided an intra-realm AAA (authentication, authorization and accounting) fallback mechanism, wherein the single global realm may be divided in one or more sub-realms. The thus presented mechanism exemplarily comprises detecting a failure of an authentication server serving at least one authentication client within a first sub-realm of a single-realm authentication system, and routing authentication messages of the at least one authentication client to a fallback authentication server within a second sub-realm of the single-realm authentication system, wherein routing may exemplarily comprise sub-realm based source routing.

Подробнее
12-06-2018 дата публикации

System and method for hierarchical interception with isolated environments

Номер: US0009996399B1

A system, method, computer program, and/or computer readable medium for providing hierarchical interception for applications within isolated environments. The computer readable medium includes computer-executable instructions for execution by a processing system. The computer-executable instructions may be for installing interceptors, configuring interceptors, preloading shared libraries, using trampoline functions, removal of interceptors, mapping between resources inside and outside the isolated environment, providing an interception database, loading the interception database, redirection of resources, and providing the hierarchy of interceptors.

Подробнее
21-05-2015 дата публикации

DATA CONFIGURATION AND MIGRATION IN A CLUSTER SYSTEM

Номер: US20150143066A1
Принадлежит:

A cluster system includes a plurality of computing nodes connected to a network. Each node is configured to access its own storage device, and to send and receive input/output (I/O) operations associated with its own storage device. Further, each node of the plurality of nodes may be configured to have a function of acting as a first node, which sends a first message to other nodes of the plurality of nodes. The first message may include configuration information indicative of a data placement of data on the plurality of nodes in the cluster system according to an event. Following receipt of the first message from the first node, each of the other nodes may be configured to determine, based at least in part on the configuration information, whether data stored on its own storage device is affected by the event.

Подробнее
03-06-2014 дата публикации

Warm standby appliance

Номер: US0008745171B1

A warm standby appliance is described herein. The warm standby appliance is coupled to a storage server which is coupled to one or more servers. When a server fails, the storage server transfers a backed up image to the warm standby appliance, so that the warm standby appliance is able to replicate the failed server. While the failed server is inaccessible, the warm standby appliance is able to mimic the functionality of the failed server. When a new server or repaired server is available, the warm standby appliance is no longer needed. To incorporate the new server into the system quickly and easily, the server image of the warm standby appliance is sent to the new server. After transferring the image, the warm standby appliance is cleaned and returns back to a dormant state, waiting to be utilized again.

Подробнее
02-02-2012 дата публикации

Method and apparatus for managing data of operation system

Номер: US20120030323A1
Автор: Akinori Matsuno
Принадлежит: Fujitsu Ltd

A server for an operation system includes a monitor to monitor a status of another server, a first storage to retain a first network configuration information, a second storage to copy the first network configuration information when an abnormality is detected in the another server, a third storage to retain a first update history information including update information of a network configuration information obtained from a client in the operation system, and an operation configuration manager to update the first network configuration information and a second network configuration information retained in the another server when the another server recovers from the abnormality. The operation configuration manager is configured to update the first network configuration information and the second network configuration information based on the first update history information and a second update history information retained in the another server.

Подробнее
23-05-2013 дата публикации

Mechanism to Provide Assured Recovery for Distributed Application

Номер: US20130132765A1
Принадлежит: CA Inc

A system and method is provided for providing assured recovery for a distributed application. Replica servers associated with the distributed application may be coordinated to perform integrity testing together for the whole distributed application. The replica servers connect to each other in a manner similar to the connection between master servers associated with the distributed application, thereby preventing the replica servers from accessing and/or changing application data on the master servers during integrity testing.

Подробнее
06-06-2013 дата публикации

Law breaking/behavior sensor

Номер: US20130144459A1
Автор: Christopher P. Ricci
Принадлежит: FLEXTRONICS AP LLC

Methods and systems for a complete vehicle ecosystem are provided. Specifically, systems that when taken alone, or together, provide an individual or group of individuals with an intuitive and comfortable vehicular environment. The present disclosure builds on integrating existing technology with new devices, methods, and systems to provide a complete vehicle ecosystem.

Подробнее
06-06-2013 дата публикации

Configurable vehicle console

Номер: US20130144463A1
Принадлежит: FLEXTRONICS AP LLC

Methods and systems for a configurable vehicle console are provided. Specifically, a configurable console may comprise one or more displays that are capable of receiving input from a user. At least one of these displays may be removed from the console of a vehicle and operated as a stand-alone computing platform. Moreover, it is anticipated that each one or more of the displays of the console may be configured to present a plurality of custom applications that, when manipulated by at least one user, are adapted to control functions associated with a vehicle and/or associated peripheral devices.

Подробнее
11-07-2013 дата публикации

DUAL-CHANNEL HOT STANDBY SYSTEM AND METHOD FOR CARRYING OUT DUAL-CHANNEL HOT STANDBY

Номер: US20130179723A1
Принадлежит: Beijing Jiaotong University

A dual-channel hot standby system and a method for carrying out dual-channel hot standby, the system comprises a hot standby status management layer including two hot standby management units, an application processing layer including two application processors, and a data communication layer including two communicators; the hot standby status management layer is used for controlling the setting and switching between a active status and a standby status of the two application processors, monitoring the working status of the data communication layer, and carrying out synchronization of the control cycles for the two channels of the system; wherein one of the hot standby management units controls one of the application processors, and together constitute a channel of the system therewith; the data communication layer is used for receiving data from outside, and forwarding the data to the application processing layer. The present invention avoids the occurrence of “dual-channel-active” or “dual-channel-standby” status; ensures synchronization of the control cycles of two channels; reduces the time of the system for responding to breakdowns; meets the real-time requirements; enhances the reliability and availability of the system; and ensures a seamless switching between active and standby statuses. 1. A dual-channel hot standby system , characterized in that , it comprises a hot standby status management layer including two hot standby management units , an application processing layer including two application processors , and a data communication layer including two communicators; the hot standby status management layer is used for controlling the setting and switching between a active status and a standby status of the two application processors , monitoring the working status of the data communication layer , and carrying out synchronization of the control cycles for the two channels of the system; wherein one of the hot standby management units controls one of the ...

Подробнее
18-07-2013 дата публикации

QUERY EXECUTION AND OPTIMIZATION WITH AUTONOMIC ERROR RECOVERY FROM NETWORK FAILURES IN A PARALLEL COMPUTER SYSTEM WITH MULTIPLE NETWORKS

Номер: US20130185588A1

A database query execution monitor determines if a network error or low performance condition exists and then where possible modifies the query. The query execution monitor then determines an alternate query execution plan to continue execution of the query. The query optimizer can re-optimize the query to use a different network or node. Thus, the query execution monitor allows autonomic error recovery for network failures using an alternate query execution. The alternate query execution could also be determined at the initial optimization time and then this alternate plan used to execute a query in the case of a particular network failure. 1. A computer apparatus comprising:a plurality of nodes each having a memory and at least one processor;a database residing in the memory;a plurality of networks connecting the plurality of nodes;a network monitor that periodically monitors the plurality of networks to determine network loading and maintains a network file that contains information about network utilization;a query optimizer and a query to the database residing in the memory;a query execution monitor residing in the memory and executed by the at least one processor, the query execution monitor detecting a network failure during execution of the query and invoking the query optimizer to re-optimize the query to use a different network to execute the query, the query execution monitor detecting poor performance of execution of the query and invoking the query optimizer to re-optimize the query to use a different network to execute the query.2. The computer apparatus of wherein the query execution monitor determines part of the query executed prior to the network failure and then modifies the query to utilize data from the part of the query that executed prior to the network failure.3. The computer apparatus of wherein the network file maintained by the network monitor is used by the query execution monitor and wherein the network file contains network file ...

Подробнее
08-08-2013 дата публикации

Redundant computer control method and device

Номер: US20130205162A1
Принадлежит: Fujitsu Ltd

Disclosed is a non-transitory computer-readable medium storing a program, which causes a computer to execute a sequence of processing. The sequence of processing includes receiving status information by a second server device from a client device, the status information being collected by the client device, and including a status of a first server device and statuses of one or more standby servers configured to operate when the first server device fails, and causing the second server device to operate, when the status information indicates a predetermined first status, as at least one of the first server device and the one or more standby servers in a failure status.

Подробнее
15-08-2013 дата публикации

COMPUTER SYSTEM AND BOOT CONTROL METHOD

Номер: US20130212424A1
Принадлежит: Hitachi, Ltd.

When a primary computer is taken over to a secondary computer in a redundancy configuration computer system where booting is performed via a storage area network (SAN), a management server delivers an information collecting/setting program to the secondary computer before the user's operating system of the secondary computer is started. This program assigns a unique ID (World Wide Name), assigned to the fibre channel port of the secondary computer to allow a software image to be taken over from the primary computer to the secondary computer. 1. A boot control method for a computer system having a plurality of computers , a management server that controls said plurality of computers , and a storage device that is shared by said plurality of computers , each computer having a port , a program for each computer is stored in a logical unit of said storage device , the computer system is configured to boot the program for each computer by using a unique ID that is set on the port of each computer , the unique ID that is set on the port of each computer is associated with the program stored in the logical unit , said boot control method comprising the steps of:managing, by said management server, unique IDs assigned to ports of the computers, and delivering a first unique ID assigned to a port of a failed computer to a secondary computer among said plurality of computers;setting, by said secondary computer, the delivered first unique ID on the port of the secondary computer, and notifying the management server of the first unique ID set on the port of the secondary computer;managing, by the management server, managing the notified first unique ID as a unique ID assigned to the port of the secondary computer;accessing, by said secondary computer, a logical unit associated with the failed computer using a logical connection newly created between the secondary computer and the storage device based on the setting of the first unique ID, and booting the program for the failed ...

Подробнее
29-08-2013 дата публикации

Failover Processing

Номер: US20130227339A1
Автор: Lund Christian
Принадлежит: METASWITCH NETWORKS LTD.

A method of providing failover processing between a first element and a second element in a data communications network, the method comprising configuring a first channel and a second channel between the first and second elements, the first and second channels comprising different physical data paths, receiving at the first element, via the first channel, first data signals representative of functioning statuses of the second element, the first channel being configured to allow a non-optimal, partly functioning status of the second element to be communicated to the first element; and receiving at the first element, via the second channel, second data signals representative of functioning statuses of the second element, the second channel being configured to allow a failed functioning status of the second element to be communicated to the first element; and conducting failover processing based on both the first and second data signals. 1. A method of providing failover processing between a first element and a second element , the first element and the second element each being suitable for performing a data processing function in a data communications network , the method comprising:configuring a first channel and a second channel between the first and second elements, said first and second channels comprising different physical data paths;receiving at the first element, via the first channel, first data signals representative of functioning statuses of the second element, the first channel being configured to allow a non-optimal, partly functioning status of the second element to be communicated to the first element;receiving at the first element, via the second channel, second data signals representative of functioning statuses of the second element, the second channel being configured to allow a failed functioning status of the second element to be communicated to the first element; andconducting failover processing based on both the first and second data signals. ...

Подробнее
26-09-2013 дата публикации

Facility Control System and Facility Control Method

Номер: US20130253665A1
Автор: Kazuto Mori, Kouichi Ikawa
Принадлежит: Daifuku Co Ltd

A facility control system comprises a selection processing portion that selects, based on a manual operation and when an abnormal condition occurs in a first-layer computer that executes a first-layer program which issues an apparatus operating command to an apparatus controller, whether to cause a second-layer computer to execute the first-layer program that had been executed by the first-layer computer, and a substitute command output processing portion which outputs a substitute command in accordance with selection information selected by the selection processing portion. The second-layer computer executes the first-layer program that had been executed by the first-layer computer in which the abnormal condition occurred based on a substitute command outputted by the substitute command output processing portion.

Подробнее
26-09-2013 дата публикации

STANDBY SYSTEM DEVICE, A CONTROL METHOD, AND A PROGRAM THEREOF

Номер: US20130254588A1
Автор: FUJIEDA Tsuyoshi
Принадлежит:

A standby system device which is connected to an active system device includes a process information sharing unit B and a standby process management unit C. The process information sharing unit B receives active side process information indicating usage of resources of an active system process A operating on the active system device from the active system device The standby process management unit C terminates a standby process A before activating a takeover process D used for taking over processing of the active system process A when a takeover of the active system process is requested on the standby system device the standby process A referring to the active side process information and acquiring resources in such a way that usage of resources of the standby process A is equal to or greater than the usage of resources of the active system process A. 1. A standby system device which is connected to an active system device comprising:a process information sharing unit which receives active side process information indicating usage of resources of an active system process operating on said active system device from said active system device; anda standby process management unit which terminates a standby process before activating a takeover process that is used for taking over processing of said active system process when a takeover of said active system process is requested on said standby system device, said standby process referring to said active side process information and acquiring resources in such a way that usage of resources of said standby process is equal to or greater than said usage of resources of said active system process.2. The standby system device according to claim 1 , wherein said standby process management unit activates said standby process before said takeover of said active system process is requested.3. The standby system device according to claim 1 , wherein said standby process management unit activates said standby process at a time at ...

Подробнее
03-10-2013 дата публикации

CLUSTER MONITOR, METHOD FOR MONITORING A CLUSTER, AND COMPUTER-READABLE RECORDING MEDIUM

Номер: US20130262916A1
Автор: SATO Yoichi
Принадлежит: NEC Corporation

A cluster monitor () controls activation of a business application program and a monitoring agent in a cluster system () that includes a plurality of servers. The cluster monitor () includes a business server identifying unit () that identifies a server on which the business application program is operating among the servers, and an agent server selecting unit () that selects a server for activating the monitoring agent from among the servers based on the identified server. 1. A cluster monitor for controlling activation of a business application program and a monitoring agent in a cluster system including a plurality of servers , comprising:a business server identifying unit that identifies a server on which the business application program is operating from among the plurality of servers; andan agent server selecting unit that selects a server for activating the monitoring agent from among the plurality of servers, based on the identified server.2. The cluster monitor according to claim 1 ,wherein in a case where the monitoring agent is activated on one of the plurality of servers, if a failure occurs in the server on which the monitoring agent is activated, the business server identifying unit identifies, in response to the occurrence of the failure, the server on which the business application program is operating, and the agent server selecting unit selects a server for activating the monitoring agent.3. The cluster monitor according to claim 1 ,wherein in a case where the monitoring agent is activated on one of the plurality of servers, if a failure relating to the business application program occurs and fail-over of the business application program is executed, the business server identifying unit identifies, in response to the execution of the fail-over, a server to take over the business application program due to the fail-over, and the agent server selecting unit selects a server for activating the monitoring agent.4. The cluster monitor according to claim ...

Подробнее
14-11-2013 дата публикации

NETWORK TRAFFIC ROUTING

Номер: US20130305085A1
Принадлежит:

A service appliance is installed between production servers running service applications and service users. The production servers and their service applications provide services to the service users. In the event that a production server is unable to provide its service to users, the service appliance can transparently intervene to maintain service availability. To maintain transparency to service users and service applications, service users are located on a first network and production servers are located on a second network. The service appliance assumes the addresses of the service users on the second network and the addresses of the production servers on the first network. Thus, the service appliance obtains all network traffic sent between the production server and service users. While the service application is operating correctly, the service appliance forwards network traffic between the two networks using various network layers. 1. Apparatus comprising a storage medium storing a program for maintaining availability of a first service on a first server to plural client systems via a network , the instructions of the program for:synchronizing a second service provided by a second server with the first service provided by the first server;monitoring availability of the first service;if the first service is unavailable, causing the second service to be substituted in place of the first service and monitoring a third service;if the third service is available and capable of handling access by client systems, causing the third service to synchronize with the second service;monitoring synchronization of the third service with the second service;if the third service is synchronized with the second service, causing the third service to be substituted in place of the second service, such that the third service is responsive to communications from the client systems directed to the first service.2. The apparatus of further comprising the second server claim 1 , the ...

Подробнее
21-11-2013 дата публикации

Resiliency to memory failures in computer systems

Номер: US20130311823A1
Принадлежит: Cray Inc

A resiliency system detects and corrects memory errors reported by a memory system of a computing system using previously stored error correction information. When a program stores data into a memory location, the resiliency system executing on the computing system generates and stores error correction information. When the program then executes a load instruction to retrieve the data from the memory location, the load instruction completes normally if there is no memory error. If, however, there is a memory error, the computing system passes control to the resiliency system (e.g., via a trap) to handle the memory error. The resiliency system retrieves the error correction information for the memory location and re-creates the data of the memory location. The resiliency system stores the data as if the load instruction had completed normally and passes control to the next instruction of the program.

Подробнее
19-12-2013 дата публикации

Recovery of a System for Policy Control and Charging, Said System Having a Redundancy of Policy and Charging Rules Function

Номер: US20130339783A1
Принадлежит: TELEFONAKTIEBOLAGET L M ERICSSON (PUBL)

A first Policy and Charging Rules Function “PCRF” server for recovery of a Policy and Charging Control “PCC” system. The PCC system also has a second PCRF server previously in charge of controlling an Internet Protocol Connectivity Access Network “IP-CAN” session previously established with a UE, and a PCRF-client. The first PCRF server includes a network interface unit of the first PCRF server arranged for receiving a modification request of the IP-CAN session from the PCRF-client after failure of the second PCRF server which was in active mode. The first PCRF server has a PCRF identifier which is shared with the second PCRF server that has failed. The first PCRF server now in active mode. The modification request requesting new rules for the IP-CAN session, including modification data and excluding access data and supported features for the IP-CAN session. The first PCRF server includes a processing unit of the first PCRF server arranged for determining that the IP-CAN session is unknown, and arranged for submitting a request from the network interface unit of the first PCRF server to the PCRF-client to provide all information that the PCRF-client has regarding the IP-CAN session. The information includes all data required to be sent for the IP-CAN session establishment and synchronization data. A Policy and Charging Rules Function “PCRF”-client for recovery of a Policy and Charging Control “PCC” system. Methods for recovery of a Policy and Charging Control “PCC” system with a first Policy and Charging Rules Function “PCRF” server in standby mode, a second PCRF server in active mode, and a PCRF-client, wherein an IP-CAN session is already established with a UE and controlled by the second PCRF server. A computer program embodied on a computer readable medium for recovery of a Policy and Charging Control “PCC” system. 121.-. (canceled)22. A method for recovery of a Policy and Charging Control (PCC) system with a first Policy and Charging Rules Function (PCRF) ...

Подробнее
13-02-2014 дата публикации

SYNCHRONOUS LOCAL AND CROSS-SITE FAILOVER IN CLUSTERED STORAGE SYSTEMS

Номер: US20140047263A1
Принадлежит:

Synchronous local and cross-site switchover and switchback operations of a node in a disaster recovery (DR) group are described. In one embodiment, during switchover, a takeover node receives a failover request and responsively identifies a first partner node in a first cluster and a second partner node in a second cluster. The first partner node and the takeover node form a first high-availability (HA) group and the second partner node and a third partner node in the second cluster form a second HA group. The first and second HA groups form the DR group and share a storage fabric. The takeover node synchronously restores client access requests associated with a failed partner node at the takeover node. 1. A method comprising:receiving, by a takeover node in a first cluster at a first site of a cross-site clustered storage system, a failover request;processing, by the takeover node, the failover request to identify a first partner node in the first cluster and a second partner node in a second cluster at a second site, the first partner node and the takeover node forming a first high-availability (HA) group, the second partner node and a third partner node in the second cluster forming a second HA group, the first HA group and the second HA group forming a disaster recovery (DR) group and sharing a storage fabric with each other; andresuming, by the takeover node, client access requests associated with a failed partner node synchronously at the takeover node.2. The method of claim 1 , further comprising:synchronously replicating, by the takeover node, cache data associated with the takeover node to the first partner node at the first site and the second partner node at the second site during non-failover conditions, wherein the first site and the second site are geographically remote with respect to each other.3. The method of claim 2 , wherein synchronously replicating the cache data comprises synchronously replicating claim 2 , by the takeover node claim 2 , the ...

Подробнее
20-02-2014 дата публикации

Techniques for performing processing for database

Номер: US20140052826A1
Автор: Masahiro Ohkawa
Принадлежит: International Business Machines Corp

Embodiments relate to a method, system and program product for performing data processing. The system includes a plurality of computer servers configured to perform data processing, a client in processing communication with the computer servers and enabled to request data processing from any of the servers and a storing component included in the client for storing information relating to requested data to be processed. A processing component included in each computer server for applying a control lock to data being processed. A reprocessing request component is included in the client for enabling a new server to take over processing of requested data upon failure of previously processing computer server. The computer server obtains information relating to requested data from storing component and information relating to control lock information from the processing component such that the new computer server commences processing at a processing point exactly prior to the failure.

Подробнее
27-03-2014 дата публикации

THREAD SPARING BETWEEN CORES IN A MULTI-THREADED PROCESSOR

Номер: US20140089732A1

Embodiments relate to thread sparing between cores in a processor. An aspect includes determining that a number of recovery attempts made by a first thread on the first core has exceeded a recovery attempt threshold, and sending a request to transfer the first thread. Another aspect includes, selecting a second core from a plurality of cores to receive the first thread from the first core, wherein the second core is selected based on the second core having an idle thread. Another aspect includes transferring a last good architected state of the first thread from the first core to the second core. Another aspect includes loading the last good architected state of the first thread by the idle thread on the second core. Yet another aspect includes resuming execution of the first thread on the second core from the last good architected state of the first thread by the idle thread. 1. A computer implemented method for thread sparing between cores in a processor , the method comprising:determining, by a first core of the processor, that a number of recovery attempts made by a first thread on the first core has exceeded a recovery attempt threshold;sending, by the first core to a processor controller in the processor, a request to transfer the first thread to another core of the processor;based on receiving the request, selecting, by the processor controller, a second core from a plurality of cores of the processor to receive the first thread from the first core, wherein the second core is selected based on the second core having an idle thread;transferring a last good architected state of the first thread from an error recovery logic of the first core to the second core;loading the last good architected state of the first thread by the idle thread on the second core; andresuming execution of the first thread on the second core from the last good architected state of the first thread by the idle thread.2. The method of claim 1 , wherein the recovery attempts made by the ...

Подробнее
04-01-2018 дата публикации

DUAL-PORT NON-VOLATILE DUAL IN-LINE MEMORY MODULES

Номер: US20180004422A1
Принадлежит:

According to an example, a dual-port non-volatile dual in-line memory module (NVDIMM) includes a first port to provide a central processing unit (CPU) with access to universal memory of the dual-port NVDIMM and a second port to provide an external NVDIMM manager circuit with access to the universal memory of the dual-port NVDIMM. Accordingly, a media controller of the dual-port NVDIMM may store data received from the CPU through the first port in the universal memory, control dual-port settings received from the CPU, and transmit the stored data to the NVDIMM manager circuit through the second port of the dual-port NVDIMM. 1. A dual-port non-volatile dual in-line memory module (NVDIMM) , corn prising:a first port to provide a central processing unit (CPU) with access to universal memory of the dual-port NVDIMM;a second port to provide an external NVDIMM manager circuit with access to the universal memory of the dual-port NVDIMM, wherein the NVDIMM manager circuit interfaces with remote storage; and store data received from the CPU through the first port of the dual-port NVDIMM in the universal memory,', 'control dual-port settings for the dual-port NVDIMM received from the CPU through the first port of the dual-port NVDIMM, wherein the dual-port settings include at least one of an active-active redundancy flow and an active-passive redundancy flow, and', 'transmit the stored data to the NVDIMM manager circuit through the second port of the dual-port NVDIMM., 'a media controller to'}2. The dual-port NVDIMM of claim 1 , wherein responsive to controlling the dual-port settings to be the active-active redundancy flow claim 1 , the media controller is to set both the first port and the second port of the dual-port NVDIMM to an active state so that the CPU and NVDIMM manager circuit can simultaneously access the dual-port NVDIMM.3. The dual-port NVDIMM of claim 2 , wherein the media controller comprises an integrated direct memory access (DMA) engine migrate the stored ...

Подробнее
02-01-2020 дата публикации

FAULT TOLERANCE METHOD AND SYSTEM FOR VIRTUAL MACHINE GROUP

Номер: US20200004642A1

A fault tolerance method and system for a virtual machine group is proposed. The method includes: establishing fault tolerance backup connections of virtual machines between a virtual machine hypervisor of at least one primary host and a virtual machine hypervisor of at least one backup host to perform fault tolerance backups of the virtual machines, wherein the plurality of virtual machines are included in a fault tolerance group; when a synchronizer determines that a failover of at least one first virtual machine among the primary virtual machines in the fault tolerance group is being performed. Informing, by the synchronizer, to perform a failover of other remaining primary virtual machines among the primary virtual machines in the fault tolerance group, or to return other remaining primary virtual machines among the primary virtual machines in the fault tolerance group back to a last fault tolerance backup state of each and continue performing fault tolerance backups of the other remaining primary virtual machines. 1. A fault tolerance method for a virtual machine group , applicable to a fault tolerance system , and comprising:establishing fault tolerance backup connections of a plurality of primary virtual machines between a virtual machine hypervisor of at least one primary host and a virtual machine hypervisor of at least one backup host to perform fault tolerance backups of the primary virtual machines, wherein the primary virtual machines are included in a fault tolerance group; andwhen a synchronizer determines that a failover of at least one first virtual machine among the primary virtual machines in the fault tolerance group is being performed,informing, by the synchronizer, to perform a failover of one or more other remaining primary virtual machines among the primary virtual machines in the fault tolerance group, orinforming, by the synchronizer, to return the one or more other remaining primary virtual machines among the primary virtual machines in ...

Подробнее
13-01-2022 дата публикации

INDEXING BACKUP DATA GENERATED IN BACKUP OPERATIONS

Номер: US20220012135A1
Принадлежит:

In certain embodiments, a tiered storage system is disclosed that provides for failover protection during data backup operations. The system can provide for an index, or catalog, for identifying and enabling restoration of backup data located on a storage device. The system further maintains a set of transaction logs generated by media agent modules that identify metadata with respect to individual data chunks of a backup file on the storage device. A copy of the catalog and transaction logs can be stored at a location accessible by each of the media agent modules. In this manner, in case of a failure of one media agent module during backup, the transaction logs and existing catalog can be used by a second media agent module to resume the backup operation without requiring a restart of the backup process. 1. A non-transitory computer-readable medium that stores instructions which , when executed by a first computing device comprising one or more hardware processors , cause the first computing device to: [ (a) first backup data generated by the second computing device, wherein the first backup data comprises a plurality of data chunks, and', '(b) storage of the first backup data as the plurality of data chunks in one or more second storage devices that are communicatively coupled to the second computing device,, 'wherein the one or more first transaction logs are based on one or more of, 'wherein once a given data chunk from the plurality of data chunks is stored in the one or more second storage devices, the second computing device transmits a corresponding transaction log to the first computing device, and', 'wherein the second computing device comprises one or more hardware processors;, 'receive from a second computing device one or more first transaction logs,'}after the first backup data is generated, apply the one or more first transaction logs to an index; andwherein the index is configured to enable restoring backup data generated by at least the second ...

Подробнее
10-01-2019 дата публикации

Technique For Higher Availability In A Multi-Node System

Номер: US20190012244A1
Принадлежит:

Techniques are described herein for quick identification of a set of units of data for which recovery operations are to be performed to redo or undo changes made by the failed node. When a lock is requested by an instance, lock information for the lock request is replicated by another instance. If the instance fails, the other instance may use the replicated lock information to determine a set of data blocks for recovery operations. The set of data blocks is available in memory of a recovery instance when a given node fails, and does not have to be completely generated by scanning a redo log. 1. A method comprising:generating, at a first node of a multi-node database system, a plurality of lock requests; storing, in a redo log associated with the first node, changes to a target data block and a change number associated with the changes;', 'receiving, at a second node of the multi-node database system, a request to replicate lock information for the lock request; and', 'storing, in a memory of the second node, the change number and a location of the target data block., 'for each lock request of the plurality of lock requests2. The method of wherein only the second node is assigned to replicate lock information for the first node.3. The method of wherein a plurality of nodes are assigned to replicate lock information for the first node claim 1 , and the plurality of nodes includes the second node.4. The method of further comprising sending the request to replicate lock information asynchronously to the second node.5. The method of further comprising:in response to a failure of the first node, sending a recovery request to the second node;determining at the second node, based on replicated lock information, a set of one or more data blocks to recover, wherein said replicated lock information includes replicated lock information for a plurality of lock requests.6. The method of claim 5 , wherein determining the set of one or more data blocks comprises:determining a ...

Подробнее
10-01-2019 дата публикации

Failover Method, Apparatus and System

Номер: US20190012245A1
Принадлежит:

A failover method, apparatus and system to implement fast failover between a primary processor and a secondary processor, where the method includes receiving, by a first device, transaction content of a transaction and transaction status data of the transaction, the transaction status data being used to resume the transaction when the transaction is interrupted by a failure of a second device, and continuing to process, by the first device, the transaction according to the transaction content and the transaction status data when detecting that the second device fails. 1. A failover method , comprising:receiving, by a first device, transaction content of a transaction and transaction status data of the transaction, the transaction status data being used to resume the transaction when the transaction is interrupted by a failure of a second device; andcontinuing to process, by the first device, the transaction according to the transaction content and the transaction status data when detecting that the second device fails.2. The failover method of claim 1 , wherein the transaction status data comprises a transaction processing location identifier claim 1 , and continuing to process the transaction comprises:determining, by the first device according to the transaction processing location identifier, a location at which the transaction is interrupted; andcontinuing to process, by the first device, the interrupted transaction from the location at which the transaction is interrupted.3. The failover method of claim 1 , wherein continuing to process the transaction comprises claim 1 , processing claim 1 , by the first device claim 1 , the interrupted transaction again from a start position of the transaction.4. The failover method of claim 1 , wherein the transaction status data comprises a transaction completion identifier claim 1 , and the failover method further comprises deleting claim 1 , by the first device claim 1 , information corresponding to the transaction ...

Подробнее
09-01-2020 дата публикации

ROLE MANAGEMENT OF COMPUTE NODES IN DISTRIBUTED CLUSTERS

Номер: US20200012577A1
Принадлежит:

In one example, a distributed cluster may include compute nodes having a master node and a replica node, an in-memory data grid formed from memory associated with the compute nodes, a first high availability agent running on the replica node, and a second high availability agent running on the master node. The first high availability agent may determine a failure of the master node by accessing data in the in-memory data grid and designate a role of the replica node as a new master node to perform cluster management tasks of the master node. The second high availability agent may determine that the new master node is available in the distributed cluster by accessing the data in the in-memory data grid when the master node is restored after the failure and demote a role of the master node to a new replica node. 1. A distributed cluster comprising:a plurality of compute nodes comprising a master node and a replica node;an in-memory data grid formed from memory associated with the plurality of compute nodes; determine a failure of the master node by accessing data in the in-memory data grid; and', 'designate a role of the replica node as a new master node to perform cluster management tasks of the master node upon determining the failure; and, 'a first high availability agent running on the replica node to determine that the new master node is available in the distributed cluster by accessing the data in the in-memory data grid when the master node is restored after the failure; and', 'demote a role of the master node to a new replica node upon determining the new master node., 'a second high availability agent running on the master node to2. The distributed cluster of claim 1 , wherein the first high availability agent running on the replica node is to:initiate a failover operation to failover the replica node to the new master node upon determining the failure of the master node;upon completion of the failover operation, determine whether the master node is restored ...

Подробнее
19-01-2017 дата публикации

DETECTING HIGH AVAILABILITY READINESS OF A DISTRIBUTED COMPUTING SYSTEM

Номер: US20170017535A1
Принадлежит:

Technology is disclosed for determining high availability readiness of a distributed computing system (“system”). A confidence measure (CM) can be computed for a particular controller in the system to determine whether a takeover by the particular controller from a first controller would be successful. The CM can be a percentage value. A CM of 0% indicates that a takeover would be a failure, which results in loss of access to data managed by the first controller. A CM of 100% indicates a successful takeover with no performance impact on the system. A CM between 0% and 100% indicates a successful takeover but with a performance impact. The CM can be computed based on events occurring in the system, e.g., veto and non-veto events. The CM is computed as a function of various weights and/or indices associated with the veto events and/or non-veto events. 1. A method , comprising:receiving, by a computing device, a list of historical events related to a high availability pair comprising a first node and a second node of a distributed computing system;determining, by the computing device, a set of non-veto events and a set of veto events related to at least one of the nodes from the list of historical events;obtaining by the computing device, a severity index and a compliance factor for each event of the set of non-veto events; andgenerating and outputting, by the computing device, a confidence measure for the at least one of the nodes based on the set of veto events, the severity index, and the compliance factor.2. The computer-implemented method of claim 1 , wherein the confidence measure indicates a magnitude of an impact on a performance of the distributed computing system if the at least one of the nodes takes over from another one of the nodes.3. The computer-implemented method of claim 1 , wherein the severity index of an event of the set of non-veto events indicates a magnitude of performance impact on the distributed computing system due to the occurrence of the ...

Подробнее
18-01-2018 дата публикации

Virtual Machine Seed Image Replication through Parallel Deployment

Номер: US20180018191A1
Принадлежит:

Generating secondary virtual machine seed image storage is provided. An input is received to deploy a primary virtual machine and a secondary virtual machine based on a golden virtual machine image. In response, the primary virtual machine from the golden virtual machine image on a primary data processing site and the secondary virtual machine from the golden virtual machine image on a secondary data processing site are deployed. Execution of the secondary virtual machine is suspended on the secondary data processing site. Using the golden virtual machine image, a seed image corresponding to the secondary virtual machine is generated that is up-to-date at that point in time in storage at the secondary data processing site to form the secondary virtual machine seed image storage. The secondary virtual machine seed image storage is enabled to receive state data updates from the primary virtual machine on the primary data processing site. 1. A computer-implemented method for generating secondary virtual machine seed image storage , the computer-implemented method comprising:receiving, by a computer, an input to deploy a primary virtual machine and a secondary virtual machine based on a golden virtual machine image;responsive to the computer receiving the input, deploying, by the computer, the primary virtual machine from the golden virtual machine image on a primary data processing site and the secondary virtual machine from the golden virtual machine image on a secondary data processing site;suspending, by the computer, execution of the secondary virtual machine on the secondary data processing site;generating, by the computer, using the golden virtual machine image, a seed image corresponding to the secondary virtual machine that is up-to-date at that point in time in storage at the secondary data processing site to form the secondary virtual machine seed image storage; andenabling, by the computer, the secondary virtual machine seed image storage to receive state ...

Подробнее
18-01-2018 дата публикации

NODE SYSTEM, SERVER APPARATUS, SCALING CONTROL METHOD, AND PROGRAM

Номер: US20180018244A1
Принадлежит: NEC Corporation

A system includes an active system that executes processing, a standby system that is able to perform at least one of scale-up and scale-down, and a control apparatus that controls system switching to set the standby system undergoing the scaled up or scaled down as a new active system. 1. A node system comprising:an active system that executes processing;a standby system that is able to perform at least one of scale-up and scale-down; anda control apparatus that controls system switching to switch the standby system undergoing the scale-up or scale-down to a new active system.2. The node system according to claim 1 , wherein the control apparatus is configured to instruct the standby system to perform scale-up or scale-down claim 1 , when performing scale-up or scale-down of the active system claim 1 , andupon reception of a completion notification from the standby system with the scale-up or scale-down completed, the control apparatus controls the system switching to switch the standby system undergoing the scale-up or scale-down to the new active system, and to switch the active system before the system switching to a new standby system.3. The node system according to claim 2 , wherein the control apparatus controls the new standby system to perform scale-up or scale-down in the same way as the standby system switched to the new active system.4. The node system according to claim 2 , wherein claim 2 , when the active system needs to be scaled up claim 2 ,the control apparatus instructs the standby system that completes the scale-up in response to a scale-up instruction from the control apparatus, to switch to the new active system,after the system switching, the new standby system imposes processing restriction on the new active system,the control apparatus instructs the new standby system to perform scale-up in the same way as the standby system that transitions to the new active system, andthe control apparatus, upon reception of a scale-up completion ...

Подробнее
16-01-2020 дата публикации

STORAGE SYSTEM AND CONFIGURATION INFORMATION CONTROL METHOD

Номер: US20200019478A1
Принадлежит: Hitachi, Ltd.

Proposed is a scale-out-type storage system which implements high-availability, high-speed failover. In a scale-out-type storage system, two or more nodes each comprise a cluster controller, a node controller, a plurality of subcluster processes (subclusters and the like) which are processes which execute I/O processing in their own node, which form a subcluster between processes in their own node, and which are synchronized with work-type (active)/standby-type (passive) corresponding processes in the other nodes, and a nonvolatile data store (SODB). The configuration information of the storage system is held partitioned into global configuration information of the SODB and local configuration information and the like of the subclusters and the like, and thereupon the working-type subcluster is capable of executing I/O processing without accessing the SODB. 1. A scale-out-type storage system in which a cluster is constructed by linking a plurality of nodes , at least two or more nodes among the plurality of nodes each comprising:a cluster controller which controls processing spanning the whole cluster;a node controller which performs closed processing control on its own node;a plurality of subcluster processes which are processes which execute I/O processing in their own node, which form a subcluster between processes in their own node, and which are synchronized with work-type/standby-type corresponding processes in the other nodes; anda nonvolatile data store which is shared by the whole cluster,wherein the data store holds, as global configuration information, configuration information which includes information that must be shared by the whole cluster among the configuration information of the storage system,wherein the subcluster processes hold, as local configuration information, configuration information which is required for their own subcluster process to operate among the configuration information of the storage system, andwherein the work-type subcluster ...

Подробнее
16-01-2020 дата публикации

DISASTER RECOVERY DEPLOYMENT METHOD, APPARATUS, AND SYSTEM

Номер: US20200019479A1
Принадлежит:

This application discloses a disaster recovery deployment method, apparatus, and system, and relates to the field of network application technologies. The method includes: obtaining, by a master data center and a backup data center, disaster recovery control information; sending, by the master data center, the data corresponding to the service of the master data center to the at least one backup data center based on the disaster recovery control information; and deploying, by the backup data center, a disaster recovery resource for the master data center based on the disaster recovery control information, and backing up the received data. In other words, the master data center and the backup data center automatically back up resources and data based on the disaster recovery control information, and therefore, manual operation steps in a disaster recovery deployment process are simplified, and efficiency of disaster recovery deployment is improved. 1. A method for disaster recovery deployment , the method comprising:obtaining, by a master data center, disaster recovery control information, wherein the disaster recovery control information indicates a disaster recovery resource to be deployed by at least one backup data center for a service of the master data center and a backup relationship of data corresponding to the service of the master data center in the at least one backup data center, and wherein the disaster recovery resource is a resource used to perform disaster recovery and backup on the service of the master data center;sending, by the master data center, the data corresponding to the service of the master data center to the at least one backup data center based on the disaster recovery control information;obtaining, by the backup data center, the disaster recovery control information;deploying, by the backup data center, a first disaster recovery resource for the master data center based on the disaster recovery control information; andreceiving, by the ...

Подробнее
25-01-2018 дата публикации

FAULT MONITORING DEVICE, VIRTUAL NETWORK SYSTEM, AND FAULT MONITORING METHOD

Номер: US20180024898A1
Автор: YOSHIKAWA Naoya
Принадлежит: NEC Corporation

A fault monitoring device includes a notice reception part configured to receive a notice indicating occurrence of faults from a virtual network device, and a recovery process part configured to carry out a recovery process for one device having the highest priority of fault response among the virtual network device producing the notice, a physical device implementing the virtual network device, and another virtual network device involved in dependency with the virtual network device. 1. A fault monitoring device comprising:a notice reception part configured to receive a notice indicating occurrence of a fault from a virtual network device; anda recovery process part configured to carry out a recovery process for one device having a highest priority of fault response among the virtual network device producing the notice, a physical device implementing the virtual network device, and another virtual network device involved in dependency with the virtual network device.2. The fault monitoring device according to claim 1 , further comprising a configuration storage unit configured to store the virtual network device in correlation with at least one of the physical device implementing the virtual network device and another virtual network device involved in dependency with the virtual network device claim 1 , wherein the recovery process part carries out the recovery process for one device having the highest priority of fault response among the virtual network device and each device stored on the configuration storage unit in correlation with the virtual network device.3. The fault monitoring device according to claim 2 , further comprising an instruction part configured to send an instruction to detect presence/absence of the fault to each device stored on the configuration storage unit in correlation with the virtual network device producing the notice claim 2 , and a result retrieval part configured to retrieve a detection result concerning the presence/absence of ...

Подробнее
23-01-2020 дата публикации

SYSTEMS, METHODS, AND APPARATUSES FOR IMPLEMENTING A SCHEDULER AND WORKLOAD MANAGER WITH WORKLOAD RE-EXECUTION FUNCTIONALITY FOR BAD EXECUTION RUNS

Номер: US20200026571A1
Принадлежит: SALESFORCE.COM, INC.

In accordance with disclosed embodiments, there are provided systems, methods, and apparatuses for implementing a stateless, deterministic scheduler and work discovery system with interruption recovery. For instance, according to one embodiment, there is disclosed a system to implement a stateless scheduler service, in which the system includes: a processor and a memory to execute instructions at the system; a compute resource discovery engine to identify one or more computing resources available to execute workload tasks; a workload discovery engine to identify a plurality of workload tasks to be scheduled for execution; a cache to store information on behalf of the compute resource discovery engine and the workload discovery engine; a scheduler to request information from the cache specifying the one or more computing resources available to execute workload tasks and the plurality of workload tasks to be scheduled for execution; and further in which the scheduler is to schedule at least a portion of the plurality of workload tasks for execution via the one or more computing resources based on the information requested. Other related embodiments are disclosed. 1. A method performed by a system having at least a processor and a memory therein , wherein the method comprises:allocating a cache within the memory of the system;identifying, via a workload discovery engine, pending workload tasks to be scheduled for execution from one or more workload queues and updating the cache;identifying, via a compute resource discovery engine, a plurality of computing resources available to execute the workload tasks and updating the cache;identifying, via an external services monitor, a plurality of external services accessible to the workload tasks and updating the cache;executing a scheduler via the processor of the system, wherein the scheduler performs at least the following operations:scheduling the workload tasks for execution on the plurality of computing resources; ...

Подробнее
23-01-2020 дата публикации

OPPORTUNISTIC OFFLINING FOR FAULTY DEVICES IN DATACENTERS

Номер: US20200026591A1
Принадлежит:

Embodiments relate to determining whether to take a resource distribution unit (RDU) of a datacenter offline when the RDU becomes faulty. RDUs in a cloud or datacenter supply a resource such as power, network connectivity, and the like to respective sets of hosts that provide computing resources to tenant units such as virtual machines (VMs). When an RDU becomes faulty some of the hosts that it supplies may continue to function and others may become unavailable for various reasons. This can make a decision of whether to take the RDU offline for repair difficult, since in some situations countervailing requirements of the datacenter may be at odds. To decide whether to take an RDU offline, the potential impact on availability of tenant VMs, unused capacity of the datacenter, a number or ratio of unavailable hosts on the RDU, and other factors may be considered to make a balanced decision. 1. A method performed by one or more control server devices participating in a cloud fabric that controls a compute cloud , the compute cloud comprised of a plurality of cloud server devices , the cloud further comprising cloud hardware assets , each cloud hardware asset servicing a respective set of the cloud server devices , the cloud server devices hosting tenant components of tenants of the cloud , the method comprising:receiving a notification that a cloud hardware asset is in a failure state, the cloud hardware asset servicing a set of the cloud server devices; and 'determining how making the set of cloud server devices unavailable would affect: (i) a measure of availability of a population of the tenant components, wherein making the set of cloud server devices unavailable would render some of the tenant components unavailable, and (ii) a measure and/or prediction of cloud capacity for a resource of the cloud.', 'based on the notification, determining whether to place the cloud hardware asset in a state of unavailability, wherein when the cloud hardware asset enters the ...

Подробнее
23-01-2020 дата публикации

CONTROLLING PROCESSING ELEMENTS IN A DISTRIBUTED COMPUTING ENVIRONMENT

Номер: US20200026605A1
Принадлежит:

A computer system controls processing elements associated with a stream computing application. A stream computing application is monitored for the occurrence of one or more conditions. One or more processing element groups are determined to be restarted based on occurrence of the one or more conditions, wherein the processing element groups each include a plurality of processing elements associated with the stream computing application. Each processing element of the determined one or more processing element groups is concurrently restarted. Embodiments of the present invention further include a method and program product for controlling processing elements within a stream computing application in substantially the same manner described above. 1. A computer-implemented method of controlling processing elements associated with a stream computing application comprising:monitoring a stream computing application for occurrence of one or more conditions;determining one or more processing element groups to restart based on occurrence of the one or more conditions, wherein the processing element groups each include a plurality of processing elements associated with the stream computing application; andconcurrently restarting each processing element of the determined one or more processing element groups.2. The computer-implemented method of claim 1 , wherein determining one or more processing element groups further comprises:establishing at least one processing element group based on a configuration attribute.3. The computer-implemented method of claim 1 , wherein the one or processing elements include a plurality of operators claim 1 , and determining one or more processing element groups further comprises:establishing at least one processing element group based on locations of processing elements within an operator graph indicating a flow through the operators.4. The computer-implemented method of claim 1 , wherein the one or processing elements include a plurality of ...

Подробнее
23-01-2020 дата публикации

INTELLIGENT LOG GAP DETECTION TO ENSURE NECESSARY BACKUP PROMOTION

Номер: US20200026622A1
Принадлежит:

An intelligent log gap detection to ensure necessary backup promotion. Specifically, a method and system are disclosed, which entail determining whether to pursue a differential database backup or promote the differential database backup to a full database backup, in order to preclude data loss across high availability databases. The deduction pivots on a matching or mismatching between log sequence numbers (LSNs). 1. A method for intelligent log gap detection , comprising:receiving a first database backup request for a first differential database backup on a database availability cluster (DAC);making a first determination that a first full database backup has already been performed;obtaining, based on the first determination, a checkpoint log sequence number (LSN) associated with the first full database backup;making a second determination that the checkpoint LSN mismatches a first differential base LSN (DBL);detecting, based on the second determination, a log gap across the DAC; andpromoting, based on the detecting the log gap, the first differential database backup to a second full database backup.2. The method of claim 1 , wherein making the first determination claim 1 , comprises:performing a search of a cluster backup chain table (BCT) in reverse chronological order, wherein the cluster BCT comprises a plurality of cluster backup chain records (BCRs); andidentifying a cluster BCR of the plurality of cluster BCRs based on a backup identifier (ID) specified therein, wherein the backup ID identifies the cluster BCR as being associated with a full database backup.3. The method of claim 2 , wherein the cluster BCR comprises an object ID claim 2 , the backup ID claim 2 , a first LSN claim 2 , a last LSN claim 2 , the checkpoint LSN claim 2 , and a database backup LSN.4. The method of claim 1 , further comprising:issuing, based on the promoting, a full backup command (FBC).5. The method of claim 1 , further comprising:receiving a second database backup request for a ...

Подробнее
28-01-2021 дата публикации

METHODS, SYSTEMS, AND COMPUTER READABLE MEDIA FOR PROVIDING BYZANTINE FAULT TOLERANCE

Номер: US20210026745A1
Автор: Wang Yongge
Принадлежит:

Methods, systems, and computer readable media for providing Byzantine fault tolerance (BFT) are disclosed. According to one method, a method for providing BFT occurs at a computing platform executing a BFT protocol, wherein the computing platform is acting as a leader participant of a round of the BFT protocol. The method comprising: receiving signed round-change messages from multiple participants in the round; broadcasting a signed lock message indicating that signed round-change messages have been received from a predetermined number of the participants in the round voting for a same candidate block; receiving signed commit messages from multiple participants in the round; and broadcasting a signed decide message indicating the candidate block is a finalized block after the predetermined number of the participants in the round have sent signed commit messages indicating the candidate block. 1. A method for providing Byzantine fault tolerance (BFT) , the method comprising: receiving signed round-change messages from multiple participants in the round;', 'broadcasting a signed lock message indicating that signed round-change messages have been received from a predetermined number of the participants in the round voting for a same candidate block;', 'receiving signed commit messages from multiple participants in the round; and', 'broadcasting a signed decide message indicating the candidate block is a finalized block after the predetermined number of the participants in the round have sent signed commit messages indicating the candidate block., 'at a computing platform executing a BFT protocol, wherein the computing platform is acting as a leader participant of a round of the BFT protocol2. The method of wherein the predetermined number of the participants includes at least 2t+1 participants claim 1 , where t represents an amount of malicious participants in the round.3. The method of wherein a participant in the round receives the decide message from the leader ...

Подробнее
28-01-2021 дата публикации

One-sided reliable remote direct memory operations

Номер: US20210026799A1
Принадлежит: Oracle International Corp

Techniques are provided to allow more sophisticated operations to be performed remotely by machines that are not fully functional. Operations that can be performed reliably by a machine that has experienced a hardware and/or software error are referred to herein as Remote Direct Memory Operations or “RDMOs”. Unlike RDMAs, which typically involve trivially simple operations such as the retrieval of a single value from the memory of a remote machine, RDMOs may be arbitrarily complex. The techniques described herein can help applications run without interruption when there are software faults or glitches on a remote system with which they interact.

Подробнее
24-04-2014 дата публикации

FAILOVER SYSTEM AND METHOD

Номер: US20140115380A1
Принадлежит: TSX Inc.

One aspect of the present invention provides a system for failover comprising at least one client selectively connectable to one of at least two interconnected server via a network connection. In a normal state, one of the servers is designated a primary server when connected to the client and a remainder of the servers are designated as backup servers when not connected to the client. The at least one client is configured to send messages to the primary server. The servers are configured to process the messages using at least one service that is identical in each of the servers. The services are unaware of whether a server respective to the service is operating as the primary server or the backup server. The servers are further configured to maintain a library, or the like, that indicates whether a server is the primary server or a server is the backup server. The services within each server are to make external calls via its respective library. The library in the primary server is configured to complete the external calls and return results of the external calls to the service in the primary server and to forward results of the external calls to the service in the backup server. The library in the secondary server does not make external calls but simply forwards the results of the external calls, as received from the primary server, to the service in the secondary server when requested to do so by the service in the secondary server. 115-. (canceled)16. A system for failover comprising:a primary server and at least one backup server; determine a plurality of processing inputs, wherein said plurality of processing inputs guarantees deterministic processing of said plurality of processing inputs by said primary server and said backup server, wherein said plurality of processing inputs includes a call result from an external resource, said external resource resident on said primary server; and', 'forward said plurality of processing inputs to said backup server prior ...

Подробнее
04-02-2016 дата публикации

MANAGING BACKUP OPERATIONS FROM A CLIENT SYSTEM TO A PRIMARY SERVER AND SECONDARY SERVER

Номер: US20160034357A1
Принадлежит:

Provided are techniques for managing backup operations from a client system to a primary server and secondary server. A determination is made at the client system of whether a state of the data on the secondary server permits a backup operation in response to determining that the primary server is unavailable when a force failover parameter is not set. The client system reattempts to connect to the primary server to perform the backup operation at the primary server in response to determining that the state of the data on the secondary server does not permit the backup operation. The client system performs the backup operation at the secondary server in response to determining that the state of the secondary server permits the backup operation. 115-. (canceled)16. A method for replicating client data from a client system between a primary server and a secondary server , comprising:determining whether a state of the data on the secondary server permits a backup operation in response to determining that the primary server is unavailable when a force failover parameter is not set;reattempting to connect to the primary server to perform the backup operation at the primary server in response to determining that the state of the data on the secondary server does not permit the backup operation; andperforming the backup operation at the secondary server in response to determining that the state of the secondary server permits the backup operation.17. The method of claim 16 , further comprising:starting a failover delay timer in response to determining that the force failover parameter is not set and that the primary server is not available; andreattempting to connect to the primary server to perform the backup operation at the primary server in response to determining that the failover delay timer has not expired when the state of the secondary server permits the backup operation,wherein the backup operation is performed with respect to the secondary server in response to ...

Подробнее
04-02-2016 дата публикации

MANAGING BACKUP OPERATIONS FROM A CLIENT SYSTEM TO A PRIMARY SERVER AND SECONDARY SERVER

Номер: US20160034366A1
Принадлежит:

Provided are techniques for managing backup operations from a client system to a primary server and secondary server. A determination is made at the client system of whether a state of the data on the secondary server permits a backup operation in response to determining that the primary server is unavailable when a force failover parameter is not set. The client system reattempts to connect to the primary server to perform the backup operation at the primary server in response to determining that the state of the data on the secondary server does not permit the backup operation. The client system performs the backup operation at the secondary server in response to determining that the state of the secondary server permits the backup operation. 1. A computer program product for replicating client data from a client system between a primary server and a secondary server , wherein the computer program product comprises at least one computer readable storage medium including a client program embodied therewith , wherein the client program is executable by a processor to cause operations , the operations comprising:determining, by the client program, whether a state of the data on the secondary server permits a backup operation in response to determining that the primary server is unavailable when a force failover parameter is not set;reattempting, by the client program, to connect to the primary server to perform the backup operation at the primary server in response to determining that the state of the data on the secondary server does not permit the backup operation; andperforming, by the client program, the backup operation at the secondary server in response to determining that the state of the secondary server permits the backup operation.2. The computer program product of claim 1 , wherein the at least one computer readable storage medium includes a primary server program and a secondary server program claim 1 , wherein the operations further comprise:replicating ...

Подробнее
30-01-2020 дата публикации

Virtual network system, vim, virtual network control method and recording medium

Номер: US20200034180A1
Автор: Yoshihiko HOSHINO
Принадлежит: NEC Corp

A virtual network system includes a first virtualized network function (VNF), a second VNF, a network functions virtualization infrastructure (NFVI), and a virtualized infrastructure manager (VIM). The VNF performs a network function. The second VNF provides a backup of the first VNF, and provides redundant configuration with the first VNF. The NFVI provides a virtual resource that is a virtualization of a physical resource. The VIM instructs the NFVI to provide the virtual resource as a resource for performing the first VNF and the second VNF, and instructs the NFVI to cancel provision of the virtual resource to the first VNF and to provide the virtual resource as a resource for performing the second VNF when the second VNF is performed and the first VNF is made a backup of the second VNF.

Подробнее
04-02-2021 дата публикации

FAILURE SHIELD

Номер: US20210034480A1

An example graphics system can include a first portion including a graphics driver and graphics hardware and a second portion communicatively coupled to the first portion. The second portion can include a display system communicatively coupled to a GUI application and a shim layer to shield the second portion from failure responsive to failure of the first portion. 1. A graphics system , comprising:a first portion comprising a graphics driver and graphics hardware; and a display system communicatively coupled to a GUI application; and', 'a shim layer to shield the second portion from failure responsive to failure of the first portion., 'a second portion communicatively coupled to the first portion and comprising2. The graphics system of claim 1 , wherein the shim layer to shield the second portion from failure comprises the shim layer to shield the second portion from failure responsive to failure of the graphics hardware of the first portion.3. The graphics system of claim 2 , the second portion further comprising drivers to reset the failed graphics hardware responsive to the failure.4. The graphics system of claim 1 , further comprising the second portion to:store display information associated with the GUI application; anddisplay the stored display information responsive to restoration of the failed first portion.5. The graphics system of claim 1 , further comprising a Linux graphics system.6. A non-transitory machine-readable medium storing instructions executable by a processing resource to cause a computing system to:receive notification of a hardware failure associated with a first portion of a graphics system;shield a second portion of the graphics system from failure responsive to the hardware failure using a shim layer of the second portion, the shim layer comprising an application programming interface (API) and a buffer; andconfigure a first display of a first GUI application communicatively coupled to a display system of the second portion based on ...

Подробнее
04-02-2021 дата публикации

METHODS, SYSTEMS, AND COMPUTER READABLE STORAGE DEVICES FOR MANAGING FAULTS IN A VIRTUAL MACHINE NETWORK

Номер: US20210034481A1
Принадлежит:

Faults are managed in a virtual machine network. Failure of operation of a virtual machine among a plurality of different types of virtual machines operating in the virtual machine network is detected. The virtual machine network operates on network elements connected by transport mechanisms. A cause of the failure of the operation of the virtual machine is determined, and recovery of the virtual machine is initiated based on the determined cause of the failure. 1. A method comprising:detecting, by a processor, a failure of operation of a virtual machine among a plurality of different types of virtual machines operating in a virtual machine network, wherein the virtual machine network comprises a plurality of network elements; a fault of a network element of the plurality of network elements;', 'a fault of the virtual machine;', 'a fault of a virtual application being executed by the virtual machine; and', 'a fault of a transport mechanism serving the virtual machine network; and, 'determining, by the processor, a cause of the failure of operation of the virtual machine via a fault signature, wherein the determining the cause of the failure includes identifying the cause of the failure from among a plurality of possible causes which includeinitiating, by the processor, a recovery of the virtual machine based on the cause of the failure that is determined, wherein when the cause of the failure is determined to be the fault of the virtual machine, the initiating the recovery of the virtual machine includes selecting between whether to: restore operation of the virtual machine, or repair a network infrastructure of the plurality of network elements.2. The method of claim 1 , further comprising claim 1 , responsive to a determination to repair a network infrastructure of the plurality of network elements claim 1 , determining a hardware resource of the network infrastructure to be repair.3. The method of claim 2 , wherein the hardware resource comprises at least one of: ...

Подробнее
31-01-2019 дата публикации

Cluster failover to avoid network partitioning

Номер: US20190036765A1
Автор: Sudip Ghosal, Yi-Nan Lee

During a synchronization technique, states of a primary cluster in the computer system with multiple primary controllers that provide controllers for access points and a backup cluster in the computer system with multiple backup controllers that independently provide controllers for the access points may be dynamically synchronized. In particular the primary cluster may receive configuration requests with configuration information for the access points on an input node of the primary cluster. In response, the primary cluster may store the configuration requests in a replay queue in the computer system. Then, the primary cluster may playback the configuration requests in the replay queue for the backup cluster to synchronize the states of the primary cluster and the backup cluster. For example, the configuration requests may be played back within a time interval associated with a service level agreement of a service provider of a service for the access points.

Подробнее
24-02-2022 дата публикации

Log-structured formats for managing archived storage of objects

Номер: US20220058094A1
Принадлежит: VMware LLC

Solutions for managing archived storage include receiving, at a first node, a snapshot comprising object data (e.g., a virtual machine disk snapshot) from a second node (e.g., a software defined data center), and storing the snapshot in a tiered structure that includes a data tier and a metadata tier. Snapshots may be used for fail-over operations and/or backups, to support disaster recovery. The data tier comprises a log-structured file system (LFS), and the metadata tier comprises a content addressable storage (CAS) identifying addresses within the LFS. The metadata tier also comprises a logical layer indicating content in the CAS. Segment cleaning of the data tier is performed using a segment usage table (SUT). Some examples include performing a fail-over operation from the second node to a third node using at least the stored snapshot for workload recovery. In some examples, the CAS comprises a log-structured merge-tree (LSM-tree).

Подробнее
24-02-2022 дата публикации

REDUNDANT PROCESSING FABRIC FOR AUTONOMOUS VEHICLES

Номер: US20220058096A1
Автор: Hayes John, UHLIG VOLKMAR
Принадлежит:

A redundant processing fabric in an autonomous vehicle may include: processing, by a first processing unit of a plurality of processing units, sensor data from a first sensor of a plurality of sensors, where the plurality of processing units are coupled to the plurality of sensors via a switched fabric, wherein the plurality of processing units and plurality of sensors are included in the autonomous vehicle, wherein the sensor data corresponds to an environment external to the autonomous vehicle; determining a failure in processing the sensor data by the first processing unit; and severing, in the switched fabric, a first communications path between the first sensor and the first processing unit; and establishing, in the switched fabric, a second communications path between the first sensor and a redundant processing unit. 1. A method for a redundant processing fabric in an autonomous vehicle , comprising:processing, by a first processing unit of a plurality of processing units, sensor data from a first sensor of a plurality of sensors, where the plurality of processing units are coupled to the plurality of sensors via a switched fabric, wherein the plurality of processing units and plurality of sensors are included in the autonomous vehicle, wherein the sensor data corresponds to an environment external to the autonomous vehicle;determining a failure in processing the sensor data by the first processing unit; andsevering, in the switched fabric, a first communications path between the first sensor and the first processing unit; andestablishing, in the switched fabric, a second communications path between the first sensor and a redundant processing unit.2. The method of claim 1 , wherein the redundant processing unit comprises a processing unit not designated for processing sensor data from any of the plurality of sensors prior to the failure.3. The method of claim 1 , wherein determining the failure in processing the sensor data comprises determining that the first ...

Подробнее
12-02-2015 дата публикации

SYSTEM AND METHOD FOR PROCESSING WEB SERVICE TRANSACTIONS USING TIMESTAMP DATA

Номер: US20150046744A1
Автор: Frerking John Randy
Принадлежит:

A system is provided that is adapted to service web-based service requests. In one implementation, a caching service is provided for storing and servicing web service requests. In one implementation, virtual computer systems may be used to service requests in a more reliable manner. Different operating modes may be configured for backup redundancy and the caching service may be scaled to meet service requests for a particular application. Also, methods are provided for exchanging timestamp information among web service transaction systems to reduce the amount of processing capability and bandwidth for ensuring database consistency. 1. A system for processing web service transactions , the system comprising: 'a first and second server of the plurality of servers are each configured to compare time stamps associated with at least one database record of a common database associated with a web service application.', 'a plurality of servers each adapted to receive and process one or more web service requests, the plurality of servers comprising2. The system according to claim 1 , wherein the first server is adapted to delete the at least one database record of the common database associated with the web service application claim 1 , if it is determined that timestamps of the first and second servers have expired claim 1 , the timestamps being associated with the at least one database record of the common database.3. The system according to claim 1 , wherein the first and second servers are configured to update a timestamp associated with the at least one database record of the common database associated with the web service application responsive to an access to the at least one database record.4. The system according to claim 1 , wherein the first and second servers are located in a first and a second datacenter claim 1 , respectively.5. The system according to claim 1 , wherein the plurality of servers include a plurality of virtual servers.6. The system according to ...

Подробнее
07-02-2019 дата публикации

Distributed dynamic architecture for error correction

Номер: US20190042378A1
Принадлежит: Intel Corp

Various systems and methods may be used to implement a software defined industrial system. For example, an orchestrated system of distributed nodes may run an application, including modules implemented on the distributed nodes. The orchestrated system may include an orchestration server, a first node executing a first module, and a second node executing a second module. In response to the second node failing, the second module may be redeployed to a replacement node (e.g., the first node or a different node). The replacement mode may be determined by the first node or another node, for example based on connections to or from the second node.

Подробнее
06-02-2020 дата публикации

ROLE DESIGNATION IN A HIGH AVAILABILITY NODE

Номер: US20200042410A1
Принадлежит:

A high-availability network device cluster role synchronization technique for devices configured with multiple network controllers is disclosed. An HA node may contain information regarding a role within a cluster for that HA node. This information should property be maintained or erased based on a type of failover for an HA device. For example, if there is a loss of the active controller that causes only a controller failover, changes to the role of the HA node may not be necessary. Thus, an election process within a cluster may be avoided. However, if a failover of an entire HA node occurs (or restart of an HA node), role information prior to the restart may not be applicable and an election process may need to be initiated such that the cluster may continue to function. Different types of roles may exist for nodes within a cluster. 1. A computer-implemented method of maintaining a role designation for a high availability (HA) node having a first and second controller , the method comprising:receiving an indication of a cluster role designation for the HA node at the first controller of the HA node when the first controller is active, the first controller to act as a backup for the second controller when the second controller is active, and the second controller to act as a backup for the first controller when the first controller is active;providing information regarding the cluster role designation from the first controller to the second controller, acting as a backup for the first controller, via a mechanism used for exchanging HA state information independent of a clustering protocol, the HA state information including an indication of which of the first controller and the second controller is active;responsive to receiving the cluster role designation at the second controller, storing information in non-persistent memory of the HA node accessible to the second controller, the information sufficient for the second controller to become active using the cluster ...

Подробнее
01-05-2014 дата публикации

High availability system allowing conditionally reserved computing resource use and reclamation upon a failover

Номер: US20140122920A1
Принадлежит: VMware LLC

In one embodiment, a method determines a first set of virtual machines and a second set of virtual machines. The first set of virtual machines is associated with a first priority level and the second set of virtual machines is associated with a second priority level. A first set of computing resources and a second set of computing resources are associated with hosts. Upon determining a failure of a host, the method performs: generating a power off request for one or more of the second set of virtual machines powered on the second set of computing resources and generating a power on request for one or more virtual machines from the first set of virtual machines that were powered on the failed host, the power on request powering on the one or more virtual machines from the first set of virtual machines on the second set of computing resources.

Подробнее
15-02-2018 дата публикации

High availability state machine and recovery

Номер: US20180046549A1
Принадлежит: Western Digital Technologies Inc

Embodiments of the present invention provide systems and methods for recovering a high availability storage system. The storage system includes a first layer and a second layer, each layer including a controller board, a router board, and storage elements. When a component of a layer fails, the storage system continues to function in the presence of a single failure of any component, up to two storage element failures in either layer, or a single power supply failure. While a component is down, the storage system will run in a degraded mode. The passive zone is not serving input/output requests, but is continuously updating its state in dynamic random access memory to enable failover within a short period of time using the layer that is fully operational. When the issue with the failed zone is corrected, a failback procedure brings the system back to a normal operating state.

Подробнее
19-02-2015 дата публикации

MANAGEABILITY REDUNDANCY FOR MICRO SERVER AND CLUSTERED SYSTEM-ON-A-CHIP DEPLOYMENTS

Номер: US20150052389A1
Принадлежит:

Technologies for providing manageability redundancy for micro server and clustered System-on-a-Chip (SoC) deployments are presented. A configurable multi-processor apparatus may include multiple integrated circuit (IC) blocks where each IC block includes a task block to perform one or more assignable task functions and a management block to perform management functions with respect to the corresponding IC block. Each task block and each management block may include one or more instruction processors and corresponding memory. Each IC block may be controllable to perform a function of one or more other IC blocks. The IC blocks may communicate with each other via a management communication infrastructure that may include a communication path from each of the management blocks to each of the other management blocks. Via the management communication infrastructure, the management blocks may bridge communication paths between pairs of management blocks. 115-. (canceled)16. A multi-processor system having a dynamically re-configurable multi-processor support system , comprising:a first set of one or more instruction processors and corresponding memory;a user interface to interface between the first set of one or more instruction processors and one or more human interface devices; anda set of multiple integrated circuit (IC) blocks to perform task functions in support of the first set of one or more instruction processors;wherein each IC block includes a task block to perform one or more assignable task functions, and a management block to perform management functions with respect to the corresponding IC block;wherein each task block and each management block includes one or more instruction processors and corresponding memory, andwherein each IC block is controllable to perform a function of one or more other IC blocks.17. The system of claim 16 , wherein each management block is controllable to perform a management function of one or more other management blocks.18. The ...

Подробнее
22-02-2018 дата публикации

Selecting Master Time of Day for Maximum Redundancy

Номер: US20180052746A1
Принадлежит:

An approach is provided in which a system selects a first processor as a master Time of Day (TOD) processor in a first TOD topology in response to determining that the first processor is directed connected to an oscillator. The system then assigns a second processor as an alternate master TOD processor to a second TOD topology based upon determining that the second processor is on a different node than the first processor. The system configures to the first TOD topology and, when the system detects a TOD failure, the system re-configures to the second TOD topology. 1. A method implemented by an information handling system that includes a memory and a plurality of processors , the method comprising:selecting a first one of a plurality of processors as a master time of day (TOD) processor in a master TOD topology based on determining that the first processor is directly connected to an oscillator, wherein the first processor is located on a first node;assigning a second one of the plurality of processors as an alternate master TOD processor to a backup TOD topology based upon determining that the second processor is on a second node that is different than the first node;configuring the information handling system to the master TOD topology; andin response to detecting a TOD failure on the master TOD topology, re-configuring the information handling system to the backup TOD topology.2. The method of wherein the selection of the master TOD processor further comprises:gathering virtual product data from the plurality of processors;identifying an amount of functioning cores within each of the plurality of processors;determining that the first processor has a largest amount of functioning cores out of the plurality of processors; andusing the largest amount of functioning cores in the first processor as a factor during the selection of the first processor as the master TOD processor.3. The method of wherein the assigning of the alternate master TOD processor further ...

Подробнее
23-02-2017 дата публикации

TRANSACTIONAL DISTRIBUTED LIFECYCLE MANAGEMENT OF DIVERSE APPLICATION DATA STRUCTURES

Номер: US20170052856A1
Принадлежит:

A state manager provides transactional distributed lifecycle management of a group of different application-level state providers, namely, differently structured application program data structures. The state providers are atomic with respect to one another. The state provider is replicated to one or more secondary nodes of a distributed network. The state providers are persistent despite one or more node operational failures. State provider lifecycle operations include creation of a transactional distributed state provider as a member of a group of different application-level state providers which include differently structured application program data structures, deletion of a previously created transactional distributed state provider, and/or enumeration of any previously created transactional distributed state providers. A given state provider may be read or written by one or more applications. Implementation restrictions and other avoidance conditions are satisfied in particular cases. 1. A system , comprising:a memory;at least one processor in operable communication with at least a portion of the memory;a state manager which upon execution provides transactional distributed lifecycle management of a group of different application-level state providers, namely, differently structured application program data structures which are atomic with respect to one another, replicated, and persistent despite node operational failure.2. The system of claim 1 , further comprising a key-value store containing tuples of the form {state provider name claim 1 , state provider}.3. The system of claim 1 , further comprising a replicator which is independent of state provider structural differences.4. The system of claim 1 , further comprising at least two different applications claim 1 , each of which upon execution accesses the same state provider.5. The system of claim 1 , wherein the system comprises at a primary replica of a state provider on one node and at least one ...

Подробнее
20-02-2020 дата публикации

System and Method for a Vehicle Mediating Zone-Specific Control of a Communication Device

Номер: US20200059413A1
Автор: Ricci Christopher
Принадлежит:

A method and system of providing zone-specific control of one or more of features of at least one communication device by a vehicle, the system comprising: a microprocessor executable feature control module connected with at least communication device (connected device), wherein the feature control module is configured to receive input from at least one of a sensor of the communication device or specific zone/s to locate the at least one communication device with respect to the zone/s; at least one of the feature control module or device location module determining for the at least one connected device, the specific zone associated with the at least one connected device, wherein the specific zone is labeled with a rule or condition for degree of restriction of one or more features of the at least one connected device; and the feature control module limiting access or control of one or more features of the at least one connected device based on the rules or conditions assigned to the determined zone, wherein the rules or condition are defined by at least one of a group, entity, individual, manufacture, in response to a vehicle or communication device detected input or condition, or any combination thereof. 1. A method of providing zone-specific control of one or more of features of at least one communication device by a vehicle , said method comprising the steps of:establishing, by a microprocessor executable feature control module, a connection with the at least one communication device, wherein the feature control module is configured to receive input from at least one of a sensor of the communication device or specific zone/s to locate the at least one communication device with respect to the zone/s;determining for the at least one connected device, the specific zone associated with the at least one connected device, wherein the specific zone is at least one of high-restricted, restricted, low-restricted, or unrestricted;controlling, via the feature control module ...

Подробнее
17-03-2022 дата публикации

METHOD OF USING A SINGLE CONTROLLER (ECU) FOR A FAULT-TOLERANT/FAIL-OPERATIONAL SELF-DRIVING SYSTEM

Номер: US20220080992A1
Принадлежит:

In a self-driving autonomous vehicle, a controller architecture includes multiple processors within the same box. Each processor monitors the others and takes appropriate safe action when needed, Some processors may run dormant or low priority redundant functions that become active when another processor is detected to have failed. The processors are independently powered and independently execute redundant algorithms from sensor data processing to actuation commands using different hardware capabilities (GPUs, processing cores, different input signals, etc.). Intentional hardware and software diversity improves fault tolerance. The resulting fault-tolerant/fail-operational system meets ISO26262 ASIL-D specifications based on a single electronic controller unit platform that can be used for self-driving vehicles. 1. A control system comprising:a first sensor,a second sensor,a third sensor,at least one input bus connected to the first, second and third sensors,an electronic controller comprising a first processor, a second processor and a third processor each coupled to the at least one input bus,wherein the first, second and third processors each independently process signals from the at least one input bus to provide control signals,the first processor providing first control signals in response to a first combination of the first, second and third sensors,the second processor providing second control signals in response to a second combination of the first, second and third sensors different from the first combination,the third processor providing third control signals in response to a third combination of the first, second and third sensors different from at least one of the first combination and different from the second combination, andan intelligent control signal arbitrator that receives the first, second and third control signals and arbitrates between them to perform at least one control function.2. The system of wherein the third processor is configured to ...

Подробнее
17-03-2022 дата публикации

All flash array server and control method thereof

Номер: US20220083438A1
Автор: Li-Sheng Kan
Принадлежит: Silicon Motion Inc

The present invention provides a control method of a server, wherein the control method includes the steps of: periodically controlling a first register and a second register of a first node to have a first value and a second value, respectively; periodically controlling a third register and a fourth register of a second node to have a third value and a fourth value, respectively; controlling the first register and the fourth register to synchronize with each other, wherein the first value is different from the fourth value; controlling the second register and the third register to synchronize with each other, wherein the second value is different from the third value; and periodically checking if the third register has the third value and the fourth register has the fourth value to determine if the first node fails to work.

Подробнее
08-03-2018 дата публикации

Scalable Data Storage Pools

Номер: US20180067829A1
Принадлежит:

Scalable data storage techniques are described. In one or more implementations, data is obtained by one or more computing devices that describes fault domains in a storage hierarchy and available storage resources in a data storage pool. Operational characteristics are ascertained, by the one or more computing devices, of devices associated with the available storage resources within one or more levels of the storage hierarchy. Distribution of metadata is assigned by the one or more computing devices to one or more particular data storage devices within the data storage pool based on the described fault domains and the ascertained operational characteristics of devices within one or more levels of the storage hierarchy. 1obtaining data by one or more computing devices that describes fault domains in a storage hierarchy and available storage resources in a data storage pool;ascertaining operational characteristics, by the one or more computing devices, of devices associated with the available storage resources within one or more levels of the storage hierarchy; andassigning distribution of metadata by the one or more computing devices to one or more particular data storage devices within the data storage pool based on the described fault domains and the ascertained operational characteristics of devices within one or more levels of the storage hierarchy, and wherein the assigning is performed to maximize usage of a number of fault domains for a specified number of data storage devices in the data storage pool that are to receive at least a portion of the metadata.. A method performed by one or more processors when executing computer-executable instructions for the method, the method comprising: This application is a continuation of U.S. patent application Ser. No. 14/485,497 filed on Sep. 12, 2014, entitled “Scalable Data Storage Pools,” and which application is expressly incorporated herein by reference in its entirety.The pervasiveness of data storage “in the cloud ...

Подробнее
08-03-2018 дата публикации

HEALING CLOUD SERVICES DURING UPGRADES

Номер: US20180067830A1
Принадлежит:

Embodiments described herein are directed to migrating affected services away from a faulted cloud node and to handling faults during an upgrade. In one scenario, a computer system determines that virtual machines running on a first cloud node are in a faulted state. The computer system determines which cloud resources on the first cloud node were allocated to the faulted virtual machine, allocates the determined cloud resources of the first cloud node to a second, different cloud node and re-instantiates the faulted virtual machine on the second, different cloud node using the allocated cloud resources. 1. A computer system comprising:one or more processors; determining that a virtual machine running on a first cloud node is in a faulted state;', 'determining computing and network resources that were allocated to the virtual machine in the faulted state', 'obtaining a set of constraints by combining information based on the determined computing and network resources and model of service established for a user of the virtual machine in the faulted state;', 'in order to reduce time to heal in response to the faulted state as well as more rapidly notifying service instances of any topology changes, using the set of constraints to perform an incremental allocation process for only instances of the determined computing and network resources that need migration rather than migrating all the determined computing and network resources; and', 're-instantiating the virtual machine in the faulted state on a virtual machine at a second, different cloud node using only the instances of the determined computing and network resources that need migration in accordance with the set of constraints used to perform the incremental allocation process., 'one or more computer-readable storage media having stored thereon computer-executable instructions that, when executed by the one or more processors, causes the computing system to perform a method for migrating affected services away ...

Подробнее
11-03-2021 дата публикации

REDUCING FAILOVER TIME BETWEEN DATA NODES

Номер: US20210073088A1
Принадлежит: EMC IP Holding Company LLC

A storage node that maintains a replica of a logical volume for use in response to a failover trigger includes a data node with volatile memory in which a filesystem and its metadata and a VDM and its metadata associated with the replica are maintained prior to the failover trigger. The storage node also includes a SAN node in which data associated with the replica is maintained. The data is maintained in a RW (read-write) state by the SAN node prior to the failover trigger. However, the replica is presented in a RO (read-only) state by the storage node prior to the failover trigger. The storage node changes the in-memory state of the filesystem and VDM to RW responsive to the failover trigger. Because the filesystem and its metadata and VDM and its metadata are already in memory and the data is in a RW state in block storage the failover is completed relatively quickly. 1. An apparatus comprising: a data node comprising a volatile memory in which a filesystem and its metadata associated with the replica is maintained prior to the failover trigger; and', 'a SAN (storage area network) node in which data associated with the replica is maintained., 'a storage node that maintains a replica of a logical volume for use in response to a failover trigger, the storage node comprising2. The apparatus of wherein the data node maintains a VDM (virtual data mover) and its metadata associated with the replica in the volatile memory prior to the failover trigger.3. The apparatus of wherein the data is maintained in a RW (read-write) state by the SAN node prior to the failover trigger.4. The apparatus of wherein the replica is presented in a RO (read-only) state by the storage node prior to the failover trigger.5. The apparatus of wherein the storage node changes in-memory state of the filesystem and VDM to RW responsive to the failover trigger.6. The apparatus of wherein SDNAS (software-defined network attached storage) applications synchronize the replica with a primary site ...

Подробнее
11-03-2021 дата публикации

Techniques for providing intersite high availability of data nodes in a virtual cluster

Номер: US20210073089A1
Принадлежит: EMC IP Holding Co LLC

Creating and using a virtual cluster may include: creating a first cluster logical device on a first data storage system including data nodes; creating a second cluster logical device on a second data storage system including data nodes; configuring the first cluster logical device and the second cluster logical device as a same first logical device; establishing bidirectional remote replication between the first and second cluster logical devices; determining pairs of data nodes including a data node from the first data storage system and another data node from the second data storage system; determining a failure of a first data node on the first data storage system, wherein one of the pairs of data nodes includes the first data node and a second data node of the second data storage system; and responsive to determining the failure of the first data node, performing failover processing by the second data node.

Подробнее
11-03-2021 дата публикации

ADJUSTMENT OF SAFE DATA COMMIT SCAN BASED ON OPERATIONAL VERIFICATION OF NON-VOLATILE MEMORY

Номер: US20210073090A1
Принадлежит:

A first non-volatile dual in-line memory module (NVDIMM) of a first server and a second NVDIMM of a second server are armed during initial program load in a dual-server based storage system to configure the first NVDIMM and the second NVDIMM to retain data on power loss. Prior to initiating a safe data commit scan to destage modified data from the first server to a secondary storage, a determination is made as to whether the first NVDIMM is armed. In response to determining that the first NVDIMM is not armed, a failover is initiated to the second server. 1. A method comprising:arming a first non-volatile dual in-line memory module (NVDIMM) of a first server and a second NVDIMM of a second server during initial program load in a dual-server based storage system to configure the first NVDIMM and the second NVDIMM to retain data on power loss;prior to initiating a safe data commit scan to destage modified data from the first server to a secondary storage, determining whether the first NVDIMM is armed; andin response to determining that the first NVDIMM is not armed, initiating a failover to the second server.2. The method of claim 1 , the method further comprising:in response to determining that the second NVDIMM is not armed, decreasing a time interval between successive safe data commit scans in the second server.3. The method of claim 2 , the method further comprising:in response to determining that the first NVDIMM has become armed once again in the first server and the first server has become operational, changing the time interval between successive safe data commit scans to a predetermined time that is a standard time between successive safe data commit scans.4. The method of claim 2 , the method further comprising:in response to completion of a safe data commit scan in the second server, and in response to determining that NVDIMM usage in the second server is greater than a predetermined threshold or a predetermined time that is a standard time between ...

Подробнее
27-02-2020 дата публикации

SYSTEM AND METHOD FOR A RECONFIGURABLE VEHICLE DISPLAY

Номер: US20200067786A1
Автор: Ricci Christopher
Принадлежит:

A system or method for reconfiguring (dynamically) a vehicle display may comprise: a Graphical User Interface (“GUI”) including a first display area; an input gesture area of the first display area; a HUD unit; a non-transitory computer readable medium having instructions stored thereon that, when executed by a processor, configure the system to: (1) display, at a first time, a configuration area to a portion of the GUI, wherein the configuration area includes at least one of a vehicle dash information, readouts, instruments, indicators, or controls arranged as a visual representation of a virtual dash display for the HUD unit; (2) receive a gesture input at the GUI, wherein the gesture input corresponds to an instruction to reconfigure at least one of a layout, size, position, features, instruments, indicators, color schemes, or controls for display on at least one of an above a vehicle dash by the HUD unit, a reconfigurable dash display, a reconfigurable console display, or a reconfigurable user device display; and wherein the gesture input is at least one of hand gesture or touch gesture received through at least one of a gesture capture region or image capture disposed on at least one of a dash, console, dash display, or console display. 1. A system for a reconfigurable vehicle display , said system comprising:a Graphical User Interface (“GUI”) including a first display area;an input gesture area of the first display area;a HUD unit;a non-transitory computer readable medium having instructions stored thereon that, when executed by a processor, configure the system to:display, at a first time, a configuration area to a portion of the GUI, wherein the configuration area includes at least one of a vehicle dash information, readouts, instruments, indicators, or controls arranged as a visual representation of a virtual dash display for the HUD unit;receive a gesture input at the GUI, wherein the gesture input corresponds to an instruction to reconfigure at least one ...

Подробнее
11-03-2021 дата публикации

SCALABLE BYZANTINE FAULT-TOLERANT PROTOCOL WITH PARTIAL TEE SUPPORT

Номер: US20210075598A1
Автор: Karame Ghassan, Li Wenting
Принадлежит:

A method is provided for preparing a plurality of distributed nodes to perform a protocol to establish a consensus on an order of received requests. The plurality of distributed nodes includes a plurality of active nodes, the plurality of active nodes including a primary node, each of the plurality of distributed nodes including a processor and computer readable media. The method includes preparing a set of random numbers, each being a share of an initial secret. Each share of the initial secret corresponds to one of the plurality of active nodes. The method further includes encrypting each respective share of the initial secret, binding the initial secret to a last counter value to provide a commitment and a signature for the last counter value, and generating shares of a second and of a plurality of subsequent additional secrets by iteratively applying a hash function to shares of each preceding secret. 1. A method for preparing a plurality of distributed nodes connected via a data communication network to perform a protocol to establish a consensus on an order of received requests , the plurality of distributed nodes including a plurality of active nodes , the plurality of active nodes including a primary node , each of the plurality of distributed nodes including a processor and computer readable media , the method comprising:preparing a set of random numbers, wherein each of the random numbers is a share of an initial secret, wherein each share of the initial secret corresponds to one of the plurality of active nodes;encrypting, in order to generate encrypted shares of the initial secret, each respective share of the initial secret;binding the initial secret to a last counter value to provide a commitment and a signature for the last counter value;generating shares of a second and of a plurality of subsequent additional secrets by iteratively applying a hash function to shares of each preceding secret;binding the second secret to a second-to-last counter value ...

Подробнее
15-03-2018 дата публикации

FAILOVER SYSTEMS AND METHODS FOR PERFORMING BACKUP OPERATIONS

Номер: US20180074914A1
Принадлежит:

In certain embodiments, a tiered storage system is disclosed that provides for failover protection during data backup operations. The system can provide for an index, or catalog, for identifying and enabling restoration of backup data located on a storage device. The system further maintains a set of transaction logs generated by media agent modules that identify metadata with respect to individual data chunks of a backup file on the storage device. A copy of the catalog and transaction logs can be stored at a location accessible by each of the media agent modules. In this manner, in case of a failure of one media agent module during backup, the transaction logs and existing catalog can be used by a second media agent module to resume the backup operation without requiring a restart of the backup process. 1. A method for performing an operation in a data storage system , the method comprising: receiving a plurality of data units from a client computing device to store on at least one first storage device as part of a data protection operation;', 'storing at least a first data unit of the plurality of data units on the at least one first storage device; and', 'prior to completion of the data protection operation, storing metadata to at least one second storage device in association with the storing of the first data unit and prior to, concurrently with, or subsequent to the storing of the first data unit, wherein the at least one second storage device is accessible by the first computing device and at least one other computing device; and, 'with a first computing device comprising one or more hardware processors receiving an instruction to take over control of the data protection operation partially performed, but not yet completed, by the first computing device;', 'obtaining, from the at least one second storage device, the metadata associated with the storing of the first data unit on the at least one first storage device;', 'using at least the metadata, ...

Подробнее
07-03-2019 дата публикации

INFORMATION PROCESSING SYSTEM

Номер: US20190073281A1
Принадлежит: Hitachi, Ltd.

A first system receives values with identifiers of the values from one or more clients. The first system enters the values sequentially into a first data store. The first system associates each of the values with a sequence ID indicating a position in entry sequence of the values into the first data store. The first system transmits a first identifier of a first value and a first sequence ID associated with the first value to a second system. The first system transmits the first sequence ID and the first value to the second system after transmitting the first identifier and the first sequence ID. The second system holds the first identifier and the first sequence ID transmitted from the first system in a first queue. The second system enters the first value received after the first identifier from the first system into a second data store. 1. An information processing system comprising:a first system; anda second system,wherein the first system is configured to:receive values with identifiers of the values from one or more clients;enter the values sequentially into a first data store;associate each of the values with a sequence ID indicating a position in entry sequence of the values into the first data store;transmit a first identifier of a first value and a first sequence ID associated with the first value to the second system; andtransmit the first sequence ID and the first value to the second system after transmitting the first identifier and the first sequence ID, andwherein the second system is configured to:hold the first identifier and the first sequence ID transmitted from the first system in a first queue; andenter the first value received after the first identifier from the first system into a second data store.2. The information processing system according to claim 1 , wherein claim 1 , after failover from the first system to the second system claim 1 , the second system is configured to issue an error message when a second identifier of a second value ...

Подробнее
24-03-2022 дата публикации

DISTRIBUTED SYSTEM, MESSAGE PROCESSING METHOD, NODES, CLIENT, AND STORAGE MEDIUM

Номер: US20220091918A1

The present disclosure discloses a client device having a digital signature. The client device includes processing circuitry configured to send a message to be stored in nodes after the nodes reach a consensus on the message. The message includes the digital signature of the client device. The processing circuitry obtains results from a subset of the nodes that receive the message. The results have respective digital signatures of the subset of the nodes. The nodes are in a first consensus mode for reaching the consensus on the message. After verifying the digital signatures, the processing circuitry determines whether one or more of the nodes has malfunctioned based on the results. Based on a determination that the one or more of the nodes has malfunctioned, the processing circuitry triggers the nodes to switch from the first consensus mode to a second consensus mode for reaching the consensus on the message. 1. A client device having a digital signature , the client device comprising: send a message to be stored in nodes after the nodes reach a consensus on the message, the message including the digital signature of the client device;', 'obtain results from a subset of the nodes that receive the message, the results having respective digital signatures of the subset of the nodes, the nodes being in a first consensus mode for reaching the consensus on the message;', 'after verifying the digital signatures of the subset of the nodes, determine, by the processing circuitry of the client device, based on the results, whether one or more of the nodes has malfunctioned; and', 'based on a determination that the one or more of the nodes has malfunctioned, trigger, by the processing circuitry of the client device, the nodes to switch from the first consensus mode to a second consensus mode for reaching the consensus on the message., 'processing circuitry configured to2. The client device according to claim 1 , wherein:the nodes include a leader node and follower nodes;the ...

Подробнее
15-03-2018 дата публикации

REDUNDANT STORAGE SOLUTION

Номер: US20180077007A1
Автор: Olson Eric
Принадлежит: SOFTNAS OPERATING INC.

Method and apparatus for switching between a first server and a second server, each located within a virtual private cloud and the first server being located within a first zone and the second server being located within a second zone that is physically separate from the first zone. The method and apparatus further configured to determine that the first server has experienced a failure to send or receive data. The method and apparatus further configured to enable a second port on the second server. The method and apparatus further configured to create a new route table by the second server and flush the previous route table. The method and apparatus further configured to transmit, via the second port, a request to a virtual private cloud controller to update an elastic internet protocol address with the second port information and receive data from the virtual private cloud controller. 1. A method of switching between a first node communicatively coupled to a virtual cloud controller and a second node coupled to the first node , each located within a virtual cloud , the nodes each comprising pools and volumes , the method comprising:determining, at the second node, that the first node has experienced a failure to send or receive data;determining the second node has proper connectivity;fencing off the first node;creating a new local route table by the second node;flushing the previous local route table;transmitting a request to a virtual cloud controller to update a virtual internet protocol address to point to the second node; andreceiving, at the second node, data via the updated virtual internet protocol address.2. The method according to claim 1 , further comprising:initiating, at the first node, a boot sequence;querying a third party witness for a current role of the first node;determine the current role of the first node; andbegin processing at the first node based on the current role.3. The method according to claim 2 , wherein the third party witness is a ...

Подробнее
05-03-2020 дата публикации

MANAGEMENT OF CLUSTERS

Номер: US20200073769A1
Принадлежит:

Example implementations relate to management of clusters. A cluster recovery manager may comprise a processing resource; and a memory resource storing machine-readable instructions to cause the processing resource to: adjust, based on a monitored degree of performance of a controller of a controller cluster, a state of the controller to one of a first state and a second state; and reassign a corresponding portion of a plurality of APs managed by the controller periodically to a different controller until the state of the controller is determined to be adjustable to the first state. The reassignment can be triggered responsive to a state adjustment of the controller from the first state to the second state. 1. A cluster recovery manager , comprising:a processing resource; and adjust, based on a monitored degree of performance of a controller of a controller cluster, a state of the controller from a first state to a second state;', 'reassign, responsive to a determination that the state of the controller is adjusted to the second state, a first portion a plurality of access points (AP) managed by the controller to a respective controller corresponding to each AP of the first portion during a first period; and', 'determine whether to reassign a remaining portion of the plurality of APs based on the degree of performance monitored subsequent to the first portion of the plurality of APs being reassigned., 'a memory resource storing machine-readable instructions to cause the processing resource to2. The cluster recovery manager of claim 1 , wherein the degree of performance of the controller is evaluated based on information collected from the controller.3. The cluster recovery manager of claim 1 , wherein the remaining portions of the plurality of APs including the second portion is periodically reassignable in a respective period until the state of the controller is determined to be adjustable from the second state to the first state.4. The cluster recovery manager of ...

Подробнее
14-03-2019 дата публикации

INFORMATION PROCESSING SYSTEM AND CONTROL APPARATUS

Номер: US20190079838A1
Принадлежит: FUJITSU LIMITED

An information processing system includes a plurality of control apparatuses communicably coupled to each other. A first control apparatus of the plurality of control apparatuses includes a first memory configured to store first instructions and a first processor configured to operate using standby power before a power-on selection is made. The first processor executes the first instructions causing a process including collecting first identification information of each of the plurality of control apparatuses other than the first control apparatus. The process includes storing the first identification information in the first memory. The process includes determining a role of the first control apparatus based on a comparison result derived by comparing second identification information of the first control apparatus with the first identification information. 1. An information processing system comprising:a plurality of control apparatuses communicably coupled to each other, a first control apparatus of the plurality of control apparatuses including:a first memory configured to store first instructions; anda first processor configured to operate using standby power, before a power-on selection is made, and execute the first instructions causing a process including:collecting first identification information of each of the plurality of control apparatuses other than the first control apparatus;storing the first identification information in the first memory; anddetermining a role of the first control apparatus based on a comparison result derived by comparing second identification information of the first control apparatus with the first identification information.2. The information processing system according to claim 1 , whereinthe first and second identification information is a numerical value, andthe process further includes:determining the role of the first control apparatus based on a magnitude relationship between a value of the second identification ...

Подробнее
26-03-2015 дата публикации

System and method for providing administration command progress status in a cloud platform environment

Номер: US20150089039A1
Принадлежит: Oracle International Corp

In accordance with an embodiment, described herein is a system and method for providing administrative command progress status for use with a cloud computing environment. In accordance with an embodiment, a job manager service provides an application program interface which receives administrative commands to be processed within the cloud environment as jobs, wherein each instance of the administrative commands is associated with a unique job identifier. A command line interface allows a user to issue a command to be processed within the cloud environment as a job. During progress of a job associated with an annotated command, a status associated with the progress of the job is determined and provided to the command line interface. For example, the system can provide job progress status during these operations, to reassure the user that the operation is proceeding normally.

Подробнее
14-03-2019 дата публикации

OPERATING A HIGHLY AVAILABLE AUTOMATION SYSTEM

Номер: US20190081812A1
Принадлежит:

To achieve an automatic adjustment of a monitoring time in an automation system with a first automation device and a second automation device, at least one of the two automation devices operates a measuring program. A desired ring interruption is carried out by the measuring program by blocking a ring port in order thus to provoke a ring reconfiguration that utilizes a reconfiguration time. The blockage of the ring port is canceled again after the reconfiguration time has elapsed. The ring port is blocked again if the ring port has been opened by the ring reconfiguration, and all routing tables are deleted. As a result of this, at least the peripheral units are triggered to learn new network routes. Runtimes of test telegrams are measured, and a maximum value of the measured runtimes is stored. The measured maximum value is used for a dynamic adjustment of the monitoring time. 1. A method for operating a program-controlled highly available automation system configured redundantly with a first automation device and a second automation device , for a technical process , wherein one automation device of the first automation device and the second automation device preferentially controls the technical process via peripheral units , and the first automation device and the second automation device mutually monitor for failure of the respective other automation device of the first automation device and the second automation device , wherein a monitoring query from the first automation device to the second automation device , and vice versa , is to be responded to within a monitoring time , wherein for a communication , the first automation device and the second automation device , and the peripheral units are connected with one another via a ring , wherein the first automation device and the second automation device each have a first ring port and a second ring port in order to form the ring , wherein one ring port of the first ring ports and the second ring ports is ...

Подробнее
26-03-2015 дата публикации

Management device, data acquisition method, and recording medium

Номер: US20150089271A1
Автор: Takeru SHIMIZU
Принадлежит: Fujitsu Ltd

A management device includes a processor that executes a process. The process includes: saving a conversion table when an information processing apparatus that performs a memory access by the conversion table, in which an active absolute address that is used by the processor to specify data is associated with an active physical address that indicates a storage area in a memory that stores therein the data, has failed; creating a second conversion table in which a standby absolute address that is different from the active absolute address is associated with the active physical address used at the time of a failure and a standby physical address that is different from the active physical address used at the time of the failure is associated with the active absolute address; setting the second conversion table; and acquiring the data from the storage area that is indicated by the physical address.

Подробнее
31-03-2022 дата публикации

STORAGE SYSTEM AND CONTROL METHOD THEREFOR

Номер: US20220100616A1
Принадлежит: Hitachi, Ltd.

Each redundancy group is constituted by one active program (storage control software of the active program) and N standby programs (N is an integer of two or more). Each of the N standby programs is associated with a priority to be determined as a failover (FO) destination. In the same redundancy group, FO is performed from the active program to the standby program based on the priority. For the plurality of pieces of storage control software including the active programs and the standby programs that change to be active by FO in the plurality of redundancy groups arranged in the same node, standby storage control software that can set each of the programs as a FO destination are arranged in different nodes. 1. A storage system comprising:a plurality of storage nodes each having a memory and a processor; anda storage device,wherein a plurality of redundancy groups each of which is constituted by (N+1)-multiplexed pieces of storage control software are arranged in the plurality of storage nodes (N is an integer of two or more), andfor each of the redundancy groups,the (N+1)-multiplexed pieces of storage control software, which constitute the redundancy group and are executed by processors to perform storage control, are arranged in different (N+1) storage nodes among the plurality of storage nodes,one piece of storage control software out of the (N+1)-multiplexed pieces of storage control software is an active program which is active storage control software, and each of the remaining N pieces of storage control software is a standby program which is standby storage control software,each of the N standby programs is associated with a priority to be determined as a failover destination,when a storage node where the active program is arranged fails, failover within the redundancy group from the active program to a standby program with a highest priority is performed,an arrangement condition of a redundancy group α is that at most k standby programs among the N standby ...

Подробнее
31-03-2022 дата публикации

Storage system and control method therefor

Номер: US20220100617A1
Принадлежит: HITACHI LTD

Each redundancy group is constituted by one active program (storage control software of the active program) and N standby programs (N is an integer of two or more). Each of the N standby programs is associated with a priority to be determined as a failover (FO) destination. In the same redundancy group, FO is performed from the active program to the standby program based on the priority. For the plurality of pieces of storage control software including the active programs and the standby programs that change to be active by FO in the plurality of redundancy groups arranged in the same node, standby storage control software that can set each of the programs as a FO destination are arranged in different nodes.

Подробнее
31-03-2022 дата публикации

COMPUTER CLUSTER USING EXPIRING RECOVERY RULES

Номер: US20220100619A1
Автор: Gusev Andrey, Wang Tak
Принадлежит: ORACLE INTERNATIONAL CORPORATION

The fail-over computer cluster enables multiple computing devices to operate using adaptive quorum rules to dictate which nodes are in the fail-over cluster at any given time. The adaptive quorum rules provide requirements for communications between nodes and connections with voting file systems. The adaptive quorum rules include particular recovery rules for unplanned changes in node configuration, such as due to a disruptive event. Such recovery quorum rules enable the fail-over cluster to continuing to operate with various changed configurations of its node members as a result of the disruptive event. In the changed configuration, access to voting file systems may not be required for a majority-group subset of nodes. If no majority-group subset remains, nodes may need direct or indirect access to voting file systems. 1. A computer-implemented method to operate a computer cluster having a plurality of nodes according to quorum rules , the method comprising:prior to a disruptive event, the computer cluster operates according to formation quorum rules in which each initial node that operates in the computer cluster is in communication with at least a majority of one or more voting file systems;determining a failure status of at least one of the plurality of nodes of the computer cluster in response to a disruptive event; andmaintaining the computer cluster according to at least one recovery quorum rule for an expiration time, under which a subset of remaining nodes operates, after which the formation quorum rules or revised quorum rules apply instead of the recovery quorum rules,wherein the at least one recovery quorum rule is different from the formation quorum rules and the revised quorum rules.2. The computer-implemented method of claim 1 , wherein the expiration time is extendable based claim 1 , at least in part claim 1 , on an additional time to address a result of the disruptive event.3. The computer-implemented method of claim 1 , wherein the at least one ...

Подробнее
25-03-2021 дата публикации

HIGH AVAILABILITY FOR A RELATIONAL DATABASE MANAGEMENT SYSTEM AS A SERVICE IN A CLOUD PLATFORM

Номер: US20210089415A1
Принадлежит:

A Relational Database Management System (“RDBMS”) as a service cluster may including a master RDBMS Virtual Machine (“VM”) node associated with an Internet Protocol (“IP”) address and a standby RDBMS VM node associated with an IP address. The RDBMS as a service (e.g., PostgreSQL as a service) may also include n controller VM nodes each associated with an IP address. An internal load balancer may receive requests from cloud applications and include a frontend IP address different than the RDBMS IP as a service addresses and a backend pool including indications of the master RDBMS VM node and the standby RDBMS VM node. A Hyper-Text Transfer Protocol (“HTTP”) custom probe may transmit requests for the health of the master RDBMS VM node and the standby RDBMS VM node via the associated IP addresses, and responses to the requests may be used in connection with a failover operation. 1. A system associated with a cloud-based computing environment , comprising: a master RDBMS Virtual Machine (“VM”) node associated with an Internet Protocol (“IP”) address,', 'a standby RDBMS VM node associated with an IP address, and', 'n controller VM nodes each associated with an IP address, where n is greater than 1; and', 'an internal load balancer to receive requests from cloud applications, including:', 'a frontend IP address different than the RDBMS IP as a service addresses, a backend pool including indications of the master RDBMS VM node and the standby RDBMS VM node,', 'a Hyper-Text Transfer Protocol (“HTTP”) custom probe to transmit requests for the health of the master RDBMS VM node and the standby RDBMS VM node via the associated IP addresses, wherein responses to the requests are used in connection with a failover operation., 'a Relational Database Management System (“RDBMS”) as a service cluster, including2. The system of claim 1 , wherein the RDBMS as a service comprises PostgreSQL as a service.3. The system of claim 1 , wherein a HTTP 200 OK response is received from a ...

Подробнее
31-03-2016 дата публикации

Evidence-based replacement of storage nodes

Номер: US20160092287A1
Принадлежит: Intel Corp

Apparatus, systems, and methods for Recovery algorithm in memory are described. In one embodiment, a controller comprises logic to receive reliability information from at least one component of a storage device coupled to the controller, store the reliability information in a memory communicatively coupled to the controller, generate at least one reliability indicator for the storage device, and forward the reliability indicator to an election module. Other embodiments are also disclosed and claimed.

Подробнее
31-03-2016 дата публикации

MULTI-PARTITION NETWORKING DEVICE AND METHOD THEREFOR

Номер: US20160092323A1
Принадлежит: Freescale Semiconductor, Inc.

A multi-partition networking device comprising a primary partition running on a first set of hardware resources and a secondary partition running on a further set of hardware resources. The multi-partition networking device is arranged to operate in a first operating state, whereby the first set of hardware resources are in an active state and the primary partition is arranged to process network traffic, and the further set of hardware resources are in a standby state. The multi-partition networking device is further arranged to transition to a second operating state upon detection of a suspicious condition within the primary partition, whereby the further set of hardware resources are transitioned from a standby state to an active state, and to transition to a third operating state upon detection of a failure condition within the primary partition, whereby processing of network traffic is transferred to the secondary partition. 1. A multi-partition networking device comprising at least one primary partition running on a first set of hardware resources and at least one secondary partition running on at least one further set of hardware resources; wherein the multi-partition networking device is arranged to:operate in a first operating state, whereby the first set of hardware resources are in an active state and the primary partition is arranged to process network traffic and the at least one further set of hardware resources are in a standby state;transition to a second operating state upon detection of a suspicious condition within the primary partition, whereby the at least one further set of hardware resources are transitioned from a standby state to an active state; andtransition to a third operating state upon detection of a failure condition within the primary partition, whereby processing of network traffic is transferred to the secondary partition.2. The multi-partition networking device of claim 1 , wherein the multi-partition networking device further ...

Подробнее
02-04-2015 дата публикации

Redundant Automation System

Номер: US20150095690A1
Принадлежит:

A redundant automation system having a plurality of automation devices which are connected to one another comprises a plurality of master devices and a slave device. Each of the plurality of automation devices processes a control program in order to control a technical process. At least one of the plurality of automation devices operates as a slave and at least two of the plurality of automation devices, each operates as a master. The plurality of master devices is each configured to run a respective master program and to process processing sections of the respective master program of the respective master program, and the slave device is configured to process a corresponding slave control program for each master control program run by the plurality of master devices and, if one of the plurality of master devices fails, to assume the function of the failed master. 1. A redundant automation system having a plurality of automation devices which are connected to one another , wherein each of the plurality of automation devices processes a control program in order to control a technical process and wherein at least one of the plurality of automation devices operates as a slave and at least two of these automation devices each operates as a master , the redundant automation system comprising:a plurality of master devices, each configured to run a respective master program and to process processing sections of the respective master program of the respective master program; anda slave device configured to process a corresponding slave control program for each master control program run by the plurality of master devices and, if one of the plurality of master devices fails, to assume the function of the failed master,wherein each of the plurality of master devices is further configured to transmit a master release to the slave device after an event has occurred or after the expiry of a predefined interval of time,wherein the master release indicates to the slave device the ...

Подробнее
05-05-2022 дата публикации

Transforming application-instance specific data

Номер: US20220137828A1
Принадлежит: EMC IP Holding Co LLC

Transforming data that is provided by a first instance of an application that uses application-instance specific data includes determining if a component of the data is an application-instance specific component and, if the component is an application-instance specific component, transforming the component either at a storage system containing the data or as the component is being accessed by a second instance of the application, different from the first instance. Transforming the component at a storage system containing the data may be performed independently of any accesses of the data. Transforming the component at a storage system containing the data may be performed by the storage system. The first instance of the application may run on a first host and the second instance of the application may run on a second host different from the first host. The first and second instances of the application may run on a same host.

Подробнее
30-03-2017 дата публикации

VEHICLE MIDDLEWARE

Номер: US20170093643A1
Принадлежит:

The present disclosure describes a vehicle implementing one or more processing modules. These modules are configured to connect and interface with the various buses in the vehicle, where the various buses are connected with the various components of the vehicle to facilitate information transfer among the vehicle components. Each processing module is further modularized with the ability to add and replace other functional modules now or in the future. These functional modules can themselves act as distinct vehicle components. Each processing modules may hand-off processing to other modules depending on its health, processing load, or by third-party control. Thus, the plurality of processing modules helps to implement a middleware point of control to the vehicle with redundancy in processing and safety and security awareness in their applications. 120-. (canceled)21. A vehicle , comprising:a non-transient, tangible computer-readable memory;a computational module selector stored in the non-transient, tangible computer-readable memory to identify and select a computational module from among a plurality of computational modules in communication with the computational module selector to perform a selected task, operation, and/or function, the selected task, operation, and/or function having performance requirements, wherein each one of the plurality of computational modules has processing capabilities; wherein at least a pair of the plural computational modules comprises a first computational module with a cellular capability and a second computational module without a cellular capability, wherein the computational module selector selects a computational module from among the plurality of computational modules, wherein the selected computational module has processing capabilities that satisfy the performance requirements of the selected task, operation, and/or function;a network selector module stored in the non-transient, tangible computer-readable memory to select one ...

Подробнее
19-03-2020 дата публикации

DISTRIBUTED PROCESSING SYSTEM AND METHOD FOR MANAGEMENT OF DISTRIBUTED PROCESSING SYSTEM

Номер: US20200089585A1
Принадлежит: Hitachi, Ltd.

A distributed processing system includes a plurality of information processing apparatuses communicably coupled to one another and is capable of performing parallel processing in which the information processing apparatus performs predetermined processing in parallel with the other information processing apparatuses. The distributed processing system including a configuration-information storing part that stores configuration information concerning the number of the information processing apparatuses configuring the distributed processing system and a combination of the information processing apparatuses, a state monitoring part that monitors an operation state of each of the information processing apparatuses, and a system reconfiguring part that, when detecting a change of the operation state of the information processing apparatus, changes the configuration information based on the number and combination of information processing apparatuses in operation and causes, based on the changed configuration information, at least one or more of the information processing apparatuses to perform the predetermined processing. 1. A distributed processing system that is configured to include a plurality of information processing apparatuses communicably coupled to one another and each including a processor and a memory and is capable of performing parallel processing in which the information processing apparatus performs predetermined processing in parallel with the other information processing apparatuses , the distributed processing system comprising:a configuration-information storing part that stores configuration information, which is information concerning a number of the information processing apparatuses configuring the distributed processing system and a combination of the information processing apparatuses;a state monitoring part that monitors an operation state of each of the information processing apparatuses; anda system reconfiguring part that, when detecting a ...

Подробнее
19-03-2020 дата публикации

DISASTER RECOVERY SPECIFIC CONFIGURATIONS, MANAGEMENT, AND APPLICATION

Номер: US20200089587A1
Принадлежит:

A mechanism for disaster recovery configurations and management in virtual tape applications. Specifically, the introduction of an additional computer process executing at an active datacenter site and at another active (or alternatively, a standby) datacenter site permit: (i) the generation and management of global configurations implemented on the active datacenter site prior to the occurrence of a failover event; and (ii) the implementation of global configurations on the another active (or standby) datacenter site after the occurrence of the failover event. 120-. (canceled)21. A method for mitigating disaster recovery in virtual tape applications , comprising:receiving, by a first virtual tape solution (VTS), a configuration override instruction from a first mainframe;obtaining, in response to the receiving, a disaster recovery configuration (DRC) from a configurations repository, wherein the DRC is a latest global configuration implemented on a second VTS prior to a failover event occurring on a second mainframe; andprocessing the DRC to configure a set of virtual tape engines (VTEs) of the first VTS, wherein, for each VTE of the set of VTEs, processing the DRC comprises: segmenting a portion of the DRC to define a VTE specific configuration (VSC).22. The method of claim 21 , wherein the configuration override instruction comprises a command for the first VTS to utilize the DRC.23. The method of claim 22 , wherein the DRC replaces an existing global configuration of the first VTS.24. The method of claim 21 , wherein processing the DRC further comprises:generating, based on a first portion of the VSC, a first library batch request (LBR) comprising a set of tape library addition requests.25. The method of claim 24 , wherein processing the DRC further comprises:processing the first LBR to create a first set of virtual tape libraries.26. The method of claim 25 , wherein processing the DRC further comprises:generating, based on a second portion of the VSC, a first ...

Подробнее
19-03-2020 дата публикации

Memory system and operating method thereof

Номер: US20200090778A1
Автор: Jong-Min Lee
Принадлежит: SK hynix Inc

A memory system includes: a memory device including a plurality of memory blocks; a memory; a data classifier suitable for classifying check-pointing information stored in the memory as selective information and mandatory information; and a check-pointing component suitable for performing a control to periodically perform a check-pointing operation of programming the selective information and the mandatory information in a memory block, wherein the check-pointing component performs the check-pointing operation by performing a control to program the mandatory information after programming the selective information.

Подробнее
05-04-2018 дата публикации

RESTORING DISTRIBUTED SHARED MEMORY DATA CONSISTENCY WITHIN A RECOVERY PROCESS FROM A CLUSTER NODE FAILURE

Номер: US20180095848A1

A DSM component is organized as a matrix of page. The data structure of a set of data structures occupies a column in the matrix of pages. A recovery file is maintained in a persistent storage. The recovery file consists of entries and each one of the entries corresponds to a column in the matrix of pages by a location of each one of the entries. The set of data structures is stored in the DSM component and in the persistent storage. Incorporated into each one of the plurality of entries in the recovery file is an indication if an associated column in the matrix of pages is assigned with the data structure of the set of data structures; and additionally incorporated into each one of the plurality of entries in the recovery file are identifying key properties of the data structure of the set of data structures. 1. A method for restoring distributed shared memory (DSM) data consistency within a recovery process from a failure of a node in a cluster of nodes by a processor device , comprising:organizing a DSM component as a matrix of pages, wherein a data structure of a set of data structures occupies a column in the matrix of pages;maintaining a recovery file in a persistent storage, wherein the recovery file consists of a plurality of entries and each one of the plurality of entries corresponds to a column in the matrix of pages by a location of each one of the plurality of entries;storing the set of data structures in the DSM component and in the persistent storage;incorporating into each one of the plurality of entries in the recovery file an indication if an associated column in the matrix of pages is assigned with the data structure of the set of data structures; andincorporating into each one of the plurality of entries in the recovery file identifying key properties of the data structure of the set of data structures and a specification of the location of the data structure in the persistent storage if the associated column in the matrix of pages is assigned.2. ...

Подробнее
05-04-2018 дата публикации

HANDLING A VIRTUAL DATA MOVER (VDM) FAILOVER SITUATION BY PERFORMING A NETWORK INTERFACE CONTROL OPERATION THAT CONTROLS AVAILABILITY OF NETWORK INTERFACES PROVIDED BY A VDM

Номер: US20180095851A1
Принадлежит:

A technique handles a VDM failover situation. The technique involves adjusting a configuration file on a first platform to indicate whether data managed by an initial VDM on that platform is being replicated to a second platform. The technique further involves, following a VDM failover event, creating a replacement VDM on the first platform to replace the initial VDM. The technique further involves, after the replacement VDM is created, performing an operation that controls interfaces provided by the replacement VDM. The operation enables the interfaces when the operation determines that the data managed by the initial VDM on the first platform was not being replicated to the second platform at the time of the event, and disables the interfaces when the operation determines that the data managed by the initial VDM on the first platform was being replicated to the second platform at that time of the event. 1. A method of handling a virtual data mover (VDM) failover situation , the method comprising:electronically adjusting a configuration file on a first physical data mover platform to indicate whether data managed by an initial VDM on the first physical data mover platform is being replicated from the first physical data mover platform to a second physical data mover platform;following a VDM failover event in which the initial VDM on the first physical data mover platform fails, electronically creating a replacement VDM on the first physical data mover platform that replaces the initial VDM; andafter the replacement VDM is created, performing a network interface control operation that controls availability of network interfaces provided by the replacement VDM, the network interface control operation (i) enabling a set of network interfaces of the replacement VDM when the network interface control operation determines from the configuration file that the data managed by the initial VDM on the first physical data mover platform was not being replicated from the first ...

Подробнее
19-03-2020 дата публикации

Systems and/or methods for intelligent and resilient failover for cloud computing environments

Номер: US20200092404A1
Принадлежит: Software AG

A cloud computing system includes computing nodes that execute a shared application and/or service accessible by client computing devices over a network. A resilience multiplexer is configured to: receive signals (e.g., from a cloud controller, registry service, error handler, and/or failover service) indicative of potential problems with components of the system and/or network; identify a rule to be executed to determine how to respond to the potential problem, based on attributes of the received signal including which component generated it and what information is included in / otherwise associated with it, and other network-related data; execute the identified rule to determine whether a failover is or might be needed; if a failover is needed, selectively trigger a failover sequence; and if a failover only might be needed, initiate a resilience mode. In resilience mode, information regarding the potential problem is communicated to other components, without immediately initiating a failover sequence.

Подробнее
12-05-2022 дата публикации

Accelerating Segment Metadata Head Scans For Storage System Controller Failover

Номер: US20220147365A1
Принадлежит: Pure Storage Inc

Accelerating segment metadata head scans for storage system controller failover, including: receiving, by a secondary storage unit corresponding to a primary storage unit, a request to store a data segment; storing the data segment and segment metadata at the head of the data segment; and storing, in a data structure, data indicating an erase block storing the segment metadata and indicating an offset in the erase block where the segment metadata is stored.

Подробнее
28-03-2019 дата публикации

Storage unit for high performance computing system, storage network and methods

Номер: US20190095294A1
Принадлежит: SEAGATE TECHNOLOGY LLC

There is disclosed a storage unit for high performance computing system, a storage network and a method of providing storage and of accessing storage. The storage unit includes an enclosure constructed and arranged to receive plural storage devices to provide high density, high capacity storage. The unit also includes a network connector and at least one integrated application controller constructed and arranged to run a scalable parallel file system for accessing data stored on the storage devices and providing server functionality to provide file access to a client via the network connector.

Подробнее
28-03-2019 дата публикации

METHOD AND SYSTEM FOR AUTOMATIC MAINTENANCE OF STANDBY DATABASES FOR NON-LOGGED WORKLOADS

Номер: US20190095297A1
Принадлежит: ORACLE INTERNATIONAL CORPORATION

A computer program product, system, and computer implemented method for automatic maintenance of standby databases for non-logged workloads, the process comprising: maintaining a redo stream of redo records sent from a primary database to a standby database, identifying a change made at the primary database for which a redo record was not created, inserting a placeholder redo record into the redo stream corresponding to the change identified at the primary database for which the redo record was not created, sending, to the standby database, a copy of one or more data blocks corresponding to the change that is associated with the placeholder redo record, receiving the placeholder redo record from the redo stream, identifying the copy of the one or more data blocks sent from the primary database corresponding to the placeholder redo record, and applying the copy of one or more data blocks to update the standby database. 1. A computer implemented method for automatic maintenance of standby databases for a non-logged workload , the method comprising: maintaining a redo stream of redo records to be sent from the primary database to a standby database;', 'identifying a change made at the primary database for which a redo record was not created;', 'inserting a placeholder redo record into the redo stream corresponding to the change identified at the primary database for which the redo record was not created;', 'sending, to the standby database, a copy of one or more data blocks corresponding to the change that is associated with the placeholder redo record;, 'at a primary database, performing receiving the placeholder redo record from the redo stream;', 'identifying the copy of the one or more data blocks sent from the primary database corresponding to the placeholder redo record; and', 'applying the copy of one or more data blocks to update the standby database., 'at the standby database, performing2. The computer implemented method of claim 1 , wherein the placeholder ...

Подробнее
28-03-2019 дата публикации

PORTABLE COMPUTING SYSTEM AND PORTABLE COMPUTER FOR USE WITH SAME

Номер: US20190095374A1
Автор: Arnouse Michael
Принадлежит:

A computing system comprising a portable computer and a reader are disclosed. The portable computer is pocket-sized and comprises flash memory, and optionally a processor and a GPS chip. The reader includes a monitor, a keyboard with docking port and an optional processor and at least one input/output USB connector. A user cannot interact with the portable computer without the reader. The reader is a non-functioning “shell” without the portable computer, however, when they are connected the system becomes a fully functional personal computer. To log on, a user provides security information, for example, a password or biometrics, such as fingerprints. The credit card size and capabilities of the portable computer allows a user to easily carry virtually their entire computer in a pocket for use anywhere there is a reader. In addition, the portable computer provides security against unauthorized use, even if lost or stolen. 1. A portable computer comprising:a processor;a controller;at least one integrated circuit for storing data;a battery;a metal casing portion;a non-metal casing portion;a printed circuit board;at least one antenna;a thermally conductive coating; andat least one connector, wherein:the portable computer is no greater in size than 100 mm by 60 mm by 6 mm,the printed circuit board is enclosed within the metal casing portion and the non-metal casing portion,the thermally conductive coating is disposed on a surface of the processor and a surface of the metal casing portion, andthe antenna is enclosed within the non-metal casing portion.2. The portable computer of claim 1 , further comprising a power management integrated circuit.3. The portable computer of claim 1 , further comprising a trusted platform module.4. The portable computer of claim 1 , further comprising a wireless communication interface claim 1 , wherein the wireless communication interface supports a wireless communication protocol including at least one of wireless fidelity (WiFi) claim 1 , ...

Подробнее
14-04-2016 дата публикации

DETECTING HIGH AVAILABILITY READINESS OF A DISTRIBUTED COMPUTING SYSTEM

Номер: US20160103720A1
Принадлежит:

Technology is disclosed for determining high availability readiness of a distributed computing system (“system”). A confidence measure (CM) can be computed for a particular controller in the system to determine whether a takeover by the particular controller from a first controller would be successful. The CM can be a percentage value. A CM of 0% indicates that a takeover would be a failure, which results in loss of access to data managed by the first controller. A CM of 100% indicates a successful takeover with no performance impact on the system. A CM between 0% and 100% indicates a successful takeover but with a performance impact. The CM can be computed based on events occurring in the system, e.g., veto and non-veto events. The CM is computed as a function of various weights and/or indices associated with the veto events and/or non-veto events. 1. A computer-implemented method , comprising:receiving a list of multiple events that have occurred in a distributed computing system over a specified period, the events related to a first computer node and a second computer node of the distributed computing system, the first computer node configured to manage a data access request received from a client computer node for data stored at a storage system associated with the first computer node, the second computer node configured to take over from the first computer node in case the first computer node becomes unavailable;determining, based on an event classification policy, a set of non-veto events and a set of veto events related to the second computer node from the events;retrieving, based on the event classification policy, a severity index and a compliance factor for each event of the set of non-veto events; andcomputing a confidence measure of the second computer node as a function of the set of veto events and the severity index and the compliance factor of the set of non-veto events.2. The computer-implemented method of claim 1 , wherein the confidence measure ...

Подробнее
28-03-2019 дата публикации

SCALABLE BYZANTINE FAULT-TOLERANT PROTOCOL WITH PARTIAL TEE SUPPORT

Номер: US20190097790A1
Автор: Karame Ghassan, Li Wenting
Принадлежит:

A method for establishing consensus between a plurality of distributed nodes connected via a data communication network includes preparing a set of random numbers, wherein each of the random numbers is a share of an initial secret, wherein each share of the initial secret corresponds to one of a plurality of active nodes; encrypting, in order to generate encrypted shares of the initial secret, each respective share of the initial secret with a shared key corresponding to respective one of the plurality of active nodes to which the respective share corresponds; applying a bitwise xor function to the set of random numbers to provide the initial secret; and binding the initial secret to a last counter value to provide a commitment and a signature for the last counter. The method includes generating shares of a second and of a plurality of subsequent additional secrets by iteratively applying a hash function. 1. A method for establishing consensus between a plurality of distributed nodes connected via a data communication network , the plurality of distributed nodes including a plurality of active nodes , the plurality of active nodes including a primary node , each of the plurality of distributed nodes including a processor and computer readable media , the method comprising:preparing a set of random numbers, wherein each of the random numbers is a share of an initial secret, wherein each share of the initial secret corresponds to one of the plurality of active nodes;encrypting, in order to generate encrypted shares of the initial secret, each respective share of the initial secret with a shared key corresponding to a respective one of the plurality of active nodes to which the respective share corresponds;applying a bitwise xor function to the set of random numbers to provide the initial secret;binding the initial secret to a last counter value to provide a commitment and a signature for the last counter;generating shares of a second and of a plurality of subsequent ...

Подробнее
26-03-2020 дата публикации

RELAY SYSTEM

Номер: US20200097375A1
Автор: Eguchi Hiroyuki
Принадлежит: FUJI XEROX CO., LTD.

A relay system includes a detector that detects occurrence of failure related to a processor that reads and processes requests from a memory, a monitor that monitors statuses of the requests stored in the memory and gives an instruction to increase or reduce a computational resource depending on the statuses of the requests stored in the memory, and a controller that performs control in response to detection of the failure by the detector so that a status of the memory is set to a state in which the monitor determines that addition of the computational resource is unnecessary. 1. A relay system , comprising:a detector that detects occurrence of failure related to a processor that reads and processes requests from a memory;a monitor that monitors statuses of the requests stored in the memory and gives an instruction to increase or reduce a computational resource depending on the statuses of the requests stored in the memory; anda controller that performs control in response to detection of the failure by the detector so that a status of the memory is set to a state in which the monitor determines that addition of the computational resource is unnecessary.2. The relay system according to claim 1 , wherein the controller performs control so that a status of a request that is being processed in association with the processor related to the detected failure is set to a state in which the processor is not able to acquire the request that is being processed from the memory.3. The relay system according to claim 1 , wherein the controller performs control so that a status of a request that is stored in the memory and is waiting for execution of the processor in which the failure has been detected is set to a state in which the request is not acquirable from the memory.4. The relay system according to claim 2 , wherein the controller performs control so that a status of a request that is stored in the memory and is waiting for execution of the processor in which the failure ...

Подробнее
04-04-2019 дата публикации

FAULT-TOLERANT STREAM PROCESSING

Номер: US20190102266A1
Принадлежит: ORACLE INTERNATIONAL CORPORATION

Techniques for providing fault-tolerant stream processing. An exemplary technique includes writing primary output events to a primary target and secondary output events to one or more secondary targets, where the primary output events are written by a primary server and the secondary output events are written by one or more secondary servers. The technique further includes receiving an election of a new primary server from a synchronization system upon a failure of the primary server, where the new primary server is elected from the one or more secondary servers. The technique further includes determining, by the new primary server, the primary output events that failed to be written to the primary target because of the failure of the primary server, and writing, by the new primary server, the failed primary output events to the primary target using the secondary output events read from the one or more secondary targets. 1. A method , comprising:processing, at a data processing system, input events to generate primary output events and secondary output events, wherein the primary output events are generated by a primary server of the data processing system and the secondary output events are generated by one or more secondary servers of the data processing system;writing, by the data processing system, the primary output events to a primary target and the secondary output events to one or more secondary targets, wherein the primary output events are written by the primary server and the secondary output events are written by the one or more secondary servers;receiving, at the data processing system, an election of a new primary server from a synchronization system upon a failure of the primary server, wherein the new primary server is elected from the one or more secondary servers;reading, by the new primary server of the data processing system, the secondary output events from the one or more secondary targets;determining, by the new primary server of the data ...

Подробнее
04-04-2019 дата публикации

SESSION TEMPLATES

Номер: US20190102267A1
Принадлежит:

Techniques are disclosed herein for identifying, recording and restoring the state of a database session and various aspects thereof. A session template data structure is generated that includes session attribute values describing various aspects of the session that is established between a client system and a database management system (DBMS and enables the client system to issue to the DBMS commands for execution. Based on the session attribute values, DBMS may generate a template identifier corresponding to the session template data structure. The template identifier may be stored in an association with the session state that it partially (or in whole) represents. In an embodiment, when another state of a session is captured, if the template identifier for the state is the same, then rather than storing the attribute-value pairs for the other state, the template identifier is further associated with the other state. In an embodiment, a request boundary is detected where the session is known to be at a recoverable point. If recovery of the session is needed, the session state is restored, and replay of commands start from this point. Each command replayed is verified to produce the same session state as it produced at original execution. If the session is determined to be a safe point, then all the commands recorded for replay prior to the safe point may be deleted. 1. A computer-implemented method comprising:generating a first session template data structure that includes a first plurality of session attribute values, each of the first plurality of session attribute values describing an aspect of a first state of a first session that is established between a client system and a database management system (DBMS), and through which one or more first commands by the client system are issued to the database management system for a first execution;based at least in part on the first plurality of session attribute values, generating a first template identifier ...

Подробнее
23-04-2015 дата публикации

SWITCH PROVIDED FAILOVER

Номер: US20150113315A1
Принадлежит:

A system is configured to: transmit requests to a first device and a second device; receive a first reply from the first device in response to one of the requests; determine an address of the first device based on the first reply; assign a first port to a first network when the first device is a first one of one or more devices that replied to the requests and have a same address as the first device; receive a second reply from the second device in response to another one of the requests; assign a second port to a second network when the address of the second device is the same as the address of the first device; and reassign the second port, from the second network, to the first network when a failure of the first device occurs. 120-. (canceled)21. A method comprising:sending, by a network device, a first request to a first device;sending, by the network device, a second request to a second device;receiving, by the network device and from the first device, a first reply to the first request;determining, by the network device, that the network device received the first reply from the first device before receiving a second reply to the second request from the second device; andidentifying, by the network device, the first device as a master device based on determining that the network device received the first reply from the first device before receiving the second reply to the second request from the second device.22. The method of claim 21 , further comprising:receiving, by the network device and from the second device, the second reply to the second request;determining, by the network device and after receiving the second reply, that the first device is acting as the master device; andidentifying, by the network device, the second device as a slave device based on the second reply and based on determining that the first device is acting as the master device.23. The method of claim 21 , further comprising:receiving traffic;determining an address based on the ...

Подробнее
02-06-2022 дата публикации

MAINTAINING COMMUNICATIONS IN A FAILOVER INSTANCE VIA NETWORK ADDRESS TRANSLATION

Номер: US20220174036A1
Принадлежит:

Described herein are systems, methods, and software to enhance failover operations in a cloud computing environment. In one implementation, a method of operating a first service instance in a cloud computing environment includes obtaining a communication from a computing asset, wherein the communication comprises a first destination address. The method further provides replacing the first destination address with a second destination address in the communication, wherein the second destination address comprises a shared address for failover from a second service instance. After replacing the address, the method determines whether the communication is permitted based on the second destination address, and if permitted, processes the communication in accordance with a service executing on the service instance. 1. A method comprising:maintaining session information for one or more sessions established with a primary service instance, wherein the maintained session information indicates a first network address that is shared between the primary service instance and a backup service instance as a destination address for the one or more sessions;periodically providing the maintained session information to the backup service instance;identifying a failover condition for the primary service instance; andtransitioning the one or more sessions from the primary service instance to the backup service instance.2. The method of claim 1 , wherein transitioning the one or more sessions comprises claim 1 , based on receiving a first packet from a source computing asset claim 1 , determining if the first packet corresponds to any of the one or more sessions based on the maintained session information.3. The method of further comprising translating a destination address of the first packet to the first network address shared between the primary and backup service instances.4. The method of claim 3 , wherein determining if the first packet corresponds to any of the one or more sessions ...

Подробнее
29-04-2021 дата публикации

METHODS AND APPARATUS FOR DETECTING, ELIMINATING AND/OR MITIGATING SPLIT BRAIN OCCURRENCES IN HIGH AVAILABILITY SYSTEMS

Номер: US20210124656A1
Принадлежит:

The present invention relates to communications methods and apparatus for detecting and/or mitigating split brain occurrences in high availability systems. A split brain condition being a condition wherein both a standby processing node and another processing node of a cluster of processing nodes included in a high availability system are both operating at the same time in an active mode of operation. An exemplary method embodiment of operating a high availability system including a plurality of processing nodes includes the steps of determining at a standby processing node that a failure condition exists, said standby processing node being one of the cluster of processing nodes; switching the standby processing node from a standby mode of operation to an active mode of operation in response to determining that a failure condition exists; and determining whether the high availability system is experiencing a split brain condition. 1. A method of operating a high availability system including a cluster of processing nodes comprising:determining by a standby processing node that a failure condition exists, said standby processing node being one of the cluster of processing nodes;in response to determining by the standby processing node that a failure condition exists, making a determination, by the standby processing node, whether switching from a standby mode of operation to an active mode of operation, will result in a split brain condition for the high availability system; andwhen the determination is that switching from a standby mode of operation to an active mode of operation will result in a split brain condition for the high availability system, refraining, by the standby processing node, from switching from a standby mode of operation to an active mode of operation.2. The method of claim 1 ,wherein said split brain condition is a condition wherein both the standby processing node and an active processing node of the cluster of processing nodes are both ...

Подробнее