## UDC 004.8:004.7

## FAULT-TOLERANCE MODELS FOR AI CHIPS OF INTELLECTUAL SYSTEMS

Anatoli Kovalenko Ph.D., Associate Professor

National Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute" Kyiv, Ukraine

The fault tolerance of a system is its ability to perform its functions in the presence of failures of some elements. In a fault-tolerant system, the failure of one or more elements leads to only a partial decrease in the quality of work. Fault tolerance is ensured by the use of methods and means of introducing redundancy (for example, using backup methods, parallel computing, etc.).

The most rational use of hardware and software in creating fault-tolerant systems is achieved through the application of methods and tools of system diagnostics.

The AI chips increasingly focus on implementing neural computing at low power and cost. The intelligent sensing, automation, and edge computing applications have been the market drivers for AI chips [2]. Several generations of AI chips are expected to appear in the near future. Artificial general intelligence (AGI) systems will be implemented in the whole human existens. Thus AGI chips will solved many argent problems.

Figure 1 shows the four stages of AI chip development. Currentl AI 1.0 stage, is focused on specific applications. In the stage of AI 2.0, the focus will be on building general intelligence chips, that could assist humans further in solving difficult social and ethical questions. The stage AI 3.0 could witness collective form of AGI, where the intelligence will expand in applications and complexity, currently not imagined by humans. The AI 4.0 stage and beyond, would see the integration of AGI chip with human brain, and ultimately allow humans to access all forms of electronic signals, and senses available to machines.

Thus, we can expect a radical change in modern digital circuit design methods [3]. A significant number of modern design methods will give way to methods for creating self-organized circuits with a whole range of functions for adapting to modes and conditions of use. Smart chips can radically change the functions of modern CPU, GPU based on neural networks.

Some approaches to self-servival technical systems, based on self-diagnosable system models with a long time of life are concidered. Thus the AI models for distributed computing resources may be an efficient mechanism for a system reconfiguration under multiple failures. The cells interaction fault information in such systems may be considered as syndromes [1].



Fig.1 Development stages of AI chips [2]

The structure of AI chip system may be defined by means of diagnostic graph G(V,E), where V is the set of system autonomic units (elements, components, processes, nodes, vertices) and E is the set of directed links  $(v_i, v_j)$ ,  $v_i$ ,  $v_j \square V$ , between these units [1].

Every graph G(V,E) may be decomposed on some regular sub graphs  $G_j$  -structures  $L_j$  , in such a way, that

$$G_{j},$$

$$G_{j} = (V_{j}, E_{j}), n_{j} = |V_{j}|,$$

$$V_{j} \subseteq V_{j} E_{j} \subseteq E$$

The chain, star and tree structures are the simplest types of such structures. Every unit in a chain structure has only two neighbors except two corner units. A star structure has only one central unit and set of other units or chains, connected with this central unit.

Diagnostic syndrome
$$A_{i}^{i} = \left\{ a_{xy} \mid \exists (v_{x}, v_{y}) \in E_{i} \right\}$$

For every structure  $L_j$  there are syndrome compatible set (SCS) of unit states, not contradicted given syndrome. It can be defined the number of SCS for a given  $L_i$  and system syndrome  $A_i^i$  as  $N(L_i)_i$ .

<sup>[1]</sup> Kovalenko A.E. Rozpodileni informacijni systemy [Distributed information systems]. Kyiv: NTUU «KPI», 2008. 244 p. (in Ukrainian)

<sup>[2]</sup> The Why, What and How of Artificial General Intelligence Chip Development arXiv:2012.06338v2 [cs.LG] 30 Mar 2021 URL: https://arxiv.org/pdf/2012.06338

<sup>[3]</sup> Kovalenko A. Kompjuterna schemotechnika i architektura kompjuteriv [Computer circuit engineering and computer architecture. Manual] Kyiv: NTUU "KPI", 2016. (in Ukrainian) URL: https://ela.kpi.ua/handle/123456789/16577