electronics Article Learnable Wavelet Scattering Networks: Applications to Fault Diagnosis of Analog Circuits and Rotating Machinery Varun Khemani *, Michael H. Azarian and Michael G. Pecht Center for Advanced Life Cycle Engineering (CALCE), University of Maryland, College Park, MD 20742, USA; mazarian@umd.edu (M.H.A.); pecht@umd.edu (M.G.P.) * Correspondence: vkheman@umd.edu Abstract: Analog circuits are a critical part of industrial electronics and systems. Estimates in the literature show that, even though analog circuits comprise less than 20% of all circuits, they are responsible for more than 80% of faults. Hence, analog circuit fault diagnosis and isolation can be a valuable means of ensuring the reliability of circuits. This paper introduces a novel technique of learning time?frequency representations, using learnable wavelet scattering networks, for the fault diagnosis of circuits and rotating machinery. Wavelet scattering networks, which are fixed time?frequency representations based on existing wavelets, are modified to be learnable so that they can learn features that are optimal for fault diagnosis. The learnable wavelet scattering networks are developed using the genetic algorithm-based optimization of second-generation wavelet transform operators. The simulation and experimental results for the diagnosis of analog circuit faults demon- strates that the developed diagnosis scheme achieves greater fault diagnosis accuracy than other methods in the literature, even while considering a larger number of fault classes. The performance of the diagnosis scheme on benchmark datasets of bearing faults and gear faults shows that the developed method generalizes well to fault diagnosis in multiple domains and has good transfer learning performance, too.   Citation: Khemani, V.; Azarian, M.H.; Keywords: wavelet scattering networks; analog circuits; rotating machinery; fault diagnosis; scatter- Pecht, M.G. Learnable Wavelet ing networks; fault isolation; second-generation wavelet transform Scattering Networks: Applications to Fault Diagnosis of Analog Circuits and Rotating Machinery. Electronics 2022, 11, 451. https://doi.org/ 1. Introduction 10.3390/electronics11030451 Electronic circuits are ubiquitous in our everyday lives, in applications ranging from Academic Editor: Gaetano Palumbo the commercial domain to the safety-critical domain. As a result, unforeseen circuit failures can have enormous consequences for the safety and financial well-being of their users Received: 1 January 2022 Accepted: 28 January 2022 and producers [1,2]. Analog circuit failures can be attributed to interconnected failures or Published: 2 February 2022 component faults, which are associated with either parametric drift (soft faults) or short circuit/open circuit [3] (hard faults). Analog circuits have become increasingly complex and Publisher?s Note: MDPI stays neutral consequentially, fault diagnosis is increasingly difficult, due to: (a) component tolerances, with regard to jurisdictional claims in (b) interactions among components, (c) inadequate accessible measurement nodes; and published maps and institutional affil- (d) the inherent non-linearity in the behavior of analog circuits. Compared to digital circuits, iations. analog circuits are more susceptible to interference and have fewer measurement nodes. Interestingly, even though analog circuits account for less than 20% of all circuits, they are responsible for more than 80% of circuit faults [4,5] Therefore, the fault diagnosis of analog Copyright: ? 2022 by the authors. circuits has become a highly important research area in recent years. Licensee MDPI, Basel, Switzerland. There are two broad categories for fault diagnosis approaches for circuits: analytical This article is an open access article methods and data-driven methods. Circuit transfer function equations are required to distributed under the terms and apply analytical methods [6]. If these equations are unavailable, they can be determined conditions of the Creative Commons using design principles or parameter identification techniques [7], and fault diagnosis is Attribution (CC BY) license (https:// then achieved by exposing the circuit to a test stimulus and using the response to estimate creativecommons.org/licenses/by/ the circuit parameters. This technique is suitable for linear analog circuits but is not feasible 4.0/). for nonlinear analog circuits because of the complexity involved [8]. Electronics 2022, 11, 451. https://doi.org/10.3390/electronics11030451 https://www.mdpi.com/journal/electronics Electronics 2022, 11, 451 2 of 16 Data-driven methods [9?12] require data obtained under faulty conditions to be avail- able either through testing, operation, or simulation such that a comparison can be made to data obtained under healthy conditions for fault diagnosis. Features of the data are used for this comparison and can be time domain, frequency domain, or time?frequency domain. Various machine learning approaches such as neural networks, support vector machines, Na?ve Bayes classifier, etc., have been used for fault diagnosis under the broad umbrella of data-driven methods. Neural-network-based fault-diagnosis approaches [13,14] have included, for feature generation: kurtosis and entropy [15], wavelet transforms [16], and fractional wavelet transforms [17]; and for dimensionality reduction: kernel PCA (kPCA) [16,17]. Support vector machine (SVM)-based [18] fault-diagnosis approaches have further included, for feature generation: fractional Fourier transform [19], cross-wavelet transform [20,21], deep belief networks (DBN) [22,23], and empirical mode decomposi- tion [24]; for dimensionality reduction: parametric t-SNE [20] and principal component analysis [21]; and for SVM hyperparameter optimization: the double-chains quantum genetic algorithm [24], the fruitfly algorithm [25], the barnacles mating optimizer algo- rithm [26], and the firefly algorithm [27]. Na?ve-Bayes-classifier-based [28] fault-diagnosis approaches include, for feature generation: cross-wavelet transform [29]; and for dimen- sionality reduction: bilateral 2D linear discriminant analysis. The standard approach that the vast majority of the methods followed is to extract features and apply a dimensionality reduction algorithm to obtain a lower-dimensional feature set which is then fed to a classification algorithm. Extracting features informative for fault diagnosis requires technical expertise which restricts its application as a generalized method. Recently, techniques have been proposed involving the direct application of deep learning methods for fault diagnosis. These techniques use input data to learn features autonomously through a multi-layered neural network. This avoids the need for manual feature extraction and feature selection. For example, different 2D representations [30,31] have been developed for circuit outputs for use with state-of-the-art deep learning networks such as ResNet50 [32] to achieve fault diagnosis. However, the creation of an optimal custom deep learning network structure for the problem at hand requires subject matter expertise and extensive trial-and-error [33]. Inspired by wavelet scattering theory [34] and second- generation wavelet transform [35], we propose a novel technique that does not need to be optimized for structure and learns wavelet filters instead of random filters from the data. Hence, it overcomes the shortcomings of deep learning networks. The remainder of the paper is organized as follows: Section 2 presents a theoretical background of the techniques involved in the approach. Section 3 details the developed fault diagnosis methodology. Section 4 details the application of the approach to the fault diagnosis of two circuits and a bearing and a gears dataset. The conclusions follow in Section 5. 2. Theoretical Background As mentioned earlier, in this paper, time?frequency representations are learnt from the circuit outputs for fault diagnosis using learnable wavelet scattering networks (LWSNs). This involves modifying wavelet scattering networks, which are fixed time?frequency representations based on existing wavelets, such that they can learn features that are optimal for fault diagnosis. Learnable wavelet scattering networks are developed using the genetic-algorithm-based optimization of second-generation wavelet transform operators. Support vector machines (SVMs) are used as classifiers for the features learned by the LWSN. In the following subsections, we review the basics of a wavelet transform, a wavelet scattering network, a genetic algorithm, and a support vector machine and introduce the concept of learnable wavelet scattering networks. 2.1. Wavelet Transform A wavelet transform is a collection of bandpass filters with progressively broader bandwidths at higher frequencies. A wavelet is a time-limited waveform that has a non-zero norm and zero average value. Often, signals are piecewise smooth but have momentary Electronics 2022, 11, 451 3 of 16 transients; for example, edges in images or transients caused by rapid changes in economic conditions in financial time series. The Fourier basis is not suited for the sparse represen- tation of these signals, as their sinusoids have infinite duration and would require sine waves of vari(ous f)requencies for representation. Wavelets, being irregular and of limitedtime, require the break-up of a signal into a limited number of variations of the original wavelet ?1 ? t?us s . The scale parameter s is inversely proportional to the frequency. A small scale s leads to a compressed wavelet, which is ideal for high-frequency signals with rapidly changing details. A long scale s leads to a stretched wavelet, which is ideal for slowly changing signals with coarse features; i.e., a low-frequency signal. This increases the flexibility of the time?frequency analysis. The wavelet transform (1) has scale-varying basis functions. ? ? ( )1 t? u W f (u, s) = f (t)? ? dt (1) ?? s s The continuous wavelet transform (CWT) (2) compares a signal with shifted and scaled versions of the mother wavelet. ( ) 1 t?m ?(u, s) = j ? j (2) 2 v 2 v Here, v is the number of voices per octave, as it requires v intermediate scales to increase the scale by an octave. Higher values of v result in a finer discretization of the scale parameter s and an increase in the amount of computation required. The discrete wavelet transform (DWT) has a much coarser discretization of the scale parameter such that the number of voices per octave is always one. Depending on the translation parameter discretization, there are two broad types of DWT: decimated DWT and non-decimated DWT. Decimated DWT (3): The translation parameter is 2jm, where m is a non-negative integer and j is the scale. The decimated DWT is a sparse representation; hence, it is used for compression, denoising, signal transmission(, etc. j )1 t? 2 m ?(u, s) = ? ? j (3)2j 2 Non-decimated DWT (4): Like in the case of the CWT, the translation parameter is independent of the scale parameter. The non-decimated DWT is a more redundant representation than the decimated DWT and is t(ranslati)on invariant. ?1 t?m?(u, s) = ? j (4)2j 2 2.2. Wavelet Scattering Networks (WSNs) In an effort to create interpretable networks that mimic human performance on vision and auditory tasks, some researchers use wavelet-transform-based methods, as wavelets are an approximation of the response of the human visual cortex and cochlea to stimuli [36]. For example, the wavelet transform renders a time domain signal to the time?frequency plane with a decreasing frequency resolution with increasing frequency, which is similar to the human cochlear response. Mallat [37] proposed WSNs (Figure 1) as a first step in understanding the success of Convolutional Neural Networks (CNNs). A wavelet scattering network computes a representation that preserves high-frequency information, is stable to deformations, and is translation invariant, which makes it a good feature extractor for classification. It is a cascade (tree) of convolutions between Gabor wavelet transforms (represented by ? in Figure 1) and non-linear modulus and averaging operators (represented by ? in Figure 1), which ?scatter? the signal along multiple paths. The number of paths at each node of the WSN is the scale of the wavelet transform (scale = 3 in Figure 1), and the number of Electronics 2022, 11, x FOR PEER REVIEW 4 of 17 cascade (tree) of convolutions between Gabor wavelet transforms (represent?ed by ? in Figure 1) and non-linear modulus and averaging operators (represented by in Figure Electronics 2022, 11, 451 1), which ?scatter? the signal along multiple paths. The number of paths at each nod4 eo fo1f6 the WSN is the scale of the wavelet transform (scale = 3 in Figure 1), and the number of layers of wavelet transforms is typically two. Discrete versions of WSNs were proposed blayy Wersiaotof wsakvie [le3t6]tr aanndsf oinrvmoslvise teyxpisictainllgy dtwisocr.eDteis ocrrtehteogvoernsaiol nansdo fbWioSrtNhsogwoenrael pwroapvoelseetds.b y Wiatowski [36] and involve existing discrete orthogonal and biorthogonal wavelets. Fiigure 1.. Wavelet scattering network. UUnnlliikkee CCNNNNss, ,aa scsactattetreirnign gnentewtworokr kouotuptuptsu tcsoecfofeicfifiecnitesn atst aaltl alalyl elarsy,e nrso,t njuostt jtuhset ltahset llaayset rl,a aynedr, fainltderfis laterers naorte lenaortnleeda rfnroemd fdroatma bduatt aarbeu pt raerdeepfirneedde fiwnaevdelwetas.v Telheutss., tThheu fsil,tethrse rfielttaeirns rtehteaiirn pthheyisripcahly msiceaalnminega,n winhgi,cwh hciacnhncoatn nboet sbaeidsa oidf othf eth feilfiteltresr sththata taarree ddeevveellooppeedd tthhrroouugghh tthhee lleeaarrnniinn? g g pprroocceessss iinn aa ttyyppiiccaall? c coonnvvoolluuttiioonn nneeuurraall nneettwwoorrkk.. OOppeerraattiioonnss iinn bbootthh CCNNNNss aanndd wwaavveelleett ssccaatttteerriinng networks can be represented as P (? (x ? w)), where x is the tihnep uintpsuigt nsaigl,nwali, s th eisfi thltee rfiwl aCtoNrN. Isn, tCh?Ne Nwse,i gthhet swweiagrhetws e? g networks can be represented as ? ? (? ? ?) , where ? is teerig whte,ig?hist, the ins othneli nneoanrliitnye,aarnitdy,P ainsdt h?e piso tohlien pgoooplienrga toopr.eIrn- ig ahrtes wofeliegahrtnse odf rlaeanrdnoemd rfialntedrosm, w fhilitleersin, wWhSilNe si,nt hWeSwNesi,g thhtes wweaigrehttsh e w aerieg hthtes wofetihgehtfis xoefd thwe afivxeeldet wfialtveerlse.t Sficlatetrtes.r iSncgatnteertiwngo rnkestwproorvkisd persotvaitdee-o sft-atthee-- oafr-tthclea-sasritfi cclaatsisoifnicaactciounr aacciceusroanciessim opn lseimtopmleo tdo emraotdeleyractoemlyp cloemx pdlaetxa sdeattsa,sseutsc,h sauschte axst uterxe-s tiunreCsU inR CeTURdeaTta dseatta[s3e4t] [,3o4r], moru msiucsailcagle gnerneraen adnde nenvvirioronnmmeenntatal ls soouunndd ccllaassssiiffiiccaattiioonn [[3377]],, aanndd iimmaaggeess iinn MMNNIISSTT ddaattaasseett [[3388]].. HHoowweevveerr,, ffoorr eexxttrreemmeellyy ccoommpplleexx ddaattaasseettss ssuucchh aass IImmaaggeeNNeett [[3399]] oorr TTIMIMITITA Acocuoustsitci?cP?hPohnoenteictiCc oCnotnintuinouuosuSsp SepecehecCho Crpourspu[4s0 ][,4C0]N, NCNs aNres satriell smtiollr me aocrceu arcacteurthataen thscaant tsecrainttgernientgw noertkws.oArkms. aAjo mr raejaosro rneafsoornth fiosri tshtihsa its stchaattt esrciantgtenrientwg onrekt-s waroerfikxs eadr-ef efaixtuedre-fgeeantuerrea tgoers, while CNNs learn features from this made to make the discrenteerawtoarvse, lwethsiclea tCteNriNngs nleeatrwn ofrekastuhraevs e frdoamta .tAs a result, an efforte the lehaer ndaabtail.i tAysp ar orepseurltty,, asnu cehfftohrat titsh meyadcaen tole marankfee athtuer edsisfcrroemte twheavdealteat. scattering networks have the learnability property, such that they can learn features from the data. 2.3. Learnable Wavelet Scattering Networks (LWSNs) 2.3. Learnable Wavelet Scattering Networks (LWSNs) Instead of the fixed wavelet filters of the WSN, the wavelet filters in the LWSN are learnIanbstleeauds ionfg thae sfeixcoedn dw-gaevneelerta tfiioltnerws oafv ethleet WtraSnNs,f othrme w(SaGveWleTt )f.ilTtehres icnla tshseic LalWwSaNv ealreet lteraarnnsafobrlme uissirnegal iaz esdecthornodu-gghentehreattrioanns lwataiovnelaent dtreaxnpsfaonrsmio n(SoGf WtheTm). oTthhee rcwlaasvsieclaelt fwunavcteiloent . tTrhanissfdoerfimn iitsi orneailsizveedry threrosturgichti vthee, storatnhselaStGioWn Tanddo eesxpawanasyiown iothf tith.eT mheotlihfetirn wg amveetlheot dfu[n3c5-] toiornth. Tehliifst idnegfisncihtieomn eis (vFeigryu rrees2tr)iicstiaves,p saoc ethdeo SmGaWinTw doavese laewt caoyn wstirtuh citti. oTnhem leiftthinogd museethdotdo [c3o5n]s otrru tchtet hlieftSinGgW sTchfielmteers (,Faingduriet b2u) iilsd as sspaarcsee dreopmreasienn wtaativoenlestb cyoenxsptrlouicttiinognt hmeectohrorde luatsieodn tion hceornesnttruinctm thoes tSrGeaWl-Tw ofirlltderds,a atan.dI titc obnusilidstss sopfatrhsree erebparseiscesntetaptsio: ns by exploiting the cor- relation inherent in most real-world data. It consists of three basic steps: 1. Split: Let x(n) be an original signal. In this step, x(n) is divided into two subsets: the 1. Sepvleitn: Lsuetb s?e(t?x)e (bn?e) aa(n?n )doroidgidnasul bsisgent axlo. (Inn) .thTihs estseupb,s e?t(s?a)r eisc dorivreidlaetded inatcoc otwrdoin sgubtosetthse: tchoer reevleanti osnubstsreut cture o fatnhde oordidgi nsualbssiegtn ?al.(?). The subsets are correlated according to the correlation structure of the original signal. xe(n) = x(2n) (5) xo(n) = x(2n + 1) (6) Electronics 2022, 11, x FOR PEER REVIEW 5 of 17 Electronics 2022, 11, 451 ? ?(?()?) =? (?) = ?( ?2(?2+?)1 ) 5 o(f51)6 (6) 22. . PPrereddicitc?:t :T(Th?eh) eodod dcoceofefficfiiceinetnst s xo(n )aarree pprreedd?ici(ct?tee)dd ffrroomm ththee nneeigighhbboorirningg eevveenn ccooeefffifi- -ciceienntst s xe(n),, aanndd tthhee pprreeddiicctti?ioo(nn? d)diiffffeerreenncceess d(n) aarreed deefifnineedda asst htheed deetatailils isgignnaal,l, ? = [?( d( n=) =? x(?o()n )?? ?P((?xe((?n)))) ((77))wherew here P =1),[?p(?1),, ?? ?(???)?]? , ips( tNhe)] pT riesdtihcetiponre odpicetrioatnoor.p erator. 33. . UUppddaatete: :CCooaarrssee aapppprroxiimattiion c?((n?)) totot htheeo roirgiignianlasl igsinganlails icsr ecarteeadtebdy bcyo mcobminbininginthge theev envecno ecfofiecfifeincitesnatns danthde thlien el?ia(nr?ec)aorm co?bmin(b?ait)nioanti?onf(? toh(f?e t)hpere pdriecdtiiocntiodnif fdeirfefnercensces ? = [?(1), ? ? , ?(?)] c( n=) = xe(n )++ U(d(n)) ((88))where ?(?) U u ? ? ? ? ? ? ius tNhe uT pdate operator. By iterating on the approximation signalw here = [ (1), , ( )matio nussiignnga tlhce nthrueesi sntgepths,e tthhe] apis tree pstrhoexiumpadaeps, thetioten operatoapparnodx itmher. By iteratina tdioetnaialn sdigtnhagl aorne dee thta oilbetaapproxisiginneadl aarte -( ) differeonbtt aleinveedls.a Tt hdeif foeprteinmtilzeavteiolsn. Tofh tehoep ltifimtinizga stciohnemofet?sh eUlpifdtiantge (sUch) eamnde? sPrUepddicatt (eP()U o)paenr-d a?tors iPnr tehdei cLtW(PS)No pise craatroriresdi noutht eusLiWngS tNhei sgecnaerrtiice daloguortiuthsmin g(GthAe).g Tehnee toipc taimlgiozreidth Ump(dGaAte) .(U) anTdh eProepdtiicmt i(zPe)d oUpperdaattoers( Ua)rea ncdonPvreerdtiecdt (tPo) tohpee wraatovresleatr e(?c)o nanvedr taevdertoagtihnegw oapveerlaetto(rs ?)( ) usainngd Cavlaeyrapgoionlge?osp aelrgaotorirtsh(m? )[u35si]n, gsuCclha ythpaoto lteh?es astlgruocrtituhrme i[n3 5F]i,gsuurceh 1t hcaatnt hbee sutrsuecdt utroe learn tiinmFei?gfurerequ1ecnacnyb reepurseesdentotalteiaornns tfirmome? ftrheeq udeantac.y representations from the data. FFiigguurree 22. .LLifitfitningg sscchheemmee. . TTaabbllee 11 iilllluussttrraatteess tthhee ddiiffffeerreenncceess bbeettwweeeennd deeeeppl eleaarnrnininggn netewtworokrsk,sw, wavaevleeltestc sactatettreinr-g innegt wneotrwkos,rkasn,d anleda rlneaarbnleabwlea vwealevteslecat tstceartitnegrinnegt wneotrwkos.rks. Table 1. DiffeTraebnlcee s1.b Detiwffeereennnceestw boetrwkse.en networks. Deep Learning Networks WaveWleatvSeclaettte SricnagttNereitnwgo rNkest- LearLneaabrnlea bWleaWvealvel Deep Learning Networks et eStcat- works Stceartitnergi nNgeNtwetowrokrsk s Features Learnt from data FFiixxeedd waavveeleltetty tpyepaen adnd co- WWaavveelleett ttyyppee aanndd Features Learnt from data coefficeifefnictsie(nntost (lenaortn tlefarormnt dfraotam) ccooeeffiffciiceinetns tlse alrenatrfnrto mfrodmat a Features Output at Last layer Everydlaaytear) Evedraytlaa yer FeNautumrbeesr OofuLtpayuetr ast Variable nuLmabset rlaoyf ehrid den Two layers (Etyvpeircya llayy) oefr fixed TwoElavyeerrys loafylearr ned(co Number of Layers Vari navbolleu ntiounmalb)elar yoefr shidden Two lwayaveresle (ttsypically) of Two laywearvse oleft slearned Nonlinearity Modulu(cso/nRvecotliufiteidonLainl)e alar yUenrist / fMixoeddu wluasvelets wMaovdeulelutss Nonlinearity Mo Hdyupleurbso/RlieccTtaifnigeedn Lt,ientec.ar Unit/ Pooling HMax/Ave Modulus Modulus yperbolriacg Tinagn,getecn. t, etc. Averaging Averaging LearnPinogolAinlggo rithm GrMadaiexn/tADveesrcaegnitnagn,d etc. ANvAeraging Lifting mAevtehroadgaingd genetic GBraacdkpiernopt aDgeasticoennt and Lifting meatlghoordit hamnd genetic LearniCnlga sAsilfigeorrithm BaScokfptMroapxagation Any (e.g.N, SAVM ) Anaylg(oe.rgi.t,hSmVM ) Classifier Various architecStoufrteMs, ae.xg ., ResNet Any (e.g., SVM) Any (e.g., SVM) Architecture [32] Alexnet [41], Recurrent See Figure 1 See Figure 1 Neural Network [42] etc. Electronics 2022, 11, 451 6 of 16 2.4. Genetic Algorithm (GA) The GA [43] mimics the theory of natural selection. As in the case with evolution, a population consists of individuals which reproduce to create the next generation. This reproduction involves the combination of genetic material from parents to create an off- spring. Each subsequent generation will be created by parent individuals by combining their genes. The selection of parents (individuals) to combine is based on their fitness, and the fitness of an individual is based on the fitness function. A total of 10% of the individuals with the best fitness move on to the next generation. This mechanism is called elitism, and the percentage of the elite individuals can be changed. The remaining individuals take part in crossover, where the genes of two individuals (parents) are combined to create the genes of the individual of the next generation (child). Crossover is carried out until the required number of individuals (children) is created in the next generation. Analogous to mutation in natural reproduction, random changes are added to the genes of a fraction of the children created. This helps to avoid getting stuck in the local minima of the optimiza- tion of the fitness function. The process repeats for the new generation and the subsequent generations until the predefined maximum number of generations is reached or there is no improvement in the fitness in consecutive generations. 2.5. Support Vector Machine (SVM) An SVM is based on the concept of finding decision planes or hyperplanes that maximize the separation between classes. If the classes are not linearly separable, a kernel trick is used to map the data into higher dimensions in an effort to separate them. To find the support vectors and hence construct an optimal hyperplane, the following optimization problem [44] is solved: 1 N ( ) min?(w) = ?w?2 + C ? ? Ti s.t. yi w ?(xi) + b ? 1? ?i (9)2 i=1 where C is the penalty parameter to guard against overfitting, and ?i are the slack variables introduced to handle inseparable data. The input data consists of xi and yi, which are the independent and the dependent variable (class label), respectively. The kernel function ? transforms the input data xi into higher dimensions. 3. Fault Diagnosis Methodology The implementation of the diagnostic scheme is depicted in Figure 3. Firstly, a dataset of signals when the circuit components are degrading is obtained via simulation or exper- imentation. This dataset is randomly split into a training dataset [XTrain, YTrain] and a testing dataset [XTest, YTest], where XTrain and Xtest represent the circuit output signals in the training and the testing dataset, respectively, and YTrain and YTest represent the corresponding labels (degrading components). A subset of signals (30%), XTrain?, is ran- domly selected from the entire training dataset to be used with the GA. This is done to prevent overfitting to the training dataset and to reduce the time taken for GA optimization. The fitness function used is the Davies?Bouldin (DB) index [45], as it considers the ratio of within-class and between-class distances. As a result, the minimization of the DB index leads to maximum separation between the classes. The GA is used to optimize the Predict and Update operators of the SGWT, such that the DB index is minimized. The genes in each individual in the GA are the coefficients for the P and U operators that need to be optimized by the GA. The P and U operators are assumed to be of length 8; hence, the number of genes in each individual is 16. Other hyperparameters chosen for the GA include population size: 100, elite count: 10%, crossover fraction: 90%, mutation rate: 5%, and the stopping criterion of the GA is when there is no appreciable improvement in the fitness function for 30 consecutive generations. The feature space (XTrainMod) created by the LWSN, with the optimized P and U operators, is classified using the SVM as the classifier. Electronics 2022, 11, x FOR PEER REVIEW 7 of 17 length 8; hence, the number of genes in each individual is 16. Other hyperparameters cho- sen for the GA include population size: 100, elite count: 10%, crossover fraction: 90%, mu- ta?t?io?n? ?r?a?te?: ?5)% , and the stopping criterion of the GA is when there is no appreciable im-Electronics 2022, 11, 451 provement in the fitness function for 30 consecutive generations. The feature s7poafc1e6 ( created by the LWSN, with the optimized P and U operators, is classified using the SVM as the classifier. Since SVM hyperparameter optimization is not the focus Soifn tcheisS VpaMpehry, ptherep hayrapmereptearraompteimteriz oaptitoimn iiszantoiot nth we faosc cuasrorifetdh iosupta upseirn, gth beuhiyltp-ienr pMarAaTmLeAteBr ofupnticmtioiznast.i on was carried out using built-in MATLAB functions. Figure 3. FaulFtidgiuargen 3o.s Fisaumlte tdhioadgnoloosgisy .methodology. 4.. Experiments and Results The prropoosseeddm meeththooddw waassv evreirfiiefidedu suinsigntgw towaon anloaglocgir cuirictus,itsh,e tShael lSeanl?leKne?yKbeayn bdapnadss- fiplatsesr fcilitrecru citiracnudit tahnedt twheo -tswoi-tscwh iftocrhw foardwacrodn vcoenrtvoerrctoirrc cuiirtc,uaint,d antwd otwrot raotitnagtinmga mchaicnheirny- derayt adsaetass, eCtsW, CRWURbUea brienagrinfagu flatsuldtsa tdaasteatsaetn adnUd oUCoCge gaerafra fualutsltsd adtaatsaeste.t.F Faauultlt ddaattaa ffor the circuits is generated by varying component values around their nominal* values within SPIC* E, i.e., if the nominal value of a component is Y,, the lower range and the upper range of the deviation constituting the paramettriic ffaulltt off tthee ccoomppoonneenntt iiss [[00.2.255 Y? Y? 0?.90*Y.9] ?anYd] a[1n.d1 [Y1 .?1 1?.7Y5*?Y]1, .r7e5sp?eYc]t,ivrelsyp.e Wctihveenly t.hWe choemnpthoenecnotm vpaolune nist bveatlwueeisn b0e.9tw*Ye aend0 .19.1?*Y, aitn ids 1c.o1n?siYd,eirteids tco nbsei wdeirthedint iotsb teolweritahnicne irtasntoglee,r ia.en.c, ea troalnegrae,nic.e .r,aantgoele orfa 1n0c%e r. aTnhgee troafin10in%g. dTahtea twraeirnei nogbtdaaintaedw ebrye coobntadinuecdtinbgy 1c0o0n0d uScPtIiCngE 1s0im00uSlPatIiCoEnss,i mwuhlearteio cnosm, wphonereenctso marpeo vnaernitesda rine vthaeri eadfoirnemthenatfionreemd ernatniognees dornaen agte sa otnime ea,t awthimilee ,thweh oiltehtehre cotmheprocnoemntpso anreen thsealdre ahte tlhdeairt tnhoemirinoalm vianlaulevsa. lues. 44..11.. SSaalllleenn??KKeeyy BBaannddppaassss FFiilltteerr TThhee fifirrsstt cciirrccuuiitt uunnddeerr tteesstt ((CCUUTT11)) iiss tthhee SSaalllleenn??KKeeyy bbaannddppaassss fifilltteerr ((FFiigguurree 44)),, wwhhiicchh iiss tthhee mmoosstt ffrreeqquueennttllyy ssttuuddiieedd cciirrccuuiitt ffoorr aannaalloogg cciirrccuuiitt ffaauulltt ddiiaaggnnoossiiss.. UUnnlliikkee ootthheerr ppaa-- ppeerrss tthhaatt oonnllyy ccoonnssiiddeerr tthhee ffaauulltt ddiiaaggnnoossiiss ooff ffoouurr ooff tthhee sseevveenn ppaassssiivvee ccoommppoonneennttss,, wwee ccoonnssiiddeerreedd aallll sseevveenn ppaassssiivvee ccoommppoonneennttss ffoorr ffaauulltt ddiiaaggnnoossiiss.. TThhee ppaarraammeettrriicc ffaauulltt rraannggeess ffoorr tthhee sseevveenn ccoommppoonneenntstsc coonnsisdidereerdeda raeres hsohwown nin inTa Tbalebl2e. 2A. sAcsa ncabne bseee sneefrno mfroTmab Tleab2,lew 2e, considered a single class for each component as opposed to other papers in the literature that consider two classes for each component. The data for each class were split into training and testing data sets via a 75%?25% split. The LWSN was trained on the training data, and the testing accuracy of the LWSN is reported in Table 3, along with the testing accuracy of the original wavelet scattering network and the Gaussian?Bernoulli Deep Belief Network (GB-DBN)-based approach [22], which was used for comparison. This paper was used for Electronics 2022, 11, x FOR PEER REVIEW 8 of 17 we considered a single class for each component as opposed to other papers in the litera- ture that consider two classes for each component. The data for each class were split into training and testing data sets via a 75%?25% split. The LWSN was trained on the training data, and the testing accuracy of the LWSN is reported in Table 3, along with the testing Electronics 2022, 11, 451 accuracy of the original wavelet scattering network and the Gaussian?Bernoulli Dee8po fB1e6- lief Network (GB-DBN)-based approach [22], which was used for comparison. This paper was used for comparison because it uses a deep-learning-based feature extractor, the DBN, along with an SVM for classification. Hence, it is conceptually similar to our paper. cTohmep caornisfounsiboenc mauasteriixt ufosre sthaed faeeuplt- ldeiaargnninogsi-sb aosf etdhef eSaatluleren?eKxteryac btaonr,dthpeasDs BfiNlte, ar luosnigngw LitWh SanN SiVs Mshofowrnc lians sTiafibcalet i4o.n . Hence, it is conceptually similar to our paper. The confusion matrix for the fault diagnosis of the Sallen?Key bandpass filter using LWSN is shown in Table 4. FFiigguurree4 4. .S Saallleenn??KKeeyyb baannddppaasssfi flitleter.r. TTaabblele2 2. .N Noommininaal lv vaalulueessa annddp paararammeetrtircicf afauultltr arnanggeeo of fS aSlallelnen?K?Keyeyb bananddppasasssfi fltiletrerc ocmompponoennetnst.s. FFaauullt?t CCllaassss FaFualutl tCCooddee NoNmoimnianla Vl Valauluee FFaauullttyy RRaannggee HHeea?alltthhyy F0F 0 NNAA NNAA ? F1 1 k? [0.25 k 0.9 k] and [1.1 k 1.75 k] R1 F1 1 k? [0.25 k 0.9 k] and [1.1 k 1.75 k]R? F2F 2 1 k1?k? [02 [.02.52 5kk 00.9.9 kk]] aanndd [[11..11k k1 1.7.575k ]k] R?3 F3F 3 2 k2?k? [0[.05. 5kk 11.8.8 kk]] and [[2..22k k3 3.5.5k ]k] R F4 2 k? [0.5 k?4 F4 2 k? [0.5 k 1 1.8.8 kk]] aand [[22..22k k3 3.5.5k ]k] R5 F5F 5 2 k2?k? [0[.05. 5kk 11.8.8? k k]] aanndd [[22..22k k3 3.5.5k ]k] C1 F6F 6 5 n5Fn F [1[.12.25 n 4.50 n] and [5.50 n 8.75 n]C2 F7 5 nF [1.52 5nn 44.5.500 nn]] aanndd [[55..5500n n8 8.7.575n ]n] F7 5 nF [1.25 n 4.50 n] and [5.50 n 8.75 n] Table 3T.a Fbaleul3t .dFiaaugnltodsiias gancocusirsaaccyc oufr aLcWy SoNf L aWndSN coamndpacroismopna wriistohn owthiethr motehtehromdse.t hods. Circuit Literature L(GitBe-rDatBuNre) [2 2] Wavelet Scattering Networks Proposed Method (LWSN) Circuit (GB-DBN) [22] Wavelet Scattering Networks Proposed Method (LWSN) CUT1 99.12% 90.01% 99.72% CUCTU2 T1 84.9349%.12% 8920.4.051%% 9929.9.732%% CUT2 (ExperimeCnUtaTl V2 alidation) N8A4.34% 8812.1.425%% 9902.7.913%% CUT2 (Experimental ValidationTa) ble 4. Confusion NmAa trix for LWSN for Sallen8?1K.e1y2%ba ndpass filter. 90.71% F0 99.T4able 4. Co0n.6fusion matrix for LWSN for Sallen?Key bandpass filter. F1 99.8 0.2 F0 99.4 0.6 F2 F1 99.8 99.8 0.2 0.2 F3 F2 10099.8 0.2 F4 F3 110000 F5 F4 1001 00 F6 1F.58 100 98.2 F7 100 F0 F1 F2 F3 F4 F5 F6 F7 Predicted Class The Sallen?Key bandpass filter circuit involved seven fault types and one healthy class to detect and identify, which correspond to the 14 fault types for methods used in True Class True Class Electronics 2022, 11, x FOR PEER REVIEW 9 of 17 F6 1.8 98.2 F7 100 F0 F1 F2 F3 F4 F5 F6 F7 Predicted Class Electronics 2022, 11, 451 The Sallen?Key bandpass filter circuit involved seven fault types and one he9aolfth16y class to detect and identify, which correspond to the 14 fault types for methods used in the literature. From Table 3, it can be seen that the proposed LWSN method achieved a tmhearligtienraalt uimrep. rForvoem eTnatb olef 30,.7i%t c ainn tbhee sfaeuenlt tdhiaatgtnhoesips raocpcousreadcyL oWvSerN commepthaoradbalec hmieevthedodas mina rthgien lailteimrapturorev e[1m8e] natnodf 0a. 79% iinmtphreofvaeumltedniat ginn otshies afacucultr adciyagonvoesrisc oamccpuarraacbyl eomveert hao tdrasdini- tthioenliatle rWatSuNre. [A18s] caannd bae9 s%eeinm fprroomve tmhee nctoinnfuthseiofna umltadtriaixg nino sTisaabclec u4r,a fcayuoltv teyrpaet rFa6d, iwtiohnicahl WcoSrNre.sApsoncadns btoe sceaepnafcriotomr tChe1,c ownafsu smioinsdmiaagtrnioxsiendT ambolest4 ,offatuenlt; thypowe Fe6v,ewr, htihceh cdoiarrgensopsoins dosf toothcaepr afaciutoltr tCyp1e, sw wasams aislmdioasgtn poesrefdecmt. ost often; however, the diagnosis of other fault types was almost perfect. 4.2. Two-Switch Forward Convertor 4.2. Two-Switch Forward Convertor The second circuit under test (CUT2) is the two-switch forward convertor circuit (Fig- ure 5T)h. eAs feocrownadrdci rccounivt eurntedre irs tae sstw(iCtcUhTin2g) ipsotwheert wsuop-pswlyi tccihrcufoitr wthaartd isc ounsveder tfoorr ecinrecrugiyt (tFriagnusrfeer5 w). hAenfo trhwea trwdoc sownvitecrhteers i(straanswsisittcohrsin) garpeo swimerulstuapnpeolyucsilryc tuuirtnthedat oins .u Tshede pfoarraemneertgriyc transfer when ffaauulltt rraannggeessf foo th rrt e tw htheec c oosmwpitocnheens t(st rcaonnssisitdoerrse)da raeftseirm suelntasniteivoiutysl yantualrynseids aorne. sThhoewpna rianm Tetricomponents considered after sensitivity analysis are shown in Tablaeb5le, a5l,o anlgonwgi twhitthh ethvea lvuaelsuefso rfoerx epxepriemriemnetanltavl evreifiricfaictiaotino.n.A Assc acannb bees seeeenn ffrroomm TTaabblele 55,, wwee ccoonnssiiddeerreedda as siningglelec lcalassssf ofrore aecahchs isnignlgelfea fualut l(ts i(nsignleglceo cmopmopnoennetndte dgreagdraadtiaotni)oans) oasp poopspeodsetod ototh oerthpearp peraspienrst hien ltihteer alittuerreattuharet ctohnast idcoenr stiwdoerc ltawssoe sclfaosrseesa cfhors ienagclhe fsaiunlgtl.eT fhaeualtd. vTahnet aagde- ovfadnotaingge osof disoitnhga tswo eis ctohualtd wceo ncosiudlder coonnesicdlears sonfoer celvasesr yfodr oeuvbelrey fdaouultb(ltew foauclot m(twpoon ceonmts- dpeognreandtisn dgesgimraudlitnagn esiomuusllyta),naesoucasnlyb),e asse ceannf rboem seFeanu flrtoCmo dFeasulFt1 C4oadneds FF1154. aInf dw Fe1w5.e Irfe wtoe cwonerseid teor ctownosicdlaesrs tews ofo crlaesascehs sfionrg eleacfahu slitn, gwlee fwaouultl,d whea wveotuoldco hnasvidee trof coounrscildaesrs efsouforr celavsesreys dfooru ebvleerfya udlot.ubTlhee fadualtta. Tfohre edaactha fcolar sesacwhe crleasssp wliteirnet soptlrita iinntion gtraanindintge satnindg tedsattiangs edtastva isaetas 7v5i%a ?a2 755%%s?p2l5i%t. Tshpelitt.e Tsthineg teasctcinugra accycoufrathcye LoWf thSNe LoWn SbNot honth beostihm tuhlea tsiiomnualnadtioenx paenrdim eexnptearl- dimateanistarle dpaotrate ids rinepToarbtleed3 i,na Tloanbglew 3i,t haltohnegt ewstiitnhg thaec ctuesraticnygo afctchueroarciyg ionfa tlhwe aovriegleint ascl awttaevrienlegt nsectawttoerriknagn ndettwheoGrka aunssdi athn?e BGearunossuilalni D?BeeerpnBouellilei fDNeeetpw Boerlkie(fG NBe-tDwBoNrk)- (bGasBe-dDaBpNp)r-obaacshed[2 a2p],- wphroicahchw [2e2re], wusheidchf owrecroe muspeadr ifsoorn c.omThpearcisoonnf.u Tsihoen comnafutrsiixonfo mr athtreixf afourl tthdei afagunlot sdisiaogfnothsies towf oth-sew tiwtcoh-sfworiwtcahr dfocrownavredr tcoorncviercrtuoirt ucisricnugitL uWsiSnNg LisWshSoNw ins sinhoTwabnl ein6 .Table 6. FFiigguurree 55.. TTwwoo--sswwititcchhf foorrwwaarrddc coonnvveerrtotorrc cirirccuuiti.t. The experimental setup that was used to demonstrate our approach is shown in Fig- ure 6. The two-switch forward convertor circuit (CUT2) was used with pulse width Electronics 2022, 11, 451 10 of 16 Table 5. Nominal values and parametric fault range of two-switch forward convertor circuit components. Fault Class Fault Code Nominal Value Faulty Range Experimental Values Healthy F0 NA NA NA R1 F1 33 ? [8.25 ? 29.7 ?] and [36.3 ? 57.75 ?] 10 ?, 20 ?, 40 ?, 50 ? C4 F2 0.1 ?F [0.025 ?F 0.09 ?F] and [0.11 F 0.175 F] 0.025 ?F, 0.05 ?F, ? ? 0.12 ?F, 0.15 ?F RL F3 100 ? [25 ? 90 ?] and [110 ? 175 ?] 30 ?, 80 ?, 120 ?, 170 ? L3 F4 100 ?H [25 H 90 30 ?H, 75 ?H, ? ?H] and [110 ?H 175 ?H] 156 ?H, 170 ?H R5 F5 0 ? [0.1 ? 10 ?] 2 ?, 4 ?, 6 ?, 8 ? R6 F6 0 ? [0.1 ? 10 ?] 2 ?, 4 ?, 6 ?, 8 ? R7 F7 0 ? [0.1 ? 10 ?] 2 ?, 4 ?, 6 ?, 8 ? R8 F8 0 ? [0.1 ? 10 ?] 2 ?, 4 ?, 6 ?, 8 ? R10 F9 0 ? [0.1 ? 10 ?] 2 ?, 4 ?, 6 ?, 8 ? R11 F10 0 ? [0.1 ? 10 ?] 2 ?, 4 ?, 6 ?, 8 ? R12 F11 0 ? [0.1 ? 10 ?] 2 ?, 4 ?, 6 ?, 8 ? R13 F12 0 ? [0.1 ? 10 ?] 2 ?, 4 ?, 6 ?, 8 ? R16 F13 0 ? [0.1 ? 10 ?] 2 ?, 4 ?, 6 ?, 8 ? (30 ? 0.025 ?F), (30 ? RL ? C4 F14 100 ? ? 10 ?F ( ([0.025 ?F 0.09 ?F] and [0.11 ?F 0.175 ?F]) 0.175 ?F), (170 ? 0.025 ?F), (170 ? 0.175 ?F) ([8.25 ? 29.7 ?] and [36.3 ? 57.75 ?]) ? R ? R ? (10 ?, 20 ?), (10 ?, 40 ?),1 2 F15 33 ? 33 ? ([8.25 ? 29.7 ?] and [36.3 ? 57.75 ?]) (30 ?, 10 ?), (50 ?, 50 ?), R2 F16 33 ? [8.25 ? 29.7 ?] and [36.3 ? 57.75 ?] 10 ?, 20 ?, 40 ?, 50 ? Table 6. Confusion matrix for LWSN for two-switch forward convertor circuit. F0 91.7 0.7 3.5 0.5 3.5 F1 94.4 0.2 5.3 F2 0.5 89.9 9.6 F3 0.8 78.4 3.8 0.8 1.3 14.0 0.8 0.3 F4 0.5 0.3 1.6 89.7 0.3 5.7 0.5 0.8 0.8 F5 0.3 0.3 98.2 0.5 0.5 0.3 F6 0.5 1.2 3.9 1.2 92.0 0.2 0.2 0.5 0.2 F7 3.1 94.8 0.3 1.3 0.5 F8 0.2 14.4 2.0 0.5 0.7 81.9 0.2 F9 0.3 0.3 99.2 0.3 F10 9.9 88.3 1.8 F11 1.0 0.3 0.3 0.8 97.1 0.3 0.3 F12 4.2 0.2 1.0 1.7 0.2 1.0 90.3 1.2 F13 0.5 0.5 0.2 0.2 0.2 98.3 F14 100 F15 4.1 0.7 94.9 0.3 F16 4.7 2.3 2.3 2.3 4.7 83.7 F0 F1 F2 F3 F4 F5 F6 F7 F8 F9 F10 F11 F12 F13 F14 F15 F16 Predicted Class The experimental setup that was used to demonstrate our approach is shown in Figure 6. The two-switch forward convertor circuit (CUT2) was used with pulse width waveforms to trigger the two switches, generated using an Agilent Arbitrary Waveform Generator 33250A. The circuit components were swapped out with the components with values shown in the Experimental values column of Table 5. For instance, to mimic the True Class Electronics 2022, 11, x FOR PEER REVIEW 11 of 17 F8 0.2 14.4 2.0 0.5 0.7 81.9 0.2 F9 0.3 0.3 99.2 0.3 F10 9.9 88.3 1.8 F11 1.0 0.3 0.3 0.8 97.1 0.3 0.3 Electronics 2022, 11, 451 11 of 16 F12 4.2 0.2 1.0 1.7 0.2 1.0 90.3 1.2 F13 0.5 0.5 0.2 0.2 0.2 98.3 F14 100 F15 4.1 d egrad ation of resi stor R 1 from its no minal valu0e.7o f 33 ?, res istors of 10 ?, 2904?.9, 400.?3 , and 50 ? were substituted, and the circuit output was captured at every instance. The F16 4.7 c ircuit respon ses c2a.p3 ture2d.3a t the outpu t usin g an A gilen2t.D3 igita l Osc4il.l7o scop e 5485 3A w83er.7e F0 F1 Fc2l assiFfi3e d usFi4n g thFe5d eveFl6o pedFf7a ult dF8ia gnoFs9i s mFe1th0o doFl1o1g y, Fa1n2d thFe1r3e suFlt1s4a reFp1r5o viFd1e6d in Table 3. Predicted Class FFigiguurere6 6. .E Exxppeerirmimeenntatal ls esetutuppf oforrd deemmoonnstsrtaratitninggt htheed deveveleoloppededa pappproroacahch. . TThhees sixixteteeennf afauultltt ytyppeessa annddo onneeh heeaaltlhthyyc clalassssc coonnssidideerereddf oforrt htheet wtwoo-s-wswitictchhf oforwrwaardrd ccoonnvveerrtotorrc coorrrreessppoonnddt oto2 288f afauultltt ytyppeessf oforrm meeththooddssi nint htheel iltieteraratuturree, ,a annddt hthisisi sisa am muucchh mmoorerec chhaalllleennggiinngg ffaulltt diiagnossiiss pprroobblelemm cocommpparaerded tot oCUCTU1T. 1F.roFmro mTabTlaeb 3le, it3 ,caitn cbaen seben steheant tthaet pthroepporsoepdo LseWdSLNW mSNethmoedt haocdhiaecvheide vae sdigansifiigcnainfitc iamnpt rimovpermoveenmt oefn 8t.o9%f 8 i.9n% thien ftahuelt faduialtgndoiasgisn oascicsuaraccyu roavcyero tvheer ctohme cpoamrapbaler ambleethmoedt hino dthine ltihterlaitueraet u[2r2e] [2a2n]da an d10a.91%0. 9im%- imprporvoevmemenetn itn itnheth feaufaltu dltiadginagosnios saisccaucrcaucrya ocyveorv tehre tthraedtirtaiodnitaiol nWaSl NW. SANs. cAans bcaen sebeens fereonm frtohme ctohnefucosinofnu smioantrmixa itnri TxainblTea 6b, lfea6u,ltf atuypltet yFp3e, wF3h,icwhh cicohrrceosprroenspdos ntod sretsoisrteosris RtoLr, wRLa,s wmais- mdiisadginaogsneods eads fasuflta utylpt ety Fp8e (Fre8si(srteosris Rto8r).R O8t)h.eOr tnhoetrabnloet ambilseclmasisicfliacsastifioncas tiinocnlsudinec tlhued seinthgele sfinauglte Ffa1u (lrteFs1is(troers iRs1to) raRn1d) tahned dtohuebdloe ufbauleltf aFu1l5t F(r1e5si(srteosris Rto1r aRn1da nRd2)R. 2T)h. iTs hhiisghilgighhlitgsh tthse thcoemcopmlepxlietyx itoyf oafnaanloaglo gcircciruciut itfafuaultl tddiaiaggnnoosissi.s .HHoowweevveerr, , the developed LLWSSNm meeththoodd sstatannddsso ouut ti nint etermrmsso of ff afauultltd diaiaggnnoosissisp perefroformrmaanncecei ninc ocommpparairsiosonnt otoe xeixsitsitninggm metehthodods.s. 44.3.3. .B BeeaarrininggF FaauultltD Diaiaggnnoosissis InInr orotatatitninggm maacchhinineeryrya apppplilcicaatitoionnss, ,r orolllilninggb beeaarrininggf afauultlstsa arreet thheem moossttc coommmmoonn,,l leeaadd-- ininggt otot htheep peerrffoorrmmaannccee ddeetteerriioorraattiioonn ooff mmaacchhiinneerryy.. HHeennccee,, bbeeaarriinngg ffaauulltt ddiiaaggnnoossiiss ppllaayyss a avvitiatal lrroolele iinn tthhee hheeaalltthh mmaannaaggeemmeenntt ooff mmaacchhiinneerryy [[4466]].. TToo tteesstt tthhee eefffefecctitviveenneessss oofft hthee mmeeththoodda accrroossss ddiiffffeerreenntt ddoommaaiinnss ooff ffaauulltt ddiiaaggnnoossisis, ,ththee ddeveveleolpopeded mmetehtohdo dwwasa tsestteesdte odn oan baeabreinargi nfaguflatsu bltesnbcehnmcahrmk adraktadsaetta. Tsehte. TChaeseC WaseestWerens RteersnerRvees UernvieveUrsniitvye (rCsiWtyR(UC)W mRoUto)r mboetaorrinbge adraitnagsedta wtaasse tgewnaesragteende urastiendg uas tiensgt raigte csotnrisgisctionngs iosft ian 2g hopf aR2elhiapnRcee lEialenccteriEc lmecotrtoicr, mao ttoorrq, uaet otrrqaunesdtruacnesrd/euncceord/eern,c ao ddeyrn, aamdyonmaemteorm, aentedr ,darnivded-ernivde -aenndd faannd-efnand- eSnvdenSsvkean Kskual- Klaugllearg-Fera-bFraikberink edneedpe-egpr-ogorvoeo vbealbl ablelabreinargisn. gIsn.nIenrn reirngri,n ogu, toeur treirngri,n agn,da nrodllrionlgli neglemeleemnte dnet- dfeefcetcst swwereer emmaannuufafactcutureredd inintoto tthhee bbeeaarriinnggss.. TThhee mmoottoorr wwaass rruunn aatt aa nneeaarr--ccoonnsstatanntts sppeeeedd (1(1772200??11779977r /r/mmiinn)) wwiitthh ddiiffffeerreennttl looaaddss( (00??33h hpp))p prroovvidideeddb byyt thheed dyynnaammoommeeteter.r.V Vibibraratitoionn ddaatataw weererec ocollleleccteteddu usisningga acccceelelerorommeeteterrss, ,w whhicichhw weererev veertritcicaalllylya atttatacchheeddt otot htheeh hoouusisningg wwitihthm maaggnneetitcicb baaseses.s.S Saammpplilninggf rfereqquueennccieiessw weerere1 122k kHHzzf oforrs osommeeo offt htheet etestsstsa anndd4 488k kHHzz foforrt htheeo tohtehresr.sF. uFruthrtehredr edtaeitlasiclsa ncabne bfoeu fnodunatdt haet tChWe RCUWBReUa rBinegarDinagta DCaetnat eCrewnteebrs iwtee[b4s7i]t.e A[s47s]h. oAws nshinowTanb ilne T7,aobnlee 7h, eoanlteh hyebaeltahryin bgeaanridngth arnede ftahureltem faoudlte sm, oindcluding the inner ringfault, the rolling element fault, and the outer ring fault, were classifieesd, iinntcolutdenincga tthege oinrineesr (one health state and nine fault states) according to different fault sizes. A plot of the data can be seen in Figure 7. The data were resampled such that the entire dataset had a constant sampling rate, and then, the data were split into chunks with sizes of 1024. The dataset was then split into training and testing datasets in the ratio of 75%:25% using stratified sampling. The LWSN achieved 99.2% accuracy for the testing dataset, which is comparable to the state-of-the-art methods [48]. The confusion matrix is shown in Table 8. Electronics 2022, 11, x FOR PEER REVIEW 12 of 17 ring fault, the rolling element fault, and the outer ring fault, were classified into ten cate- gories (one health state and nine fault states) according to different fault sizes. A plot of the data can be seen in Figure 7. The data were resampled such that the entire dataset had a constant sampling rate, and then, the data were split into chunks with sizes of 1024. The dataset was then split into training and testing datasets in the ratio of 75%:25% using strat- ified sampling. The LWSN achieved 99.2% accuracy for the testing dataset, which is com- Electronics 2022, 11, 451 parable to the state-of-the-art methods [48]. The confusion matrix is shown in Table 81. 2 of 16 Table 7. CWRU faults. Table 7. FCaWulRt UMfoaudlets . Description HealthF SatualtteM ode the normal bearing at 1D79e1s crrpipmti oannd 0 HP Inner ring 1 0.007-inch inner ring fault at 1797 rpm and 0 HP Inner rHinega l2th State 0.014-inch i the normal bearing at 1791 rInner ring 1 0.0n0n7e-irn rcihngin fnaeurlrti antg 1f7a9u7lt raptm17 9a pnmd a0n Hd 0P HP7 rpm and 0 HP Inner rIinnnge 3r ring 2 0.021-inch0 .i0n1n4e-irn rcihngin fnaeurlrti antg 1f7a9u7lt raptm17 9a7ndrp 0m HaPnd 0 HP Rolling ElIenmnernrti n1g 3 0.007-inch ro0l.l0in21g- ienlcehmiennte rfaruinlgt afat u1l7t9a7t 1rp79m7 rapnmd a0n HdP0 HP RollingR EollelimngenElte 2m ent 1 0.014-inch0 r.0o0l7li-ningc ehleromlleingt fealeumlte antt 1fa7u9l7t rapt m17 9a7nrdp 0m HanPd 0 HP RollingR EollelimngenElte 3m ent 2 0.021-inch0 r.0o1l4li-ningc ehleromlleinngt fealeumlte antt 1fa7u9l7t rapt m17 9a7nrdp 0m HanPd 0 HP OuteRro rlliinngg 1E lement 3 0.007-in0c.0h2 o1-uintecrh rrionlgli nfaguellte amt e1n7t9f7a urlptmat a1n79d7 0r pHmP and 0 HP Outer ring 1 0.007-inch outer ring fault at 1797 rpm and 0 HP Outer rOinugte 2r ring 2 0.014-inch0 .o0u14te-irn rcihngo uftaeurlrti antg 1f7a9u7lt raptm17 9a7ndrp 0m HaPnd 0 HP Outer rOinugte 3r ring 3 0.021-inch0 .o0u21te-irn rcihngo uftaeurlrti antg 1f7a9u7lt raptm17 9a7ndrp 0m HaPnd 0 HP FiFgiugruere7 .7V. Vibirbartaitoionns isgignnaalslso offt thhee ddiiffffeerreenntt ffaauullttss iinn tthhee CCWWRRUU ddaattaasseet.t . TaTbalbel8e. 8C. Conofnufsuisoinonm maatrtirxixf oforrL LWWSSNNf foorrt thhee CCWWRRUU ddaattaasseett.. Healthy 100.0 InnHeera lt hy 100.0 RInn 100.0 inge r1R ing 1 100.0 IInnnneerr R ing 2 96.7 3.3 RIinnner 96.7 3.3 g 2R ing 3 100.0 InnReorl li ng Element 1 100.0 100.0 Ring 3 RollRinolgli ng Element 2 100.0 100.0 Element 1 Rolling Element 3 100.0 Outer Ring 1 100.0 Outer Ring 2 4.0 96.0 Outer Ring 3 100.0 Healthy Inner Inner Inner Rolling Rolling Rolling Outer Outer OuterRing 1 Ring 2 Ring 3 Element 1 Element 2 Element 3 Ring 1 Ring 2 Ring 3 Predicted Class True Class True Class Electronics 2022, 11, 451 13 of 16 The CWRU bearing dataset involves nine fault classes and one healthy class. As can be seen from the confusion matrix in Table 8, for the bearing fault diagnosis, fault types F3 and F9 were misdiagnosed most often; however, the diagnosis of other fault types was perfect. 4.4. Gear Fault Diagnosis The second rotating machinery fault diagnosis dataset considered was the University of Connecticut (UoC) gear fault dataset [49]. The CWRU dataset and the UoC dataset were ranked the simplest and the most difficult benchmark dataset, respectively [48], for rotating machinery fault diagnosis. The average RMS and the average power of the signals in the CWRU and the UoC dataset were 0.27, ?9.36 dB and 0.07, ?21.91 dB, respectively. Preprocessing methods such as stochastic resonance [50] can be used to enhance weak fault characteristics in datasets such as UoC; however, in this paper, the LWSN method was applied directly to the raw vibration data. In the UoC dataset, nine different gear conditions were introduced to the pinions on the input shaft, including healthy condition, root crack, missing tooth, spalling, and chipping tip with five different levels of severity. All the collected datasets were used and classified into nine categories (one health state and eight fault states missing, crack, spall, chip5a, chip4a, chip3a, chip2a, and chip1a) to test the performance. The data were resampled such that the entire dataset had a constant sampling rate, and then, the data were split into chunks with sizes of 1024. The dataset was then split into training and testing datasets in the ratio 75%:25% using stratified sampling. The LWSN achieved 96.51% accuracy for the testing dataset, and the confusion matrix is shown in Table 9. Our result is marginally better, as the best result reported in [48] was 96.19%. Since the UoC dataset had 3600 samples per fault class and there were nine fault classes, the developed method is able to process the big data of rotating machinery. Table 9. Confusion matrix for LWSN for the UoC dataset. Healthy 99.0 0.1 0.1 0.1 0.1 0.1 0.3 Missing Tooth 0.3 98.6 0.3 0.3 0.3 0.1 0.1 Root Crack 0.7 1.3 91.6 1.1 1.4 0.9 0.9 1.3 0.9 Spalling 0.3 98.2 0.1 0.8 0.3 0.3 Chipping Tip 1a 1.0 0.3 0.4 0.4 95.5 0.6 0.7 0.4 0.7 Chipping Tip 2a 0.4 0.3 0.1 0.6 98.2 0.1 0.3 Chipping Tip 3a 0.1 0.1 0.1 0.1 0.1 99.0 0.1 0.1 Chipping Tip 4a 0.1 0.3 0.6 0.1 0.1 0.3 98.2 0.3 Chipping Tip 5a 0.1 0.1 0.1 99.5 Healthy Missing Root Spalling Chipping Chipping Chipping Chipping ChippingTooth Crack Tip 1a Tip 2a Tip 3a Tip 4a Tip 5a Predicted Class 4.5. Transfer Learning In recent years, transfer learning has been gaining importance, as it enables knowledge acquired through training on data to be transferred from a source domain to gain insight in the target domain. This importance rises from the fact that it is very challenging to collect data from all possible conditions that machinery may encounter. Umdale et al. [51] created different datasets by dividing the original CWRU dataset based on speed and load, as can be seen in Table 10. For instance, in dataset D1, the goal was to determine if training on lower speeds in the source data set would still enable us to achieve acceptable fault diagnosis on a dataset with higher rotational speeds, as can be seen from the target dataset of D1. In dataset D2, the opposite was true?the goal was to determine if datasets with higher speeds would have vital information for fault diagnosis at lower speeds, whereas mixtures of speeds were considered in datasets D3 and D4. The maximum training and testing accuracies reported True Class Electronics 2022, 11, 451 14 of 16 by [51] are shown in Table 10, where testing accuracies are an indication of the effectiveness of transfer learning. As can be seen from Table 10, the developed LWSN is more effective for transfer learning across all four datasets. Exploratory work suggests that LWSN can perform at least as well as deep learning networks at transfer learning, but further work needs to be undertaken to determine if there is a fundamental improvement. Table 10. Comparison of transfer learning accuracies across different datasets. Dataset Source Dataset Target Dataset Training Testing Training Accuracy Testing AccuracyAccuracy [51] Accuracy [51] (LWSN) (LWSN) 1730 RPM and 3 1772 RPM and 1 D1 HP 1750 RPM HP 1797 RPM 97.22 97.02 100 99.96 and 2 HP and 0 HP 1772 RPM and 1 1730 RPM and 3 D2 HP 1797 RPM HP 1750 RPM 94.17 92.88 100 99.87 and 0 HP and 2 HP 1730 RPM and 3 1750 RPM and 2 D3 HP 1797 RPM HP 1772 RPM 96.92 95.77 100 99.39 and 0 HP and 1 HP 1750 RPM and 2 1730 RPM and 3 D4 HP 1772 RPM HP 1797 RPM 95.77 94.48 100 99.93 and 1 HP and 0 HP These results imply that the LWSN network can extract discriminative information from raw data effectively and achieve fault classification with high accuracy, irrespective of the complexity and domain of the dataset. 5. Conclusions Traditional fault diagnosis methods involve the extraction of fixed representations in the time domain, frequency domain, or time?frequency domain. These methods require technical expertise for designing appropriate features from the fixed representations. In this paper, a new feature extraction technique based on learnable wavelet scattering networks was developed to diagnose faults primarily in analog circuits and rotating machinery. By learning a time?frequency representation from the data, the developed method has a better ability to extract essential features of the fault signals. This results in better fault diagnosis accuracy, by almost 9%, compared to the state-of-the-art fault diagnosis method in the liter- ature. By considering more classes for fault diagnosis than any other paper in the literature, a more thorough fault diagnosis was demonstrated. The fault diagnosis performance of this method was verified by experiments on the two-switch forward convertor circuit. The experiments indicated that the fault diagnosis model trained on simulation data is able to effectively diagnose faults from the actual circuit. Analog circuits and gears/bearings are the predominant sources of faults in electronic systems and rotary mechanical systems, respectively. The developed fault diagnosis approach was applied to the CWRU bearing faults and the UoC gear faults benchmark datasets and achieved fault diagnosis accuracy that is comparable to state-of-the-art methods. Since the UoC gear faults benchmark dataset is considered the most challenging benchmark dataset in rotating machinery fault diagnosis, this speaks to the ability of the developed method to extract weak fault signatures. Hence, the generalizability of the developed fault diagnosis approach across the most common industrial fault diagnosis domains was demonstrated. Initial experiments indicated that the developed approach is also effective in transfer learning; however, further experiments need to be carried out to confirm these observations. The incorporation of learnability in traditional wavelet scattering networks resulted in a 10% improvement in fault diagnosis accuracy. As opposed to deep learning networks, the developed learnable wavelet scattering networks do not require an extensive trial-and- error process to optimize their structure. Additionally, the developed learnable wavelet scattering networks learn wavelet filters as opposed to the random filters learnt in deep Electronics 2022, 11, 451 15 of 16 learning networks. Hence, the filters learnt by learnable wavelet scattering networks are interpretable, which enables wavelets to be used to gain further insight into circuit faults. The interpretability of the wavelets learnt by the learnable wavelet scattering networks and digital circuit fault diagnosis are possible avenues for future research. Author Contributions: Conceptualization, methodology, investigation, software, writing?original draft, V.K.; writing?review and editing, M.H.A.; writing?review and editing, supervision, M.G.P. All authors have read and agreed to the published version of the manuscript. Funding: The Center for Advanced Life Cycle Engineering (CALCE) and the Center for Advances in Reliability and Safety (CAiRS) in Hong Kong provided financial support for this research work. Data Availability Statement: Publicly available datasets were analyzed in this study This data can be found here: https://figshare.com/articles/dataset/Gear_Fault_Data/6127874/1 (accessed on 31 December 2021) and https://engineering.case.edu/bearingdatacenter (accessed on 31 December 2021). Acknowledgments: The authors thank the Center for Advanced Life Cycle Engineering (CALCE) and its over 150 funding companies and the Center for Advances in Reliability and Safety (CAiRS) in Hong Kong for supporting research into advanced topics in reliability, safety, and sustainment. Conflicts of Interest: The authors declare no conflict of interest. References 1. Pecht, M.; Jaai, R. A prognostics and health management roadmap for information and electronics-rich systems. Microelectron. Reliab. 2010, 50, 317?323. [CrossRef] 2. Binu, D.; Kariyappa, B.S. A survey on fault diagnosis of analog circuits: Taxonomy and state of the art. AEU-Int. J. Electron. Commun. 2017, 73, 68?83. [CrossRef] 3. Vasan, A.S.S.; Long, B.; Pecht, M. Diagnostics and prognostics method for analog electronic circuits. IEEE Trans. Ind. Electron. 2013, 60, 5277?5291. [CrossRef] 4. Yang, H.; Meng, C.; Wang, C. Data-driven feature extraction for analog circuit fault diagnosis using 1-D convolutional neural network. IEEE Access 2020, 8, 18305?18315. [CrossRef] 5. Li, F.; Woo, P.Y. Fault detection for linear analog IC?The method of short-circuit admittance parameters. IEEE Trans. Circuits Syst. I Fundam. Theory Appl. 2002, 49, 105?108. [CrossRef] 6. Tadeusiewicz, M.; Halgas, S.; Korzybski, M. An algorithm for soft-fault diagnosis of linear and nonlinear circuits. IEEE Trans. Circuits Syst. I Fundam. Theory Appl. 2002, 49, 1648?1653. [CrossRef] 7. Luo, H.; Wang, Y.; Lin, H.; Jiang, Y. Module level fault diagnosis for analog circuits based on system identification and genetic algorithm. Meas. J. Int. Meas. Confed. 2012, 45, 769?777. [CrossRef] 8. Cannas, B.; Fanni, A.; Montisci, A. Algebraic approach to ambiguity-group determination in nonlinear analog circuits. IEEE Trans. Circuits Syst. I Regul. Pap. 2010, 57, 438?447. [CrossRef] 9. Dai, X.; Gao, Z. From model, signal to knowledge: A data-driven perspective of fault detection and diagnosis. IEEE Trans. Ind. Inform. 2013, 9, 2226?2238. [CrossRef] 10. Bandyopadhyay, I.; Purkait, P.; Koley, C. Performance of a classifier based on time-domain features for incipient fault detection in inverter drives. IEEE Trans. Ind. Inform. 2019, 15, 3?14. [CrossRef] 11. Queiroz, L.P.; Rodrigues, F.C.M.; Gomes, J.P.P.; Brito, F.T.; Chaves, I.C.; Paula, M.R.P.; Salvador, M.R.; Machado, J.C. A fault detection method for hard disk drives based on mixture of gaussians and nonparametric statistics. IEEE Trans. Ind. Inform. 2017, 13, 542?550. [CrossRef] 12. Nasser, A.R.; Azar, A.T.; Humaidi, A.J.; Al-Mhdawi, A.K.; Ibraheem, I.K. Intelligent fault detection and identification approach for analog electronic circuits based on fuzzy logic classifier. Electronics 2021, 10, 2888. [CrossRef] 13. Shi, J.; Deng, Y.; Wang, Z. Analog circuit fault diagnosis based on density peaks clustering and dynamic weight probabilistic neural network. Neurocomputing 2020, 407, 354?365. [CrossRef] 14. Aizenberg, I.; Belardi, R.; Bindi, M.; Grasso, F.; Manetti, S.; Luchetta, A.; Piccirilli, M.C. A neural network classifier with multi-valued neurons for analog circuit fault diagnosis. Electronics 2021, 10, 349. [CrossRef] 15. Yuan, L.; He, Y.; Huang, J.; Sun, Y. A new neural-network-based fault diagnosis approach for analog circuits by using kurtosis and entropy as a preprocessor. IEEE Trans. Instrum. Meas. 2010, 59, 586?595. [CrossRef] 16. Xiao, Y.; He, Y. A novel approach for analog fault diagnosis based on neural networks and improved kernel PCA. Neurocomputing 2011, 74, 1102?1115. [CrossRef] 17. Xiao, Y.; Feng, L. A novel linear ridgelet network approach for analog fault diagnosis using wavelet-based fractal analysis and kernel PCA as preprocessors. Meas. J. Int. Meas. Confed. 2012, 45, 297?310. [CrossRef] 18. Zhang, A.; Chen, C.; Jiang, B. Analog circuit fault diagnosis based UCISVM. Neurocomputing 2016, 173, 1752?1760. [CrossRef] Electronics 2022, 11, 451 16 of 16 19. Song, P.; He, Y.; Cui, W. Statistical property feature extraction based on FRFT for fault diagnosis of analog circuits. Analog Integr. Circuits Signal Process. 2016, 87, 427?436. [CrossRef] 20. He, W.; He, Y.; Li, B.; Zhang, C. Analog circuit fault diagnosis via joint cross-wavelet singular entropy and parametric t-SNE. Entropy 2018, 20, 604. [CrossRef] 21. Cui, J.; Wang, Y. A novel approach of analog circuit fault diagnosis using support vector machines classifier. Meas. J. Int. Meas. Confed. 2011, 44, 281?289. [CrossRef] 22. Liu, Z.; Jia, Z.; Vong, C.M.; Bu, S.; Han, J.; Tang, X. Capturing high-discriminative fault features for electronics-rich analog system via deep learning. IEEE Trans. Ind. Inform. 2017, 13, 1213?1226. [CrossRef] 23. Zhao, G.; Liu, X.; Zhang, B.; Liu, Y.; Niu, G.; Hu, C. A novel approach for analog circuit fault diagnosis based on Deep Belief Network. Meas. J. Int. Meas. Confed. 2018, 121, 170?178. [CrossRef] 24. Chen, P.; Yuan, L.; He, Y.; Luo, S. An improved SVM classifier based on double chains quantum genetic algorithm and its application in analogue circuit diagnosis. Neurocomputing 2016, 211, 202?211. [CrossRef] 25. Wenxin, Y. Analog circuit fault diagnosis via FOA-LSSVM. Telkomnika 2020, 18, 251. [CrossRef] 26. Liang, H.; Zhu, Y.; Zhang, D.; Chang, L.; Lu, Y.; Zhao, X.; Guo, Y. Analog circuit fault diagnosis based on support vector machine classifier and fuzzy feature selection. Electronics 2021, 10, 1496. [CrossRef] 27. Gao, T.Y.; Yang, J.L.; Jiang, S.D.; Yang, C. A novel fault diagnostic method for analog circuits using frequency response features. Rev. Sci. Instrum. 2019, 90, 104708. [CrossRef] 28. He, W.; He, Y.; Li, B.; Zhang, C. A naive-Bayes-based fault diagnosis approach for analog circuit by using image-oriented feature extraction and selection technique. IEEE Access 2020, 8, 5065?5079. [CrossRef] 29. He, W.; He, Y.; Luo, Q.; Zhang, C. Fault diagnosis for analog circuits utilizing time-frequency features and improved VVRKFA. Meas. Sci. Technol. 2018, 29, 045004. [CrossRef] 30. Ji, L.; Fu, C.; Sun, W. Soft fault diagnosis of analog circuits based on a ResNet with circuit spectrum map. IEEE Trans. Circuits Syst. I Regul. Pap. 2021, 68, 2841?2849. [CrossRef] 31. Khemani, V.; Azarian, M.H.; Pecht, M.G. Electronic circuit diagnosis with no data. In Proceedings of the 2019 IEEE International Conference on Prognostics and Health Management (ICPHM), San Francisco, CA, USA, 17?20 June 2019; pp. 1?7. [CrossRef] 32. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27?30 June 2016. 33. Elsken, T.; Metzen, J.H.; Hutter, F. Simple and efficient architecture search for convolutional neural networks. In Proceedings of the 6th International Conference on Learning Representations, ICLR 2018?Workshop Track Proceedings, Vancouver, BC, USA, 30 April?3 May 2018. 34. Bruna, J.; Mallat, S. Invariant scattering convolution networks. IEEE Trans. Pattern Anal. Mach. Intell. 2013, 35, 1872?1886. [CrossRef] 35. Sweldens, W. The lifting scheme: A construction of second generation wavelets. SIAM J. Math. Anal. 1998, 29, 511?546. [CrossRef] 36. Wiatowski, T.; Tschannen, M.; Stanic, A.; Grohs, P.; Bolcskei, H. Discrete deep feature extraction: A theory and new architectures. In Proceedings of the 33rd International Conference on Machine Learning, New York, NY, USA, 19?24 June 2016; Volume 5, pp. 3168?3183. 37. And?n, J.; Lostanlen, V.; Mallat, S. Joint time-frequency scattering. IEEE Trans. Signal Process. 2019, 67, 3704?3718. [CrossRef] 38. LeCun, Y.; Cortes, C.; Burges, C. The MNIST Database of Handwritten Digits. Courant Inst. Math. Sci. 1998. Available online: http://yann.lecun.com/exdb/mnist/ (accessed on 25 January 2022). 39. Deng, J.; Dong, W.; Socher, R.; Li, L.-J.; Li, K.; Li, F.-F. ImageNet: A large-scale hierarchical image database. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20?25 June 2009. 40. Garofolo, J.S.; Lamel, L.F.; Fisher, W.M.; Fiscus, J.G.; Pallett, D.S.; Dahlgren, N.L.; Zue, V. TIMIT Acoustic-Phonetic Continuous Speech Corpus; Linguistic Data Consortium: Philadelphia, PA, USA, 1993. 41. Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet classification with deep convolutional neural networks. In Proceedings of the Advances in Neural Information Processing Systems, Lake Tahoe, NV, USA, 3?6 December 2012. 42. Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735?1780. [CrossRef] 43. Holland, J.H. Genetic Algorithms. Sci. Am. 1992, 267, 66?73. [CrossRef] 44. Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn. 1995, 20, 273?297. [CrossRef] 45. Davies, D.L.; Bouldin, D.W. A cluster separation measure. IEEE Trans. Pattern Anal. Mach. Intell. 1979, PAMI-1, 224?227. [CrossRef] 46. Mao, W.; Wang, L.; Feng, N. A new fault diagnosis method of bearings based on structural feature selection. Electronics 2019, 8, 1406. [CrossRef] 47. Bearing Data Center|Case School of Engineering|Case Western Reserve University. Available online: https://engineering.case. edu/bearingdatacenter (accessed on 25 January 2022). 48. Zhao, Z.; Li, T.; Wu, J.; Sun, C.; Wang, S.; Yan, R.; Chen, X. Deep learning algorithms for rotating machinery intelligent diagnosis: An open source benchmark study. ISA Trans. 2020, 107, 224?255. [CrossRef] 49. Gear Fault Data. Available online: https://figshare.com/articles/dataset/Gear_Fault_Data/6127874/1 (accessed on 25 January 2022). 50. Qiao, Z.; Elhattab, A.; Shu, X.; He, C. A second-order stochastic resonance method enhanced by fractional-order derivative for mechanical fault detection. Nonlinear Dyn. 2021, 106, 707?723. [CrossRef] 51. Udmale, S.S.; Singh, S.K.; Singh, R.; Sangaiah, A.K. Multi-fault bearing classification using sensors and ConvNet-based transfer learning approach. IEEE Sens. J. 2020, 20, 1433?1444. [CrossRef]