• 대한전기학회
Mobile QR Code QR CODE : The Transactions of the Korean Institute of Electrical Engineers
  • COPE
  • kcse
  • 한국과학기술단체총연합회
  • 한국학술지인용색인
  • Scopus
  • crossref
  • orcid

  1. (Medical Research Institute, School of Medicine, Chungbuk National University, Korea.)
  2. (Electronics and Telecommunications Research Institute, Korea.)
  3. (Dept. of Biomedical Engineering, School of Medicine, Chungbuk National University, Korea.)
  4. (Institute for Trauma Research, College of Medicine, Korea University, Korea.)



Kidney cancer, Deep learning, Generative models, Autoencoder

1. Introduction

Kidney cancer is among the ten most common cancers worldwide (1), (2); unfortunately, it is hard to detect early through normal clinical means. It is not a single disease; instead, it comprises different histologically and genetically distinct types of cancer, each with its own histologic type, which in turn has its own clinical course and therapy responses (3), (4). The Cancer Genome Atlas (TCGA) Research Network has conducted a series of comprehensive molecular characterizations in distinctive histologic types of kidney cancers (5).

Kidney cancer is the 6th most frequent cancer in males and the 10th in females, representing 5% and 3% of all new cases, respectively (6). Gender disparities in kidney cancer incidence have been reported, with a higher incidence and worse outcome in males (7). Over half of all people aged 50 have cysts, which are fluid-filled and are usually benign (noncancerous) and do not need treatment (8). Solid tumors of the kidney are rare; however, approximately three-quarters of these tumors are cancerous, with a potential to spread (9). According to the Centers for Disease Control (CDC), in the United States in 2014, black men were the most likely to get kidney cancer (24.7 per 100,000), followed by white men (22.0 per 100,000). Among women, African-American women are the most likely to get kidney and renal pelvis cancers (12.4 per 100,000), followed by Hispanic women (11.9 per 100,000) (10). Studies suggest that the distribution of kidney cancer subtypes differs between racial groups (11), (12). Race and ethnicity cause inter-tumoral heterogeneity in cancers, ranging from disease incidence, morbidity, and mortality rates to treatment outcomes (13), (14). Therefore, the identification of population-specific molecular biomarkers is essential (15).

Identifying genes that contribute to the prognosis of cancer patients is one of the challenges faced while providing appropriate treatment for patients. The critical challenges in bioinformatics are searching for biomarkers that represent the state of patients and predicting the prognosis of cancer patients. The number of gene data is enormous compared with the number of patients, making it challenging to analyze it. To solve these problems, significant genes that represent the state of patients must be extracted. In addition, developing a classification model from the extracted genes may be helpful for early diagnosis and prediction of the prognosis of cancer patients. Cancer is caused by gene variation, damaging genes regulating cell replication in a predetermined order; thus, cells multiply unlimitedly. Therefore, the cells invade adjacent normal tissues and are transferred to the whole body. Because cancer stem cells from mutated genes, they are thought to be a genetic disorder, although only a small number of cancers are genetic. In the case of a mutation in a reproductive cell, the mutation is transferred over generations, and it exists in the whole somatic cell (16). To predict the state of patients, researchers applied deep learning techniques for analyzing the mutation in a sequence and, studies have accurately predicted major mutations that cause diseases such as spinal muscular atrophy, hereditary non-species colon cancer, and autism (17).

Kidney cancer is a primary tumor stemming from kidney and renal cell carcinoma; it is a malignant tumor that accounts for over 90% of cases (2), (18). Because kidney cancer has no symptoms in the early stages, there is often a progressive step at the time of discovery. According to the national cancer registration statistics published in 2020, among the 243,837 cases of cancer in 2018, 5,456 were attributed to kidney cancer, accounting for 2.2% of all cancer cases. By gender, kidney cancer ranked eighth with 3.0% (3,806 cases) of all male cancers (19). In addition, the symptoms and treatment of kidney cancer decrease patients’ quality of life by increasing the disease burden and medical costs. Risk factors for kidney cancer include environmental habits, living factors, genetic factors, and existing kidney diseases. Among them, smoking, obesity, high blood pressure, and eating habits can be the causes associated with living factors (20). Recently, researchers conducted research to extract features using genetic data from kidney cancer patients and apply classification algorithms through neighborhood component analysis methods (21). Furthermore, we used big data from a large cohort (KOTCC database) of kidney cancer patients collected from eight domestic medical institutions to extract variables affecting kidney cancer recurrence. We applied a machine learning algorithm to predict recurrence within five years of surgery (22).

In this study, we propose a method to extract genes that affect prognosis prediction in kidney cancer patients using a deep learning algorithm and apply a classification algorithm based on the extracted genes to predict the prognosis of cancer patients. We combined gene expression data and clinical data from kidney cancer patients obtained from TCGA portal sites to extract genes that contribute to patient prognosis and applied classification techniques to present their utilization (23). Next, we selected gender (male, female), sample type (primary cancer, normal), and race (white, black, Asian) as the target variables for analysis. Notably, we extracted genes from kidney cancer patients based on gender, sample type, and race to overcome heterogeneity and extract genetic biomarkers that could allow a more accurate prognosis prediction. After testing the functionality of genes, we presented their applicability and developed the optimal prediction model by comparing and analyzing classification algorithms using extracted genes.

2. Related works

Machine learning and deep learning algorithms are being applied to various analyses of biological data. Some studies predicted the risk of 20 cancers by applying machine learning techniques and artificial intelligence methods to genetic big data analysis (24). The Bayesian classifier has been applied to the problem of classifying proteins that have sequence and structural information, and studies have also used the Bayesian network to combine various details related to proteins and genes with improving the predictive performance for gene function (25). As such, different machine learning technologies are being applied for the analysis of biological data. In a study utilizing the TCGA-KIRC database, they used CT and MRI scan data and clinical data of 227 kidney cancer patients were used to predict the classification accuracy of the cancer stage by applying deep learning (26). In another study, significant gene extraction from kidney cancer data in the TCGA was performed using a deep autoencoder compared to the traditional methods such as least absolute shrinkage and selection (LASSO). The predictive accuracies of classification were compared with the conventional state-of-the-art classification methods and analyzed (27). Researchers integrated data of various cancer patient types, conducted the analysis using AE structures, presented their availability in clinical applications, and suggested ways to efficiently perform posterior inference via stochastic variational inference and learning algorithms in the presence of posterior probability distributions and continuous latent variables for extensive data (28). An AE is a neural network in which the output is set to input x to extract features. By learning how to reconstruct an input, the AE extracts basic or abstract properties that facilitate the accurate prediction of the information. In principle, a linear AE with a single hidden layer in a multi-layer perceptron is the same as principal component analysis (PCA) (29), (30). More generally, nonlinear autoencoders have been studied to extract key properties, including high-level features and Gabor-filter features (31), (32).

The variational autoencoder (VAE) is a model that, given training data as a generative model, produces new data with a sampled value in the same distribution as the actual distribution of the training data. An AE is a model that compresses high-dimensional input data into smaller representations in the stochastic form. Unlike the conventional AE, which maps inputs to latent vectors, VAE maps input data to parameters in the identical probability distributions as the mean and variance of Gaussian distributions. This method produces structured latent spaces and is therefore helpful for image generation (33-35).

Supervised autoencoder (SAE) is a neural network that jointly predicts the targets and inputs (reconstruction). For a single hidden layer, this simply means that a classification loss is added to the output layer. The innermost layer has a classification loss added to the layer for a deeper AE, which is usually handed off to the supervised learner after training the AE. The SAE uses unsupervised auxiliary tasks to improve the generalization performance (36-38).

The CVAE is a modification of existing VAE structures that enable supervised learning, which considers category information when learning data distributions in the form of added class label y to encoders and decoders. CVAE is a deep conditional generative model for structured output prediction using Gaussian latent variables. The model has efficiently trained in the stochastic gradient variational Bayes framework and allows fast prediction using stochastic feed-forward inference (39-41).

We validated the performance of the proposed framework by comparing it with the traditional data mining and classification methods. The proposed framework employs the various AE-based deep learning techniques by taking advantage of pre-training and fine-tuning strategies. The experimental results show that the AE-based deep learning methods show better performances than the combinations of traditional data mining and classification methods.

3. Methodologies

3.1 Architecture

The main challenge faced by general analysis is the characteristic of genetic data because they have more gene expression values than the number of samples. We propose a novel deep learning–based framework by combining the various AE-based techniques for cancer analysis and compared with the existing feature extraction methods—PCA and NMF—and demonstrate its superiority. The following section describes a pre-training method for auto-encoder-based feature extraction. Proper training of neural networks requires a large amount of learning data; however, often, we have a small quantity of labeled learning data and large amounts of unlabeled learning data. In this case, unlabeled learning data are used to pre-train each layer of a neural network called unsupervised pre-training. AE and VAE only have reconstruction loss in pre-training, but SAE and CVAE also include classification loss. Once the parameters for each layer have been determined to some extent, the classification performance can be improved through fine-tuning using labeled learning data.

In particular, feature extraction was first performed to compare traditional classification algorithms with deep learning techniques. We used conventional dimension reduction techniques such as principal component analysis (PCA) and non-negative matrix factorization (NMF) followed by state-of-the-art classification algorithms. And we used deep learning techniques such as autoencoder (AE), variational autoencoder (VAE), conditional autoencoder (CAE), and conditional variational autoencoder (CVAE), followed by a neural network classifier. For significant gene selection using traditional classification algorithms, we used PCA and NMF, solved the data imbalance problem, applied various classification techniques, and compared and analyzed the results. For deep learning–based significant gene selection, we used all the improved algorithms based on the AE algorithm. We compared the extracted genes, and the classification accuracy was analyzed using a multi-layer perceptron (MLP).

3.1.1 Autoencoder

The AE is a deep learning structure for efficiently coding data. Coding refers to compressing data; in other words, dimensionality reduction is the transformation of data from a high-dimensional space into a low-dimensional space to efficiently represent some data. The neural network architecture of AE has the same input and output and can be represented by Fig. 1 as a symmetrically constructed structure. Because dimensionality reduction is the goal in our study, we take some data X and obtain the node value Z of the hidden layer as a combination of the weighted multiplication and sum and the activation function, which we call the encoder.

그림. 1. 오토인코더의 아키텍쳐

Fig. 1. Architecture of autoencoder

../../Resources/kiee/KIEE.2022.71.10.1393/fig1.png

AE has the same structure as MLP, except that the input and output layers have the same number of neurons. Because AE reconstructs the input, the output is called reconstruction, and the loss function is calculated using the difference between the input and reconstruction. When learning AE, it follows unsupervised learning, and the loss is used as the maximum likelihood (ML). Once the hidden layer Z parameters have been determined to some extent, we can use the labeled learning data can be used to perform supervised fine-tuning.

3.1.2 Variational autoencoder

A VAE combines the input data X with the mean (μ) and variance (σ2) (two vector outputs) through the encoder to create a normal distribution. That allows the sampling to create latent vector Z to pass through the decoder to produce new data similar to any existing input data. Therefore, the VAE is a generative model developed generate new data using probability distributions. The structure of the VAE is shown in Fig. 2.

We used the ideal sampling function posterior (to sample, allowing generators to learn the input data well. Equation (1) is used to make the value generated by the sampling function equal to the input value. The maximum likelihood estimation that maximizes the value of (1) shows how well the reconstruction restores data like the input data when the Z vector (latent vector) extracted from the ideal sampling function is given.

그림. 2. 변형 오토인코더의 아키텍쳐

Fig. 2. Architecture of a variational autoencoder

../../Resources/kiee/KIEE.2022.71.10.1393/fig2.png

(1)
$Eq_{\Phi}(z|x)[\log(p(x|z))]$

Finding an optimal formula that satisfies these conditions results in evidence lower bound formula that fulfills the above conditions when X is given to the network as evidence.

(2)
$Eq_{\Phi}(z|x)[\log(p(x|z))]-KL(q_{\Phi}(z|x)||p(z))+KL(q_{\Phi}(z|x)||p(z|x))$

The first term of (2) is the reconstruction term, indicating how well it is restored from the ideal sampling function. The second term is a regularization term, which makes the perfect sampling function the same as the prior as possible. The conditions are given to sample values like priors among several samples. The third term represents the distance between the two probability distributions, the distance between the ideal sampling function $(q_{\Phi}(z |x))$ and the sample function $p(z |x)$.

3.1.3 Supervised autoencoder

An SAE is an AE with the addition of a classification loss to the representation layer. For a single hidden layer, this means that a classification loss is added to the output layer. For a deeper AE, the innermost layer would have a classification loss added to the layer, usually handed off to the supervised learner after training the AE, which is explained in Fig. 3.

그림. 3. 감독 오토인코더의 아키텍쳐

Fig. 3. Architecture of the supervised autoencoder

../../Resources/kiee/KIEE.2022.71.10.1393/fig3.png

3.1.4 Conditional variational autoencoder

The CVAE is a modification of existing VAE structures which enables supervised learning. CVAE adds a class label y to the encoder and decoder, considering the category information when learning data distributions. Thus, in CVAE, a particular condition is given and added to the encoder and decoder if the label information is known. The y-value is given along with x to find the latent vector z in the encoder. Similarly, the decoder can represent the y-value that generates data as follows. Therefore, the loss function is represented by the reconstruction loss and classification loss. The form is shown in Fig. 4.

그림. 4. 조건부 변형 오토인코더의 아키텍쳐

Fig. 4. Architecture of the conditional variational autoencoderㄴ

../../Resources/kiee/KIEE.2022.71.10.1393/fig4.png

3.2 Classifier

To establish a classification model, we employ a multilayer perceptron (MLP) classifier followed by the various autoencoder-based techniques. A multilayer perceptron is a neural network connecting multiple layers in a directed graph, which means that the signal path through the nodes only goes one way. Each node, apart from the input nodes, has a nonlinear activation function. An MLP uses backpropagation as a supervised learning technique.

3.3 Training

3.3.1 Generative pre-training

The number of samples for a given phenotype prediction task is generally small; however, many other gene expression profiles unrelated to this phenotype are available. These profiles were grouped to form a large dataset of samples without labels. This unlabeled dataset cannot predict the phenotype but helps construct a hierarchical representation of gene expressions in the neural network. The idea is to find nonlinear combinations of inputs that provide functional patterns for gene expression analysis. The unlabeled dataset is used to initialize the weights of the MLP before supervised learning. We pre-trained the AE models iteratively for each hidden layer to learn a denoising AE that reconstructs the previous layer’s output.

In the current setting, a generative approach is an approach that provides a training dataset; that is, the empirical distribution can generate synthetic observations that should exhibit the essential structural properties observed in the empirical distribution. The VAE and CVAE generative training strategies ultimately result in a pre-trained model with a good understanding of representation. It can generate the correct features of the given data well.

3.3.2 Fine-tuning for classification

Fine-tuning involves tuning the parameters pre-trained with large-scale data using small-scale data. We fine-tuned the encoder of the pre-trained VAE and CVAE pre-trained with an imbalanced large amount of data. We added a supervised neural network classifier after the encoder of the VAE and CVAE, ignoring the decoder part. With model loss and cross-entropy, we also trained the model using the Adam optimizer to update the model’s weights.

4. Experiments

4.1 Dataset

TCGA has collected cancer data from various platforms worldwide and has produced a dataset of immeasurable values using standardized analysis methods. These data were obtained through TCGA’s data portal. In this study, we collected 1,157 kidney cancer samples from TCGA. We used the transcription profiling file’s data format and contained both case files and clinical information files for the samples. Next, we combined clinical, expression, and case data into a single file based on the case ID and file name using Python. Therefore, the dataset was analyzed using 1,157 samples and 60,483 gene expression data from patients. The frequencies of the target variables are shown in Table 1 below. To solve an imbalanced problem, we applied AE-based nonlinear data transformation and generation techniques during training.

The samples classified by gender were 407 women (35.2%) and 750 men (64.8%). The samples classified by race were as follows: 940 white (81.2%), 150 black or African Americans (13.0%), 17 Asians (1.5%), and the remaining 50 were not reported (4.3%). In the sample types, 1,010 cases had a primary tumor (87.3%), 139 people had a solid tissue normal (12%), and the remaining had missing values.

표 1. 클래스 레이블(종양, 성별, 인종)에 따른 빈도

Table 1. Frequency according to class label(tumor, gender, race)

Frequency

Percentage

Cumulative (%)

Primary Tumor

1010

87.9

87.6

Solid Tissue Normal

139

12.1

100.0

Total

1149

100.0

Female

407

35.2

35.2

Male

750

64.8

100.0

Total

1157

100.0

Asian

17

1.5

1.6

Black or African-American

150

13.6

14.6

White

940

84.9

100.0

Total

1107

100.0

4.2 Overall analytical structure

We leveraged the integrated data obtained from TCGA to conduct classification analysis on gene expression data using traditional classification techniques and deep learning–based MLP. Fig. 5 shows the overall framework used by traditional classification algorithms. We calculate the interquartile range for outlier detection, a widely used technique that helps find outliers in continually distributed data. We used IQR in our preprocessing because it is a reasonably robust measure of variability. Besides, it is not affected by outliers since it uses the middle 50% of the distribution for calculation and is computationally cheap. First, we eliminated noise and outliers from the genetic data of kidney cancer and extracted 5,000 genes via chi-square tests. We performed 5-fold cross-validation (train (80%) and test (20%)) on 5,000 data samples and used PCA and NMF as data transformation methods. Subsequently, we utilized SMOTE algorithms to solve the data imbalance problem for gender, race, and sample type variables. Furthermore, the classification accuracy of the said variables (race, gender, and sample type) was compared and analyzed by applying classification algorithms such as KNN, SVM, DT, RF, AB, NB, and MLP.

그림. 5. 신장암 유전자 발현 데이터에 대한 전통적인 분류

Fig. 5. Traditional classification for gene expression of kidney cancer

../../Resources/kiee/KIEE.2022.71.10.1393/fig5.png

The classification accuracies of the deep learning techniques for race, gender, and sample type based on the AE are shown in Fig. 6. In AE-based techniques, we eliminated noise and outliers, and 5,000 genes were extracted via chi-square tests. We performed 5-fold cross-validation (80% for training and 20% for testing) on the selected 5,000 features and corresponding data samples followed by AE, VAE, SAE, and CVAE during the pre-training and training phase. Finally, we extracted the 100 latent variables. We also solved the imbalanced data problem by fine-tuning the generative pre-trained encoder for highly imbalanced data. The encoder and MLP were combined as classifiers to predict the classification accuracy for race, gender, and sample type.

Compared to Fig. 5, the experiments consist in assessing two different approaches when training the classification model, allowing fine-tuning of the entire network and embedding the AEs into the classification network, namely by only importing the encoding layers. The unsupervised pre-training on the gene expression data and fine-tuning it on specific tasks affect the classification performance. The experimental results show that autoencoder-based approaches achieved a higher classification performance than the traditional classification approaches as reported in the next sections.

그림. 6. 신장암 유전자 발현 데이터에 대한 오토인코더 기반 분류

Fig. 6. Autoencoder-based classification for gene expression data of kidney cancer

../../Resources/kiee/KIEE.2022.71.10.1393/fig6.png

4.3 Evaluation measures

To evaluate the model’s performance for classification accuracy prediction, we utilized the precision, recall, and F1-score using a confusion matrix. Precision represents the true positive ratio of the predicted positive data, and recall represents the proportion of actual positive data that is predicted well. The F1-score uses the harmonic mean of precision and recall to compute the mean so that the more imbalanced the data are, the more penalty is applied, which is close to a smaller value. We compared the macro-average and micro-average because our target data had an imbalance problem. The macro-average is used while verifying whether a classifier works well for all classes. It is used when all the classes of data are the same. The micro-average is used when the sizes of each class are different; that is when the sizes of the independently measured confusion matrix are different. Therefore, it can be used more effectively on datasets with class-imbalance problems. Abbreviations used in the confusion matrix refer to true positive (TP), false positive (FP), false negative (FN), and true negative (TN). The following micro-average is used when the number of classes is different; for example, if the class label is 2, the following forms of micro-precision, micro-recall, and micro-F1-score can be expressed from equations (3) to (5).

(3)
$Micro-Precision=\dfrac{TP1+TP2}{TP1+FP1+TP2+FP2}$

(4)
$Micro-Recall=\dfrac{TP1+TP2}{TP1+FN1+TP2+FN2}$

(5)
$Micro-F1-score= 2\times\dfrac{Micro-Precision\times Micro-Recall}{Micro-Precision + Micro-Recall}$

The macro-averaging normalizes the sum of all metrics. Thus, Macro-averaging does not consider the number of events in each class. Macro-precision, macro-recall, and macro-F1-score can be expressed using equations (6) to (10).

(6)
$Precision=\dfrac{TP}{TP+FP}$

(7)
$Recall=\dfrac{TP}{TP+FN}$

(8)
\begin{align*} Macro-Precision= \\\\\dfrac{(Precision \enspace {for} \enspace Class 1+Precision \enspace {for} \enspace Class 2)}{2} \end{align*}

(9)
\begin{align*} Macro-Recall= \\\\\dfrac{(Recall \enspace {for} \enspace Class 1+Recall \enspace {for} \enspace Class 2)}{2} \end{align*}

(10)
\begin{align*} Macro-F1-score= \\\\2\times\dfrac{Macro-Precision\times Macro-Recall}{Macro-Precision + Macro-Recall} \end{align*}

All experiments were executed on an Intel Xeon E5-2698 v4 @ 2.20GHz, 256GB (CPU), NVIDIA Tesla V100 32GB (GPU), and Ubuntu 18.04 operation system. We also used the Scikit-Learn (42) and PyTorch (43) libraries with the Python programming language for all analyses.

5. Results

This section extensively evaluates our approach and compares it with other unsupervised feature extraction techniques, followed by over-sampling and state-of-the-art classifiers. We also report on an ablation study we conducted to explore the most significant 20 genes for each clinical information.

Tables 2-4 show the performance comparison among all methods according to gender, race, and sample type, respectively. The classification performance of values using the micro-average is superior to that of evaluation metrics using the macro-average. The AE-based methods achieved higher performance than the conventional feature extraction methods. That means that AE-based methods can better extract the complexity of cancer and produce more meaningful features. We used only an MLP classifier for the features extracted by AE-based methods because of its neural network structure, and we did not use any sampling for it. When the data imbalance problem was solved using traditional algorithms, the generative AE-based methods achieved higher performance than when sampling was performed using SMOTE algorithms.

Table 2 presents the classification results for gender. The results show that VAE achieved a macro-F1-score of 0.958, and a micro-F1-score of 0.962, indicating higher classification performance than the other methods. It offers results comparable with other AE-based methods and improves the highest performance results of conventional PCA+SVM with SMOTE over-sampling, by 0.021 macro- and 0.02 micro-F1-score, respectively. A gender disparity exists in the incidence of kidney carcinomas, with more incidence reported in men (44). Men are at a higher risk of developing kidney cancer and usually have a more aggressive disease at the time of diagnosis. Females generally show more favorable histological kidney cancer and have better oncological outcomes than males (45). Extracting valuable features by VAE or other AE-based methods gives deeper information about gender-related differences in kidney cancer therapy.

표 2. 성별에 따른 분류 성능 평가

Table 2. Classification performance evaluation according to gender

Feature

Extraction

Sampling

Classifier

Micro-

Precision

Micro-

Recall

Micro-F1

-

score

Macro-

Precision

Macro-

Recall

Macro-

F1-score

AE

FALSE

MLP

0.953

0.952

0.953

0.945

0.952

0.948

VAE

FALSE

MLP

0.963

0.962

0.962

0.958

0.960

0.958

CAE

FALSE

MLP

0.950

0.950

0.950

0.945

0.946

0.945

CVAE

FALSE

MLP

0.958

0.957

0.957

0.952

0.955

0.953

NMF

FALSE

AB

0.908

0.907

0.907

0.896

0.901

0.898

DT

0.894

0.893

0.893

0.884

0.881

0.882

KNN

0.657

0.671

0.659

0.630

0.615

0.616

MLP

0.835

0.836

0.835

0.822

0.814

0.818

NB

0.804

0.798

0.784

0.808

0.740

0.753

RF

0.910

0.910

0.909

0.909

0.892

0.899

SVM

0.781

0.777

0.759

0.785

0.710

0.722

TRUE

AB

0.909

0.908

0.908

0.897

0.903

0.900

DT

0.868

0.867

0.867

0.854

0.856

0.854

KNN

0.647

0.603

0.612

0.603

0.613

0.594

MLP

0.847

0.847

0.847

0.834

0.829

0.831

NB

0.815

0.813

0.805

0.814

0.768

0.780

RF

0.916

0.915

0.915

0.908

0.908

0.907

SVM

0.777

0.754

0.758

0.743

0.759

0.743

PCA

FALSE

AB

0.867

0.868

0.866

0.860

0.846

0.852

DT

0.736

0.737

0.736

0.711

0.709

0.710

KNN

0.737

0.744

0.732

0.725

0.689

0.697

MLP

0.943

0.943

0.943

0.939

0.936

0.937

NB

0.657

0.677

0.639

0.640

0.585

0.578

RF

0.845

0.833

0.822

0.858

0.778

0.797

SVM

0.940

0.939

0.939

0.936

0.932

0.933

TRUE

AB

0.867

0.864

0.865

0.850

0.858

0.853

DT

0.762

0.759

0.759

0.738

0.739

0.737

KNN

0.736

0.692

0.699

0.692

0.710

0.685

MLP

0.940

0.940

0.940

0.934

0.934

0.934

NB

0.756

0.764

0.748

0.746

0.708

0.712

RF

0.857

0.858

0.856

0.851

0.834

0.841

SVM

0.943

0.942

0.942

0.936

0.938

0.937

Table 3 shows that when the target variable is race, and the label is white, black or African-American, and Asian, the class label imbalance is very severe. Our data included 940 (81.2%), 150 (13.0%), and 17 (1.5%) white, African-American, and Asian samples, respectively. Clearly, race prediction is a more challenging task than other clinical prediction tasks. It shows a much lower macro-averaged performance. The results show that CVAE achieved a macro-F1-score of 0.763, and a micro-F1-score of 0.959, indicating a higher classification performance than the other methods. It offers a micro-F1-score that is comparable with other AE-based methods and improves the highest performance results of conventional PCA+SVM with SMOTE over-sampling by 0.121 macro- and 0.018 micro-F1-score, respectively. It also enhances the highest performance results for the AE method by 0.076. We can conclude that CVAE can extract the complexity of cancer and works well for more complex tasks than other AE-based methods. Surveillance and epidemiology data indicate that kidney cancer incidence and mortality rates are higher among African-American patients compared to white patients (46).

White and Asian patients (age 63.9 and 62.6 years, respectively) had a slightly older age of onset than Black and Native American patients (age 60.7 and 60.3 years) (47). However, feature extraction for racial information is challenging; we can achieve a higher macro-F1-score (higher than 70%) using the CVAE method.

Table 4 presents the breakdown results for the sample types. The label for the sample type was 1,010 primary tumors (87.3%) and 139 solid tissue normal (12%), and the remaining had missing values. The results show that all AE-based methods achieved comparable results, with macro-F1-score of 0.996 and micro F1-score of 0.998, and a higher classification performance than the other methods. All AE-based methods improve the

표 3. 인종에 따른 분류 성능 평가

Table 3. Classification performance evaluation of race

Feature Extraction

Sampling

Classifier

Micro-

Precision

Micro-

Recall

Micro-F1

-

score

Macro-

Precision

Macro-

Recall

Macro-

F1-score

AE

FALSE

MLP

0.953

0.961

0.956

0.743

0.660

0.687

VAE

FALSE

MLP

0.956

0.964

0.958

0.767

0.663

0.685

CAE

FALSE

MLP

0.955

0.960

0.956

0.720

0.662

0.678

CVAE

FALSE

MLP

0.961

0.959

0.958

0.832

0.753

0.763

NMF

FALSE

AB

0.857

0.849

0.848

0.515

0.479

0.489

DT

0.874

0.880

0.877

0.530

0.517

0.523

KNN

0.790

0.844

0.809

0.480

0.402

0.416

MLP

0.845

0.873

0.852

0.505

0.456

0.468

NB

0.807

0.589

0.657

0.384

0.409

0.355

RF

0.889

0.914

0.889

0.576

0.508

0.516

SVM

0.772

0.867

0.812

0.373

0.385

0.369

TRUE

AB

0.866

0.845

0.853

0.512

0.507

0.506

DT

0.871

0.842

0.855

0.508

0.515

0.509

KNN

0.814

0.622

0.685

0.421

0.535

0.405

MLP

0.847

0.857

0.851

0.608

0.518

0.540

NB

0.800

0.594

0.656

0.376

0.413

0.349

RF

0.901

0.921

0.905

0.586

0.537

0.550

SVM

0.810

0.601

0.671

0.416

0.475

0.388

PCA

FALSE

AB

0.819

0.852

0.827

0.468

0.415

0.424

DT

0.822

0.836

0.828

0.457

0.433

0.442

KNN

0.826

0.858

0.816

0.510

0.378

0.387

MLP

0.933

0.946

0.937

0.626

0.590

0.604

NB

0.797

0.834

0.809

0.456

0.402

0.414

RF

0.847

0.860

0.807

0.575

0.364

0.364

SVM

0.935

0.947

0.939

0.629

0.594

0.608

TRUE

AB

0.847

0.849

0.846

0.506

0.488

0.492

DT

0.825

0.799

0.811

0.448

0.472

0.456

KNN

0.817

0.669

0.722

0.412

0.478

0.408

MLP

0.940

0.950

0.945

0.625

0.610

0.616

NB

0.891

0.883

0.885

0.543

0.533

0.534

RF

0.886

0.900

0.878

0.601

0.471

0.503

SVM

0.938

0.948

0.941

0.691

0.622

0.642

highest performance results of conventional PCA+KNN, PCA+MLP without any over-sampling, and PCA+MLP with SMOTE over-sampling, by 0.002 macro- and 0.001 micro-F1-score, respectively. Survival in patients with kidney cancer can be correlated with the expression of various genes based solely on the expression profile in the primary kidney tumor (48). Compared to other tasks, distinguishing extracted features is more straightforward, and predicting sample type is much easier. For sample types, classifiers based on both traditional techniques and deep learning performed well. Using the AE-based pre-training algorithm is slightly better, and overall, than the other compared methods. The are several methods to predict cancer subtypes or sample types using deep learning techniques on gene expression data (49-51). To the best of our knowledge, methods identifying kidney cancer biomarkers by combining AE-based methods and model interpretation techniques are still lacking.

In general, unsupervised learning algorithms applied to gene expression data extract biological and technical signals present in input samples. It is best to compress gene expression data using several algorithms and many different latent space dimensionalities. These compressed gene expression features represent important biological signals, including gender, race, and presence of tumor. We showed, through several experiments tracking lower dimensional gene expression representations, and supervised learning performance, that optimal biological features are learned using a variety of latent space dimensionalities and different compression algorithms.

표 4. 종양 유무에 따른 분류 성능 평가

Table 4. Classification performance evaluation according to tumor type

Feature

Sampling

Classifier

Micro-

Precision

Micro-

Recall

Micro-

F1-score

Macro-

Precision

Macro-

Recall

Macro-

F1-score

AE

FALSE

MLP

0.998

0.998

0.998

0.996

0.996

0.996

VAE

FALSE

MLP

0.998

0.998

0.998

0.996

0.996

0.996

CAE

FALSE

MLP

0.998

0.998

0.998

0.996

0.996

0.996

CVAE

FALSE

MLP

0.998

0.998

0.998

0.996

0.996

0.996

NMF

FALSE

AB

0.989

0.989

0.989

0.970

0.978

0.974

DT

0.984

0.983

0.984

0.951

0.975

0.962

KNN

0.976

0.974

0.974

0.929

0.957

0.941

MLP

0.991

0.990

0.990

0.983

0.973

0.977

NB

0.995

0.995

0.995

0.982

0.994

0.988

RF

0.995

0.995

0.995

0.991

0.984

0.988

SVM

0.964

0.963

0.960

0.963

0.863

0.897

TRUE

AB

0.992

0.991

0.991

0.974

0.986

0.980

DT

0.977

0.975

0.975

0.929

0.961

0.943

KNN

0.959

0.943

0.948

0.845

0.955

0.887

MLP

0.992

0.992

0.992

0.984

0.980

0.981

NB

0.995

0.995

0.995

0.985

0.991

0.988

RF

0.994

0.994

0.994

0.987

0.984

0.986

SVM

0.981

0.978

0.979

0.931

0.978

0.952

PCA

FALSE

AB

0.993

0.993

0.993

0.987

0.980

0.983

DT

0.980

0.979

0.979

0.962

0.942

0.950

KNN

0.997

0.997

0.997

0.993

0.995

0.994

MLP

0.997

0.997

0.997

0.992

0.995

0.994

NB

0.985

0.984

0.984

0.961

0.966

0.964

RF

0.987

0.987

0.987

0.989

0.949

0.968

SVM

0.996

0.996

0.996

0.988

0.991

0.990

TRUE

AB

0.991

0.991

0.991

0.980

0.979

0.979

DT

0.986

0.985

0.985

0.968

0.963

0.965

KNN

0.995

0.995

0.995

0.983

0.994

0.988

MLP

0.997

0.997

0.997

0.992

0.995

0.994

NB

0.969

0.967

0.968

0.914

0.941

0.925

RF

0.993

0.993

0.993

0.993

0.974

0.983

SVM

0.996

0.996

0.996

0.988

0.991

0.990

6. Conclusions

We combined kidney cancer clinical data and gene data collected through the TCGA database to extract significant gender, race, and sample type genes. We conducted a classification analysis based on these data. Based on deep learning algorithms, we compared and analyzed datasets using traditional classification techniques and pre-training processes, such as AE, VAE, SAE, and CVAE. For feature extraction of significant genes for the classification analysis using traditional techniques, PCA and NMF techniques were employed, while in our proposed deep learning–based techniques, important genes were extracted through pre-training processes such as AE, VAE, SAE, CVAE, and fine-tuning. As a result, deep learning–based effective gene extraction methods performed better.

There are several methods to predict cancer subtypes or sample types using deep learning techniques on gene expression data. To the best of our knowledge, there is a lack methods for identifying kidney cancer biomarkers that combine AE-based methods and model interpretation techniques. As shown in Tables 2-4, extracting race-related features is the most challenging task, and sample type feature extraction is much easier than other tasks. For the challenging tasks, CVAE outperforms the other methods.

Furthermore, we compared micro and macro measures according to the number of class labels of the target variables. The micro-measure exhibited better performance. In the future, the extracted genes will be able to confirm the gene’s function through verification and help predict the prognosis of kidney cancer patients. In further work, we will consider the other data samples such as clinical, RNA, DNA methylation, etc.

Acknowledgements

This work was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) by the Ministry of Education under Grant No. 2019R1F1A1051569, and No. 2020R1I1A1A01065199, No. 2020R1A6A1A12047945.

References

1 
V. M. G. Olivares, L. M. G. Torres, G. H. Cuartas, M. C. N. De la Hoz, 2019, Immunohistochemical profile of renal cell tumours, Revista Espanola De Patologia, Vol. 52, No. 4, pp. 214-221DOI
2 
J. J. Hsieh, M. P. Purdue, S. Signoretti, C. Swanton, L. Albiges, M. Schmidinger, D. Y. Heng, J. Larkin, V. Ficarra, 2017, Renal cell carcinoma, Nat. Rev. Dis. Primers, Vol. 3, No. 17010, pp. 1-19DOI
3 
W. M, Linehan, M. M. Walther, B. Zbar, 2003, The genetic basis of cancer of the kidney, J. Urol., Vol. 170, pp. 2163-2172DOI
4 
W. M. Linehan, B. Zbar, 2004, Focus on kidney cancer, Cancer Cell, Vol. 6, No. 3, pp. 223-228Google Search
5 
Cancer Genome Atlas Research Network, 2016, Comprehensive molecular characterization of papillary renal-cell carcinoma, N. Engl. J. Med., Vol. 374, No. 2, pp. 135-145Google Search
6 
L. A. Torre, B. Trabert, C. E. DeSantis, K. D. Miller, G. Samimi, C. D. Runowicz, M. M. Gaudet, A. Jemal, R. L. Siegel, 2018, Ovarian cancer statistics, 2018, CA Cancer J. Clin., Vol. 68, pp. 284-296DOI
7 
A. J. Peired, R. Campi, M. L. Angelotti, G. Antonelli, C. Conte, E. Lazzeri, F. Becherucci, L. Calistri, S. Serni, P. Romagnani, 2021, Sex and Gender Differences in Kidney Cancer: Clinical and Experimental Evidence, Cancers, Vol. 13, No. 18, pp. 4588DOI
8 
Y. Zhan, C. Pan, Y. Zhao, J. Li, B. Wu, S. Bai, 2021, Systematic Analysis of the Global, Regional and National Burden of Kidney Cancer from 1990 to 2017: Results from the Global Burden of Disease Study 2017, Eur. Urol. Focus, Vol. 8, No. 1, pp. 302-319DOI
9 
N. Chowdhury, C. G. Drake, 2020, Kidney cancer: an overview of current therapeutic approaches, Urol. Clin., Vol. 47, No. 4, pp. 419-431DOI
10 
D. A. Siegel, S. J. Henley, J. Li, L. A. Pollack, E. A. Van Dyne, A. White, pp 950-954 2017, Rates and trends of pediatric acute lymphoblastic leukemia—United States, 2001–2014, Morb. Mortal. Wkly. Rep., Vol. 66, No. 36, pp. 950-954DOI
11 
A. F. Olshan, Y. M. Kuo, A. M, Meyer, M. E. Nielsen, M. P. Purdue, W. K. Rathmell, 2013, Racial difference in histologic subtype of renal cell carcinoma, Cancer Med., Vol. 2, No. 5, pp. 744-749DOI
12 
L. Lipworth, A. K. Morgans, T. L. Edwards, D. A. Barocas, S. S. Chang, S. D. Herrell, D. F. Penson, M. J. Resnick, J. A. Smith, P. E. Clark, 2016, Renal cell cancer histological subtype distribution differs by race and sex, BJU Int., Vol. 117, No. 2, pp. 260-265DOI
13 
T. R. Rebbeck, 2018, Prostate cancer disparities by race and ethnicity: from nucleotide to neighborhood, Cold Spring Harbor Persp. Med., Vol. 8, No. 9, pp. a030387DOI
14 
S. J. O. Nomura, Y. T. Hwang, S. L. Gomez, T. T. Fung, S. L. Yeh, C. Dash, L. Allen, S. Philips, L. Hilakivi-Clarke, Y. L. Zheng, J. H. Y. Wang, 2017, Dietary intake of soy and cruciferous vegetables and treatment- related symptoms in Chinese-American and non-Hispanic White breast cancer survivors, Breast Cancer Res. Treat., Vol. 168, No. 2, pp. 467-79DOI
15 
P. Mamoshina, K. Kochetov, E. Putin, F. Cortese, A. Aliper, W. S. Lee, S. M. Ahn. L. Uhn, N. Skjodt, O. Kovalchuk, M. Scheibye-Knudsen, 2018, Population specific biomarkers of human aging: a big data study using South Korean, Canadian, and Eastern European patient populations, J. Gerontology: Ser. A, Vol. 73, No. 11, pp. 1482-1490DOI
16 
H. Y. Xiong, B. Alipanahi, L. J. Lee, H. Bretschneider, D. Merico, R. K. Yuen, Y. Hua, S. Gueroussov, H. S. Najafabadi, T. R. Hughes, Q. Morris, Y. Barash, A. R. Krainer, N. Jojic, S. W. Scherer, B. J. Blencowe, B. J. Frey, 2015, RNA splicing. The human splicing code reveals new insights into the genetic determinants of disease, Science, Vol. 347, No. 6218, pp. 1-20DOI
17 
M. Amgad, H. Elfandy, H. Hussein, L. A. Atteya, M. A. T. Elsebaie, L. S. A. Elnasr, R. A. Sakr, H. S. E. Salem, A. F. Ismail, A. M. Saad, J. Ahmed, M. A. T. Elsebaie, M. Rahman, I. A. Ruhban, N. M. Elgazar, Y. Alagha, M. H. Osman, A. M. Alhusseiny, M. M. Khalaf, A. F. Younes, A. Abdulkarim, D. M. Younes, A. M. Gadallah, A. M. Elkashash, S. Y. Fala, B. M. Zaki, J. Beezley, D. R. Chittajallu, D. Manthey, D. A. Gutman, L. A. D. Cooper, 2019, Structured crowdsourcing enables convolutional segmentation of histology images, Bioinformatics, Vol. 35, No. 18, pp. 3461-3467DOI
18 
V. M. G. Olivares, L. M. G. Torres, G. H. Cuartas, M. C. N. De la Hoz, 2019, Immunohistochemical profile of renal cell tumours, Rev. Esp. Patol., Vol. 52, No. 4, pp. 214-221DOI
19 
accessed on 17 August, 2021, National Cancer Center. Available online: https://ncc.re.kr/ indexGoogle Search
20 
B. H. Chi, I. H. Chang, 2018, The overdiagnosis of kidney cancer in Koreans and the active surveillance on small renal mass, Korean J. Urol. Oncol., Vol. 16, No. 1, pp. 15-24DOI
21 
A. M. Ali, H. Zhuang, A. Ibrahim, O. Rehman, M. Huang, A. Wu, 2018, A machine learning approach for the classification of kidney cancer subtypes using miRNA genome data, Appl. Sci., Vol. 8, No. 12, pp. 1-14DOI
22 
H. M. Kim, S. J. Lee, S. J. Park, I. Y. Choi, S. H. Hong, 2021, Machine learning approach to predict the probability of recurrence of renal cell carcinoma after surgery: Prediction model development study, JMIR Med. Inform., Vol. 9, No. 3, pp. e25635DOI
23 
accessed on 17 August, 2021, Genomic Data Commons. Available online: https://portal.gdc. cancer.govGoogle Search
24 
B. J. Kim, S. H. Kim, 2018, Prediction of inherited genomic susceptibility to 20 common cancer types by a supervised machine-learning method, PNAS USA, Vol. 115, No. 6, pp. 1322-1327DOI
25 
O. G. Troyanskaya, K. Dolinski, A. B. Owen, R. B. Altman, D. Botstein, 2003, A Bayesian framework for combining heterogeneous data sources for gene function prediction (in Saccharomyces cerevisiae), PNAS USA, Vol. 100, No. 14, pp. 8348-8353DOI
26 
N. Hadjiyski, 2020, Kidney cancer staging: Deep learning neural network based approach, 2020 International Conference on E-Health and Bioengineering (EHB 2020)DOI
27 
H. S. Shon, E. Batbaatar, K. O. Kim, E. J. Cha, K. A. Kim, 2020, Classification of kidney cancer data using cost-sensitive hybrid deep learning approach, Symmetry, Vol. 12, No. 1,154DOI
28 
N. Simidjievski, C. Bodnar, I. Tariq, P. Scherer, H. A. Terre, Z. Shams, M. Jamnik, P. Liò, 2019, Variational autoencoders for cancer data integration: Design principles and computational practice. Front. Genet.,, Vol. 10, No. 1205DOI
29 
P. Baldi, K. Hornik, 1989, Neural networks and principal component analysis: Learning from examples without local minima., Neural Netw., Vol. 2, pp. 53-58DOI
30 
M. Mohri, A. Rostamizadeh, A. Talwalkar, 2012, Foundations of Machine Learning, MIT PressGoogle Search
31 
P. Vincent, H. Larochelle, L. Lajoie, Y. Bengio, P. A. Manzagol, 2010, Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion, J. Mach. Learn. Res., Vol. 11, pp. 3371-3408Google Search
32 
M. A. Ranzato, C. S. Poultney, S. Chopra, Y. LeCun, 2007, Efficient learning of sparse representations with an energy-based model, Adv. Neural Inf. Process. Syst., Vol. 19, pp. 1137-1144Google Search
33 
D. P. Kingma, M. Welling, 2014, Auto-encoding variational bayes, Proceedings of the 2nd International Conference on Learning RepresentationsDOI
34 
Y. Pu, Z. Gan, R. Henao, X. Yuan, C. Li, A. Stevens, L. Carin, 2016, Variational autoencoder for deep learning of images, labels and captions, 30th Conference on Neural Information Processing Systems (NIPS 2016)Google Search
35 
K. Simonyan, A. Zisserman, 2015, Very deep convolutional networks for large-scale image recognition, The 3rd International Conference on Learning Representations (ICLR)DOI
36 
L. Le, A. Patterson, M. White, 2018, Supervised autoencoders: Improving generalization performance with unsupervised regularizers, 32nd Conference on Neural Information Processing Systems (NIPS 2018)Google Search
37 
M. Mohri, A. Rostamizadeh, D. Storcheus, 2015, Generalization bounds for supervised dimensionality reduction, JMLR: Workshop and Conf. Proc., Vol. 44, pp. 226-241Google Search
38 
L. A. Gottlieb, A. Kontorovich, R. Krauthgamer, 2016, Adaptive metric dimensionality reduction, Theor. Comput. Sci., Vol. 620, pp. 105-118DOI
39 
K. Sohn, H. Lee, X. Yan, 2015, Learning structured output representation using deep conditional generative models, Proceedings of the 28th International Conference on Neural Information Processing Systems, Vol. 2, pp. 3483-3491Google Search
40 
S. Belharbi, R. Hérault, C. Chatelain, S. Adam, 2018, Deep neural networks regularization for structured output prediction, Neurocomputing, Vol. 281, pp. 169-177DOI
41 
Y. Bengio, E. Laufer, G. Alain, J. Yosinski, 2014, Deep generative stochastic networks trainable by backprop, Proceeding of the 31st International Conference on Machine Learning, Vol. 32, pp. 226-234Google Search
42 
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, 2011, Scikit-learn: Machine learning in Python, J. Mach. Learn. Res., Vol. 12, pp. 2825-2830Google Search
43 
A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, 2019, Pytorch: An imperative style, high-performance deep learning library, Proceedings of the 33rd International Conference on Neural Information Processing Systems, pp. 8026-8037Google Search
44 
I. Lucca, T. Klatte, H. Fajkovic, M. De Martino, S. F. Shariat, 2015, Gender differences in incidence and outcomes of urothelial and kidney cancer, Nat. Rev. Urol., Vol. 12, No. 12, pp. 585-592DOI
45 
M. Mancini, M. Righetto, G. Baggio, 2020, Gender-related approach to kidney cancer management: Moving forward, Int. J. Mol. Sci., Vol. 21, No. 9, pp. 3378Google Search
46 
D. Hepps, A. Chernoff, 2006, Risk of renal insufficiency in African-Americans after radical nephrectomy for kidney cancer, Urologic Oncology: Seminars and Original Investigations, Vol. 24, No. 5, pp. 391-395DOI
47 
B. Shuch, S. Vourganti, C. J. Ricketts, L. Middleton, J. Peterson, M. J. Merino, A. R. Metwalli, R. Srinivasan, W. M. Linehan, 2014, Defining early-onset kidney cancer: implications for germline and somatic mutation testing and clinical management, J. Clin. Oncol., Vol. 32, No. 5, pp. 431-437DOI
48 
J. R. Vasselli, J. H. Shih, S. R. Iyengar, J. Maranchie, J. Riss, R. Worrell, C. Torres-Cabala, R. Tabios, A. Mariotti, R. Stearman, M. Merino, W. M. Linehan, 2003, Predicting survival in patients with metastatic kidney cancer by gene- expression profiling in the primary tumor, Proceedings of the National Academy of Sciences, Vol. 100, No. 12, pp. 6958-6963DOI
49 
M. Mostavi, Y. C. Chiu, Y Huang, Y. Chen, 2020, Convolutional neural network models for cancer type prediction based on gene expression, BMC Med. Genom., Vol. 13, No. 44, pp. 1-13DOI
50 
N. E. M. Khalifa, M. H. N. Taha, D. E. Ali, A. Slowik, A. E. Hassanien, 2020, Artificial intelligence technique for gene expression by tumor RNA-Seq data: a novel optimized deep learning approach, IEEE Access, Vol. 8, pp. 22874-22883Google Search
51 
R. Tabares-Soto, S. Orozco-Arias, V. Romero-Cano, V. S. Bucheli, J. L Rodríguez-Sotelo, C. F. Jiménez-Varón, 2020, A comparative study of machine learning and deep learning algorithms to classify cancer types based on microarray gene expression data, PeerJ Comput. Sci., Vol. 6, No. e270DOI

저자소개

손호선(Ho Sun Shon)
../../Resources/kiee/KIEE.2022.71.10.1393/au1.png

2010 : Ph.D in Computer Science, Chungbuk National University, Korea.

2012 to present : Visiting professor in Medical Research Institute, School of Medicine, Chungbuk National University, Korea.

Erdenebileg Batbaatar
../../Resources/kiee/KIEE.2022.71.10.1393/au2.png

2019 : Ph.D in Computer Science, Chungbuk National University, Korea

2021 to present : Researcher in Electronics and Telecommunications Research Institute, Korea.

차은종(Eun Jong Cha)
../../Resources/kiee/KIEE.2022.71.10.1393/au3.png

1987 : Ph.D in Biomedical Engineering, University of Southern California, U. S. A.

1988 to present : Professor in Department of Biomedical Engineering, School of Medicine, Chungbuk National University, Korea.

강태건(Tae Gun Kang)
../../Resources/kiee/KIEE.2022.71.10.1393/au4.png

2000 : Ph. D in Industrial Engineering, Dongguk University, Korea

2021 to present : Research professor in Institute for Trauma Research, College of Medicine, Korea University, Korea.

최성곤(Seong Gon Choi)
../../Resources/kiee/KIEE.2022.71.10.1393/au5.png

2004 : Ph.D in Information Communications University, Korea

2004 to present : Professor in College of Electrical and Computer Engineering, Chungbuk National University, Korea.

김경아(Kyung Ah Kim)
../../Resources/kiee/KIEE.2022.71.10.1393/au6.png

2001 : Ph.D in Biomedical Engineering, Chungbuk National University, Korea.

2005 to present : Professor in Department of Biomedical Engineering, School of Medicine, Chungbuk National University, Korea.