Publications
Below is a list of preprints and publications.
2025
2025
- npjSPEED-TR: a self-distilled and pre-trained transformer model for enhanced ECG detection of tricuspid regurgitationXiaolin Diao# , Wei Xu# , Huaibing Cheng# , Ya Zhou# , Yang Liu , Yanni Huo , Jianli Lu , Jinghan Huang , Jia He , Fang Liu , and 4 more authorsnpj Digital Medicine, 2025
Tricuspid regurgitation (TR) remains underdiagnosed due to the lack of effective screening tools. We developed a self-distilled and pre-trained transformer model for detecting TR (SPEED-TR) from electrocardiography. The model was trained using 466,149 electrocardiogram-echocardiogram pairs from 291,673 patients and validated in one internal (63,925 patients) and two external cohorts (44,951 and 21,300 patients). SPEED-TR accurately detected moderate-to-severe TR in the hold-out set (AUROC 0.945, NPV 0.983; specificity 0.973) and maintained stable performance in multi-center testing sets (AUROCs 0.939–0.943; NPVs 0.978–0.988). Three thresholds enabled SPEED-TR severity grading: none (0–0.008), mild (0.008–0.255), moderate (0.255–0.755), and severe (0.755–1), achieving accuracies of 0.749 (hold-out), 0.730 (internal), 0.775 and 0.726 (external), with overall accuracy of 0.744. SPEED-TR remained robust in patients with 1 to ≥3 risk factors and 1 to ≥2 valvular diseases. SPEED-TR demonstrated potential as a screening tool and may provide reference for TR severity assessment.
- CIBMEnhancing automatic multilabel diagnosis of electrocardiogram signals: A masked transformer approachYa Zhou#* , Yanni Diao# , Yang Liu , Zhaohong Sun , Xiaohan Fan* , and Wei Zhao*Computers in Biology and Medicine, 2025
Background and Objective : Electrocardiogram (ECG) is one of the most important diagnostic tools in clinical applications. Although deep learning models have been widely applied to ECG classification tasks, their accuracy remains limited, especially in handling complex signal patterns in real-world clinical settings. This study explores the potential of Transformer models to improve ECG classification accuracy. Methods: We present Masked Transformer for ECG classification (MTECG), a simple yet effective method which adapts the image-based masked autoencoders to self-supervised representation learning from ECG time series. The model is evaluated on the Fuwai dataset, comprising 220,251 ECG recordings annotated by medical experts, and compared with six recent state-of-the-art methods. Ablation studies are conducted to identify key components contributing to the model’s performance. Additionally, the method is evaluated on two public datasets to assess its broader applicability. Results: Experiments show that the proposed method increases the macro F1 scores by 2.8%–28.6% on the Fuwai dataset, 10.4%–46.2% on the multicenter dataset and 19.1%–46.9% on the PTB-XL dataset for common ECG diagnoses recognition tasks, compared to six alternative methods. Additionally, the proposed method consistently achieves state-of-the-art performance on the PTB-XL superclass task in both linear probing and fine-tuning evaluations. The masked pre-training strategy significantly enhances classification performance, with key contributing factors including the masking ratio, training schedule length, fluctuating reconstruction targets, layer-wise learning rate decay, and DropPath rate. Conclusion: The Masked Transformer model exhibits superior performance in ECG classification, highlighting its potential to advance ECG-based diagnostic systems.
- ArxivBridging Performance Gaps for Foundation Models: A Post-Training Strategy for ECGFounderYa Zhou#* , Yujie Yang# , Xiaohan Fan* , and Wei Zhao*Arxiv. , 2025
ECG foundation models are increasingly popular due to their adaptability across various tasks. However, their clinical applicability is often limited by performance gaps compared to task-specific models, even after pre-training on large ECG datasets and fine-tuning on target data. This limitation is likely due to the lack of an effective post-training strategy. In this paper, we propose a simple yet effective post-training approach to enhance ECGFounder, a state-of-the-art ECG foundation model pre-trained on over 7 million ECG recordings. Experiments on the PTB-XL benchmark show that our approach improves the baseline fine-tuning strategy by 1.2%-3.3% in macro AUROC and 5.3%-20.9% in macro AUPRC. Additionally, our method outperforms several recent state-of-the-art approaches, including task-specific and advanced architectures. Further evaluation reveals that our method is more stable and sample-efficient compared to the baseline, achieving a 9.1% improvement in macro AUROC and a 34.9% improvement in macro AUPRC using just 10% of the training data. Ablation studies identify key components, such as stochastic depth and preview linear probing, that contribute to the enhanced performance. These findings underscore the potential of post-training strategies to improve ECG foundation models, and we hope this work will contribute to the continued development of foundation models in the ECG domain.
- PatentAbnormal Electrocardiogram Event Recognition Method and SystemWei Zhao , Zhihui Cao , Ya Zhou , and Yanni HuoChina Patent, 2025
- ArxivMulti-scale Masked Autoencoder for Electrocardiogram Anomaly DetectionYa Zhou#* , Yujie Yang# , Jianhuang Gan , Xiangjie Li , Jing Yuan , and Wei Zhao*Arxiv. Under review, 2025
Electrocardiogram (ECG) analysis is a fundamental tool for diagnosing cardiovascular conditions, yet anomaly detection in ECG signals remains challenging due to their inherent complexity and variability. We propose Multi-scale Masked Autoencoder for ECG anomaly detection (MMAE-ECG), a novel end-to-end framework that effectively captures both global and local dependencies in ECG data. Unlike state-of-the-art methods that rely on heartbeat segmentation or R-peak detection, MMAE-ECG eliminates the need for such pre-processing steps, enhancing its suitability for clinical deployment. MMAE-ECG partitions ECG signals into non-overlapping segments, with each segment assigned learnable positional embeddings. A novel multi-scale masking strategy and multi-scale attention mechanism, along with distinct positional embeddings, enable a lightweight Transformer encoder to effectively capture both local and global dependencies. The masked segments are then reconstructed using a single-layer Transformer block, with an aggregation strategy employed during inference to refine the outputs. Experimental results demonstrate that our method achieves performance comparable to state-of-the-art approaches while significantly reducing computational complexity-approximately 1/78 of the floating-point operations (FLOPs) required for inference. Ablation studies further validate the effectiveness of each component, highlighting the potential of multi-scale masked autoencoders for anomaly detection.
2024
2024
- Statistica SinicaMultivariate Varying-coefficient Models via Tensor DecompositionFengyu Zhang# , Ya Zhou# , Raymond K W Wong , and Kejun HeStatistica Sinica, 2024
Multivariate varying-coefficient models (MVCM) are popular statistical tools for analyzing the relationship between multiple responses and covariates. Nevertheless, estimating large numbers of coefficient functions is challenging, especially with a limited amount of samples. In this work, we propose a reduced- dimension model based on the Tucker decomposition, which unifies several existing models. In addition, sparse predictor effects, in the sense that only a few predictors are related to the responses, are exploited to achieve an interpretable model and sufficiently reduce the number of unknown functions to be estimated. All the above dimension-reduction and sparsity considerations are integrated into a penalized least squares problem on the constraint domain of 3rd-order tensors. To compute the proposed estimator, we propose a block updating algorithm with ADMM and manifold optimization. We also establish the oracle inequality for the prediction risk of the proposed estimator. A real data set from Framingham Heart Study is used to demonstrate the good predictive performance of the proposed method.
- JRSSBBroadcasted Nonparametric Tensor RegressionYa Zhou , Raymond K W Wong , and Kejun HeJournal of the Royal Statistical Society Series B: Statistical Methodology, 2024
We propose a novel use of a broadcasting operation, which distributes univariate functions to all entries of the tensor covariate, to model the nonlinearity in tensor regression nonparametrically. A penalized estimation and the corresponding algorithm are proposed. Our theoretical investigation, which allows the dimensions of the tensor covariate to diverge, indicates that the proposed estimation yields a desirable convergence rate. We also provide a minimax lower bound, which characterizes the optimality of the proposed estimator for a wide range of scenarios. Numerical experiments are conducted to confirm the theoretical findings, and they show that the proposed model has advantages over its existing linear counterparts.
2023
2023
- PatentQuality Evaluation Method and Device of Electrocardiosignal, Storage Medium and Electronic EquipmentWei Zhao , Jing Yuan , Xiaolin Diao , Yi Zhang , Ya Zhou , and Yanni HuoChina Patent, 2023
- PatentType Identification Method, System and Auxiliary System for Multi-lead ElectrocardiosignalWei Zhao , Ya Zhou , Jing Yuan , Xiaolin Diao , and Yanni HuoChina Patent, 2023
- PatentIntelligent Electrocardiosignal Analysis Method and System and Intelligent Electrocardio Auxiliary SystemWei Zhao , Xiaohan Fan , Xiaolin Diao , Yanni Huo , Jing Yuan , Ya Zhou , and Huaibing ChengChina Patent, 2023
- ArxivMasked Transformer for Electrocardiogram ClassificationYa Zhou#* , Xiaolin Diao# , Yanni Huo , Yang Liu , Xiaohan Fan* , and Wei Zhao*Arxiv, 2023
Electrocardiogram (ECG) is one of the most important diagnostic tools in clinical applications. With the advent of advanced algorithms, various deep learning models have been adopted for ECG tasks. However, the potential of Transformers for ECG data is not yet realized, despite their widespread success in computer vision and natural language processing. In this work, we present a useful masked Transformer method for ECG classification referred to as MTECG, which expands the application of masked autoencoders to ECG time series. We construct a dataset comprising 220,251 ECG recordings with a broad range of diagnoses annoated by medical experts to explore the properties of MTECG. Under the proposed training strategies, a lightweight model with 5.7M parameters performs stably well on a broad range of masking ratios (5%-75%). The ablation studies highlight the importance of fluctuated reconstruction targets, training schedule length, layer-wise LR decay and DropPath rate. The experiments on both private and public ECG datasets demonstrate that MTECG-T significantly outperforms the recent state-of-the-art algorithms in ECG classification.
2021
2021
- OATensor Linear Regression: Degeneracy and SolutionYa Zhou , Raymond K. W. Wong , and Kejun HeIEEE Access, 2021
Tensor regression is an important and useful tool for analyzing multidimensional array data. To deal with high dimensionality, CANDECOMP/PARAFAC (CP) low-rank constraints are often imposed on the coefficient tensor parameter in the (penalized) loss functions. However, besides the well-known non-identifiability issue of the CP parameters, we demonstrate that the corresponding optimization may not have any attainable solutions, and thus the estimation of the coefficient tensor is not well-defined when this happens. This is closely related to a phenomenon, called CP degeneracy, in low-rank tensor approximation problems. In this article, we show some useful results of CP degeneracy in the context of tensor regression problems. To overcome the theoretical and numerical issues associated with the degeneracy, we provide a general penalized strategy as a solution to the degeneracy. The related results also explain why some of the existing methods are more stable than the others. The asymptotic properties of the resulting estimation are also studied. Numerical experiments are conducted to illustrate our findings.
- StatAn Improved Tensor Regression Model via Location SmoothingYa Zhou , and Kejun HeStat, 2021
Many applications of regression study the predictors with complex forms such as tensors. Besides low dimensional assumption, the effects of predictors with a tensor structure typically show clustered or smooth phenomena in the sense that adjacent elements have a similar effect on the response. To simultaneously incorporate the low-rank and smoothness in tensor regression, we generalize the CANDECOMP/PARAFAC (CP) decomposition to a smooth version and propose a novel regression model based on the smoothed decomposition. The asymptotic theory of the proposed method is studied, which shows a faster rate of convergence than the one without incorporating the smoothness. The experiments on both synthetic and real data confirm that the proposed method has advantages over the alternative especially when the sample size is small.