This collection of preprints explores diverse applications of machine learning and signal processing in wireless communications, sensing, and biomedical signal analysis. A significant focus lies in reconfigurable intelligent surfaces (RIS), particularly beyond-diagonal RIS (BD-RIS), for enhancing wireless communication systems. Liu et al. (2024) investigate the potential of non-reciprocal BD-RIS in full-duplex (FD) systems, proposing an iterative algorithm based on block coordination descent (BCD) and penalty dual decomposition (PDD) to maximize sum-rates. Complementing this, Björnson & Demir (2024) derive a closed-form capacity-maximizing BD-RIS reflection matrix for MIMO channels, offering geometrical insights. Semmler et al. (2024) propose decoupling networks for RIS arrays to mitigate mutual coupling, demonstrating potential for super-quadratic channel gains. Zhou et al. (2024) introduce a novel Q-stem connected BD-RIS architecture offering a performance-complexity trade-off, along with efficient scattering matrix design algorithms. Finally, Ginige et al. (2024) tackle channel estimation and prediction in BD-RIS-assisted MIMO systems with channel aging, proposing a Tucker2 decomposition and convolutional neural network (CNN) based approach.
Beyond RIS, several papers explore machine learning in various wireless scenarios. Djuhera et al. (2024) investigate resilient multi-task large language model fusion (R-MTLLMF) at the wireless edge, addressing adversarial attacks. Cong et al. (2024) propose a two-timescale digital twin for resource allocation and model retraining using incremental learning and deep reinforcement learning (DRL). Xu et al. (2024) present channel measurements for evaluating spatial separation in distributed massive MIMO. Cheng et al. (2024) introduce an auto-encoder for learning rate-compatible linear block codes (RC-LBCs). Nguyen et al. (2024) propose UPGANet, a deep unfolding network for hybrid beamforming in mmWave massive MIMO joint communications and sensing (JCAS). Huang et al. (2024) investigate beam switching for high-speed train mmWave communications, while Ci et al. (2024) focus on hybrid beamforming for covert mmWave MIMO.
Biomedical signal analysis, particularly of EEG data, forms another key area. Del Pup et al. (2024) evaluate EEG preprocessing for deep learning. Several papers address specific EEG applications: emotion detection (Chandanwala et al., 2024), Alzheimer's risk (Henao Isaza et al., 2024), and general EEG pathology (Poziomska et al., 2024). Oliver et al. (2024) analyze brain responses during human-robot interaction, while Iskarous et al. (2024) develop neuromorphic representations of tactile stimuli. Delorme et al. (2024) introduce a Python ICLabel implementation for automatic EEG component classification. Wang et al. (2024) present AnyECG, a foundational model for ECG analysis. Jeong et al. (2024) investigate contrastive learning for ECG anomaly detection.
Finally, various other applications are explored. These include signal detection in colored noise (Udupitiya et al., 2024), biosignal analysis (Jo et al., 2024), indoor localization (Etiabi et al., 2024), DoA estimation (Zhang et al., 2024), acoustic emission analysis (Muthumala et al., 2024), soil characterization (Rahman et al., 2024), audio quality assessment (Delgado & Herre, 2024), semantic edge computing (Zhang et al., 2024), analytic continuation (Zhao et al., 2024), AI testing (Guerci et al., 2024), spatial signal detection (Zahra et al., 2024), UV communications (Wu et al., 2024), heat load profiling (Michalakopoulos et al., 2024), multicore fiber switching (Melo et al., 2024), robot positioning (Etzion & Klein, 2024), Bayesian optimization (Kim et al., 2024), MI-EEG classification (Peng et al., 2024), VLC beamforming (Qiu et al., 2024), and time series clustering (Afzali et al., 2024).
AnyECG: Foundational Models for Electrocardiogram Analysis by Yue Wang, Xu Cao, Yaojun Hu, Haochao Ying, James Matthew Rehg, Jimeng Sun, Jian Wu, Jintai Chen https://arxiv.org/abs/2411.17711
Caption: The architecture of AnyECG, a novel family of foundation models for ECG analysis, is depicted. The top section illustrates the Patient Attribute Tokenizer training process, including temporal and spatial encoding, a rhythm codebook, and decoders for signal and demographic reconstruction. The bottom section shows the ECGFM pre-training stage, which uses a masked modeling approach with a Patient Attribute Tokenizer to learn robust representations from ECG data.
AnyECG introduces a family of foundational models designed to tackle the challenges of ECG analysis. ECGs are crucial for cardiac monitoring, but their analysis is often complex due to data heterogeneity (varying sampling rates and noise levels), demographic shifts, and intricate rhythm-event associations. Existing machine learning methods often struggle with these complexities.
AnyECG's strength lies in its two-stage pre-training approach. The first stage employs a tailored ECG Tokenizer, which segments ECG signals into fixed-duration fragments, converting them into tokens. A novel hierarchical modeling approach effectively handles ultra-long ECG recordings. A crucial component, the Rhythm Codebook, captures essential local morphological and frequency features, effectively reducing noise and enhancing signal quality. The ECG Tokenizer also recovers demographic information, facilitating generalization across diverse populations.
The second pre-training stage leverages masked modeling, where the model predicts Rhythm Code indices to fill masked patches. This process encourages the model to learn cardiac event semantics by capturing relationships between different ECG patches. The overall loss function for the ECG tokenizer is a combination of several reconstruction and regularization losses:
L<sub>T</sub> = L<sub>morphology</sub> + L<sub>frequency</sub> + L<sub>demography</sub> + L<sub>codebook</sub> + L<sub>commitment</sub>
Each term in the loss function contributes to the model's ability to accurately represent and reconstruct ECG signals while capturing relevant demographic and rhythmic information.
The researchers rigorously evaluated AnyECG on four distinct downstream tasks: anomaly detection, arrhythmia detection, corrupted lead generation, and ultra-long ECG signal recognition. Across all these tasks, AnyECG consistently surpassed state-of-the-art methods, demonstrating its robustness and generalizability. The largest model, AnyECG-XL, achieved particularly impressive results, highlighting the potential of scaling these foundational models for even greater performance gains. For instance, in anomaly detection, it achieved 82.55% accuracy and a 0.9538 AUC-PR. In arrhythmia detection, it reached 34.49% accuracy and a 0.1635 AUC-PR. For corrupted lead generation, AnyECG-L achieved a PSNR of 32.74 dB and SSIM of 0.8738. Finally, in ultra-long ECG recognition, AnyECG-XL achieved 80.55% accuracy and a 0.9088 AUC-PR.
Quantity versus Diversity: Influence of Data on Detecting EEG Pathology with Advanced ML Models by Martyna Poziomska, Marian Dovgialo, Przemysław Olbratowski, Paweł Niedbalski, Paweł Ogniewski, Joanna Zych, Jacek Rogala, Jarosław Żygierewicz https://arxiv.org/abs/2411.17709
Caption: This figure displays the AUC scores of various machine learning models for EEG pathology detection across different datasets derived from TUH and ELM19. It demonstrates the impact of data quantity and diversity, showing that increasing data quantity in the diverse ELM19 dataset leads to improved AUC, particularly for complex models like the meta-model (META), which consistently outperforms others. The error bars represent standard deviations across cross-validation folds.
This study explores the crucial interplay between data quantity and diversity in the performance of machine learning models for detecting general EEG pathology. Using two distinct datasets – the relatively homogeneous TUH Abnormal EEG Corpus (2,993 recordings) and the significantly larger and heterogeneous ELM19 dataset (55,787 recordings) from Elmiko Biosignals – the research investigates how these data characteristics affect various ML architectures, including classical models with handcrafted features and neural networks designed for automatic feature extraction. The introduction of the ELM19 dataset, the largest publicly available EEG corpus, provides a rich and diverse representation of patient conditions and recording protocols.
A standardized preprocessing pipeline ensured data consistency, involving notch filtering, band-pass filtering, resampling, and re-referencing. Classical models utilized handcrafted features derived from time and frequency domain analyses. Neural networks, built upon an EEGNet frame encoder, included variations with attention mechanisms and transformer architectures (miNet, MINet, and TransNet). A meta-model combining the strengths of top-performing neural and classical models further enhanced performance.
The study's findings reveal a substantial performance drop across datasets of equal size as data diversity increases, highlighting the challenges posed by heterogeneous data sources. However, increasing data quantity within the ELM19 dataset led to consistent AUC improvements, suggesting that quantity can indeed compensate for diversity, particularly for more complex models like the meta-model, which consistently outperformed individual models. This observation aligns with the scaling laws observed in other domains of machine learning, where larger datasets often lead to improved performance.
Analyzing the asymptotic behavior of model performance using a saturation power law (ACC(n) = ACC∞ - αn⁻β) applied to both accuracy and AUC provides further insights. The meta-model's projected AUC of approximately 91% in the limit of infinite data suggests that even larger datasets could further enhance performance, though the rate of improvement might diminish due to inherent limitations posed by inter-rater agreement in EEG labeling.
The more, the better? Evaluating the role of EEG preprocessing for deep learning applications by Federico Del Pup, Andrea Zanola, Louis Fabrice Tshimanga, Alessandra Bertoldo, Manfredo Atzori https://arxiv.org/abs/2411.18392
Caption: This figure presents the best-performing preprocessing pipeline (Raw, Filt, ICA, ICA+ASR) for each of the four deep learning models (EEGNet, ShallowNet, DeepConvNet, FBCNet) across six EEG classification tasks. The percentage represents the proportion of tasks where each pipeline yielded the highest balanced accuracy for the respective model, demonstrating that minimal preprocessing (Filt) often outperformed more complex methods. These results highlight the importance of carefully considering the trade-off between artifact removal and information preservation when preprocessing EEG data for deep learning.
This study systematically investigates the impact of various EEG preprocessing pipelines on deep learning models, aiming to establish practical guidelines for researchers. The study evaluates four preprocessing pipelines, ranging from raw data to complex artifact removal techniques, across six diverse classification tasks (eye blinking, motor imagery, Parkinson's and Alzheimer's disease, sleep deprivation, and first episode psychosis) and four popular EEG-focused deep learning architectures (EEGNet, ShallowConvNet, DeepConvNet, and FBCNet). A total of 4800 models were trained and evaluated, providing a comprehensive analysis of the interplay between preprocessing and model performance.
The research employed a rigorous Nested Leave-N-Subject-Out (N-LNSO) cross-validation strategy. This nested approach ensures robust and generalizable results by mitigating biases associated with subject-specific characteristics. Data were carefully preprocessed, including segmentation, downsampling, and normalization. Model training used standard optimization techniques, and performance was evaluated using balanced accuracy.
The results revealed significant differences between preprocessing pipelines. Raw data consistently underperformed, highlighting the importance of preprocessing. Surprisingly, minimal preprocessing involving only filtering often surpassed more complex pipelines, suggesting that EEG artifacts might contain valuable information leveraged by deep learning models. This challenges the conventional "more is better" assumption in preprocessing.
This newsletter highlights significant advancements in applying machine learning and signal processing to diverse fields. The development of foundational models like AnyECG demonstrates the potential of self-supervised learning for extracting robust representations from complex biomedical signals like ECGs. This approach promises to revolutionize automated cardiac diagnostics by overcoming challenges posed by data heterogeneity and noise. Simultaneously, the exploration of data quantity versus diversity in EEG pathology detection reveals a crucial scaling law: while data diversity poses challenges, increasing data quantity, particularly with advanced models like the meta-model, can significantly improve performance. This underscores the value of large, diverse datasets in training robust and generalizable EEG classifiers. Finally, the systematic evaluation of EEG preprocessing pipelines challenges conventional wisdom, suggesting that minimal preprocessing might be superior to complex artifact removal techniques for deep learning applications. This emphasizes the need for careful consideration of the trade-off between artifact removal and preservation of potentially informative signal components. Overall, these studies reveal exciting progress in leveraging data-driven approaches for enhanced signal analysis and interpretation across various domains.