Subject: Cutting-Edge Research in Machine Learning and Signal Processing
Hi Elman,
This newsletter covers recent preprints exploring the intersection of machine learning and signal processing across wireless communications, medical imaging, and sensor data analysis.
This collection of preprints showcases diverse applications of machine learning and signal processing. Several papers focus on advanced beamforming techniques, indicating a growing interest in optimizing antenna configurations for enhanced performance. Guo et al. (2025) Guo et al. (2025) introduce GPASS, a deep learning architecture for joint pinching beamforming and transmit beamforming in pinching-antenna systems (PASS). Concurrently, Bereyhi et al. (2025) Bereyhi et al. (2025) investigate downlink beamforming in multiuser MIMO systems using PASS, developing a low-complexity algorithm for optimizing precoding and antenna locations. Nayak et al. (2025) Nayak et al. (2025) propose a deep reinforcement learning approach for Dolph-Tschebyscheff beamforming, adapting beam patterns for mobile users in downlink transmission. These works collectively demonstrate the potential of data-driven methods for optimizing beamforming in complex scenarios. Further exploration of beamforming is presented by Wachowiak et al. (2025), (Wachowiak et al., 2025) who analyze oversampled time-modulated arrays (TMAs) for enhanced phase-shifting resolution, and (Wachowiak et al., 2025) who investigate frequency diverse array OFDM for joint communication and sensing. Mo et al. (2025) Mo et al. (2025) present a practical demonstration of joint phase-time arrays (JPTA), enabling frequency-dependent beamforming with a single RF chain.
Beyond beamforming, several contributions address signal processing challenges in various applications. Muzeau et al. (2025) Muzeau et al. (2025) leverage contrastive learning for general feature extraction in SAR target classification, achieving high accuracy with limited labeled data. Rai et al. (2025) Rai et al. (2025) propose a compressive sensing based multi-target localization algorithm for MIMO-FMCW radar, enabling efficient range, Doppler, and angle estimation. Ohayon et al. (2025) Ohayon et al. (2025) introduce Denoising Diffusion Codebook Models (DDCM) for compressed image generation. Mishaly et al. (2025) Mishaly et al. (2025) present a Multi-Band Mamba network for deep active speech cancellation.
The application of machine learning to medical and health-related data analysis is also prominent. Bhagubai et al. (2025) Bhagubai et al. (2025) introduce SeizeIT2, a comprehensive wearable dataset for focal epilepsy research. Gradowski and Buchner (2025) Gradowski and Buchner (2025) develop a deep learning model for ECG reconstruction. Saha et al. (2025) Saha et al. (2025) present Pulse-PPG, an open-source field-trained PPG foundation model for wearable applications. Nguyen et al. (2025) Nguyen et al. (2025) introduce a wearable device dataset for mental health assessment. These contributions highlight the increasing use of wearable sensor data and deep learning for personalized healthcare.
Theoretical advancements are explored by Soleymani et al. (2025) Soleymani et al. (2025), proposing a novel framework for fractional matrix programming. Li et al. (2025) Li et al. (2025) introduce a parallel coding strategy for orthogonal delay-Doppler division multiplexing (ODDM). Zhao et al. (2025) Zhao et al. (2025) investigate near-field integrated sensing and communications. Mashayekh Bakhsh et al. (2025) Mashayekh Bakhsh et al. (2025) analyze direct uplink connectivity in space MIMO systems.
Methodological advancements are presented by Liu et al. (2025) Liu et al. (2025), proposing a bilinear subspace variational Bayesian inference algorithm. Moshtaghpour and Kirkland (2025) Moshtaghpour and Kirkland (2025) analyze the impact of overlap ratio in defocused electron ptychography. Habi et al. (2025) Habi et al. (2025) introduce a learned Bayesian Cramér-Rao bound. Guo et al. (2025) Guo et al. (2025) present measurements of scattering from building surfaces. Additional contributions include work on EEG signal analysis (Roy et al., 2025), P300 extraction (Sarasa et al., 2025), and knowledge distillation for wearable sensor data (Jeon et al., 2025).
Pulse-PPG: An Open-Source Field-Trained PPG Foundation Model for Wearable Applications Across Lab and Field Settings by Mithun Saha, Maxwell A. Xu, Wanting Mao, Sameer Neupane, James M. Rehg, Santosh Kumar https://arxiv.org/abs/2502.01108
Caption: The diagram illustrates the training and evaluation process of Pulse-PPG, an open-source foundation model for PPG analysis. It shows the pre-training stage using a large unlabeled wearable field dataset and a relative contrastive learning approach with a learned motif-based distance function, followed by transfer learning to various downstream tasks in clinical, wearable lab, and wearable field settings. The model learns from raw PPG waveforms (A, B, C, D) to generate embeddings (Ea, Eb, Ec, Ed) used for downstream tasks like sleep staging, blood pressure estimation, stress detection, and activity recognition.
Pulse-PPG stands out as the first open-source PPG foundation model trained exclusively on raw, uncurated field data, addressing a critical gap in wearable health monitoring. Existing models are often limited by their reliance on clean clinical data or closed-source nature, hindering their real-world applicability. This work leverages a vast dataset of 21 billion data points collected over 100 days from 120 participants wearing smartwatches, embracing the inherent noise of real-world data as a strength rather than a weakness.
The key innovation lies in the novel pre-training task, employing a learnable motif-based distance function:
$d(X_{anchor}, X_{cand}) = \sum_{i=0}^{T} \underset{j \in [0,...,T]}{argmin} (d_m (Motif (X_{anchor} [i]), Motif(X_{cand}[j])))$
and a relative contrastive loss function. This approach allows Pulse-PPG to capture subtle yet significant patterns within raw PPG signals, leading to robust representations that generalize well across diverse real-world scenarios. Instead of relying on traditional noise filtering techniques, which can inadvertently discard valuable contextual information, the model learns directly from the noisy data, making it more adaptable to the complexities of real-world applications.
The evaluation of Pulse-PPG across 11 downstream tasks and 5 datasets, encompassing wearable field, wearable lab, and clinical settings, reveals its remarkable performance. It consistently outperforms a state-of-the-art, clinically-trained foundation model on a majority of tasks, demonstrating the power of field-trained models. Surprisingly, pre-training on field data proves superior to clinical data even for tasks in clinical settings, challenging conventional wisdom and highlighting the importance of real-world data diversity. The "field-to-lab" generalizability of Pulse-PPG opens exciting new avenues for developing more robust and adaptable PPG-based models. The open-source release of Pulse-PPG's weights promises to democratize access to advanced PPG analysis techniques, accelerating innovation and broadening the reach of reliable mHealth applications.
Direct Uplink Connectivity in Space MIMO Systems with THz and FSO Inter-Satellite Links by Zohre Mashayekh Bakhsh, Yasaman Omid, Gaojie Chen, Farbod Kayhan, Yi Ma, Rahim Tafazolli https://arxiv.org/abs/2502.00824
This paper explores the potential of multi-satellite MIMO systems to enhance direct uplink connectivity from mobile phones to satellites. The focus is on the critical role of inter-satellite links (ISLs), considering both terahertz (THz) and free-space optical (FSO) technologies. The study grounds its analysis in a practical scenario derived from 3GPP standards for non-terrestrial networks (NTN), incorporating realistic parameters for frequency bands, bandwidths, antenna gains, power levels, and channel characteristics.
A key contribution is a proposed satellite selection method for identifying the optimal master node (MN) satellite for signal processing. This selection considers both the user-satellite link quality and ISL conditions, optimizing system performance. The research delves into the ergodic capacity of THz and FSO ISLs under non-ideal conditions. For THz ISLs, closed-form approximations for ergodic capacity are derived under scenarios with instantaneous and statistical CSI sharing between satellites. The capacity formula for the full CSI scenario is provided:
C = log₂(1 + (pv(diag(c))²hhᴴvᴴ)/(σ²ₙᵤₚv(diag(c))²vᴴ + vΣₙᵢₛₗvᴴ))
where p represents transmit power, c accounts for uplink and ISL path loss, h is the channel vector, v is the detection vector, σ²ₙᵤₚ is uplink noise power, and Σₙᵢₛₗ represents ISL noise covariance. For FSO ISLs, a closed-form approximate upper bound for ergodic capacity is presented, accounting for pointing error loss.
Simulations demonstrate the significant spectral efficiency gains of multi-satellite MIMO SatCom over single-satellite systems, highlighting the benefits of cooperation. The results emphasize the importance of optimal satellite selection, with the nearest satellite often being the best choice for the MN. The analysis reveals that instantaneous CSI sharing improves spectral efficiency compared to statistical CSI, albeit marginally. Furthermore, the study shows that reducing pointing error enhances spectral efficiency for FSO ISLs.
Compressed Image Generation with Denoising Diffusion Codebook Models by Guy Ohayon, Hila Manor, Tomer Michaeli, Michael Elad https://arxiv.org/abs/2502.01189
Caption: This diagram illustrates the Denoising Diffusion Codebook Model (DDCM) architecture for image compression. Rather than sampling continuous Gaussian noise, DDCM selects noise vectors from fixed codebooks (C<sub>i</sub> and C<sub>i+1</sub>) at each timestep i of the reverse diffusion process, guided by similarity to the target image. This sequence of selected indices forms a compressed representation, which can be used to reconstruct the image.
This paper introduces Denoising Diffusion Codebook Models (DDCM), a novel approach to image generation that produces high-quality images alongside their compressed bit-stream representations. DDCM challenges the conventional use of continuous Gaussian noise in DDMs by employing pre-defined codebooks of fixed iid Gaussian vectors. Surprisingly, even with small codebooks, DDCM maintains the quality and diversity of traditional DDMs, suggesting a potential redundancy in the infinite representation space of standard methods.
The core of DDCM lies in its discrete and finite representation. Instead of sampling noise from a continuous Gaussian distribution during reverse diffusion, DDCM selects noise vectors from codebooks C<sub>i</sub> at each timestep i. The generative process is defined as:
x<sub>i-1</sub> = μ<sub>i</sub>(x<sub>i</sub>) + σ<sub>i</sub>C<sub>i</sub>(k<sub>i</sub>)
where k<sub>i</sub> is the chosen index from the codebook. The sequence {k<sub>i</sub>} acts as a lossless compressed representation of the generated image. For image compression, the index selection is guided by minimizing the residual error between the target image and the denoiser's prediction. This gradient-free approach achieves state-of-the-art perceptual compression, particularly at lower bit rates.
The framework extends to compressed conditional image generation by generalizing the index selection rule to minimize a loss function incorporating conditioning information. This enables generating compressed representations conditioned on factors like degraded images for restoration tasks. Experiments across various datasets validate the effectiveness of DDCM for compression and restoration, showcasing its superior perceptual quality and competitive distortion metrics compared to existing methods. The authors also provide a mathematical interpretation linking DDCM to score-based posterior samplers, offering a theoretical foundation for the method.
This newsletter highlights a convergence of machine learning and signal processing across diverse domains. The development of Pulse-PPG signifies a paradigm shift in wearable health, leveraging the richness of real-world data to build more robust and generalizable models. The exploration of multi-satellite MIMO systems with THz and FSO ISLs offers a promising path towards enhanced connectivity in future 6G networks. Finally, DDCM presents a groundbreaking approach to image generation and compression, challenging conventional methods with its discrete and finite representation. These advancements collectively demonstrate the transformative potential of data-driven approaches in addressing complex challenges across various fields.