Subject: Cutting-Edge Advances in Signal Processing and Machine Learning
Hi Elman,
This newsletter covers a diverse range of preprints exploring novel applications of signal processing and machine learning across various domains. From enhancing sensory evaluations and secure communication to refining medical diagnostics and optimizing network performance, these studies showcase the transformative potential of these technologies.
This collection of preprints explores diverse applications of signal processing and machine learning, ranging from enhancing sensory evaluation and secure communication to improving medical diagnostics and optimizing network performance. Several studies leverage deep learning for feature extraction and classification. Xia et al. (2024) propose CAM-Attention, a novel channel selection method for taste EEG data, combining CNN-CSA and Grad-CAM to improve the efficiency of food sensory evaluation. Similarly, Ventura et al. (2024) employ a CNN for location estimation in underwater acoustic networks, enabling context-based authentication by tracking variations in the underwater acoustic channel. Brown et al. (2024) demonstrate the efficacy of a CNN-layer only autoencoder for mitigating interference in aircraft radar altimeters, enhancing ranging estimate accuracy even in severe interference environments. Di Gennaro et al. (2024) utilize deep learning for user-centric clustering in cell-free massive MIMO systems, addressing the complex association problem between access points and users to maximize spectral efficiency.
Another prominent theme is the application of optimization techniques to enhance system performance. Hua (2024) revisits STEEP, a secure communication scheme, demonstrating its ability to achieve positive secrecy rates even under challenging eavesdropping scenarios across various channel models. Li et al. (2024) propose a consensus-ADMM approach for maximizing the minimum SINR in a downlink communication network enabled by a transmissive RIS transceiver, addressing the non-convex optimization problem with a linear-complexity algorithm. Ma et al. (2024) investigate secure transmission in RIS-ISAC systems aided by movable antennas, developing a two-layer penalty-based algorithm to jointly optimize beamformers, reflection coefficients, and antenna positions for enhanced physical layer security. Palmucci et al. (2024) address power minimization in multi-user MIMO systems with large-size RISs, employing an iterative alternating optimization approach to optimize precoding coefficients and RIS configuration while meeting SINR constraints.
Several contributions focus on specific application domains. Pereira et al. (2024) employ Q-learning for age-of-information (AoI) minimization in NOMA random access networks, demonstrating significant AoI performance improvements without sacrificing throughput. Farhat et al. (2024) evaluate a probabilistic strategy for code rate and header replica allocation in LR-FHSS networks, showing consistent performance gains over standardized data rates. Wei et al. (2024) propose an airborne maneuverable bi-static ISAC system for adaptive tracking and communication, formulating a trajectory optimization problem to minimize the Cramer Rao bound while maintaining sufficient communication SNR. Karande et al. (2024) present a dataset and a Random Forest model for classifying stair and lift usage from wearable sensor data, achieving high accuracy in real-time activity detection.
Further contributions explore novel methodologies and applications. Esteban-Perez et al. (2024) utilize data-driven inverse optimization to estimate unobservable components of electricity demand response, offering insights into consumer behavior. Kołodziej et al. (2024) propose a calibration methodology for water delivery networks using short-burst hydrant trials, demonstrating improved accuracy compared to traditional calibration methods. Silva et al. (2024) introduce Logistic-NARX Multinomial, a classification algorithm combining NARX and logistic regression, for insightful railway track evaluation through feature importance analysis. Alghamdi et al. (2024) present a robust algorithm for correlation change detection in high-dimensional data, utilizing the maximum magnitude correlation coefficient as a summary statistic. Da Silva et al. (2024) propose a CSI acquisition scheme for cell-free massive MIMO surveillance systems, enabling accurate channel estimation for robust monitoring performance.
Finally, several papers focus on biomedical applications and signal enhancement. Soydan et al. (2024) introduce S7, a simplified state-space model for sequence modeling, demonstrating superior performance across various tasks, including neuromorphic event-based datasets. Rocha da Costa and O'Keeffe (2024) develop a digital front-end for electrooculography circuits, aiming to facilitate digital communication for individuals with disabilities. Torabi et al. (2024) present a manikin-recorded cardiopulmonary sounds dataset, providing a valuable resource for developing AI-driven diagnostic tools. Chen and Liu (2024) propose an LLM-based framework for remaining useful life prediction, achieving state-of-the-art results on the Turbofan engine dataset. Li et al. (2024) explore MRI quantification of liver fibrosis using diamagnetic susceptibility, offering a potential non-invasive diagnostic method. Hua (2024) proposes a simple method for secret-key generation between mobile users across networks, enhancing privacy and authentication.
Remaining Useful Life Prediction: A Study on Multidimensional Industrial Signal Processing and Efficient Transfer Learning Based on Large Language Models by Yan Chen, Cheng Liu https://arxiv.org/abs/2410.03134
Remaining Useful Life (RUL) prediction is critical for preventing costly downtime in industrial settings. Traditional methods, relying on small-scale deep learning or physics-based models, often struggle with the complexities of multidimensional sensor data and varying operating conditions. This research introduces an innovative framework leveraging Large Language Models (LLMs) for RUL prediction, offering a potential game-changer in predictive maintenance. Traditional methods often struggle with generalizing across different datasets and require task-specific tuning. This new approach aims to overcome these limitations by harnessing the power of LLMs.
The proposed framework uses a pre-trained GPT-2 medium model, augmented with global pooling and additional attention mechanisms to handle the nuances of industrial sensor data. The input data, representing multidimensional time-series from sensors, undergoes a series of preprocessing steps. These include generating RUL labels using the formula RUL(i) = max(T – t(i),0), where T is the maximum operational cycle and t(i) is the current cycle. Further steps involve normalization based on operating conditions, exponential smoothing, and sliding window processing with a fixed window length L across all datasets. This unified preprocessing pipeline, using all available sensor signals, enhances the model's generalizability, a key advantage over existing methods that often require dataset-specific adjustments.
A key innovation of this research is the introduction of a novel transfer learning strategy. After initial training on a source dataset, most layers of the GPT-2 model are frozen, and only the last few layers are fine-tuned using a smaller portion of data from the target dataset. This "partial layer freezing" approach aims to retain the general knowledge learned from the source domain while adapting quickly to the target task, significantly improving training efficiency and reducing computational costs. The model's performance is evaluated using Root Mean Squared Error (RMSE) and a Score function that penalizes late predictions more heavily than early ones.
The results on the C-MAPSS dataset, a benchmark for RUL prediction, are impressive. The LLM-based framework outperforms existing state-of-the-art models on the FD001, FD002, and FD004 subsets and achieves near state-of-the-art results on FD003. Notably, the framework uses a consistent sliding window length and all sensor signals across all subsets, demonstrating its robust generalizability. Transfer learning experiments further validate the approach, showing that fine-tuning with only 50% of the target domain training data can achieve superior results compared to training from scratch on the full dataset. For example, in the transfer from FD004 to FD002, the RMSE surpasses even the result obtained by training solely on the full FD002 dataset. This research highlights the potential of LLMs to revolutionize industrial signal processing and RUL prediction. The unified model structure and efficient transfer learning strategy address key challenges faced by existing methods, paving the way for more robust and adaptable predictive maintenance solutions. The ability to achieve high accuracy with limited target domain data makes this approach particularly attractive for real-world industrial applications, where data acquisition can be costly and time-consuming. Future work will explore the application of this framework to larger industrial datasets and further refine the transfer learning strategy.
Resource-Efficient Multiview Perception: Integrating Semantic Masking with Masked Autoencoders by Kosta Dakic, Kanchana Thilakarathna, Rodrigo N. Calheiros, Teng Joon Lim https://arxiv.org/abs/2410.04817
Caption: This figure showcases the semantic masking and masked autoencoder (MAE) approach for multiview perception. The top row displays an original image, a semantically masked version, and the MAE reconstruction. The bottom row shows the semantic segmentation mask, the patch activity heatmap, a randomly masked image, and the final MAE reconstruction. This method prioritizes transmitting informative patches, significantly reducing communication overhead while maintaining accuracy in multiview detection and tracking.
Multiview systems offer powerful capabilities for scene understanding, but the increasing volume of visual data presents significant bandwidth and computational challenges, especially for resource-constrained camera nodes. This paper introduces a novel approach for communication-efficient distributed multiview detection and tracking, leveraging semantic-guided masking and masked autoencoders (MAEs). The core idea is to prioritize and transmit only the most informative image patches, reducing communication overhead while preserving essential visual information for accurate target detection and tracking.
The proposed system architecture involves several key components. First, at the camera nodes, images are resized and then masked using the semantic-guided strategy. This strategy utilizes a pre-trained Detectron2 semantic segmentation network to identify regions of interest (pedestrians in this case). A heatmap of patch activity levels is generated based on the semantic masks, and a tunable power function (f(x) = x^κ) is applied to control the randomness of patch selection. A subset of patches is then randomly selected based on the normalized activity levels and a predefined masking ratio (r = N_masked / N_total). Only the selected (unmasked) patches are transmitted to the edge server.
At the edge server, an MAE reconstructs the complete images from the received patches. These reconstructed images are then aggregated using perspective transformation to obtain a unified bird's-eye view (BEV) representation. Finally, a CNN-based module processes the BEV representation for target detection and tracking. The system was evaluated on the MultiviewX and Wildtrack datasets using metrics such as MODA, MODP, MOTA, and MOTP. The results demonstrate that the proposed method achieves comparable or better performance than state-of-the-art techniques, even at high masking ratios (e.g., 70%). Specifically, the semantically-guided masking strategy consistently outperformed random masking, highlighting the importance of prioritizing informative regions. Furthermore, the approach achieved a significant reduction in communication volume (13.33-fold reduction at 70% masking and with image resizing) compared to methods transmitting full-resolution, unmasked images. For instance, on the Wildtrack dataset with a 70% masking ratio, the proposed method achieved 90.9% MODA, 79.4% MODP, 88.5% MOTA, and 86.8% MOTP, while reducing the communication volume to 11.7 Mb. On the MultiviewX dataset with the same masking ratio, the method achieved 89.9% MODA, 90.5% MODP, 81.0% MOTA, and 85.8% MOTP, with a communication volume of 10.0 Mb. The superior performance of semantically-guided masking over random masking, particularly in the denser MultiviewX dataset, suggests that prioritizing informative regions is crucial for maintaining accuracy at high compression levels. Additional experiments demonstrated the robustness of the MAE component in the presence of camera dropout, further highlighting the practicality of the proposed system in real-world scenarios. This work presents a promising direction for developing resource-efficient multiview systems, paving the way for wider deployment in bandwidth and computationally constrained environments.
Near-Field ISAC in 6G: Addressing Phase Nonlinearity via Lifted Super-Resolution by Sajad Daei, Amirreza Zamani, Saikat Chatterjee, Mikael Skoglund, Gabor Fodor https://arxiv.org/abs/2410.04930
Caption: This diagram illustrates a near-field ISAC scenario where a target (car) and a communication user (phone) are in close proximity to a multi-antenna ISAC receiver. The orange and green lines represent the communication and radar signal paths, respectively, highlighting the different angles (θ<sub>C</sub>, θ<sub>R</sub>) and distances (r<sub>C</sub>, r<sub>R</sub>) involved, which introduce phase nonlinearities addressed by the lifted super-resolution framework.
6G is poised to integrate sensing and communication (ISAC) functionalities, leveraging extremely large antenna arrays (ELAAs) for enhanced performance. Traditional ISAC models assume far-field operation, where signal wavefronts are planar. However, in realistic scenarios, targets and users are often in close proximity to the receiver, placing them in the near-field region where spherical wavefronts dominate. This shift introduces phase nonlinearities in the channel's steering vectors, a challenge that traditional Fourier analysis methods struggle to address. Near-field ISAC offers advantages like enhanced security, improved spatial resolution, and lower power consumption, making it ideal for applications like contactless payments and health monitoring. However, the nonlinear phase variations necessitate innovative signal processing techniques.
This paper introduces a novel lifted super-resolution framework to tackle the near-field ISAC challenge. The core idea is to transform the nonlinear problem into a linear one in a higher-dimensional space. This is achieved by employing the Jacobi-Anger expansion:
e<sup>iz cos(θ)</sup> = Σ<sub>l=-∞</sub><sup>∞</sup>j<sub>l</sub>(z)e<sup>ilθ</sup>
where j<sub>l</sub>(z) represents the Bessel function of order l. This expansion effectively linearizes the phase relationship with respect to the angle θ, enabling the application of linear processing techniques. The steering vector a(θ,r), which depends on both angle θ and distance r, is then approximated as a product of a distance-dependent matrix and a Vandermonde vector representing the far-field steering vector.
The proposed method then leverages super-resolution techniques to exploit the continuous angular sparsity inherent in the near-field channel. Specifically, an optimization problem is formulated to estimate the continuous-valued angle parameter. This problem is relaxed into a tractable semidefinite programming (SDP) formulation, which can be efficiently solved. Once the angle estimates are obtained, an alternating optimization procedure is employed to recover the distances and channel gains. Numerical experiments validate the effectiveness of the proposed method. In a scenario with a target at θ<sub>R</sub> = π/3 and r<sub>R</sub> = 4, and a user at θ<sub>C</sub> = 1 and r<sub>C</sub> = 2, the method accurately estimates the angles and distances. The estimated distances were r<sub>C</sub> = 1.9 and r<sub>R</sub> = 4.05, demonstrating the high precision of the proposed approach. The results highlight the potential of lifted super-resolution for practical near-field ISAC applications in 6G, paving the way for enhanced sensing and communication capabilities in next-generation wireless systems. However, the computational complexity of the SDP problem, which depends on the frequency and distance parameters, remains a consideration for future research.
This newsletter highlights a convergence of advanced techniques in signal processing and machine learning to address critical challenges across diverse fields. From leveraging LLMs for predictive maintenance in industrial settings to optimizing resource allocation in multiview systems and tackling the complexities of near-field ISAC in 6G, these studies showcase innovative solutions with significant real-world implications. The common thread weaving through these diverse applications is the intelligent use of data and sophisticated algorithms to enhance performance, efficiency, and robustness. The advancements presented in these papers offer promising directions for future research and development, paving the way for smarter, more efficient, and more reliable systems across various domains.