Paper deep dive

Machine Learning for the Internet of Underwater Things: From Fundamentals to Implementation

Kenechi Omeke, Attai Abubakar, Michael Mollel, Lei Zhang, Qammer H. Abbasi, Muhammad Ali Imran

Year: 2026Venue: arXiv preprintArea: eess.SYType: PreprintEmbeddings: 416

Abstract

Abstract:The Internet of Underwater Things (IoUT) is becoming a critical infrastructure for ocean observation, marine resource management, and climate science. Its development is hindered by severe acoustic attenuation, propagation delays far exceeding those of terrestrial wireless systems, strict energy constraints, and dynamic topologies shaped by ocean currents. Machine learning (ML) has emerged as a key enabler for addressing these limitations, offering data driven mechanisms that enhance performance across all layers of underwater wireless sensor networks. This tutorial survey synthesises ML methodologies supervised, unsupervised, reinforcement, and deep learning specifically contextualised for underwater communication environments. It outlines the algorithmic principles of each paradigm and examines the conditions under which particular approaches deliver superior performance. A layer wise analysis highlights physical layer gains in localisation and channel estimation, MAC layer adaptations that improve channel utilisation, network layer routing strategies that extend operational lifetime, and transport layer mechanisms capable of reducing packet loss by up to 91 percent. At the application layer, ML enables substantial data compression and object detection accuracies reaching 92 percent. Drawing on 300 studies from 2012 to 2025, the survey documents energy efficiency gains of 7 to 29 times, throughput improvements over traditional protocols, and cross layer optimisation benefits of up to 42 percent. It also identifies persistent barriers, including limited datasets, computational constraints, and the gap between theoretical models and real world deployment. The survey concludes with emerging research directions and a technology roadmap supporting ML adoption in operational underwater networks.

PDF

Open source PDF →

PDF not stored locally. Use the link above to view on the source site.

Intelligence

Status: succeeded | Model: google/gemini-3.1-flash-lite-preview | Prompt: intel-v1 | Confidence: 94%

Last extracted: 3/13/2026, 12:33:34 AM

Summary

This paper provides a comprehensive tutorial and survey on the application of Machine Learning (ML) to the Internet of Underwater Things (IoUT). It addresses the unique challenges of underwater environments—such as acoustic attenuation, propagation delays, and energy constraints—by synthesizing 300 studies from 2012 to 2025. The survey covers ML paradigms (supervised, unsupervised, reinforcement, and deep learning) across all protocol layers, highlighting performance gains in localization, channel estimation, routing, and data compression, while identifying critical implementation barriers and future research directions like physics-informed neural networks.

Entities (5)

Internet of Underwater Things · paradigm · 100%Machine Learning · technology · 100%Autonomous Underwater Vehicles · device · 95%Physics-Informed Neural Networks · algorithm · 95%Underwater Wireless Sensor Networks · network-type · 95%

Relation Signals (3)

Machine Learning → optimizes → Internet of Underwater Things

confidence 95% · ML has emerged as a key enabler for addressing these limitations, offering data driven mechanisms that enhance performance across all layers of underwater wireless sensor networks.

Physics-Informed Neural Networks → enhances → Internet of Underwater Things

confidence 90% · The survey identifies high-impact research directions including physics-informed neural networks that achieve accurate predictions

Autonomous Underwater Vehicles → operateswithin → Internet of Underwater Things

confidence 90% · The IoUT architecture comprises several key components... These nodes communicate with Autonomous Underwater Vehicles (AUVs)

Cypher Suggestions (2)

Identify devices used in IoUT architectures · confidence 90% · unvalidated

MATCH (d:Device)-[:OPERATES_WITHIN]->(n:Network {name: 'IoUT'}) RETURN d.name

Find all ML algorithms applied to IoUT layers · confidence 85% · unvalidated

MATCH (a:Algorithm)-[:APPLIED_TO]->(l:Layer)-[:PART_OF]->(n:Network {name: 'IoUT'}) RETURN a.name, l.name

Full Text

415,486 characters extracted from source content.

Expand or collapse full text

1 Machine Learning for the Internet of Underwater Things: From Fundamentals to Implementation Kenechi Omeke, Attai Abubakar, Michael Mollel, Lei Zhang, Qammer H. Abbasi and Muhammad Ali Imran James Watt School of Engineering, University of Glasgow, Glasgow, United Kingdom Abstract—The Internet of Underwater Things (IoUT) enables transformative applications in ocean monitoring, marine resource management, and climate science, yet faces formidable chal- lenges including severe acoustic signal attenuation, propagation delays that are 200,000 times greater than terrestrial wireless, extreme energy constraints, and dynamic network topologies caused by ocean currents. Machine learning (ML) techniques are revolutionising underwater wireless sensor networks to address these challenges. This comprehensive tutorial-survey examines how ML enables transformative capabilities across all protocol layers. We provide a systematic tutorial on ML algorithms, covering supervised, unsupervised, reinforcement, and deep learning paradigms, specifically contextualised for underwater communications, explaining not only algorithmic mechanics but why certain approaches excel in specific underwater scenarios. Our layer-by-layer analysis covers physical layer innovations including high-accuracy localisation techniques and substantial channel estimation improvements, MAC layer adaptations which demonstrate significant channel utilisation gains over baseline protocols, network layer protocols that offer substantial network lifetime extensions, transport layer optimisations that achieve up to 91% packet loss reduction, and application layer intelligence resulting in up to 10 times data compression and 92% object detection accuracy. We synthesise 300 papers from 2012–2025 that demonstrate how ML approaches achieve substantial energy efficiency gains (7–29 times in specific scenarios) and notable throughput improvements over traditional methods, with cross- layer optimisation delivering 42% additional performance be- yond layer-isolated approaches. We critically examine implemen- tation challenges, including the “million-dollar dataset” problem, computational constraints of underwater platforms, and the theory-to-practice deployment gap. The survey identifies high- impact research directions including physics-informed neural networks that achieve accurate predictions from hundreds of measurements rather than millions, federated learning enabling privacy-preserving collaboration despite acoustic bandwidth lim- itations (10–100 kbps), and transformer architectures that cap- ture long-range dependencies in acoustic signals. We present a technology roadmap covering near-term deployments through transformative capabilities expected from 2035 and beyond, alongside practical decision frameworks for ML adoption. This work serves as both an authoritative reference for researchers entering the field and a practical implementation guide for engineers deploying ML-enhanced underwater networks. Index Terms—Internet of Underwater Things, machine learn- ing, deep learning, reinforcement learning, federated learning, underwater acoustic communications, wireless sensor networks, autonomous underwater vehicles, physics-informed neural net- works I. INTRODUCTION The Earth is fundamentally a water planet, with over 70% of its surface covered by oceans that regulate global climate, generate approximately 50% of the planet’s oxygen, absorb 25% of atmospheric carbon dioxide, and provide sustenance for billions of people worldwide [1], [2]. Despite this critical role in sustaining life, more than 90% of our oceans remain unexplored, presenting both an opportunity and an urgent challenge as climate change threatens marine ecosystems and, by extension, human survival [3]. The Internet of Underwater Things (IoUT) has emerged as a transformative paradigm to revolutionise our stewardship of marine environments through the convergence of advanced sensing, wireless communication, and artificial intelligence (AI) [4], [5]. A. The Internet of Underwater Things: Vision and Challenges The IoUT represents a sophisticated ecosystem of inter- connected underwater devices, sensors, and autonomous ve- hicles that collect, transmit, and analyse marine data in real- time [2], [3]. This paradigm extends the terrestrial Internet of Things (IoT) into the aquatic domain, enabling unprecedented monitoring capabilities for applications ranging from climate change mitigation to offshore energy production, marine bio- diversity conservation, and national security operations [6]. Terminology Note: Throughout this survey, we use IoUT as the umbrella term encompassing all underwater network- ing paradigms. This includes Underwater Wireless Sensor Networks (UWSNs), which refer to networks of battery- powered sensors, and Underwater Acoustic Sensor Networks (UASNs), which specifically denote acoustic communication- based systems. Formally, IoUT⊃ UWSN⊃ UASN, with IoUT representing the broadest concept of networked underwater intelligence. At its core, the IoUT architecture comprises several key components working in concert. Underwater sensor nodes form the foundation, deployed across the seafloor or suspended at various depths to monitor physical parameters (such as temperature, pressure, and salinity) and chemical indicators (including pH levels, dissolved oxygen, and pollutant concen- trations) [7]. These nodes communicate with Autonomous Un- derwater Vehicles (AUVs) that serve as mobile data collectors and relay stations, bridging the gap between stationary sensors and surface gateways [8], [9]. Surface buoys and vessels equipped with satellite or cellular communication capabilities complete the network architecture, providing the critical link to cloud-based data centres where advanced analytics and decision-making occur [10]. The evolution toward IoUT has been driven by converg- ing technological advances and pressing global needs. The catastrophic impacts of climate change on marine ecosys- tems—from coral bleaching events that have devastated the arXiv:2603.07413v1 [eess.SY] 8 Mar 2026 2 Great Barrier Reef to the accelerating acidification of ocean waters—demand comprehensive, real-time monitoring systems that traditional oceanographic methods cannot provide [11]. Simultaneously, the explosive growth in offshore activities, including renewable energy installations, aquaculture opera- tions, and deep-sea mining ventures, requires sophisticated underwater communication networks for operational efficiency and environmental compliance [3]. Consider the scale of the challenge: monitoring even a small fraction of the ocean’s 361 million square kilometres of surface area, extending to average depths of 3,688 metres, requires networks of thousands or potentially millions of sensors [12]. These networks must operate autonomously for extended periods, often years, in one of the most hostile envi- ronments on Earth. The pressure at ocean depths can exceed 1,000 times atmospheric pressure, temperatures hover near freezing, and corrosive saltwater attacks electronic components relentlessly [13]. Unlike terrestrial sensor networks where maintenance crews can readily access and service equipment, underwater sensors may be deployed at depths where human intervention is impossible or prohibitively expensive [12]. The applications enabled by IoUT span multiple domains with transformative potential. In environmental monitoring, dense sensor networks track the formation and movement of harmful algal blooms that threaten marine life and coastal communities, while distributed acoustic sensors monitor the health of marine mammal populations through their vocalisa- tions [14]. For the offshore energy sector, IoUT enables real- time structural health monitoring of oil platforms, pipelines, and wind turbines, detecting microscopic cracks or corrosion before catastrophic failures occur [15]. Military and security applications leverage IoUT for harbour protection, mine detec- tion, and submarine tracking, while scientific research benefits from continuous observation of deep-sea hydrothermal vents, underwater volcanoes, and previously inaccessible marine habitats [16]. The economic implications are equally profound. The global “Blue Economy,” valued at over $1.5 trillion annually, depends increasingly on reliable underwater communication and mon- itoring systems [3]. Aquaculture operations, which produce over 80 million tons of seafood annually, utilise IoUT for optimising feeding schedules, monitoring water quality, and tracking fish health [17]. Offshore wind farms, projected to generate 420 GW of power by 2050, rely on underwater sensor networks for foundation monitoring and cable integrity assessment. Even international telecommunications, with 99% of intercontinental data traffic carried by submarine cables worth over $10 trillion in annual transactions, depends on IoUT technologies for cable monitoring and protection [3]. B. Unique Challenges of Underwater Communications The underwater environment presents fundamental physical challenges that render conventional wireless communication technologies ineffective or severely limited. Understanding these challenges is crucial for appreciating why ML ap- proaches have become essential for IoUT systems [18]. 1) Physical Propagation Characteristics: The propagation of electromagnetic and acoustic waves underwater differs dramatically from terrestrial environments, creating unique constraints for each communication modality: Acoustic Communication: Sound waves remain the pri- mary communication medium for long-range underwater ap- plications due to their relatively low attenuation in seawater. However, acoustic communication suffers from severe limita- tions that would be unacceptable in terrestrial networks [19]. The speed of sound in water, approximately 1,500 m/s, is 200,000 times slower than electromagnetic waves in air, re- sulting in propagation delays measured in seconds rather than microseconds for kilometre-scale distances. This fundamental constraint creates challenges for any protocol requiring ac- knowledgments or time synchronisation [20]. The acoustic channel’s bandwidth is severely limited, typically offering only 1–100 kHz for practical systems, compared to GHz- scale bandwidths available to terrestrial wireless networks. This bandwidth limitation becomes more severe with distance due to frequency-dependent absorption, where higher frequen- cies experience exponentially greater attenuation [21]. Fur- thermore, the acoustic channel exhibits extreme time-varying characteristics. Sound speed varies with temperature, salinity, and pressure, creating curved propagation paths that change with daily and seasonal cycles. In shallow water environments, multipath propagation from surface and bottom reflections cre- ates frequency-selective fading with delay spreads exceeding 100 milliseconds—orders of magnitude greater than terrestrial wireless channels [22]. Doppler effects from platform motion and water currents further complicate signal processing, with Doppler spreads potentially exceeding 10 Hz even for slowly moving platforms. Optical Communication: Visible light communication of- fers high bandwidth potential underwater, with blue-green wavelengths experiencing relatively low absorption in clear ocean water [23]. Modern underwater optical systems can achieve data rates exceeding 1 Gbps over distances of 100 metres in optimal conditions. However, optical communication faces severe range limitations due to exponential attenuation from both absorption and scattering. In typical ocean water, optical signals may only propagate 10–20 metres, while in turbid coastal waters, the range drops to mere metres or even centimetres [24]. The requirement for line-of-sight alignment between transmitter and receiver presents additional challenges in the dynamic underwater environment. Ocean currents, plat- form motion, and marine growth on optical windows all contribute to alignment difficulties. Radio Frequency (RF) and Magnetic Induction (MI): Electromagnetic waves at radio frequencies experience severe attenuation in seawater due to its high conductivity (typically 4 S/m) [25]. The skin depth, which characterises penetration distance, is inversely proportional to the square root of fre- quency. While extremely low frequencies (ELF, 3–30 Hz) can propagate through seawater for thousands of kilometres, they require enormous antennas and offer data rates measured in bits per minute, making them impractical for most IoUT ap- plications. Magnetic induction (MI) offers a unique alternative based on near-field coupling between coil antennas [10]. MI 3 channels exhibit predictable, distance-dependent attenuation without the multipath fading that plagues acoustic and RF systems. However, MI systems typically require large coil antennas and suffer from rapid signal decay with distance (proportional to 1/r 3 ), limiting their application to short- range, high-reliability scenarios [26]. 2) Environmental and Operational Challenges: Beyond propagation physics, the underwater environment imposes severe operational constraints that compound communication difficulties: Energy Constraints: Underwater sensors operate on finite battery resources that cannot be easily replaced or recharged. Solar panels cannot function at depth, and the logistics of battery replacement for thousands of sensors deployed at ocean depths make it economically infeasible [27]. Acoustic modems consume 10–100 watts during transmission—orders of mag- nitude higher than terrestrial wireless systems—while even receiving operations draw several watts. With typical battery capacities of 10–100 Wh for compact sensors, operational lifetimes are measured in weeks or months rather than the years achieved by terrestrial IoT devices [28]. Node Mobility and Network Topology: Ocean currents cause continuous sensor drift, with velocities ranging from centimetres per second in deep waters to metres per second in tidal zones. This mobility destroys any carefully planned network topology within hours or days of deployment [29]. Sensors deployed in a grid pattern quickly disperse into irregu- lar configurations, creating coverage gaps and communication voids. The three-dimensional nature of the ocean adds com- plexity, as sensors can move vertically due to pressure changes, temperature gradients, or attachment to marine organisms. Environmental Noise: The underwater acoustic environ- ment contains numerous noise sources that vary spatially and temporally [14]. Shipping noise dominates low frequencies (10–1000 Hz) near commercial routes, with levels exceed- ing 100 dB re 1 μPa. Breaking waves create broadband noise that increases with wind speed, while marine mammals produce intense biological noise—snapping shrimp colonies generate broadband clicks exceeding 200 dB re 1 μPa at close range [30]. Biofouling and Corrosion: Marine growth accumulates on exposed surfaces within days of deployment, potentially covering acoustic transducers, optical windows, and sensor membranes. Biofouling alters acoustic impedance, reduces op- tical transmission, and can completely disable sensors within months. Corrosion from saltwater exposure attacks electronic components and mechanical structures, while pressure hous- ings must withstand immense static pressures and cyclic loading [13]. Deployment and Maintenance Costs: The economics of underwater operations differ dramatically from terrestrial networks. Research vessel operations cost $20,000–$50,000 per day, making sensor deployment and recovery expensive propositions [12]. Deep-sea operations requiring specialised vessels and Remotely Operated Vehicles (ROVs) can exceed $100,000 per day. Even in shallow coastal waters, diver operations cost thousands of dollars per day with strict safety limitations. These economic realities demand that IoUT sys- tems operate autonomously for extended periods with minimal human intervention. C. Why ML for IoUT? The convergence of these challenges—hostile propaga- tion environments, severe resource constraints, dynamic net- work topologies, and prohibitive maintenance costs—renders traditional communication approaches inadequate for IoUT systems. Conventional protocols designed for stable, high- bandwidth terrestrial networks fail catastrophically when con- fronted with seconds-long propagation delays, time-varying channels, and nodes that drift kilometres from their deploy- ment positions [31]. This is where ML emerges not just as an optimisation tool but as an essential enabler of functional IoUT systems [32]. 1) Fundamental Advantages of ML Approaches: ML algo- rithms offer unique capabilities that directly address the core challenges of IoUT: Adaptation to Non-Stationary Environments: Unlike traditional protocols with fixed parameters, ML algorithms continuously learn and adapt to changing environmental con- ditions [33]. Consider acoustic channel equalisation: conven- tional approaches require accurate channel models that become obsolete within minutes as temperature gradients shift. In contrast, deep learning equalizers trained on diverse chan- nel conditions can generalise to previously unseen channel states, maintaining performance despite environmental vari- ations [34]. Implicit Environmental Modelling: The complexity of the ocean defies analytical modelling—three-dimensional temper- ature and salinity fields, irregular bottom topography, and in- ternal waves create propagation conditions that would require solving coupled partial differential equations in real-time. ML algorithms bypass this complexity by learning implicit envi- ronmental models from data. Reinforcement learning agents, for instance, discover optimal transmission strategies without explicitly modelling the channel, instead learning from reward signals based on successful packet delivery [35]. Predictive Capabilities for Proactive Management: Time- series prediction using Recurrent Neural Networks (RNNs) and Long Short-Term Memory (LSTM) networks enables IoUT systems to anticipate and prepare for environmental changes [36]. By learning patterns in historical oceanographic data, these models predict future channel conditions hours or days in advance, allowing proactive adjustment of commu- nication parameters. For example, LSTM models trained on tidal data can predict node positions with metre-scale accuracy hours in advance, enabling preemptive routing table updates. Intelligent Resource Management: The severe energy constraints of underwater sensors demand intelligent power management beyond simple duty cycling. ML algorithms optimise energy allocation across sensing, processing, and communication tasks based on learned patterns of data im- portance and channel conditions [37]. Reinforcement learning approaches have demonstrated 200–300% improvements in network lifetime by learning when to aggregate data locally versus transmit immediately, and when to enter deep sleep modes based on predicted future communication opportunities. 4 2) Transformative Applications Enabled by ML: The inte- gration of ML into IoUT systems has enabled applications that were previously impossible: Autonomous Underwater Vehicle Navigation: Traditional AUV navigation relies on pre-programmed waypoints and basic obstacle avoidance. ML-enabled AUVs use deep rein- forcement learning for adaptive path planning that responds to discovered features, unexpected obstacles, and dynamic current fields [8], [38]. These systems have achieved sig- nificant reductions in energy consumption while improving area coverage by learning efficient search patterns tailored to specific environments. Distributed Environmental Sensing: ML transforms net- works of simple sensors into intelligent environmental moni- toring systems. Instead of transmitting raw measurements that quickly exhaust batteries, edge ML algorithms identify and transmit only anomalous events [39]. Federated learning ap- proaches enable sensors to collaboratively build environmental models without centralised data collection, preserving privacy while reducing communication overhead by up to 90% [40], [41]. Adaptive Protocol Stacks: Every layer of the commu- nication protocol stack benefits from ML optimisation. At the physical layer, deep learning improves modulation clas- sification accuracy even at negative Signal-to-Noise Ratios (SNR) [42]. The MAC layer employs reinforcement learning for collision-free channel access that achieves significantly higher channel utilisation compared to traditional ALOHA variants in long-delay acoustic networks [43]. Network layer protocols use Q-learning for routing decisions that balance energy consumption, delay, and reliability based on application requirements [44]. 3) Recent Breakthroughs and Success Stories: The past five years have witnessed remarkable demonstrations of the trans- formative potential of ML in real-world IoUT deployments: The DARPA Ocean of Things program has deployed thou- sands of intelligent floats equipped with edge ML capabilities for persistent maritime surveillance, using onboard processing to classify vessel signatures while minimising power consump- tion [45]. Commercial aquaculture operations in Norway have deployed ML-enabled monitoring networks that reduced fish mortality by significant margins through early disease detec- tion using computer vision algorithms that analyse swimming patterns [17]. Research initiatives like FathomNet use ML to process terabytes of visual data, accelerating marine species discovery and enabling automated anomaly detection in deep- sea environments [46]. D. Contributions and Organisation This article provides a comprehensive tutorial and survey on ML techniques and their applications in the IoUT, specifically designed to guide researchers and practitioners in selecting and implementing appropriate ML solutions for underwater communication and networking challenges. Our contributions are fourfold: 1) Tutorial Foundation: We present a systematic tuto- rial on ML fundamentals tailored specifically for the underwater communications community. Rather than generic ML descriptions, we explain each algorithm category—supervised, unsupervised, reinforcement, and deep learning—through the lens of underwater appli- cations, providing intuitive explanations of why certain approaches excel in specific underwater scenarios. 2) Layer-by-Layer Survey: We provide the first compre- hensive survey of ML applications in IoUT organised by protocol stack layers, covering literature from 2012 to 2025. This organisation enables practitioners to quickly identify relevant techniques for their specific challenges, whether optimising physical layer modulation, designing MAC protocols for long-delay channels, implementing energy-aware routing, or developing application-layer data analytics. 3) Implementation Guidelines: We synthesise practical implementation guidelines derived from successful de- ployments, addressing the critical gap between theoretical ML research and operational IoUT systems (detailed in Section VI). These guidelines cover computational constraints of underwater platforms, training data require- ments (including the “million-dollar dataset” problem discussed in Section VIII-A1), model selection criteria that balance accuracy versus complexity, and deployment strategies for resource-constrained networks. 4) Future Roadmap: We identify emerging research direc- tions at the intersection of ML and IoUT, highlighting opportunities where recent ML advances, from Physics- Informed Neural Networks (PINNs) [47] to transformer architectures [48], can address long-standing underwater communication challenges. We provide a roadmap en- abling researchers to focus efforts on high-impact prob- lems. The remainder of this article is organised as follows: Sec- tion I presents our ML primer for underwater communica- tions. Section I provides a critical comparison with existing surveys. Section IV forms the technical core, systematically reviewing ML applications across protocol layers. Section V presents quantitative comparisons between ML and traditional approaches. Section VI addresses implementation challenges and solutions. Section VII explores future research directions and emerging opportunities. Section VIII documents open challenges that need to be addressed before intelligent IoUT systems can reach their full deployment maturity. Finally, Section IX summarises key findings and conclusions. 5 I. ML PRIMER FOR UNDERWATER COMMUNICATIONS Navigation Guide For readers with strong ML background: This sec- tion provides a 35-page tutorial on ML fundamentals contextualised for underwater applications. Readers familiar with supervised learning, reinforcement learn- ing, and deep neural networks may: • Skip to Section I-F2 (Physics-Informed Neural Networks) and Section I-F4 (Transformer Archi- tectures) for emerging paradigms, OR • Proceed directly to Section IV (Layer-by-Layer Analysis) for underwater-specific applications For readers new to ML: This section builds intuition progressively from fundamentals to advanced architec- tures, with all concepts explained through underwater examples. ML represents a paradigm shift in how we approach underwater communication challenges, moving from rigid, rule-based protocols to adaptive systems that learn optimal strategies from experience [33], [49]. This section provides a comprehensive tutorial on ML techniques specifically contex- tualised for underwater applications, explaining not just what these algorithms do, but why certain approaches excel in ad- dressing the unique challenges of the underwater environment. We structure this primer to build intuition progressively, start- ing with fundamental concepts and advancing to sophisticated architectures currently revolutionising IoUT systems. A. Foundations of ML in the Underwater Context Before discussing specific algorithms, it is essential to understand what makes ML uniquely suited to underwater environments and how the learning paradigm differs from traditional algorithmic approaches [3], [50]. 1) The Learning Paradigm Shift: Traditional underwater communication protocols operate on predetermined rules: transmit at power level P , wait for time T , retransmit N times upon failure [18]. These rules, derived from theoretical models or empirical observations, remain fixed regardless of environ- mental changes. When water temperature stratification alters acoustic propagation paths, when seasonal migrations bring noise-generating marine life, or when storm-driven currents scatter sensor nodes, traditional protocols cannot adapt—they continue executing the same rigid rules, often with catastrophic performance degradation [51]. ML fundamentally changes this paradigm [52], [53]. Instead of programming explicit rules, we enable systems to learn patterns from data and experience. An ML-enabled acoustic modem does not follow fixed transmission rules; it learns when higher power improves reliability, when waiting re- duces collisions, and when alternative routes bypass inter- ference [33]. This learning occurs through three fundamen- tal mechanisms that we will explore in detail: supervised learning from labelled examples, unsupervised learning from data structure, and reinforcement learning from environmental interaction [54]. TABLE I COMPARISON OF TRADITIONAL VS. ML-BASED APPROACHES IN UNDERWATER COMMUNICATIONS FunctionTraditional ApproachML-Based Approach Channel Estimation Analytical models, pilot symbols Neural network predic- tion, adaptive [34] Power ControlFixed levels, lookup ta- bles RL-basedadapta- tion [35] MAC ProtocolFixed backoff, TDMA slots Learning-based schedul- ing [43] RoutingShortestpath, geographic Q-learning,GNN- based [44] LocalisationToA/TDoA algorithmsDNN regression, RL- aided [55] Consider a concrete example that illustrates this paradigm shift. A traditional underwater MAC protocol might implement carrier sense multiple access (CSMA) with fixed backoff windows, designed for worst-case propagation delays [18]. In a shallow water environment with 10 km maximum range, this means waiting up to 13 seconds (assuming 1500 m/s sound speed) before transmission—even when communicating with a neighbour 100 metres away. An ML-based approach learns the actual network topology and traffic patterns, adapting backoff times to real conditions [43]. Through reinforcement learning, nodes discover that morning thermal stratification creates reliable long-range propagation, enabling aggressive transmission scheduling, while afternoon mixing requires con- servative strategies [35]. The result: 200–300% throughput improvement without modifying hardware [32]. Table I summarises the key differences between traditional and ML-based approaches across major underwater network- ing functions. 2) Data Representations for Underwater Signals: The foundation of any ML system is data representation—how we transform raw underwater signals into mathematical forms that algorithms can process [56]. This transformation critically im- pacts learning effectiveness and computational requirements. Acoustic signals in underwater communications typically ar- rive as time-series pressure measurements from hydrophones, sampled at rates from 10 kHz to 1 MHz depending on the communication bandwidth [21]. The raw time-domain signal x(t) contains all information but obscures patterns that ML algorithms need to recognise. Therefore, we employ various transformations that highlight different signal charac- teristics [57]. Frequency Domain Representation: The frequency do- main representation via Fast Fourier Transform (FFT) reveals spectral content crucial for identifying modulation schemes and detecting narrowband interference [22]. For an N -point FFT of time-domain samples x(n), we obtain complex spectral coefficients X(k) that separate signal from noise in frequency: X(k) = N−1 X n=0 x(n)e −j2πkn/N ,(1) where k is the frequency bin index. Time-Frequency Representations: Underwater acoustic channels exhibit time-varying frequency responses due to surface waves and platform motion [58]. This motivates 6 Machine Learning for IoUT Supervised Learning Classification (SVM, RF) Regression (GP, Linear) Neural Nets (MLP, CNN) Unsupervised Learning Clustering (k-Means) Dim. Reduction (PCA) Anomaly Detection Reinforcement Learning Value-Based (Q-Learn, DQN) Policy Gradient (PPO, A3C) Model-Based (Dyna) Advanced Paradigms Federated Learning Meta-Learning (MAML) Transformers, GNNs Fig. 1. Taxonomy of ML techniques for the IoUT. This survey covers four major categories: supervised learning for classification and prediction tasks, unsupervised learning for pattern discovery and compression, reinforcement learning for adaptive protocol design, and advanced paradigms including federated learning and transformer architectures. time-frequency representations like spectrograms, which ap- ply short-time Fourier transforms (STFT) to capture spectral evolution: S(t,f ) = ∞ X n=−∞ x(n)w(n− t)e −j2πfn 2 ,(2) where S(t,f ) is the spectrogram (power spectral density at time t and frequency f ), x(n) is the discrete-time signal, and w(n) is a window function (e.g., Hamming or Hann window) that balances time and frequency resolution. For underwater communications with typical symbol rates of 1–10 kbaud and Doppler spreads up to 10 Hz, window lengths of 10–100 ms provide effective time-frequency resolution [22]. Cepstral Domain: The cepstral domain, obtained by com- puting the inverse FFT of the log-magnitude spectrum, sepa- rates channel effects from transmitted signals, which is partic- ularly valuable in multipath environments [56]: c(n) = IFFTlog|X(k)|,(3) where c(n) is the cepstral coefficient at quefrency n and X(k) is the frequency-domain representation. Cepstral coef- ficients concentrate multipath information in high-frequency components whilst preserving modulation information in low- frequency terms, enabling ML algorithms to independently learn channel compensation and symbol detection strategies. Spatial Representations: For spatial processing with hy- drophone arrays, we extend representations to include di- rectional information [59]. The array covariance matrix R captures spatial correlation: R = E[x(t)x H (t)],(4) where x(t) is the vector of array measurements at time t, E[·] denotes expectation, and (·) H denotes Hermitian trans- pose. Eigendecomposition of R separates signal and noise subspaces, enabling ML algorithms to learn beamforming weights that maximise signal-to-interference-plus-noise ratio (SINR) [60]. Learned Representations: Modern deep learning ap- proaches often bypass manual feature engineering, learning optimal representations directly from raw data [52]. Convolu- tional neural networks automatically discover filter banks that extract relevant features, while attention mechanisms identify important temporal patterns [61]. These learned representa- tions often outperform handcrafted features and discover subtle patterns humans overlook, such as micro-Doppler signatures from platform vibrations that aid in source classification [62]. 3) The Curse of Dimensionality in Underwater Data: Un- derwater communication systems generate high-dimensional data that challenges ML algorithms [3]. A modest 10 kHz sampling rate produces 600,000 samples per minute, thus, a small 10-node network generates gigabytes daily. This dimen- sionality explosion, known as the “curse of dimensionality”, causes several problems that are particularly acute in under- water environments: • Sample Complexity: The number of training exam- ples required grows exponentially with dimensionality, but underwater data collection is expensive and time- consuming [12]. • Computational Burden: Processing high-dimensional data on resource-constrained underwater nodes with lim- ited power budgets becomes intractable [27]. • Overfitting Risk: Models can memorise noise patterns in high-dimensional data rather than learning generalisable features [53]. Successful ML deployment requires aggressive dimension- ality reduction tailored to underwater characteristics [63]. Principal Component Analysis (PCA) identifies dominant vari- ations in ocean measurements, typically finding that 95% of variance is concentrated in 10–20 components from thousands of original dimensions. For acoustic signals, mel-frequency cepstral coefficients (MFCCs) reduce wideband spectrograms to 13–39 coefficients while preserving perceptually important information [56]. Learned embeddings from autoencoders can compress high-dimensional sensor readings to compact representations that preserve information relevant to specific tasks [64]. B. Supervised Learning Techniques for IoUT Supervised learning forms the backbone of many IoUT applications where we have labelled training data (examples of inputs paired with desired outputs) [65]. These techniques excel at pattern recognition tasks, such as identifying mod- ulation schemes, predicting channel conditions, classifying marine vessels, or estimating sensor locations [66]. 1) Classification Algorithms for Underwater Signals: Clas- sification assigns discrete labels to inputs, answering questions like: “Is this acoustic signature from a cargo ship or fish- ing vessel?” “Which modulation scheme is being received?” 7 “Is this sensor measurement normal or anomalous?” [67]. The underwater environment presents unique classification challenges: limited training data due to deployment costs, class imbalance (rare events like oil leaks versus normal operations), and distribution shift (training in calm conditions but deploying during storms) [68]. k-Nearest Neighbours (k-N) for Acoustic Pattern Matching: The k-N algorithm classifies inputs based on the majority class among k nearest training examples, making it particularly suitable for underwater acoustic classification where physical proximity often correlates with similar propa- gation conditions [69]. For vessel classification from acoustic signatures, k-N achieves surprising effectiveness by match- ing spectral patterns. Consider a hydrophone array monitoring harbour traffic. Each vessel generates a unique acoustic signature combining engine noise, propeller cavitation, and hull vibrations [70]. We represent each signature as a feature vector x i containing spectral peak frequencies, harmonic ratios, and broadband energy levels. Given an unknown signature x q , k-N finds the k most similar training examples based on Euclidean distance: d(x q ,x i ) = v u u t n X j=1 (x q,j − x i,j ) 2 .(5) The algorithm assigns the majority class among these neigh- bours. For k = 5 and a training set of 1000 labelled vessel passages, experimental deployments achieve 89–94% classification accuracy, distinguishing between container ships, tankers, fishing vessels, and recreational boats [71]. The success of k-N in underwater applications stems from its non-parametric nature—it makes no assumptions about data distribution, adapting naturally to the complex, multi- modal distributions of ocean measurements [69]. However, k-N requires careful consideration in resource-constrained underwater nodes. Storing thousands of training examples demands significant memory. Distance computations for each classification scale linearly with training set size. These limita- tions motivate approximate nearest neighbour methods using locality-sensitive hashing or tree-based indexing that reduce search complexity from O(n) to O(logn) [65]. Support Vector Machines for Robust Classification: Support Vector Machines (SVMs) construct optimal decision boundaries that maximise separation between classes, provid- ing robust classification even with limited training data—a critical advantage in expensive underwater deployments [72]. The SVM solves a constrained optimisation problem to find a hyperplane that maximises the margin between classes whilst allowing controlled misclassification through slack variables (full optimisation formulation in Appendix A). For underwater modulation classification, SVMs excel at distinguishing between phase-shift keying (PSK), frequency- shift keying (FSK), and orthogonal frequency-division multi- plexing (OFDM) schemes even at low signal-to-noise ratios (SNR) [42]. The kernel trick enables nonlinear classification without explicit feature mapping, with the Gaussian radial ba- sis function (RBF) kernel implicitly mapping acoustic features to infinite-dimensional space where linear separation becomes possible. Experimental results show SVMs achieving 92–97% modulation classification accuracy at 0 dB SNR, compared to 75–80% for traditional likelihood-based methods [42]. The margin-maximisation principle provides inherent ro- bustness to the noise and interference plaguing underwater channels [72]. Only support vectors—training examples near decision boundaries—determine the classifier, automatically ignoring outliers from occasional interference spikes. This robustness extends to temporal variations; SVMs trained on summer acoustic conditions maintain 85–90% accuracy during winter deployments despite significant sound speed profile changes. Decision Trees and Random Forests for Interpretable Decisions: Decision trees recursively partition feature space using threshold tests, creating interpretable models that explain their reasoning—crucial for safety-critical underwater applica- tions where operators must understand and trust automated de- cisions [73]. The tree construction selects splits that maximise information gain: IG(S,A) = H(S)− X v∈V alues(A) |S v | |S| H(S v ),(6) where H(S) is the entropy of set S and S v is the subset with attribute A having value v. For underwater network routing decisions, decision trees learn readable rules: “IF depth < 100m AND time = night AND season = summer THEN use-surface-reflection-path ELSE use-direct-path.” [44]. This interpretability enables net- work operators to verify that learned strategies align with oceanographic principles and safety requirements. Random Forests extend decision trees by training mul- tiple trees on bootstrap samples and feature subsets, then combining predictions through voting [74]. This ensemble approach dramatically improves accuracy and robustness. For underwater sensor fault detection, Random Forests achieve 95– 98% detection accuracy by learning complex patterns: gradual sensitivity drift in salinity sensors, sudden offsets from bio- fouling, or intermittent failures from connector corrosion [75]. The ensemble naturally handles the heterogeneous features in underwater sensing—mixing continuous measurements (tem- perature, pressure) with categorical variables (location, season) and temporal patterns (tide phase, diurnal cycles). 2) Regression Methods for Continuous Predictions: Re- gression predicts continuous values critical for underwater op- erations: future channel capacity, optimal transmission power, time-to-failure for sensors, or AUV position estimates [33]. The underwater environment’s continuous nature—gradually varying temperature gradients, slowly changing currents, pro- gressively accumulating biofouling—makes regression essen- tial for system optimisation. Linear Regression for Channel Prediction: Despite its simplicity, linear regression provides effective baseline pre- dictions for many underwater parameters that vary smoothly with environmental factors [76]. The model predicts output y as a weighted combination of inputs: y = β 0 + n X i=1 β i x i + ε.(7) 8 For predicting acoustic propagation loss, linear regression on temperature, salinity, and depth achieves root mean square errors (RMSE) of 3–5 dB for ranges up to 10 km—sufficient accuracy for power control decisions [76]. The closed-form solution via normal equations: β = (X T X) −1 X T y(8) enables rapid model updates as new measurements arrive, critical for adapting to changing ocean conditions. Ridge regression adds L2 regularisation λ||β|| 2 to prevent overfitting when training data is limited—common in ex- pensive underwater deployments [65]. For predicting sensor drift from environmental factors, ridge regression reduces prediction error by 20–30% compared to ordinary least squares by preventing the model from learning spurious correlations in small datasets. Gaussian Process Regression for Uncertainty Quantifica- tion: Gaussian Processes (GPs) provide not just predictions but uncertainty estimates—crucial for risk-aware decision-making in underwater operations [56]. A GP models the unknown function as a distribution over functions, specified by mean m(x) and covariance k(x,x ′ ) functions: f (x)∼GP(m(x),k(x,x ′ )).(9) For underwater field estimation, GPs excel at spatial in- terpolation with quantified uncertainty [3]. Consider map- ping temperature fields from sparse AUV measurements. The GP provides a posterior distribution over function values at unmeasured locations, yielding both a predictive mean and variance (detailed derivation in Appendix A). The predictive variance quantifies interpolation uncertainty, guiding adaptive sampling strategies [8]. AUVs use this uncertainty to identify regions requiring additional measurements, improving map- ping efficiency by 40–60% compared to predetermined survey patterns [38]. 3) Neural Networks for Complex Pattern Recognition: Ar- tificial neural networks, inspired by biological neurons, excel at learning complex nonlinear patterns in high-dimensional underwater data [57]. The fundamental building block—the perceptron—combines inputs through weighted connections, applies a nonlinear activation function, and produces an out- put: y = σ n X i=1 w i x i + b ! ,(10) where σ is an activation function like the rectified linear unit (ReLU): σ(z) = max(0,z). Multilayer perceptrons (MLPs) stack multiple layers of neurons, enabling representation of arbitrary nonlinear func- tions [53]. For underwater acoustic equalisation, a three-layer MLP with architecture 100-50-20-16 (input-hidden1-hidden2- output) learns to compensate for multipath distortion [34]: • The input layer receives 100 samples of received signal (covering several symbol periods). • The first hidden layer with 50 neurons learns basic fea- ture detectors—identifying symbol transitions, estimating carrier phase, detecting multipath arrivals. • The second hidden layer with 20 neurons combines these features into higher-level patterns—recognising inter- symbol interference patterns, identifying dominant prop- agation paths. • The output layer produces 16 soft decisions for 16-QAM constellation points. Training via backpropagation adjusts weights to minimise mean squared error between network outputs and transmitted symbols: ∆w ij =−η ∂E ∂w ij = ηδ j x i ,(11) where η is the learning rate and δ j is the error gradient at neuron j. Experimental deployments show neural network equalizers reducing bit error rates by factors of 10 2 to 10 4 compared to linear equalizers in shallow water channels with delay spreads exceeding 10 ms [34]. The network implicitly learns the channel inverse without explicit channel estimation, adapting to time-varying conditions through online training. C. Unsupervised Learning for Discovering Underwater Pat- terns Unsupervised learning extracts patterns from unlabelled data—abundant in underwater environments where manual labelling is expensive or impossible [65]. These techniques reveal hidden structure: identifying distinct water masses, dis- covering communication patterns, detecting anomalous events, or compressing high-dimensional measurements [3]. 1) Clustering Algorithms for Network Organisation: Clus- tering groups similar data points, naturally organising under- water networks for efficient operation [77]. The challenge lies in defining “similarity” in dynamic ocean environments where Euclidean distance poorly captures communication capability—two nodes 100 metres apart might be unable to communicate due to acoustic shadows while nodes kilometres apart enjoy reliable links via surface reflections [78]. k-Means Clustering for Energy-Efficient Topology: The k-means algorithm partitions n nodes into k clusters by minimising within-cluster sum of squares [79]: min C k X i=1 X x∈C i ||x−μ i || 2 ,(12) whereμ i is the centroid of cluster C i . For underwater sensor networks, k-means creates energy- balanced clusters for hierarchical communication [80], [81]. Instead of using only geographic positions, we define feature vectors incorporating: • Geographic coordinates (latitude, longitude, depth) • Residual energy levels • Communication success rates with neighbours • Historical traffic generation rates The algorithm iteratively: (1) assigns each node to the nearest centroid, (2) recomputes centroids as cluster means, and (3) repeats until convergence [82]. This produces clusters where members share similar communication characteristics and energy levels. Cluster heads, selected as nodes nearest 9 to centroids, aggregate data from members and forward to surface gateways. Field deployments demonstrate 40–60% energy savings compared to direct transmission, extending network lifetime from months to years [78], [83]. The choice of k critically impacts performance. Too few clusters force long-range intra-cluster communication; too many create overhead from inter-cluster coordination [84]. The elbow method selects k by identifying where increasing clusters yields diminishing returns in error reduction. For typical coastal deployments with 50–200 nodes, optimal k ranges from 5–15 clusters. Hierarchical Clustering for Multi-Scale Organisation: Hierarchical clustering builds a tree of nested clusters, en- abling multi-scale network organisation adaptive to communi- cation requirements [77]. Agglomerative clustering starts with individual nodes and recursively merges closest clusters: d(C i ,C j ) =min x∈C i ,y∈C j d(x,y).(13) For underwater networks, we define distance metrics cap- turing communication cost [85]: d comm (x,y) = P tx (||x− y||) P success (x,y) ,(14) where P tx is required transmission power and P success is link success probability. The resulting dendrogram reveals natural network hierarchies. Cutting at different heights produces organisations optimised for different objectives: few large clusters for energy efficiency, many small clusters for low latency, or adaptive cuts based on traffic patterns. Density-Based Clustering for Irregular Deployments: DBSCAN (Density-Based Spatial Clustering of Applications with Noise) identifies clusters of arbitrary shape—matching the irregular node distributions in ocean deployments where currents and obstacles create complex geometries [86]. The algorithm grows clusters from core points having minimum neighbours within radius ε: A point x is a core point if |N ε (x)| ≥ minPts where N ε (x) =y : d(x,y)≤ ε. For underwater networks, DBSCAN naturally identifies connected components while isolating outliers—nodes that have drifted beyond communication range [29]. Setting ε to maximum reliable communication range and minPts to 2– 3 produces clusters matching actual network connectivity. Unlike k-means, DBSCAN adapts to node failures and mobil- ity without reconfiguration, maintaining valid clusters as the network evolves. 2) Dimensionality Reduction for Data Compression: Un- derwater sensors generate high-dimensional data that exhausts limited bandwidth and storage [64]. Dimensionality reduction compresses measurements while preserving essential informa- tion, enabling efficient communication and analysis. Principal Component Analysis for Sensor Data: PCA identifies orthogonal directions of maximum variance, pro- jecting high-dimensional data onto principal components that capture most information [63]. For centred data matrix X, PCA computes eigenvectors of the covariance matrix: C = 1 n− 1 X T X.(15) TABLE I AUTOENCODER ARCHITECTURE FOR UNDERWATER ACOUSTIC COMPRESSION StageLayer TypeOutput/Kernel EncoderConv → ReLU64 filters, 5× 5 MaxPool2× 2 Conv → ReLU32 filters, 3× 3 MaxPool2× 2 Dense16 units DecoderDense32 units Reshape– ConvTrans → ReLU32 filters, 3× 3 UpSample2× 2 ConvTrans → Sigmoid64 filters, 5× 5 The projection onto k principal components: Z = XW k ,(16) where W k contains the k eigenvectors with largest eigenval- ues. For oceanographic measurements, PCA reveals remarkable compression potential [3]. Temperature-salinity profiles from CTD casts, nominally 1000-dimensional (measurements at 1000 depths), compress to 10–20 components while preserving 98% of variance. The principal components correspond to physically meaningful patterns: surface mixed layer depth, thermocline gradient, deep water masses. This compression enables efficient acoustic transmission of ocean profiles. Instead of transmitting 1000 floating-point values (32,000 bits), nodes send 20 coefficients (640 bits) achieving 50:1 compression with negligible reconstruction error [64]. The receiving station reconstructs profiles via: X reconstructed = ZW T k +μ.(17) Autoencoders for Nonlinear Compression: Autoencoders use neural networks to learn nonlinear compression schemes surpassing linear methods like PCA [53]. The encoder network f θ maps inputs to compressed representations: z = f θ (x).(18) The decoder network g φ reconstructs inputs: ˆ x = g φ (z).(19) Training minimises reconstruction error: min θ,φ n X i=1 ||x i − g φ (f θ (x i ))|| 2 .(20) For underwater acoustic signals, convolutional autoencoders achieve 100:1 compression while maintaining intelligibil- ity [64]. The encoder learns to extract essential spectral fea- tures while discarding water noise and redundancy. A typical architecture for compressing acoustic spectrograms is shown in Table I. This compresses 128×128 spectrograms (16,384 values) to 16-dimensional latent representations—1000:1 compression— while preserving sufficient detail for marine mammal vocali- sation classification or vessel identification [30]. 10 3) Anomaly Detection for Network Security and Monitor- ing: Anomaly detection identifies unusual patterns that may indicate equipment failures, security threats, or interesting environmental events [75]. In underwater networks, anomalies range from sensor drift and biofouling to malicious attacks and rare marine events [13]. One-class SVM learns a boundary around normal data, flagging anything outside as anomalous [72]. For underwater sensor networks, normal operational patterns include expected temperature ranges, typical acoustic noise levels, and regular communication schedules. Deviations—sudden temperature spikes, unusual acoustic signatures, or irregular transmission patterns—trigger alerts for further investigation [39]. Isolation Forests provide an alternative approach, identi- fying anomalies as points requiring fewer random splits to isolate [75]. This method proves particularly effective for de- tecting outliers in high-dimensional oceanographic data where traditional distance-based methods struggle. D. Reinforcement Learning for Adaptive Underwater Systems Reinforcement learning enables underwater systems to learn optimal behaviours through environmental interaction— essential when accurate models are unavailable or environ- ments change unpredictably [32], [54]. Unlike supervised learning requiring labelled examples, RL agents discover suc- cessful strategies through trial and error, receiving rewards for desirable outcomes [87], [88]. 1) Fundamental RL Concepts in Underwater Contexts: The RL framework models an agent interacting with an environment through states, actions, and rewards [89]. At each time step t: 1) Agent observes state s t (channel conditions, node posi- tions, energy levels) 2) Agent selects action a t (transmission power, routing decision, sleep schedule) 3) Environment transitions to state s t+1 according to dy- namics P (s t+1 |s t ,a t ) 4) Agent receives reward r t (successful transmission, energy saved, latency achieved) The agent’s goal is learning a policy π(a|s) maximising expected cumulative reward [54]. The cumulative discounted return G t from time step t is: G t = ∞ X k=0 γ k r t+k ,(21) where r t+k is the reward received at time step t + k 1 and discount factor γ ∈ [0, 1] balances immediate versus future rewards. For underwater applications, careful reward design is cru- cial [90]. Consider an AUV learning efficient survey patterns. A naive reward based solely on area covered encourages rapid 1 We use the convention where r t is the reward received when transitioning into state s t . Alternative formulations use r t+1 as the reward received after taking action a t in state s t ; both are valid and equivalent under proper index alignment. movement that misses important features. A better reward combines multiple objectives: r t = λ 1 · area covered t − λ 2 · energyused t + λ 3 · featuresdetected t − λ 4 · overlappenalty t , (22) where λ 1 ,λ 2 ,λ 3 ,λ 4 are weight coefficients that balance the trade-offs between coverage, energy efficiency, feature detec- tion, and redundancy avoidance. This encourages thorough coverage while minimising energy and avoiding redundant measurements [38]. 2) Value-Based Methods for Underwater Decision Making: Value-based RL methods learn the expected return from each state or state-action pair, deriving optimal policies from these value estimates [89]. Q-Learning for Acoustic MAC Protocols: Q-learning learns action values Q(s,a) representing expected return from taking action a in state s [91]. The Q-value update rule: Q(s t ,a t )← Q(s t ,a t ) +α[r t +γ max a Q(s t+1 ,a)−Q(s t ,a t )], (23) where α is the learning rate. For underwater MAC protocols, Q-learning adapts transmis- sion strategies to time-varying conditions [43]. The state space includes: queue length at node, estimated channel busy/idle status, time since last successful transmission, and neighbour activity patterns. The action space comprises: transmit imme- diately, wait for time slots 1, 2, 4, 8, 16, adjust transmis- sion power levels, and select frequency channel (for multi- frequency systems). The reward function encourages successful transmission whilst minimising energy: r =          +10successful transmission −5collision detected −1per time slot waited −P tx /P max energy penalty (24) where P tx is the transmission power used and P max is the maximum available transmission power. Through exploration, nodes learn optimal strategies: ag- gressive transmission during quiet periods, conservative back- off during high traffic, power adjustment based on chan- nel quality [35]. Experimental deployments show Q-learning MAC protocols achieving 150–200% throughput improvement over fixed CSMA approaches in dynamic underwater net- works [44]. Deep Q-Networks for High-Dimensional Spaces: Tradi- tional Q-learning maintains a table of Q-values, becoming intractable for large state spaces common in underwater appli- cations [92]. Deep Q-Networks (DQN) approximate Q-values using neural networks: Q(s,a;θ)≈ Q ∗ (s,a),(25) where Q(s,a;θ) is the neural network approximation with parameters θ and Q ∗ (s,a) is the optimal Q-value function. The network parameters θ are updated to minimise temporal difference error: L(θ) = E[(r + γ max a ′ Q(s ′ ,a ′ ;θ − )− Q(s,a;θ)) 2 ],(26) 11 TABLE I DQN ARCHITECTURE FOR AUV NAVIGATION StreamLayer / ActionParameters SonarConv2D × 332, 64, 64 filters Kernels8× 8, 4× 4, 3× 3 Strides4, 2, 1 SensorDense × 2128, 64 units JointConcatenateFusion of streams Dense512 units OutputDiscrete |A| = 88 Q-values MovementFwd, Back, Left, Right VerticalUp, Down ControlAdjust Speed, Scan where s ′ is the next state, a ′ is the next action, and θ − are tar- get network parameters updated periodically for stability [87]. For AUV navigation in complex environments, DQN han- dles continuous observations from sonar, cameras, and en- vironmental sensors [8], [55]. The network architecture pro- cesses multimodal inputs as shown in Table I. DQN enables AUVs to learn complex behaviours: following interesting gradients while avoiding obstacles, surfacing peri- odically for GPS fixes while minimising energy, or coordinat- ing with other AUVs for distributed sensing [9]. The experi- ence replay mechanism—storing and randomly sampling past experiences—breaks correlation in sequential data, improving learning stability in continuous underwater operations [93]. 3) Policy Gradient Methods for Continuous Control: Many underwater control problems involve continuous actions: thrust levels, rudder angles, or transmission powers [94]. Policy gra- dient methods directly optimise parameterised policies without requiring action discretisation. REINFORCE for Acoustic Power Control: The REIN- FORCE algorithm optimises policy parameters θ by gradient ascent on expected reward [94]: ∇ θ J (θ) = E π θ [∇ θ logπ θ (a|s)G t ],(27) where J (θ) is the expected cumulative reward under policy π θ . For continuous power control, we parameterise the policy as a Gaussian: π θ (a|s) =N (μ θ (s),σ 2 θ (s)),(28) where neural networks output mean μ θ (s) and variance σ 2 θ (s). The agent learns to adjust transmission power based on channel conditions, message priority, and energy reserves [95]. Training episodes simulate various scenarios: calm conditions rewarding energy conservation, storms requiring high power for reliability, or critical messages justifying energy expendi- ture. Proximal Policy Optimisation for Stable Learning: PPO improves training stability by limiting policy updates [96]: L CLIP (θ) = E[min(r t (θ)A t , clip(r t (θ), 1− ε, 1 + ε)A t )], (29) where r t (θ) = π θ (a t |s t )/π θ old (a t |s t ) is the probability ratio, A t is the advantage estimate, and ε is the clipping parameter (typically 0.1–0.2) that constrains policy updates. For multi-AUV coordination, PPO enables learning complex collaborative behaviours while maintaining training stability despite partial observability and communication delays [9], [97]. Each AUV’s policy network processes local observa- tions and limited neighbour information, learning decentralised coordination strategies that emerge into effective global be- haviours: forming sensing arrays for distributed beamforming, maintaining communication relay chains, or systematically searching areas while avoiding redundancy [38]. Actor-Critic Methods: Actor-critic methods combine the benefits of value-based and policy gradient approaches [98]. The actor learns a policy while the critic estimates value functions, providing lower-variance gradient estimates. Deep Deterministic Policy Gradient (DDPG) extends this to continu- ous action spaces, enabling fine-grained control of underwater vehicle dynamics [95]. 4) Model-Based Reinforcement Learning: While model- free RL learns purely from interaction, model-based RL additionally learns environmental dynamics, enabling more sample-efficient learning—critical when underwater experi- ments are expensive [99], [100]. The Dyna architecture combines model-free learning with simulated experience [101]: 1) Execute action, observe transition (s,a,r,s ′ ) 2) Update Q-values from real experience 3) Learn model: ˆ P (s ′ |s,a), ˆ R(s,a) 4) Generate simulated experiences from model 5) Update Q-values from simulated experience For underwater channel adaptation, model-based RL learns the relationship between environmental factors and channel quality, then uses this model to rapidly adapt when conditions change [102]. This proves particularly valuable for rare but important events—storm conditions that occur infrequently but require immediate protocol adaptation [33]. E. Deep Learning Architectures for Underwater Applications Deep learning’s hierarchical feature learning excels at pro- cessing complex underwater signals where traditional feature engineering fails [52]. These architectures automatically dis- cover relevant patterns across multiple scales—from microsec- ond carrier variations to seasonal oceanographic cycles— transforming raw sensor data into actionable intelligence with- out explicit programming of detection rules [57]. 1) Convolutional Neural Networks for Signal and Image Processing: Convolutional Neural Networks revolutionise un- derwater signal processing by automatically learning hierar- chical features that capture both local patterns and global structure [53]. Unlike traditional signal processing requiring careful filter design and parameter tuning, CNNs discover optimal feature extractors directly from data, adapting to the unique characteristics of underwater acoustic and optical signals [103]. Acoustic Signal Processing with CNNs: Underwater acoustic signals present unique challenges: time-varying mul- tipath creating complex interference patterns, Doppler shifts from platform motion, and frequency-dependent absorption distorting spectral content [22]. CNNs excel at learning robust features despite these distortions. Consider a CNN architecture 12 for acoustic modulation classification operating on spectro- grams: The input layer receives time-frequency representations sized 256×128 (256 time bins× 128 frequency bins), covering 100 ms of signal at 25.6 kHz sampling rate [30]. This captures several symbol periods while providing sufficient frequency resolution to distinguish modulation features. The first convolutional layer applies 64 filters of size 7× 7 with stride 1: h (k) 1 = σ C in X c=1 W (k,c) 1 ∗ x (c) + b (k) 1 ! ,(30) where h (k) 1 is the output feature map from filter k, σ(·) is the activation function (ReLU), C in is the number of input channels, W (k,c) 1 are the learnable filter weights, x (c) is the input from channel c, ∗ denotes convolution, and b (k) 1 is the bias term. These filters learn to detect basic time- frequency patterns: carrier frequencies, symbol transitions, and multipath delays. Underwater deployments reveal fascinating learned features—some filters become matched filters for specific multipath delays, others detect Doppler chirps from moving platforms, and several identify biologically-generated interference patterns [14]. Batch normalisation after each convolutional layer addresses the covariate shift problem particularly severe in underwater environments where training and deployment conditions differ significantly [53]: ˆx = x− μ B p σ 2 B + ε , y = γ BN ˆx + β BN ,(31) where x is the input, μ B and σ 2 B are the batch mean and variance, ε is a small constant for numerical stability, and γ BN and β BN are learnable scale and shift parameters. This normalisation enables networks trained in controlled tanks to generalise to open ocean conditions with different noise characteristics and propagation physics. Max pooling layers with 2 × 2 kernels reduce spatial dimensions whilst preserving dominant features: h pool = max (i,j)∈R h(i,j),(32) where h pool is the pooled output, h(i,j) is the input fea- ture map at position (i,j), and R is the pooling region. For underwater signals, pooling provides invariance to small time-frequency shifts caused by synchronisation errors and Doppler variations—critical for robust operation with moving platforms [42]. Experimental deployments demonstrate remarkable perfor- mance: 96–98% modulation classification accuracy at −5 dB SNR, compared to 70–75% for traditional cyclostationary feature-based methods [30]. More importantly, CNNs maintain performance across diverse conditions—trained on summer data but tested in winter, the CNN achieves 91% accuracy while traditional methods drop to 60%. Underwater Image Enhancement and Analysis: Under- water imagery suffers from severe degradation: exponential light attenuation causing colour cast, backscatter creating haze- like effects, and refraction distorting geometry [68], [104]. TABLE IV 3D CNN ARCHITECTURE FOR SONAR CLASSIFICATION Layer TypeFiltersKernel / Pool Conv3D + BN325× 5× 3 Pool3D-2× 2× 1 Conv3D + BN643× 3× 3 Pool3D-2× 2× 2 Conv3D + BN1283× 3× 3 Pool3D-2× 2× 2 GlobalAvgPool3D-- Dense256- Dense (Out)2Softmax CNNs learn to reverse these degradations through architectures specifically designed for underwater conditions. The U-Net architecture, originally developed for biomedical imaging, proves remarkably effective for underwater image enhancement [105]. The encoder pathway progressively re- duces spatial dimensions whilst increasing feature channels. Skip connections concatenate encoder features with decoder features, preserving fine details lost during downsampling: h dec = Conv([UpSample(h lower ),h enc ]),(33) where h dec is the decoder output, h lower is the feature map from the lower decoder layer, h enc is the corresponding encoder feature map, and [·,·] denotes channel-wise concatenation. Training uses a combination of losses capturing different aspects of image quality: L total = λ 1 L MSE + λ 2 L SSIM + λ 3 L percep + λ 4 L colour ,(34) where λ 1 ,λ 2 ,λ 3 ,λ 4 are weighting coefficients, L MSE ensures pixel accuracy, L SSIM preserves structural similarity, L percep maintains perceptual features, and L colour corrects colour dis- tribution [106]. This multi-objective training produces networks that simul- taneously remove backscatter, correct colours, and enhance contrast. Processing underwater pipeline inspection footage, the CNN-enhanced images improve crack detection accuracy from 72% to 94%, enabling automated inspection systems previously requiring human analysis [107]. 2) 3D CNNs for Sonar Processing: Multi-beam and syn- thetic aperture sonar systems generate volumetric data requir- ing 3D convolutional processing [108]. 3D CNNs extend 2D convolutions to include temporal or depth dimensions: h (k) x,y,z = σ   X i,j,l W (k) i,j,l · x x+i,y+j,z+l + b (k)   ,(35) where h (k) x,y,z is the output at position (x,y,z) for filter k, W (k) i,j,l are the 3D filter weights, x x+i,y+j,z+l is the input volume, and b (k) is the bias. For mine detection in side-scan sonar imagery, 3D CNNs process sequential ping data as a volume [109]. The architec- ture is shown in Table IV. The 3D convolutions learn features invariant to object orientation and burial depth—critical for mine detection where targets appear at arbitrary angles partially buried in sed- iment [110]. Temporal convolutions across pings identify 13 acoustic shadows and highlight discontinuities indicating man- ufactured objects. Transfer learning from terrestrial computer vision mod- els accelerates training despite limited underwater training data [111]. Networks pre-trained on ImageNet, fine-tuned with just 1,000 underwater images, achieve performance compara- ble to training from scratch with 50,000 images—reducing data collection costs by 98% [108]. 3) Recurrent Networks for Temporal Modelling: Underwa- ter environments exhibit strong temporal dependencies: tidal cycles, diurnal temperature variations, seasonal stratification changes [36]. Recurrent Neural Networks capture these tem- poral dynamics, predicting future states and learning long-term patterns crucial for proactive network management [57]. LSTM Networks for Channel Prediction: Long Short- Term Memory (LSTM) networks overcome the vanishing gradient problem plaguing standard RNNs, maintaining infor- mation over extended periods—essential for capturing tidal cycles (12.4 hours) or seasonal variations [53]. The LSTM cell state C t and hidden state h t evolve through three gate mechanisms (forget gate f t , input gate i t , and output gate o t ) that control information flow, enabling the network to selectively retain or discard information over long sequences. The complete gate equations are provided in Appendix A. For predicting acoustic channel impulse responses, the network processes environmental measurements (temperature profiles, wave heights, velocities) to forecast conditions [76], [112]. The first LSTM layer captures short-term variations (wave-induced fluctuations), the second models medium-term patterns (tidal cycles), and the third learns long-term depen- dencies. Deployed systems achieve remarkable accuracy: predict- ing propagation loss within 2 dB RMSE six hours ahead, enabling proactive power control that reduces transmission failures by 60% while saving 35% energy compared to reactive approaches [36]. Bidirectional RNNs for Sequence Labelling: Many un- derwater processing tasks benefit from both past and future context, such as identifying marine mammal calls or segment- ing AUV missions into behavioural phases [14]. Bidirectional RNNs (BiRNN) process sequences in both directions: ⃗ h t = RNN fwd (x t , ⃗ h t−1 ),(36) ←− h t = RNN bwd (x t , ←− h t+1 ),(37) h t = [ ⃗ h t ; ←− h t ],(38) where ⃗ h t is the forward hidden state at time t, ←− h t is the backward hidden state, x t is the input at time t, and [·;·] denotes concatenation. For packet detection in continuous acoustic recordings, a BiLSTM-CRF (Conditional Random Field) architecture achieves precise boundary detection despite variable interfer- ence [70]. The CRF layer enforces sequential constraints, pre- venting invalid label transitions. This approach detects 98.5% of packets with boundary accuracy within 2 ms, compared to 89% detection and 10 ms accuracy for traditional energy-based detectors. Attention Mechanisms for Selective Processing: Attention mechanisms enable networks to focus on relevant parts of input sequences—essential when processing long underwater recordings where important events occupy small fractions of total duration [61]. The attention weight α t,s for time step t attending to position s: e t,s = v T tanh(W h h s + W ̄ h ̄ h t + b attn ),(39) α t,s = exp(e t,s ) P S s ′ =1 exp(e t,s ′ ) ,(40) where e t,s is the alignment score, v, W h , and W ̄ h are learnable weight matrices, h s is the encoder hidden state at position s, ̄ h t is the decoder hidden state at time t, b attn is the bias, and S is the sequence length. The context vector is computed as: c t = S X s=1 α t,s h s .(41) For marine mammal vocalisation detection in year-long recordings, attention-augmented RNNs learn to ignore back- ground noise while focusing on biologically-relevant sig- nals [14]. Multi-head attention extends this concept, learning different attention patterns for different aspects. Different heads learn to attend to different acoustic features: one fo- cusing on fundamental frequency progressions, another on harmonic structures, a third on amplitude modulation patterns. This multi-faceted analysis improves blue whale call detection from 84% to 96% precision while maintaining 92% recall [62]. 4) Generative Models for Data Augmentation and Simu- lation: The scarcity and cost of underwater training data motivates generative models that synthesise realistic samples, augmenting limited datasets and enabling robust model train- ing [53]. Generative Adversarial Networks for Acoustic Syn- thesis: GANs generate realistic underwater acoustic signals through adversarial training between generator G and discrim- inator D networks [9]. The minimax objective: min G max D V (D,G) = E x∼p data [logD(x)] + E z∼p z [log(1− D(G(z)))], (42) where V (D,G) is the value function, x is a real sample from the data distribution p data , z is the latent noise vector sampled from prior distribution p z , D(x) is the discriminator’s probability that x is real, and G(z) is the generator’s output. Conditional GANs (cGANs) enable controlled synthesis by conditioning on specific environmental labels y: G(z,y;θ G )→ x fake|y ,(43) where θ G are the generator parameters and x fake|y is the generated sample conditioned on label y. This enables precise generation, such as synthesising a QPSK signal with a specific multipath spread. The synthetic data significantly augments training sets; models trained on 90% synthetic and 10% real data achieve performance comparable to those trained on 100% real data, reducing collection costs by 90% [108]. Variational Autoencoders for Anomaly Detection: VAEs learn probabilistic latent representations, enabling anomaly 14 detection through reconstruction probability [53]. The encoder maps inputs to latent distributions: q φ (z|x) =N (μ φ (x),σ 2 φ (x)),(44) where q φ (z|x) is the approximate posterior with parameters φ, z is the latent variable, and μ φ (x) and σ 2 φ (x) are the encoder- predicted mean and variance. The decoder reconstructs from samples: p θ (x|z) =N (μ θ (z),σ 2 θ (z)),(45) where p θ (x|z) is the likelihood with parameters θ, and μ θ (z) and σ 2 θ (z) are the decoder-predicted mean and variance. Train- ing maximises the evidence lower bound (ELBO): L = E q φ (z|x) [logp θ (x|z)]− D KL (q φ (z|x)||p(z)),(46) where D KL is the Kullback-Leibler divergence and p(z) is the prior distribution (typically N (0,I)). For underwater sensor anomaly detection, VAEs learn nor- mal operating patterns [75]. Anomalies produce high re- construction errors, indicating deviation from learned distri- butions. A VAE monitoring oceanographic sensors detects anomalies with 94% accuracy: distinguishing sensor drift from environmental changes, identifying biofouling onset before complete failure, and detecting cyberattacks attempting to inject false data [39]. F. Emerging Paradigms The intersection of ML with underwater communications continues to evolve, with emerging paradigms addressing fundamental limitations of current approaches while opening entirely new application domains [3]. These advances leverage recent breakthroughs in ML theory, computational hardware, and interdisciplinary insights to tackle previously intractable underwater challenges. 1) Federated Learning for Privacy-Preserving Collabora- tion: Federated learning enables multiple underwater plat- forms to collaboratively train models without sharing raw data—critical for military operations requiring operational security, commercial ventures protecting proprietary infor- mation, or international collaborations with data sovereignty constraints [40], [41], [113]. Distributed Training Architecture: In federated underwa- ter networks, nodes maintain local models trained on private data D i [114]. Instead of transmitting raw sensor measure- ments, nodes share only model updates. The local update at node i: θ t+1 i = θ t i − η∇ θ L i (θ t i ;D i ),(47) where θ t i are the model parameters at node i at iteration t, η is the learning rate, and L i is the local loss function evaluated on local dataD i . Nodes transmit compressed updates ∆ t i = Compress(θ t+1 i −θ t global ) to an aggregation server [115]. Compression exploits update sparsity (1–5%) via top-k sparsi- fication, probabilistic quantisation, or structured updates with low-rank matrix constraints [116]. The server aggregates updates using Federated Averag- ing [117]: θ t+1 global = θ t global + N X i=1 n i n total ∆ t i ,(48) where θ t global are the global model parameters, N is the number of participating nodes, n i is the size of node i’s local dataset, n total = P N i=1 n i is the total dataset size, and ∆ t i is the compressed update from node i. For heterogeneous networks, asynchronous federated learning accommodates varying up- date rates: θ t+1 global = (1− α t )θ t global + α t θ t+1 i ,(49) where α t = 1/(t + 1) 0.75 ensures convergence [118]. This architecture reduces bandwidth requirements by 95% while maintaining model accuracy within 1% of centralised train- ing [40]. Applications in Collaborative Ocean Monitoring: Con- sider an international consortium monitoring ocean acidifica- tion across multiple economic zones [119]. Each nation oper- ates sensor networks collecting pH, temperature, and carbonate measurements—sensitive data revealing fishing grounds and military operations. Federated learning enables collaborative model training without data sharing. Local models learn re- gional patterns: seasonal variations, river influences, upwelling dynamics. The global model captures ocean-wide trends: acid- ification rates, correlation with atmospheric CO 2 , impact on calcifying organisms [3]. Differential privacy mechanisms add mathematical privacy guarantees [120]: θ t+1 i = θ t i − η(∇ θ L i +N (0,σ 2 C 2 I)),(50) where N (0,σ 2 C 2 I) is Gaussian noise with zero mean and covariance σ 2 C 2 I , σ is the noise scale, C is the gradi- ent clipping threshold, and (ε,δ) are the differential privacy parameters. This enables military and commercial networks to contribute to environmental monitoring without revealing operational patterns [40]. 2) Physics-Informed Neural Networks: Physics-Informed Neural Networks (PINNs) incorporate domain knowledge as constraints, dramatically reducing data requirements while ensuring physically plausible predictions [47]. For underwater systems governed by well-understood physics, PINNs achieve accuracy impossible with pure data-driven approaches. Embedding Acoustic Physics: The underwater acoustic field satisfies the Helmholtz equation [21]: ∇ 2 p + k 2 (x,y,z)p = 0,(51) where p is the acoustic pressure,∇ 2 is the Laplacian operator, k(x,y,z) = ω/c(x,y,z) is the spatially-varying wavenumber, ω is the angular frequency, and c(x,y,z) is the spatially- varying sound speed. A PINN learns pressure field p(x,y,z;θ) whilst satisfying this physics constraint. The loss function combines data fi- delity and physics residual [47]: L = X i |p(x i ;θ)− p meas i | 2 | z Data loss +λ X j |∇ 2 p(x j ;θ) + k 2 p(x j ;θ)| 2 | z Physics loss , (52) where p(x i ;θ) is the neural network prediction at measurement location x i , p meas i is the measured pressure, λ is a weighting 15 parameter balancing data and physics terms, and x j are collocation points where physics constraints are enforced. The physics loss is evaluated at collocation points requiring no measurements—the network learns to satisfy the wave equation throughout the domain, not just at sensor locations. For source localisation, PINNs trained on sparse hydrophone measurements extrapolate the full acoustic field, achieving localisation accuracy of 50–100 m at 10 km range with only 5 receivers, compared to 500–1000 m for conventional beamforming [60]. Learning Ocean Dynamics: For AUV navigation, PINNs learn ocean circulation patterns constrained by Navier-Stokes equations [47]: ∂u ∂t + (u·∇)u =− 1 ρ ∇p + ν∇ 2 u + f,(53) where u is the velocity field, ρ is the fluid density, p is pressure, ν is the kinematic viscosity, and f represents body forces (e.g., Coriolis, buoyancy). The network predicts velocity fields u(x,y,z,t) and pres- sure p(x,y,z,t) from sparse AUV measurements. Physics constraints ensure mass conservation, momentum conserva- tion, geostrophic balance at large scales, and boundary layer physics near surfaces [38]. Training on 50 AUV transects, PINNs reconstruct basin-scale circulation matching satel- lite altimetry while revealing submesoscale features invisi- ble to satellites—enabling AUV path planning that exploits favourable currents, reducing energy consumption by 25–40%. 3) Meta-Learning for Rapid Adaptation: Meta-learning, or “learning to learn,” enables models to quickly adapt to new underwater environments using minimal data—critical when deploying to unexplored regions where extensive training data is unavailable [111]. Model-Agnostic Meta-Learning (MAML) for Channel Adaptation: MAML learns initialisation parameters that en- able rapid fine-tuning. Meta-training across multiple environ- ments: θ ∗ = arg min θ X T i ∼p(T ) L T i (θ− α∇ θ L T i (θ)),(54) where θ ∗ are the optimal meta-learned parameters, T i is a task sampled from task distribution p(T ) representing different underwater environments, L T i is the loss on task T i , and α is the inner-loop learning rate [118]. For acoustic equalisation, tasks correspond to different deployment sites: shallow harbours, deep channels, coral reefs. The meta-learned initialisation enables adaptation to new sites with just 10–100 transmissions, compared to 10,000+ required for training from scratch [33]. Deployment process: (1) Deploy with meta-learned param- eters θ ∗ , (2) Collect small calibration dataset (5 minutes of transmissions), (3) Fine-tune: θ adapted = θ ∗ − α∇ θ L new (θ ∗ ), (4) Achieve site-specific performance. This reduces deploy- ment time from days to hours—critical for rapid response operations or temporary deployments. Few-Shot Learning for Species Classification: Prototypi- cal networks enable classification of rare marine species from few examples [121]. Support set establishes class prototypes: c k = 1 |S k | X (x i ,y i )∈S k f φ (x i ),(55) where c k is the prototype (centroid) for class k, S k is the support set of examples for class k, (x i ,y i ) are example-label pairs, and f φ is the embedding function with parameters φ. Query classification uses nearest prototype: p(y = k|x) = exp(−d(f φ (x),c k )) P k ′ exp(−d(f φ (x),c k ′ )) ,(56) where d(·,·) is a distance metric (typically Euclidean distance) between the query embedding f φ (x) and class prototypes. For identifying endangered species vocalisations, prototypi- cal networks trained on common species adapt to rare species with just 5–10 example calls [62]. This enables rapid biodi- versity assessment: deploying to new regions, recording local species, and immediately beginning population monitoring without extensive training data collection. 4) Transformer Architectures and Self-Attention: Trans- formers, revolutionising natural language processing, bring powerful sequence modelling capabilities to underwater com- munications, excelling at capturing long-range dependencies and parallel processing [48], [61]. Transformers for Protocol Learning: Traditional protocol design requires extensive standardisation and rigid specifica- tions. Transformers learn protocol structures from observa- tions, automatically discovering frame formats, error correc- tion schemes, and timing relationships [122]. Self-attention mechanism relates all positions in a sequence: Attention(Q,K,V ) = softmax QK T √ d k V,(57) where Q (query), K (key), and V (value) are linear projections of the input, and d k is the dimension of the key vectors (the √ d k scaling prevents softmax saturation). Multi-head attention captures different protocol aspects: frame boundaries and synchronisation patterns, address fields and routing information, error detection/correction codes, and payload structure and encoding [123]. Position encoding incorporates temporal information: PE (pos,2i) = sin(pos/10000 2i/d model ),(58) PE (pos,2i+1) = cos(pos/10000 2i/d model ),(59) where PE (pos,i) is the position encoding at position pos and dimension i, and d model is the model dimension. A transformer trained on 1000 hours of intercepted commu- nications automatically discovers: frame structure with 99.2% boundary detection accuracy, modulation switching patterns correlating with channel conditions, adaptive coding schemes responding to error rates, and hidden acknowledgment mech- anisms embedded in data frames [124]. Vision Transformers for Sonar Image Analysis: Vision Transformers (ViT) process sonar images as sequences of 16 patches, capturing global context missed by CNNs’ local receptive fields [125]. Image tokenisation: x p = Flatten(Patch(I))∈ R N×(P 2 ·C) ,(60) where x p are the flattened patch embeddings, I is the input image divided into N patches, P is the patch size, and C is the number of channels. For seafloor classification from side-scan sonar, ViT achieves remarkable performance by capturing long-range spatial dependencies [126]. The attention maps provide in- terpretability, highlighting which image regions contribute to classification decisions. For detecting unexploded ordnance, attention concentrates on acoustic shadows and characteristic highlight patterns while ignoring seafloor clutter—achieving 97.8% detection rate with 0.2% false alarms, compared to 93.5% detection with 1.8% false alarms for CNN-based meth- ods [127]. 5) Edge AI and Neuromorphic Computing: The severe power constraints of underwater sensors motivate ultra-low- power AI implementations [3]. Neuromorphic computing, inspired by biological neural networks’ efficiency, enables intelligent processing consuming microwatts rather than watts. Spiking Neural Networks for Event-Based Processing: SNNs process information through discrete spikes, matching the event-driven nature of underwater sensing [57]. The Leaky Integrate-and-Fire (LIF) neuron dynamics: τ m dV dt =−(V − V rest ) + R· I(t),(61) where τ m is the membrane time constant, V is the membrane potential, V rest is the resting potential, R is the membrane resistance, and I(t) is the input current. When membrane potential V exceeds threshold V th , the neuron generates a spike and resets. Spike-Timing-Dependent Plasticity (STDP) enables local adaptation without external training: ∆w = ( A + exp(−∆t/τ + )if t post > t pre −A − exp(∆t/τ − ) if t post < t pre (62) where ∆w is the weight change, A + and A − are learning rate parameters, ∆t = |t post − t pre | is the absolute time difference between post-synaptic and pre-synaptic spikes, and τ + and τ − are time constants for potentiation and depression. For acoustic event detection, SNNs offer extreme efficiency. Neuromorphic hardware like Intel’s Loihi implements these networks with idle power of 10 μW and active power of 1 mW per event [128]. This efficiency allows underwater sensors to operate for five years on a single battery while continuously monitoring for rare events such as oil leaks, submarine passages, or whale vocalisations. Quantisation and Pruning for Resource-Constrained Deployment: Model compression enables sophisticated AI on limited underwater hardware [53]. Weight quantisation reduces precision from 32-bit floating-point to b-bit widths: w q = round w s · s, s = w max − w min 2 b − 1 ,(63) where w q is the quantised weight, w is the original weight, s is the scale factor, w max and w min are the maximum and minimum weights, and b is the bit width. Binary quantisation achieves 32× compression, enabling complex models to run on microcontrollers [64]. Struc- tured pruning removes entire channels using group spar- sity, achieving 10× speedup with 95% accuracy retention. Knowledge distillation transfers expertise from large teacher models to compact student networks, reducing inference time from 100 ms to 5 ms while maintaining classification accu- racy [129]. 6) Graph Neural Networks for Network Topology Learning: Underwater networks exhibit complex graph structures: sen- sor connectivity, AUV coordination, or acoustic propagation graphs [130]. GNNs process this relational data, learning from both node features and topology [131]. Message Passing for Distributed Learning: GNNs aggre- gate information from neighbours through iterative message passing: h (k+1) i = σ   W (k) self h (k) i + X j∈N (i) W (k) msg h (k) j   ,(64) where h (k) i is node i’s feature representation at layer k,N (i) is the set of neighbours of node i, W (k) self and W (k) msg are learnable weight matrices, and σ is a nonlinear activation function. For underwater routing, nodes learn strategies based on local observations h (0) i = [E i ,D i ,Q i , SNR i ] (energy, depth, queue length, signal-to-noise ratio) and neighbour states [44]. Graph Attention Networks (GAT) weight neighbour contributions using learnable attention: α ij = exp(LeakyReLU(a T [Wh i ∥Wh j ])) P k∈N (i) exp(LeakyReLU(a T [Wh i ∥Wh k ])) ,(65) where α ij is the attention coefficient from node i to node j, a is a learnable attention vector, W is a weight matrix, and ∥ denotes concatenation. This adaptive weighting outperforms fixed topology routing by 40–60% in dynamic networks [131], [132]. Spatial-Temporal GNNs for Dynamic Networks: Under- water networks evolve as nodes drift and links fail. Spatial- Temporal GNNs (ST-GNNs) capture these dynamics through spatial graph convolutions and temporal kernels [130]: H (l) = σ( ̃ D −1/2 ̃ A ̃ D −1/2 H (l−1) W (l) ),(66) where H (l) are the node features at layer l, ̃ A = A +I is the adjacency matrix with self-loops, ̃ D is the degree matrix, and W (l) are learnable weights. Temporal evolution is captured via: Z = K−1 X τ =0 P τ X t−τ W τ ,(67) where Z is the temporal output, K is the temporal window size, P τ are temporal convolution parameters, X t−τ are node features at time t− τ , and W τ are temporal weights. Predicting network evolution 24 hours ahead achieves 85% topology accuracy, enabling proactive management such as preemptively establishing backup routes and repositioning AUVs to maintain connectivity [133]. 17 7) Hybrid Quantum-Classical Algorithms: Quantum com- puting promises exponential speedups for optimisation prob- lems in underwater networks [134]. Near-term devices offer advantages when integrated with classical ML via hybrid frameworks. QuantumApproximateOptimisationAlgorithm (QAOA): Many underwater networking problems—sensor placement, frequency allocation—reduce to combinatorial optimisation. QAOA leverages quantum superposition: |ψ(γ,β)⟩ = p Y l=1 e −iβ l H B e −iγ l H C |+⟩ ⊗n ,(68) where|ψ(γ,β)⟩ is the variational quantum state, p is the circuit depth, γ and β are variational parameters, H C encodes the objective (cost Hamiltonian), H B is the mixing Hamiltonian, |+⟩ is the equal superposition state, and n is the number of qubits. Quantum ML for Feature Mapping: Quantum feature maps exploit high-dimensional Hilbert spaces to capture in- tricate phase relationships in acoustic signatures [134]: |φ(x)⟩ = Y i e ix i Z i Y i<j e ix i x j Z i Z j |0⟩ ⊗n ,(69) where |φ(x)⟩ is the quantum feature state, x i are input features, Z i is the Pauli-Z operator on qubit i, |0⟩ is the zero state, and n is the number of qubits. The quantum kernel K(x,x ′ ) = |⟨φ(x)|φ(x ′ )⟩| 2 achieves 98.5% acoustic classification accuracy compared to 94.0% for classical RBF kernels, with the advantage stemming from entanglement creating exponentially large feature spaces. 8) Continual Learning and Lifelong Adaptation: Underwa- ter deployments spanning decades encounter evolving con- ditions: sensor degradation, seasonal cycles, and changing noise sources [3]. Continual learning enables models to adapt without forgetting previously learned knowledge [135]. Elastic Weight Consolidation (EWC): To prevent catas- trophic forgetting, EWC slows updates to parameters critical for previous tasks using the Fisher information matrix F i : L EWC (θ) = L new (θ) + λ 2 X i F i (θ i − θ ∗ old,i ) 2 ,(70) where L new (θ) is the loss on the new task, λ is a weighting parameter, F i is the Fisher information for parameter i, and θ ∗ old,i are the optimal parameters from the previous task. The Fisher information is: F i = E x∼p old " ∂ logp(x|θ ∗ old ) ∂θ i 2 # .(71) For acoustic equalizers, EWC maintains 95% performance across seasonal shifts, whereas standard adaptation drops to 60% when conditions reverse [102]. Progressive Neural Networks: Progressive networks ex- pand architecture for new missions whilst freezing existing parameters to preserve knowledge [57]. Lateral connections enable knowledge transfer between columns: h (k) i = f   W (k) i h (k) i−1 + X j<k U (k:j) i h (j) i−1   ,(72) where h (k) i is the hidden state at layer i of column (task) k, f is the activation function, W (k) i are within-column weights, and U (k:j) i are lateral connection weights from column j to column k. This allows multi-mission AUVs to accumulate capabilities: navigation provides base mobility, target detection leverages navigation for approach, mapping uses mobility for efficient sampling, and communications relay uses sampling for optimal positioning [8]. Memory-Augmented Networks: Experience replay via external memory enables storage and retrieval of anomalous patterns [53]: r t = X i w r t (i)M t (i),(73) M t (i) = M t−1 (i)(1− w w t (i)e t ) + w w t (i)a t ,(74) where r t is the read vector at time t, w r t (i) are read weights, M t (i) is memory slot i at time t, w w t (i) are write weights, e t is the erase vector, and a t is the add vector. For long-term mon- itoring, this system stores prototypical anomalies. After five years of deployment, such systems recognise 47 anomaly types with 99% detection accuracy and zero forgetting—mirroring the lifelong acoustic learning of marine mammals [75]. G. Summary This section has provided a comprehensive tutorial on ML techniques for underwater communications, progressing from foundational concepts to cutting-edge paradigms. Table V summarises the key techniques and their primary underwater applications. The key insight from this tutorial is that successful ML deployment in underwater systems requires matching algo- rithm capabilities to application requirements. Supervised learning excels when labelled data is available; unsuper- vised methods discover structure in unlabelled ocean mea- surements; reinforcement learning enables adaptation with- out explicit models; and deep learning architectures handle high-dimensional signals. The emerging paradigms—federated learning, physics-informed networks, transformers, and neuro- morphic computing—address the unique constraints of under- water deployment: limited communication bandwidth, severe energy restrictions, and the need for autonomous operation over extended periods. The following sections apply these techniques across all layers of the underwater network protocol stack, demonstrating how ML transforms each layer from the physical to the application layer. I. COMPARISON WITH EXISTING SURVEYS Having established the ML fundamentals essential for un- derstanding underwater applications, we now position our work within the broader landscape of existing surveys. This comparison demonstrates how our tutorial-survey approach— combining pedagogical ML foundations with comprehensive protocol-layer analysis—addresses critical gaps in the litera- ture. The application of ML to underwater communications has attracted growing research interest, resulting in several survey 18 TABLE V SUMMARY OF ML TECHNIQUES FOR UNDERWATER COMMUNICATIONS CategoryTechniquePrimary ApplicationsKey Advantages Supervised Learning k-N, SVMModulation classification, vessel identi- fication Robust with limited data [42] Random ForestsFault detection, routing decisionsInterpretable,handlesmixed features [74] Gaussian ProcessesField estimation, path planningUncertainty quantification [56] Unsupervised Learning k-Means, DBSCANNetwork clustering, topology organisa- tion Adapts to irregular deployments [78] PCA, AutoencodersData compression, anomaly detection50–1000× compression [64] Reinforcement Learning Q-Learning, DQNMAC protocols, power control, routingLearns from interaction [44] PPO, DDPGAUV navigation, continuous controlHandles continuous actions [9] Deep Learning CNNsSignal classification, image enhance- ment Automatic feature learning [30] LSTMsChannel prediction, sequence labellingCaptures temporal dependencies [36] GANs, VAEsData augmentation, anomaly detectionGenerates realistic training data [75] Emerging Paradigms Federated LearningCollaborative training, privacy preser- vation 95% bandwidth reduction [40] PINNsSource localisation, field estimationPhysics-constrained predictions [47] TransformersProtocol learning, sonar analysisLong-range dependencies [61] GNNsRouting, topology predictionHandles network structure [131] articles examining different aspects of this interdisciplinary field [3], [4], [33]. However, existing surveys either focus nar- rowly on specific applications, address only terrestrial sensor networks, or discuss underwater systems without considering ML solutions. This section provides a comprehensive com- parison with existing literature, demonstrating how our survey uniquely addresses critical gaps whilst providing practical guidance for implementing ML-enabled IoUT systems. Fig- ure 2 presents a taxonomy of existing surveys in this domain. A. Analysis of Existing Survey Contributions To understand the unique positioning of our survey, we sys- tematically analyse existing literature across multiple dimen- sions: topical coverage, technical depth, practical applicability, and temporal relevance. Table VI presents a comprehensive comparison of surveys spanning 2012–2025. 1) Surveys on ML in Wireless Sensor Networks: The foun- dational work by Alsheikh et al. [65] established a compre- hensive taxonomy of ML applications in WSNs, categoris- ing algorithms by learning type (supervised, unsupervised, reinforcement) and application domain (routing, localisation, clustering). While groundbreaking for its time, this survey assumes terrestrial propagation models where radio waves travel at light speed with predictable path loss. The fundamen- tal differences in underwater acoustics—propagation speeds 200,000× slower, frequency-dependent absorption, and severe multipath—render many of their recommendations inapplica- ble [21]. For instance, their analysis of k-means clustering assumes Euclidean distance correlates with communication cost, but underwater acoustic shadows can prevent commu- nication between physically proximate nodes while enabling long-range communication via surface reflections. Kumar et al. [66] extended this work with greater emphasis on energy efficiency, providing detailed complexity analysis of ML algorithms suitable for resource-constrained nodes. They examine dimensionality reduction techniques (PCA, LDA) and lightweight classifiers (decision trees, naive Bayes) from an energy perspective. However, their energy models assume RF communication where transmission power scales with distance squared. Underwater acoustic transmission power follows complex models incorporating frequency-dependent absorption (α(f ) ∝ f 2 ), spherical/cylindrical spreading, and environmental noise that varies by orders of magnitude with sea state and biological activity [22]. Their recommendation to “always use the nearest neighbour for routing” could be catastrophic underwater where the nearest neighbour might be in an acoustic shadow zone. The self-organising networks survey by Klaine et al. [136] explores ML for network automation, discussing how super- vised learning enables traffic prediction, unsupervised learn- ing supports anomaly detection, and reinforcement learn- ing optimises resource allocation. Their framework for self- configuration, self-optimisation, and self-healing provides valuable architectural insights. Yet their solutions assume cellular network characteristics: reliable backhaul connections, stable node positions, and predictable channel conditions. Underwater networks face opposite conditions: intermittent connectivity to surface gateways, continuous node drift from currents, and channels varying dramatically with thermocline depth and internal waves. 2) Surveys on Underwater Communications: Li et al. [12] comprehensively review reliability techniques for underwater sensor networks, analysing error correction codes, retransmis- sion strategies, and cross-layer protocols. They provide valu- able insights into underwater-specific challenges: long prop- agation delays preventing traditional ARQ, Doppler spreads requiring specialised equalisation, and energy constraints lim- iting retransmissions. However, their solutions remain rule- based: fixed FEC rates, predetermined retransmission limits, static routing tables. They acknowledge that “adaptive ap- proaches could improve performance” but do not explore how ML enables such adaptation. Our survey demonstrates that ML-based adaptive FEC reduces energy consumption by 40% while maintaining reliability by learning channel patterns and predicting error rates [143]. The routing protocol survey by Khisa and Moh [20] cat- 19 Existing Survey Literature ML in WSNs (General) Underwater NetworkingML for UW ApplicationsSpecialised Topics Alsheikh 2014 Kumar 2019 Klaine 2017 Li 2019 Khisa & Moh 2021 Mohsan 2022 Luo 2021 Moniruzzaman 2017 Jahanbakht 2021 Wang 2022 Luo 2023 Jiang 2019 (Security) Christensen 2022 (AUV) Saleh 2022 (Fish) Yang 2025 (Tracking) This Survey: Comprehensive ML-IoUT Fig. 2. Taxonomy of existing surveys related to ML and IoUT. Our survey uniquely integrates knowledge across all four categories, providing comprehensive coverage that individual surveys lack. TABLE VI COMPREHENSIVE COMPARISON OF ML AND UNDERWATER NETWORKING SURVEYS (2012–2025) Survey ReferenceYearPrimary ScopeKey ContributionsLimitations ML in Wireless Sensor Networks (General) Alsheikh et al. [65]2014ML algorithms in terrestrial WSNs Comprehensive ML taxonomy, algo- rithm comparison, complexity analy- sis No underwater considerations, out- dated ML techniques Kumar et al. [66]2019ML techniques for WSN optimisation Energy efficiency focus, clustering al- gorithms, fault detection Minimal underwater content, lacks deep learning Klaine et al. [136]2017Self-organisingnetworks with ML SON paradigm, cellular focus, optimi- sation techniques Terrestrial only, no acoustic chan- nels Underwater Communications and Networking Li et al. [12]2019High reliability in UWSNsProtocol comparison, reliability met- rics, cross-layer design No ML techniques discussed Khisa & Moh [20]2021Routingprotocolsfor UWSNs Depth-based,cluster-based,bio- inspired routing Limited RL coverage, no deep learning Mohsan et al. [4]2022General IoUT overviewBasic concepts, applications, chal- lenges No ML/AI coverage, lacks techni- cal depth Luo et al. [137]2021UWSN routing protocolsComprehensive protocol taxonomy, simulation comparison Traditionalprotocolsonly,no learning-based ML for Specific Underwater Applications Moniruzzamanet al. [103] 2017DL for underwater object detection CNN architectures, dataset review, de- tection metrics Application-specific, no network- ing Jahanbakht et al. [3]2021Big data analytics in marine IoUT Data processing pipelines, visualisa- tion, storage Limited to data analytics, no pro- tocols Wang et al. [138]2022DL for marine object detec- tion YOLO variants, dataset comparison, real-time processing Computer vision only, no commu- nications Luo et al. [139]2023ML for target recognitionRecognition methods, feature extrac- tion, classification Narrow acoustic focus Specialised Topics Jiang [140]2019Security in underwater net- works Attack taxonomy, defence mecha- nisms, authentication No ML-based security solutions Saleh et al. [141]2022DL for fish classificationSpecies recognition, tracking algo- rithms, datasets Narrow application focus Christensen et al. [8]2022AI for AUV controlNavigation algorithms, path planning, obstacle avoidance Limited to AUV control Yang et al. [142]2025Underwater positioning & tracking Localisation methods, tracking algo- rithms Positioning focus only This Survey2026Comprehensive ML for IoUT Layer-by-layer analysis, quantita- tive comparisons, implementation guidelines, emerging paradigms — egorises underwater routing into depth-based, cluster-based, and bio-inspired approaches. They analyse 47 protocols, com- paring energy efficiency, packet delivery ratio, and end-to-end delay. While mentioning “RL-based” routing as an emerging category, they dedicate only two paragraphs to Q-learning approaches, missing the revolution in deep reinforcement learning [88], [144]. They do not discuss how deep Q-networks handle continuous state spaces representing 3D positions, currents, and time-varying channels—critical for practical de- ployment. Our survey provides detailed analysis of 15+ RL- based routing protocols, including implementation architec- tures, training procedures, and convergence guarantees [44]. Mohsan et al. [4] present a high-level overview of IoUT concepts, applications, and challenges. While useful for new- 20 comers, the survey lacks technical depth required for imple- mentation. Their discussion of “intelligent algorithms” spans one page without explaining how intelligence is achieved. They mention “AI and ML will revolutionise IoUT” with- out providing concrete examples, algorithms, or performance metrics. In contrast, our survey provides implementation- ready details: network architectures with layer specifications, hyperparameter settings, training procedures, and measured performance improvements from real deployments. 3) Surveys on ML for Underwater Applications: Moniruz- zaman et al. [103] pioneered the review of deep learning for underwater object detection, analysing CNN architectures (AlexNet, VGGNet, ResNet) and their adaptation for un- derwater imagery. They discuss challenges including colour distortion, low contrast, and limited labelled data. However, their focus remains entirely on visual sensing—they do not consider acoustic sensing, communication systems, or how de- tected objects relate to network behaviour. Our survey bridges this gap, connecting computer vision insights to network-level decisions such as triggered data transmission or AUV mission adaptation [46]. Jahanbakht et al. [3] provide the most comprehensive survey of big data analytics for marine IoUT, covering data col- lection platforms, storage architectures, processing pipelines, and visualisation tools. Their analysis of data characteristics— volume, velocity, variety—offers valuable insights for system design. However, they treat the network as a data conduit, not examining how ML can optimise the network itself. Questions such as “How should sensor sampling rates adapt to detected phenomena?” or “Which data merits immediate transmission versus local processing?” remain unexplored. Our survey addresses these network-centric ML applications while building upon their data analytics foundations. Wang et al. [138] and Luo et al. [139] focus on deep learning for marine object detection and target recognition, respectively. These surveys provide excellent coverage of YOLO variants, attention mechanisms, and acoustic feature extraction, but remain confined to perception tasks. Neither survey connects recognition to communication: how does detecting a whale affect transmission scheduling to avoid acoustic interference? How should recognising a pipeline leak trigger network recon- figuration for high-priority data delivery? Our survey uniquely addresses these ML applications for network adaptation. 4) Specialised Topic Surveys: Christensen et al. [8] provide an excellent review of AI techniques for AUV navigation and control, covering path planning, obstacle avoidance, and mission adaptation. However, their communication discussion remains limited to “AUVs must surface to transmit data,” miss- ing extensive research on underwater acoustic communication for AUV coordination, real-time data relay, and collaborative SLAM [38]. Our survey integrates AUV intelligence with network-level optimisation, showing how navigation decisions affect and are affected by communication capabilities. Yang et al. [142] offer the most recent survey on underwater positioning and tracking, covering acoustic ranging, inertial navigation, and fusion techniques. While they mention ML briefly, their focus remains on geometric algorithms. Our survey complements their work by providing deep technical analysis of ML-based localisation: fingerprinting with neural networks, RL-based active localisation, and federated learning for privacy-preserving positioning [40], [55]. Figure 3 visualises the coverage gaps across existing sur- veys, highlighting the unique comprehensive coverage pro- vided by this work. B. Critical Gaps Addressed by This Survey Our systematic analysis reveals four critical gaps in existing literature that this survey addresses. Figure 4 illustrates these gaps and our corresponding contributions. 1) Gap 1: Fragmented Protocol Stack Coverage: Existing surveys examine isolated aspects of underwater networks without considering how ML optimisations at one layer affect others. Physical layer surveys [34], [56] analyse modulation and channel estimation in isolation, ignoring how improved channel knowledge could benefit MAC scheduling or routing decisions. Routing surveys [20], [137] evaluate protocols as- suming fixed physical layer parameters, missing opportunities for joint optimisation. This fragmentation prevents practition- ers from understanding system-level trade-offs and synergies. Our Solution: We provide the first comprehensive layer- by-layer analysis of ML applications spanning physical, MAC, network, transport, and application layers, explicitly address- ing cross-layer interactions. For example, we show how physi- cal layer channel prediction can inform MAC layer scheduling, which affects network layer routing decisions—a cascade of optimisations impossible to understand from fragmented surveys. 2) Gap 2: Missing Quantitative Performance Comparisons: Existing surveys often make qualitative claims—“ML im- proves performance”—without standardised metrics enabling fair comparison. A survey might state “CNN achieves high accuracy” for one application while “RL reduces energy consumption” for another, without common baselines or con- sistent evaluation methodologies. This vagueness prevents evidence-based algorithm selection. Our Solution: We compile quantitative performance met- rics from 200+ papers into a structured repository. Table VII provides examples of the standardised comparisons we enable, allowing researchers to make informed decisions based on measured performance under comparable conditions. 3) Gap 3: Outdated ML Technique Coverage: Existing surveys focus on established ML techniques (k-means, SVM, basic neural networks) while missing recent advances that address fundamental IoUT challenges. The rapid evolution of deep learning, reinforcement learning, and distributed learning has produced transformative techniques largely unexplored in underwater contexts [40], [47], [88]: • Deep Reinforcement Learning: Actor-critic architectures (TD3, SAC, PPO) handle continuous action spaces required for power control and AUV navigation, but no underwater survey provides comprehensive DRL coverage [97]. • Graph Neural Networks: GNNs naturally model network topology for routing and clustering, yet remain unexamined in underwater surveys despite growing terrestrial applica- tions [130]. 21 Physical MACNetwork TransportApp Cross-L DLRLFL Implement. Alsheikh ’14 Jahanbakht ’21 Khisa ’21 Christensen ’22 This Survey Full Partial None Fig. 3. Coverage matrix comparing existing surveys across protocol layers (Physical, MAC, Network, Transport, Application), cross-layer optimisation, ML paradigms (DL: Deep Learning, RL: Reinforcement Learning, FL: Federated Learning), and implementation guidance. Our survey provides comprehensive coverage across all dimensions. Gap 1: Fragmented Protocol Coverage Surveys focus on single layers or applications Solution: Layer-by-Layer Analysis Comprehensive coverage from PHY to APP with cross-layer Gap 2: Missing Quantitative Benchmarks Vague claims without comparable metrics Solution: Performance Repository 200+ papers synthesised with standardised metrics Gap 3: Outdated ML Techniques Focus on SVM, basic N; missing DRL, FL, GNN Solution: Modern ML Coverage DRL, FL, PINNs, Transformers, GNNs for IoUT Gap 4: Theory-Practice Divide Academic algorithms without deployment guidance Solution: Implementation Guidelines Architecture specs, training configs, deployment procedures Critical Gaps in Existing Literature and Our Solutions Fig. 4. Four critical gaps identified in existing survey literature and the corresponding solutions provided by this survey. Each gap represents a significant barrier to ML adoption in IoUT systems that our comprehensive treatment addresses. TABLE VII SAMPLE QUANTITATIVE COMPARISONS PROVIDED IN THIS SURVEY ApplicationBaselineML MethodImprovement Physical Layer LocalisationTrilaterationCNN7× accuracy Channel Est.LS PilotLSTM15 dB MSE gain Modulation Class.Energy Det.ResNet25% @ -5 dB SNR MAC Layer Channel AccessALOHAQ-learning45% throughput Power ControlFixed PowerTD330% energy SchedulingTDMAMARL2× utilisation Network Layer RoutingShortest PathDQN148% throughput ClusteringGeographicRL+k-means70% lifetime Load BalanceRound RobinPPO40% delay Application Layer Anomaly Det.ThresholdVAE95% precision Species Class.ManualCNN94% accuracy Path PlanningA*TD335% energy • Federated Learning: FL enables privacy-preserving collab- orative learning across distributed sensors, critical for multi- stakeholder ocean monitoring, but underwater FL surveys do not exist [40], [41]. • Physics-Informed Neural Networks: PINNs embed acous- tic propagation physics into learning, improving generalisa- tion with limited data—a key underwater challenge [47]. • Transformer Architectures: Self-attention mechanisms capture long-range dependencies in acoustic signals, with OceanGPT demonstrating potential for marine foundation models [48]. Our Solution: We provide detailed technical analysis of modern ML paradigms specifically contextualised for un- derwater applications, including architecture specifications, training procedures, and performance benchmarks. 4) Gap 4: Theory-Practice Divide: Academic surveys present algorithms without deployment guidance, creating a theory-practice gap that has limited ML adoption in oper- ational underwater systems. Researchers propose novel ar- chitectures without discussing computational requirements, training data needs, or failure modes. Practitioners reading these surveys cannot assess whether proposed solutions are feasible for their hardware constraints, data availability, or reliability requirements. Our Solution: We bridge this gap with implementation- ready details including complete architecture specifications, training configurations, quantisation strategies for embedded deployment, and documented pitfalls from real deployments. For example, we explain that “models trained in tanks fail in open ocean due to boundary reflections”—practical knowledge absent from theoretical surveys. 22 1 Tutorial-Survey Hybrid with Underwater Examples 2 Quantitative Performance Repository (200+ Papers) 3 Implementation-Ready Technical Specifications 4 Cross-Layer Optimisation Insights 5 Future-Oriented Research Roadmap Unique Contributions Fig. 5. Five unique value propositions distinguishing this survey from ex- isting literature, providing comprehensive coverage from tutorial foundations through future research directions. Lessons Learned: Gap Analysis • Fragmented surveys force practitioners to synthesise across 10+ papers for system design • Qualitative claims without metrics lead to suboptimal algorithm selection • ML advances from 2020–2025 remain largely unexplored in underwater contexts • Implementation details are critical for transitioning from simulation to deployment C. Unique Value Propositions of This Survey Building upon identified gaps, we articulate five unique value propositions that distinguish this survey from existing literature. Figure 5 summarises these contributions. 1) Value Proposition 1: Tutorial-Survey Hybrid: Unlike pure surveys that assume ML expertise, we provide a hybrid approach where each section begins with tutorial content explaining fundamentals through underwater-specific exam- ples, surveys state-of-the-art applications, and concludes with practical lessons learned [33]. This structure serves multiple audiences: • Ocean engineers learn ML concepts through familiar underwater examples (e.g., RL explained as an AUV learning to navigate kelp forests) • ML researchers understand underwater challenges that motivate specific algorithm choices • System designers gain end-to-end understanding for complete solution architecture For example, we explain Q-learning through an under- water routing scenario: states represent node energy levels and queue depths, actions select next-hop neighbours, and rewards balance delivery success against energy cost. This contextualisation, absent from generic ML tutorials, enables immediate application. 2) Value Proposition 2: Quantitative Performance Reposi- tory: We compile the first comprehensive repository of quan- titative ML performance metrics for IoUT, aggregating results from 200+ papers into standardised formats. This repository enables evidence-based design decisions: • Query by application: “Show all localisation methods achieving < 2 m error” TABLE VIII SAMPLE FROM QUANTITATIVE PERFORMANCE REPOSITORY TaskMethodMetricValueRef. Localisation TrilaterationError8.5 m[145] k-NNError1.2 m[120] CNNError0.8 m[55] Channel Est. LS PilotMSE-5 dB[22] CNNMSE-15 dB[34] LSTMMSE-20 dB[76] Routing Shortest PathPDR72%[12] Q-learningPDR89%[44] DQNPDR94%[146] • Query by constraint: “Find algorithms requiring < 100 KB memory” • Query by improvement: “List techniques providing > 50% energy reduction” Table VIII provides a sample from our comprehensive repository, demonstrating the level of detail that enables fair comparison across diverse approaches. 3) Value Proposition 3: Implementation-Ready Specifica- tions: We provide technical specifications that practitioners can directly translate into deployable systems. Unlike surveys offering only algorithmic descriptions, we include: Complete Architecture Specifications: • Input dimensions and preprocessing requirements • Layer configurations with filter sizes, activation functions, normalisation • Output formats and post-processing steps Training Configurations: • Optimiser selection with hyperparameters (Adam: η = 0.001, β 1 = 0.9) • Loss functions with regularisation terms • Data augmentation strategies (time shifts, Doppler scaling) • Early stopping criteria and checkpointing Deployment Procedures: • Quantisation to INT8 for embedded platforms • Memory profiling ensuring peak < 90% available RAM • Watchdog timers for inference timeout (100 ms typical) • Fallback to traditional algorithms when confidence < 0.7 4) Value Proposition 4: Cross-Layer Optimisation Insights: We uniquely analyse cross-layer interactions and joint opti- misation opportunities that emerge when applying ML holis- tically across the protocol stack. These insights, absent from single-layer surveys, reveal significant performance gains from coordinated learning: Physical-MAC Joint Learning: A multi-task neural net- work simultaneously predicts channel state (physical layer) and optimal transmission slot (MAC layer). Shared layers learn correlations: calm morning waters enable aggressive schedul- ing, while afternoon thermal mixing requires conservative approaches. This joint model achieves 35% better efficiency than separate models. MAC-Network Coordinated Clustering: MAC layer com- munication patterns (who communicates with whom, when, how often) inform network layer clustering. K-means using communication frequency alongside geographic position re- duces intra-cluster collisions by 45%. Application-Driven Protocol Adaptation: Detecting rare events (e.g., oil leaks) triggers protocol stack reconfiguration 23 PHY MAC NET TRN APP SL UL RLDL Impl Vertical Horizontal Diagonal • Vertical: Layer deep dive• Horizontal: Cross-layer• Diagonal: Tracing Fig. 6. Three reading patterns supported by our organisation: vertical for layer-specific expertise, horizontal for technique comparison, and diagonal for algorithm tracing. to high-reliability mode: increased FEC, confirmed delivery, multiple paths. Routine monitoring reverts to energy-efficient modes. This adaptation extends network lifetime by 3× while maintaining critical event detection [39]. 5) ValueProposition5:Future-OrientedResearch Roadmap: Rather than merely cataloguing existing work, we provide a forward-looking research roadmap identifying promising directions and explaining why certain problems merit investigation. We connect current limitations to emerging ML techniques that could provide solutions: Continual Learning for Long Deployments: Current mod- els assume stationary distributions, failing when conditions change over years-long deployments. Continual learning ap- proaches could adapt to sensor drift and biofouling while remembering critical events. Foundation Models for Underwater Sensing: Large- scale pre-training on oceanographic datasets could dramati- cally reduce deployment-specific data requirements, similar to language model success [48]. Neuromorphic Edge Intelligence: Spiking neural networks on neuromorphic processors (Intel Loihi) enable microwatt- level always-on processing for event-driven underwater mon- itoring. Quantum-Enhanced Optimisation: Many IoUT prob- lems (sensor placement, frequency allocation) are NP-hard. Quantum approximate optimisation algorithms could provide speedups on near-term quantum devices. D. Structure and Organisation Advantages Beyond content, our survey’s organisation provides unique advantages for different usage scenarios. Figure 6 illustrates supported reading patterns. 1) Layer-by-Layer Systematic Coverage: Our protocol stack organisation enables readers to quickly locate relevant content for their specific challenges. A MAC layer researcher can directly access the MAC section without wading through unrelated material, while system designers can read sequen- tially to understand complete solutions. Each layer section follows a consistent five-step structure: 1) Challenge Formulation: Why traditional approaches fail underwater TABLE IX INTEGRATED BENCHMARK: ACOUSTIC CHANNEL EQUALISATION AlgorithmBER @ 0 dBTrainInfer.Mem. MMSE Equalizer3.2× 10 −2 N/A0.5 ms10 KB RLS Adaptive1.8× 10 −2 N/A2.0 ms25 KB MLP (3 layer)8.4× 10 −3 2 h5.0 ms150 KB CNN (5 layer)4.2× 10 −3 8 h12 ms500 KB LSTM2.1× 10 −3 24 h20 ms1.2 MB Transformer1.3× 10 −3 48 h35 ms4.5 MB 2) ML Solution Space: Which algorithms address these challenges 3) Technical Implementations: Detailed algorithm descrip- tions 4) Performance Analysis: Quantitative comparisons with baselines 5) Lessons Learned: Practical insights and best practices 2) Progressive Complexity Management: We carefully manage complexity progression, ensuring accessibility without sacrificing depth: Concept Introduction: Each technique is first introduced intuitively through analogy. Reinforcement learning is ex- plained as “learning through trial and error, like a child learning to swim.” Technical Development: Mathematical formulations follow intuitive introductions, providing rigor for researchers while maintaining readability. Advanced Extensions: Sophisticated variants appear in clearly marked subsections, allowing readers to skip based on their needs. Practical Simplifications: We explicitly identify when sim- pler approaches suffice: “For networks under 20 nodes, tabular Q-learning outperforms deep RL while requiring 100× less computation.” 3) Integrated Performance Benchmarking: Unlike surveys mentioning performance in isolation, we provide integrated benchmarks comparing multiple algorithms on standardised tasks. Table IX demonstrates cross-algorithm comparison on acoustic channel equalisation, revealing trade-offs invisible when examining algorithms individually. The Transformer achieves best BER but requires 450× more memory than MMSE—potentially prohibitive for resource- constrained sensors but acceptable for AUVs with greater computational capacity. Such trade-offs become clear only through integrated comparison. E. Impact and Practical Significance The unique contributions of this survey translate into tan- gible impacts for the underwater communications community across research, industry, and interdisciplinary collaboration. 1) Accelerating Research Progress: By providing compre- hensive literature coverage with standardised comparisons, we eliminate months of literature review for new researchers. Our citation network analysis identifies seminal papers, active research groups, and emerging trends, helping researchers position their work effectively. The quantitative performance repository establishes clear baselines, ending the frustration of comparing against vague 24 claims. Researchers can immediately identify state-of-the-art performance for their specific problem, focusing effort on meaningful improvements rather than rediscovering known solutions. Our identification of open problems with suggested ap- proaches provides concrete starting points for PhD students and research proposals. Instead of vague “improve underwater communications with ML,” we offer specific hypotheses: “In- vestigate whether vision transformers’ global attention mech- anisms can overcome locality limitations of CNNs for long- range acoustic channel prediction.” 2) Enabling Industrial Deployment: Our implementation guidelines bridge the academic-industrial gap that has limited ML adoption in operational underwater systems. Companies can assess feasibility before committing resources, under- standing computational requirements, training data needs, and expected performance gains. Documented pitfalls save expensive trial-and-error in under- water deployments where mistakes cost tens of thousands of dollars per day of ship time. Knowing that “models trained in tanks fail in open ocean due to boundary reflections” prevents wasted deployments and guides data collection strategies [17]. Staged deployment procedures reduce risk for safety-critical applications. Organisations can follow our progression from simulation to tank testing to limited trials, with specific metrics and rollback triggers at each stage. 3) Fostering Interdisciplinary Collaboration: By explain- ing ML concepts through underwater examples and underwa- ter challenges through ML solutions, we create a common language for interdisciplinary collaboration: • Oceanographers contribute environmental models improv- ing physics-informed neural networks • Signal processors provide channel models enhancing simulation-based training • Network engineers identify protocol bottlenecks that ML could address • ML researchers discover challenging problems with real- world impact This cross-pollination has already sparked new research directions, with oceanographers adopting ML tools and ML re- searchers considering physical constraints previously ignored. Impact Summary • Research: Accelerated literature review, clear baselines, concrete hypotheses • Industry: Feasibility assessment, deployment guidance, risk reduction • Collaboration: Common vocabulary bridging ML, oceanography, and networking F. Conclusion of Comparison This comprehensive comparison demonstrates that our sur- vey fills critical gaps in existing literature while providing unique value through integrated analysis, quantitative bench- marking, implementation guidance, and future-oriented in- sights. Figure 7 visualises our survey’s positioning relative to existing works across two key dimensions. ML Depth IoUT Breadth BasicModerate Advanced Narrow Partial Complete Alsheikh ’14 Khisa ’21 Jahanbakht ’21 Christensen ’22 Mohsan ’22 Wang ’22 This Survey Comprehensive but ML-light ML-deep but narrow scope Fig. 7.Positioning of this survey relative to existing works across ML depth (from basic algorithms to advanced paradigms) and IoUT breadth (from single-application to full protocol stack). Our survey uniquely occupies the ideal region combining comprehensive IoUT coverage with advanced ML treatment. TABLE X FINAL COMPARISON: THIS SURVEY VS. CLOSEST EXISTING WORKS FeatureThisJan.Khi.Chr. Full protocol stack✓–∼– Deep learning coverage✓∼–∼ Reinforcement learning✓–∼✓ Federated learning✓– Cross-layer analysis✓– Quantitative repository✓∼ Implementation details✓–∼ Deployment guidance✓–∼ Future roadmap✓∼ Tutorial content✓∼– Jan.=Jahanbakht’21, Khi.=Khisa’21, Chr.=Christensen’22 ✓=Full, ∼=Partial, –=None Unlike previous works that address narrow aspects of ML or underwater communications separately, we provide the first complete treatment of ML-enabled IoUT systems from theory through deployment. Our contributions extend beyond cataloguing existing work to synthesising insights that emerge only from comprehensive cross-layer, cross-domain analysis. Table X summarises the distinguishing features of our survey compared to the closest existing works, demonstrating comprehensive coverage across all evaluation criteria. The practical guidelines, quantitative comparisons, and im- plementation details transform academic research into de- ployable solutions, accelerating progress in this critical field. The subsequent sections leverage this unique positioning to provide the core technical content: a systematic, layer-by- layer analysis of ML applications in IoUT that demonstrates these value propositions through detailed technical discus- sions, quantitative results, and practical lessons learned from real deployments. 25 Key Takeaway: Survey Positioning This survey uniquely combines: • Breadth: Complete protocol stack coverage (PHY to APP) • Depth: Advanced ML paradigms (DRL, FL, GNN, PINNs) • Practicality: Implementation-ready specifications • Timeliness: Literature through 2025 with 2026 vision No existing survey achieves this combination, making this work essential reading for researchers, practitioners, and students entering the ML-IoUT field. IV. ML APPLICATIONS IN IOUT: LAYER-BY-LAYER ANALYSIS This section presents a comprehensive technical analysis of ML applications across the IoUT protocol stack. We demonstrate how intelligent algorithms address fundamental challenges at each layer while enabling capabilities previously impossible with traditional approaches [3], [33]. Our layer- by-layer organisation facilitates both focused exploration of specific challenges and holistic understanding of system-wide optimisations. Figure 8 illustrates the mapping of ML tech- niques to protocol stack layers. A. Physical Layer Applications The physical layer forms the foundation of underwater communications, responsible for signal transmission, recep- tion, and initial processing. The unique characteristics of un- derwater channels—severe frequency-dependent attenuation, extensive multipath with delays exceeding 100 ms, and time- varying Doppler shifts—create challenges that traditional sig- nal processing struggles to address [21], [22]. ML transforms these challenges into opportunities, learning robust represen- tations that adapt to environmental dynamics while extracting maximum information from degraded signals [147]. 1) Localisation and Tracking: Underwater localisation rep- resents a fundamental challenge in IoUT systems [145]. The absence of GPS signals underwater necessitates alternative positioning methods, while ocean currents induce continuous node drift, and acoustic path bending due to temperature- salinity variations degrades ranging accuracy [148]. Tradi- tional geometric methods based on Time-of-Arrival (ToA) or Time-Difference-of-Arrival (TDoA) fail in Non-Line-of-Sight (NLOS) conditions and acoustic shadow zones [149]. Challenge Formulation: The localisation problem requires estimating unknown node positions x i ∈ R 3 from a set of anchor nodes at known positions a j , j = 1,...,M . Traditional least squares optimisation minimises: ˆ x i = argmin x i M X j=1 w ij d ij − ˆ d ij (x i ) 2 ,(75) where ˆ x i is the estimated position, M is the number of anchors, w ij are weights (often set to 1 or based on signal quality), d ij denotes measured distance, and ˆ d ij (x i ) =∥x i − a j ∥ is the estimated Euclidean distance. This formulation fails underwater due to non-Euclidean propagation paths (curved sound rays in stratified media) and outlier contamination from multipath arrivals [55]. Typical least squares solutions yield 10–50 m errors, insufficient for precision tasks such as pipeline inspection or AUV docking. k-Nearest Neighbours Fingerprinting: Fingerprinting re- frames localisation as a pattern recognition problem [69]. Dur- ing an offline phase, sensors at known locations record acous- tic fingerprints f i = [P 1 ,...,P M ,τ 1 ,...,τ M ,σ 1 ,...,σ M ] ⊤ comprising received power P , propagation delay τ , and delay spread σ from each anchor. During online localisation, a query fingerprint f q is matched against the database using the Mahalanobis distance to account for feature correlations: d M (f q ,f i ) = q (f q − f i ) ⊤ C −1 (f q − f i ),(76) where C is the covariance matrix of fingerprint features. Position estimation uses inverse-distance-weighted averaging of the k nearest neighbours: ˆ x q = P k i=1 w i x i P k i=1 w i , w i = 1 d M (f q ,f i ) + ε ,(77) where ˆ x q is the estimated query position, x i are the positions of the k nearest training samples, w i are inverse-distance weights, and ε is a small constant preventing division by zero. In a 100 m × 100 m harbour deployment with k = 10, fingerprinting achieves 1.2 m mean localisation error com- pared to 8.5 m for trilateration—a 7× improvement enabled by implicitly encoding multipath characteristics into learned fingerprints [120]. Deep Learning for Robust Localisation: CNN-based lo- calisation networks process raw multichannel acoustic sig- nals to learn hierarchical features invariant to environmental shifts [60]. The architecture directly maps received waveforms to 3D coordinates without explicit feature engineering: • Input: Multi-receiver signal matrix (N rx × N samples ) • Feature Extraction: Conv1D layers (64, 128, 256 filters, kernel size 10) with BatchNorm and MaxPool • Aggregation: Global average pooling across receivers • Regression: Dense layers (512, 256 units) with Dropout (0.3) • Output: 3D position [ˆx, ˆy, ˆz] The network employs a multi-objective loss function balanc- ing accuracy, uncertainty estimation, and physical constraints: L = λ 1 L MSE + λ 2 L var + λ 3 L bound ,(78) where λ 1 , λ 2 , and λ 3 are weighting parameters, L MSE = ∥x − ˆ x∥ 2 ensures accuracy, L var encourages calibrated un- certainty estimates, and L bound penalises predictions outside the deployment region. Data augmentation through time shifts, noise injection, and simulated multipath variations improves robustness, achieving 0.8 m accuracy for AUV docking oper- ations [55]. Reinforcement Learning for Active Localisation: AUVs can leverage mobility to actively improve localisation accuracy by moving to positions that maximise information gain [150]. This is formulated as a Partially Observable Markov Decision Process (POMDP) where the belief state b(s t ) represents uncertainty about position. The reward function penalises both uncertainty and energy expenditure: r t =−H(b(s t ))− λE move ,(79) 26 Application Layer Transport Layer Network Layer MAC Layer Physical Layer CNN, GNN, VAE, Fusion DRL, GNN, Transformer Q-Learning, DQN, GNN MARL, DRL, FL CNN, LSTM, DQN, k-N Cross-Layer Optimisation Fig. 8. Mapping of ML techniques to IoUT protocol stack layers. Each layer employs specialised algorithms suited to its unique challenges, while cross-layer optimisation enables holistic system improvement. TABLE XI LOCALISATION PERFORMANCE COMPARISON IN 100M× 100M× 50M VOLUME MethodMean Err.95% Err.LatencyRobustness Trilateration8.5 m22 m10 msPoor Weighted LS6.2 m18 m25 msFair Particle Filter3.8 m11 m200 msGood k-N (k=10)1.2 m3.5 m15 msExcellent CNN0.8 m2.2 m50 msExcellent DQN Active0.5 m1.5 m100 msExcellent where H(·) denotes entropy (measuring belief state uncer- tainty), b(s t ) is the belief state at time t, λ is a weighting parameter, and E move represents propulsion energy. A DQN- based active localisation agent learns to navigate toward acous- tic “sweet spots” with favourable geometry, achieving 0.5 m accuracy while consuming 40% less energy than systematic grid surveys [38]. Performance Comparison: Table XI summarises localisa- tion performance across methods. ML approaches demonstrate superior robustness in multipath-rich environments and sparse anchor deployments. 2) Channel Estimation and Prediction: Accurate channel state information (CSI) enables optimal signal processing, adaptive modulation selection, and power control [22]. How- ever, underwater acoustic channels exhibit extreme complex- ity: impulse responses spanning 100+ ms due to multipath propagation, coherence times of seconds to minutes, and Doppler spreads exceeding symbol rates in mobile scenar- ios [34]. Traditional pilot-based least squares estimation: ˆ h = (X H X) −1 X H y,(80) where ˆ h is the estimated channel impulse response, X is the known pilot matrix, (·) H denotes Hermitian transpose, and y is the received signal vector, requires excessive pilot overhead (10–20% of transmission time) and suffers from noise amplification at low SNR [76]. CNN-Based Channel Estimation: Convolutional neural networks learn to extract channel information from received spectrograms without explicit pilots [151]. The network archi- tecture processes time-frequency representations: ˆ H = f CNN (Y;θ),(81) where ˆ H is the estimated channel frequency response, f CNN is the CNN function, Y is the received signal spectrogram, and θ denotes learned parameters. Training uses a combined loss function: L = λ 1 ∥H true − ˆ H∥ 2 F +λ 2 ∥Y−X⊙ ˆ H∥ 2 F +λ 3 TV( ˆ H), (82) where λ 1 ,λ 2 ,λ 3 are weighting coefficients, ∥·∥ F denotes the Frobenius norm, H true is the true channel,⊙ denotes element- wise (Hadamard) product, and TV(·) is the Total Variation regularisation promoting smooth channel evolution. The first term ensures estimation accuracy whilst the second enforces consistency with observations. This approach achieves 16% lower MSE than pilot-based methods with only 5 ms inference latency [34]. LSTM Networks for Channel Prediction: Long Short- Term Memory networks capture temporal correlations in chan- nel evolution, enabling prediction of future channel states [57]. The hidden state h t encodes channel history: f t = σ(W f [h t−1 ,x t ] + b f ),(83) i t = σ(W i [h t−1 ,x t ] + b i ),(84) ̃ c t = tanh(W c [h t−1 ,x t ] + b c ),(85) c t = f t ⊙ c t−1 + i t ⊙ ̃ c t ,(86) h t = o t ⊙ tanh(c t ).(87) Multi-step prediction unfolds the network to forecast channel states ˆ H t+∆ for ∆ ∈ 1, 5, 10 seconds ahead. This enables proactive adaptation: adjusting modulation schemes before channel degradation occurs rather than reacting after errors accumulate [33]. 3) Modulation Recognition and Adaptive Transmission: Automatic Modulation Classification (AMC) enables cogni- tive underwater systems to identify transmission schemes for spectrum sensing, interference management, and adaptive communication [42]. Traditional likelihood-based classifiers require accurate channel models unavailable underwater. CNN-Based Modulation Classification: Deep learning achieves robust classification by learning discriminative fea- tures directly from received signals [147]. The network archi- tecture processes In-phase/Quadrature (I/Q) samples: • Input: Complex baseband samples (I,Q)∈ R 2×N 27 • Feature Extraction: Parallel Conv1D branches for temporal and spectral features • Classification: Dense layers with softmax output • Output: Probability distribution over modulation schemes At SNR = 0 dB, CNN classifiers achieve 96% accu- racy across BPSK, QPSK, 8-PSK, 16-QAM, and 64-QAM— compared to 75% for traditional cyclostationary feature detec- tors [42]. The learned features implicitly capture modulation- specific characteristics robust to channel distortions. Reinforcement Learning for Adaptive Modulation: RL agents learn optimal modulation and coding scheme (MCS) selection policies that maximise throughput whilst meeting reliability constraints [35]. The state captures channel and system conditions: s t = [SNR t ,σ τ,t ,f D,t , BER t−1 ,Q t ],(88) where SNR t is the signal-to-noise ratio, σ τ is delay spread, f D is Doppler spread, BER t−1 is the bit error rate from the previous time step, and Q t is queue length. The action selects from available MCS options: a t ∈BPSK-1/2, QPSK-1/2, QPSK-3/4,..., 64QAM-3/4. (89) The reward balances throughput and reliability: r t = η(a t )· I[BER t < BER th ]− λ· I[BER t ≥ BER th ], (90) where η(a t ) is the spectral efficiency of selected MCS, I[·] is the indicator function (1 if condition is true, 0 otherwise), BER th is the target error rate threshold, and λ is a penalty weight. DQN-based AMC achieves 147% throughput improve- ment over fixed modulation by learning to exploit favourable channel periods whilst gracefully degrading during fading events [152]. Lessons Learnt – Physical Layer Localisation Insights: • Fingerprinting over geometry: k-N finger- printing achieves 7× better accuracy than trilat- eration in NLOS conditions • Active localisation: Mobility enables 40% en- ergy reduction whilst improving accuracy through information-driven positioning • Multimodal fusion: Combining acoustic ranging with depth sensors reduces uncertainty by 60% Channel Estimation Strategy: • Prediction over reaction: LSTM-based channel prediction enables proactive adaptation 5–10 sec- onds ahead • Pilot reduction: CNN-based estimation achieves 16% lower MSE with only 5% pilot overhead (vs. 15–20% for LS) • Physics-informed learning: Incorporating wave equation constraints improves generalisation to untrained depths Adaptive Modulation Principles: • Dynamic range: RL-based AMC exploits 3–5× wider SNR operating range than fixed schemes • Multi-objective rewards: Balance throughput, reliability, and energy—not throughput alone • Delayed feedback: Account for propagation de- lay in Q-learning updates to prevent divergence B. MAC Layer Applications The Medium Access Control (MAC) layer coordinates chan- nel access among competing nodes, a challenge exacerbated underwater by propagation delays exceeding 1 second over kilometre distances [20]. Traditional contention protocols like CSMA suffer catastrophic performance degradation: while a terrestrial node waits microseconds to detect carrier, underwa- ter nodes wait seconds during which multiple transmissions may collide. Reservation-based protocols require complex handshaking that consumes precious channel time [18]. ML enables protocol adaptation that exploits environmental pat- terns and learns coordination strategies impossible to derive analytically. 1) Intelligent Channel Access: Q-Learning for Adaptive Backoff: Q-learning transforms the backoff mechanism from random waiting to intelligent scheduling based on learnt traffic patterns [43]. The state captures local channel observations: s t = [Q len ,N busy ,T idle ,C recent , ˆρ],(91) where s t is the state vector at time t, Q len is queue length, N busy counts busy channel detections, T idle measures idle du- ration, C recent counts recent collisions, and ˆρ estimates channel utilisation. The action space defines backoff durations: a t ∈0,W, 2W, 4W, 8W, 16W,(92) 28 where a t is the action (backoff duration) at time t, and W is the base contention window. The Q-value update incorporates delayed feedback accounting for propagation: Q(s t ,a t )← Q(s t ,a t ) + α r t+τ p + γ max a Q(s t+τ p ,a)− Q(s t ,a t ) , (93) where α is the learning rate, r t+τ p is the delayed reward, γ is the discount factor, τ p represents the round-trip propagation delay (in time steps), and s t+τ p is the state after propagation delay. The reward function encourages successful transmission while penalising collisions and delays: r t =          +10successful transmission −5collision detected −1per slot waited −P tx /P max energy penalty (94) Through exploration, nodes discover optimal strategies: aggressive transmission during quiet periods, conservative backoff during high traffic, and power adjustment based on channel quality [35]. Experimental deployments demonstrate Q-learning MAC protocols achieving 150–200% throughput improvement over fixed CSMA in dynamic underwater net- works [44]. Multi-Agent Reinforcement Learning for Distributed Coordination: Single-agent approaches treat other nodes as part of the environment, missing opportunities for explicit coordination. Multi-Agent RL (MARL) enables nodes to learn complementary policies achieving network-wide optimisation without centralised control [9]. Each agent i models the joint policy space: π i (a i |s i ,π −i ),(95) whereπ −i represents policies of other agents. The multi-agent Q-function captures coordination value: Q joint (s,a 1 ,...,a N ) = N X i=1 Q i (s,a i ) + V coord (s,a 1 ,...,a N ), (96) where V coord captures synergies between agents’ actions. De- centralised training with periodic synchronisation follows four phases: 1) Local learning: Each node updates its policy based on local observations 2) Policy sharing: Nodes broadcast compressed policy pa- rameters 3) Consensus update: Weighted averaging based on perfor- mance metrics 4) Exploration coordination: Synchronised exploration prevents conflicting strategies Communication-efficient policy sharing uses parameter quantisation, reducing 32-bit floats to 8-bit integers with minimal performance degradation. The coordination mech- anism learns implicit TDMA-like patterns: nodes discover non-overlapping transmission windows without explicit slot assignment [153]. Performance analysis shows 148% through- put improvement over independent learners while maintaining fairness (Jain’s index > 0.85). 2) Resource Allocation: Underwater networks face se- vere resource constraints: limited bandwidth (typically 10– 100 kHz), high power consumption (10–50 W for acoustic modems), and finite battery capacity (100–1000 Wh) [27]. Traditional static allocation wastes resources on idle nodes while starving active ones. ML enables dynamic, predictive allocation adapting to traffic patterns and environmental con- ditions. Deep Reinforcement Learning for Power Allocation: Power control must balance conflicting objectives: higher power improves reliability but increases interference and energy consumption [95]. Deep RL learns optimal power allocation policies considering network-wide effects. The state space encompasses local and network observa- tions: s t = [h t ,q t ,E t ,ρ t ],(97) where h t captures channel conditions, q t traffic state, E t energy status, andρ t network topology information. The continuous action space controls transmission power: P t ∈0, 0.1, 0.5, 1, 2, 5, 10, 20, 50 Watts.(98) The reward function captures multiple objectives: r t = λ 1 ·R success −λ 2 ·E consumed −λ 3 ·I caused +λ 4 ·U fairness , (99) where λ 1 ,λ 2 ,λ 3 ,λ 4 are weighting coefficients, R success indi- cates successful transmission, E consumed measures normalised energy consumption, I caused quantifies interference to other transmissions, and U fairness ensures equitable resource distri- bution. The Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm handles continuous power control [98]: μ(s;θ μ ) :S → [0,P max ],(100) Q 1 (s,a;θ Q 1 ), Q 2 (s,a;θ Q 2 ).(101) Policy updates use the minimum Q-value to prevent over- estimation: ∇ θ μ J = E s∼ρ β h ∇ a Q 1 (s,a) a=μ(s) ∇ θ μ μ(s) i .(102) Experimental results from a 30-node network demonstrate: • Energy efficiency: 52% improvement over fixed power • Network lifetime: Extended from 15 to 41 days • Packet delivery ratio: Maintained at 94% despite power reduction • Interference reduction: 38% decrease in collision rate Federated Learning for Privacy-Preserving Optimisa- tion: Military and commercial networks cannot share sensitive traffic patterns but could benefit from collaborative learn- ing [40], [41]. Federated learning enables distributed resource optimisation without data sharing. Local model training at each node: θ (t+1) i =θ (t) i − η∇L i (θ (t) i ;D i ),(103) where D i represents private local data. Secure aggregation using differential privacy adds calibrated noise: θ noisy i =θ i +N (0,σ 2 S 2 f I),(104) 29 where sensitivity S f = max D,D ′ ∥θ(D)−θ(D ′ )∥ [116]. Communication-efficient updates transmit only significant changes: ∆ sparse i = TopK(θ (t+1) i −θ (t) global ,k),(105) where k = 0.01·|θ| transmits 1% of parameters. Field de- ployment with 5 organisations (military, commercial, research) demonstrates: • Achieves 91% of centralised training performance • Maintains privacy: no organisation can infer others’ traffic patterns • Reduces communication overhead by 98% through com- pression • Adapts to heterogeneous hardware and update schedules Lessons Learned – MAC Layer Protocol Design Considerations: • Delayed Feedback: Propagation delays require patient learning—rewards arrive seconds after ac- tions • Spatial Variations: Location-specific policies outperform universal ones • Temporal Patterns: Exploit predictable patterns (tides, shipping schedules) for coordination • Energy-Awareness: Include energy in reward functions—throughput alone depletes batteries rapidly Implementation Pitfalls: • Exploration Overhead: Random exploration wastes energy—use informed exploration with domain knowledge • Fairness Neglect: Pure efficiency optimisation starves edge nodes—explicitly reward fairness • Hidden Terminals: Partial observability causes conflicting learning—share policies periodically C. Network Layer Applications Building upon the MAC layer’s intelligent channel ac- cess mechanisms, the network layer manages end-to-end data delivery across multi-hop underwater networks, addressing challenges of dynamic topology, energy-constrained routing, and unreliable links [20]. Traditional routing protocols fail underwater due to rapid topology changes from node drift, position uncertainty without GPS, and the inability to maintain consistent routing tables under long propagation delays [29]. ML transforms routing from predetermined paths to intelligent forwarding decisions adapting to network dynamics. 1) ML-Enhanced Routing:Underwater routing faces unique challenges: three-dimensional networks where vertical and horizontal distances differ greatly, void regions where no forwarding nodes exist, and energy holes where frequently- used relays die prematurely [44]. ML approaches learn to nav- igate these challenges through experience rather than relying on idealised models. Q-Learning for Opportunistic Routing: Q-learning en- ables each node to learn optimal forwarding decisions without global topology knowledge [154]. State representation for routing decisions: s t = [p dest ,n avail ,E res ,Q len ,z],(106) where p dest encodes destination, n avail lists reachable neigh- bours, E res is residual energy, Q len is queue occupancy, and z is depth. The action space comprises forwarding candidates: A =n 1 ,n 2 ,...,n k , broadcast, hold,(107) where n i represents forwarding to neighbour i. The reward function balances multiple routing objectives: r =      R delivery − λ 1 · hops− λ 2 · delay if delivered −P drop if dropped −E fwd /E rem energy cost (108) Q-value initialisation uses heuristic knowledge to accelerate convergence: Q 0 (s,a) = 1 1 + d(a, dest) − λ· E tx (a) E rem (a) ,(109) where d(a, dest) estimates distance through neighbour a and λ is an energy penalty weight [155]. Void region handling requires special consideration: • Void detection: No positive Q-values for any neighbour • Recovery mode: Switch to depth-first search or greedy forwarding • Backpressure: Propagate negative rewards upstream • Surface relay: Use surface reflection as last resort After 5000 packet transmissions, Q-routing demonstrates: • Packet delivery ratio: 94% (vs. 76% for geographic routing) • Average path length: 4.2 hops (optimal: 3.8 hops) • Energy balance: Standard deviation of node energy re- duced by 61% • Void recovery: 89% success rate in sparse networks Deep Q-Networks for Large-Scale Networks: Tabular Q-learning becomes intractable for networks with hundreds of nodes and destinations. DQN approximates Q-values us- ing neural networks, enabling routing in large-scale deploy- ments [87]. The state embedding captures network context: s = [e packet ,e neighbours ,e history ,e env ],(110) where embeddings are learned representations: • e packet : Packet header encoding (destination, TTL, prior- ity) • e neighbours : Graph neural network embedding of local topology • e history : LSTM encoding of recent routing decisions • e env : Environmental features (depth, temperature, time) The DQN architecture uses attention mechanisms for neigh- bour selection [156]: α ij = exp(f att (h i ,h j )) P k∈N (i) exp(f att (h i ,h k )) ,(111) where f att is a learned attention function [131]. 30 Curriculum learning stages training complexity: 1) Static topology, single destination 2) Static topology, multiple destinations 3) Mobile nodes, single destination 4) Mobile nodes, multiple destinations 5) Adversarial conditions (node failures, congestion) 2) Intelligent Clustering: Hierarchical network organisa- tion through clustering reduces communication overhead and extends network lifetime [78], [84]. ML-based clustering adapts to underwater-specific constraints: depth-stratified com- munication ranges, energy heterogeneity from harvesting, and mobility patterns from currents. K-Means with Energy Awareness: Standard k-means clustering minimises intra-cluster distance: min μ k K X k=1 X i∈C k ∥x i −μ k ∥ 2 .(112) For underwater networks, the distance metric incorporates energy and communication quality [80]: d UW (i,j) = q w d · d 2 ij + w E · (E max − E j ) 2 + w q · (1− q ij ) 2 , (113) where d ij is Euclidean distance, E j is residual energy, q ij is link quality, and w d ,w E ,w q are weighting factors. Cluster head selection considers multiple criteria: Score i = λ 1 · E i E max +λ 2 · |N i | N max +λ 3 ·(1− z i z max )+λ 4 ·c i , (114) where λ 1 ,λ 2 ,λ 3 ,λ 4 are weighting coefficients and centrality c i = 1/ P j∈C k d ij favours nodes closer to cluster centres [82]. Reinforcement Learning for Dynamic Reclustering: RL agents learn when and how to reorganise clusters based on network conditions [85]. The state captures cluster health: s t = [ ̄ E CH ,σ E ,N orphan , ̄ L intra ,R delivery ],(115) where ̄ E CH is average cluster head energy, σ E is energy variance, N orphan counts unassigned nodes, ̄ L intra is average intra-cluster latency, and R delivery is recent delivery ratio. Actions trigger reorganisation: a t ∈maintain, rotate CH, merge, split, fullrecluster. (116) The reward balances stability and performance: r t = λ 1 · R delivery − λ 2 · E reorg − λ 3 · Var(E nodes ),(117) where λ 1 ,λ 2 ,λ 3 are weighting coefficients balancing delivery performance, reorganisation cost, and energy variance. RL-based clustering achieves: • 40% longer network lifetime through balanced energy consumption • 25% reduction in control overhead through adaptive re- organisation • 15% improvement in delivery ratio through optimal clus- ter sizing Lessons Learned – Network Layer Routing Metric Selection: • Multi-metric optimisation: Single metrics (hop count, energy) lead to pathological behaviours • Environmental awareness: Include depth and temperature—they affect propagation • Traffic-adaptive: Different traffic types need dif- ferent paths (emergency vs. routine) • Predictive routing: Anticipate node failures and route around them preemptively Cluster Size Optimisation: • Communication-limited: Clusters bounded by acoustic range, not arbitrary sizes • Energy-balanced: Equal energy distribution more important than equal sizes • Depth-stratified: Vertical clustering exploits thermocline boundaries D. Transport Layer Applications Whilst the network layer establishes multi-hop paths, the transport layer ensures reliable end-to-end data delivery, man- aging congestion, flow control, and error recovery [12]. Un- derwater transport faces unique challenges: round-trip times exceeding 10 seconds make TCP-style acknowledgments im- practical, high bit error rates (10 −3 to 10 −2 ) require sophisti- cated error control, and variable delays from changing routes complicate sequence management [157]. ML transforms trans- port protocols from fixed mechanisms to adaptive strategies learning optimal reliability-latency-energy trade-offs. 1) Congestion Control: Congestion in underwater networks manifests differently than in terrestrial systems: temporal congestion where packets bunch up after traversing different paths, spatial congestion at depth boundaries where nodes con- centrate, and energy congestion when popular relays exhaust batteries [155]. Deep Reinforcement Learning for Predictive Congestion Control: DRL agents learn to predict and prevent congestion before it occurs, adjusting transmission rates based on network state predictions [96]. State representation captures congestion indicators: s t = [q t ,RTT t ,loss t ,E t ,t day ],(118) where q t represents queue metrics, RTT t round-trip time statistics, loss t loss indicators, E t energy levels, and t day captures diurnal patterns. The continuous action space controls transmission: a t = [rate adj , burst size , redundancy],(119) where rate adj ∈ [−0.5, 2.0] is multiplicative rate change, burst size ∈ [1, 10] packets per burst, and redundancy∈ [0, 0.5] is FEC overhead ratio. The reward function balances multiple objectives: r t = λ 1 · throughput t − λ 2 · delay t − λ 3 · loss t − λ 4 · energy t − λ 5 · unfairness t , (120) 31 where λ 1 ,λ 2 ,λ 3 ,λ 4 ,λ 5 are weighting coefficients balancing throughput, delay, loss, energy consumption, and fairness. Predictive model using LSTM forecasts congestion: h t = LSTM(x t ,h t−1 ),(121) p(congestion t+τ ) = σ(W p h t + b p ),(122) where τ ∈10s, 30s, 60s represents prediction horizons. Training uses Proximal Policy Optimisation (PPO) for sta- bility: L CLIP (θ) = E t [min(r t (θ)A t , clip(r t (θ), 1− ε, 1 + ε)A t )], (123) where probability ratio r t (θ) = π θ (a t |s t )/π θ old (a t |s t ) [96]. Deployment results demonstrate predictive superiority: • Prevents 78% of congestion events through proactive rate reduction • Maintains 85% link utilisation without packet loss • Reduces end-to-end delay by 43% through congestion avoidance • Achieves fairness index of 0.91 among competing flows 2) Reliable Data Transfer: Underwater reliability mech- anisms must overcome high bit error rates, long propaga- tion delays preventing timely retransmissions, and energy constraints limiting redundancy [158]. ML approaches learn optimal combinations of Forward Error Correction (FEC), retransmission, and redundancy strategies. Adaptive Forward Error Correction using Neural Net- works: Neural networks learn to predict channel conditions and select optimal FEC parameters [159]. Channel quality prediction model: ˆ Q t+∆t = f N (Q t−w:t ,E env ),(124) where Q t−w:t represents quality history and E env environmen- tal features. FEC parameter selection network outputs: [n,k,t] = f FEC (BER est , SNR,σ τ ,L pkt , priority),(125) where n is codeword length, k is message length, and t is error correction capability. Multi-objective loss function: L = λ 1 L rel + λ 2 L oh + λ 3 L E ,(126) where λ 1 ,λ 2 ,λ 3 are weighting coefficients, and: L rel =− logP (successful decode),(127) L oh = (n− k)/k,(128) L E = E tx (n) + P (retx)· E tx (n),(129) where P (successful decode) is the decoding success probabil- ity, n is the codeword length, k is the number of information bits, E tx (n) is the transmission energy for codeword length n, and P (retx) is the retransmission probability. Performance improvements: • Reduces retransmissions by 73% through appropriate FEC selection • Maintains 99.8% reliability with 15% less overhead • Adapts to channel variations within 10 packets • Energy savings of 41% compared to fixed FEC Deep Q-Learning for Hybrid ARQ Strategies: Hybrid Automatic Repeat Request (HARQ) combines FEC with re- transmissions. DQN learns optimal strategies for different conditions [88]. State space for HARQ decisions: s = [NACK count ,SNR hist ,buf state ,t deadline ],(130) where NACK count is the number of negative acknowledgements received, SNR hist is the SNR history vector, buf state is the buffer state, and t deadline is the remaining time until deadline. Action space combines multiple strategies: • Chase combining: Retransmit identical packet • Incremental redundancy: Send additional parity bits • Adaptive modulation: Change modulation for retrans- mission • Path diversity: Route through different path • Give up: Drop packet after threshold HARQ strategy learning results: • Reduces average retransmissions from 3.2 to 1.4 • Improves throughput by 156% in poor channels • Meets 95% of delay deadlines (vs. 68% baseline) • Energy efficiency improved by 48% Lessons Learned – Transport Layer Congestion Control Insights: • Prediction beats reaction: Forecast congestion 30–60 seconds ahead • Multi-timescale control: Fast (packet-level) and slow (flow-level) adaptations • Energy-aware congestion: Consider battery lev- els in congestion decisions Reliability Trade-offs: • FEC vs. retransmission: FEC better for broad- cast, ARQ for unicast • Adaptive redundancy: Vary protection with data importance • Deadline-aware: Trade reliability for timeliness when needed E. Application Layer With reliable communication established through the phys- ical, MAC, network, and transport layers, the application layer provides high-level services for underwater monitor- ing, data analytics, and system intelligence. ML transforms raw sensor measurements into actionable insights, enables autonomous vehicle intelligence, and provides system-wide optimisation [3]. Unlike lower layers focused on communi- cation efficiency, the application layer emphasises semantic understanding, decision support, and autonomous operation. 1) Data Analytics and Sensor Fusion: Underwater sen- sors generate heterogeneous data streams: acoustic record- ings, optical images, chemical measurements, and physical parameters [160]. ML techniques fuse these diverse inputs into coherent environmental understanding, detecting patterns invisible to individual sensors. 32 Deep Learning for Multi-Modal Sensor Fusion: Multi- modal fusion networks combine different sensing modalities, exploiting complementary information [128]. Modality-specific encoders extract features: h acoustic = CNN 1D (x acoustic ),(131) h visual = ResNet(x image ),(132) h chemical = MLP(x sensors ),(133) h physical = LSTM(x CTD ).(134) Cross-modal attention mechanisms enable information ex- change: A v→a = softmax Q v K ⊤ a √ d V a ,(135) where A v→a is the attention output from visual to acoustic modality, Q v are query vectors from visual features, K a are key vectors from acoustic features, V a are value vectors from acoustic features, and d is the feature dimension. Adaptive fusion based on modality confidence: h fused = X m α m (t)· h m , α m (t) = exp(c m (t)) P m ′ exp(c m ′ (t)) , (136) where h fused is the fused feature representation, m indexes modalities, h m is the feature from modality m, α m (t) are time-varying fusion weights, and c m (t) represents modality m’s reliability estimate at time t. Application to oil spill detection demonstrates fusion bene- fits: • 96.5% detection accuracy (vs. 78% best single modality) • 84% accuracy with 2 modalities missing • False alarm rate < 0.1% • Detection latency < 30 seconds Anomaly Detection using Autoencoders: Variational au- toencoders (VAEs) learn normal patterns, identifying anoma- lies through reconstruction error [39], [75]. Encoder produces distribution parameters: μ, logσ 2 = f enc (x),(137) whereμ is the mean vector,σ 2 is the variance vector, f enc is the encoder network, and x is the input. Sampling using reparameterisation: z =μ +σ⊙ε,ε∼N (0,I),(138) where z is the latent variable, ⊙ denotes element-wise multi- plication, andε is sampled from a standard normal distribution. Loss combines reconstruction and regularisation: L =∥x− ˆ x∥ 2 + β· D KL (q(z|x)∥p(z)),(139) where ˆ x is the reconstruction, β is a weighting parameter, D KL is the Kullback-Leibler divergence, q(z|x) is the ap- proximate posterior, and p(z) is the prior (typically N (0,I)). Anomaly detection performance: • Detects sensor drift 48 hours before failure • Identifies 94% of equipment malfunctions • Discovers unknown event types (e.g., new species vocal- isations) • Maintains < 2% false positive rate 2) AUV Intelligence: Autonomous Underwater Vehicles require sophisticated intelligence for navigation, mission plan- ning, and adaptive behaviour [8], [38]. ML transforms AUVs from scripted robots to intelligent agents capable of complex decision-making. Deep Reinforcement Learning for Path Planning: DRL enables AUVs to learn optimal paths through complex envi- ronments [161], [162]. State representation for navigation: s = [p,v,E bat ,M sonar ,u current ,m status ],(140) where p is position, v velocity, E bat battery level, M sonar sonar map, u current current field estimate, and m status mission status. Continuous action space: a = [thrust, rudder, dive angle ].(141) Hierarchical reward structure: r mission =      +100 target reached +10waypoint achieved +1progress toward goal (142) r safety = ( −100 collision −10dangerous proximity (143) r efficiency =−λ 1 · E used − λ 2 · t elapsed .(144) Twin Delayed DDPG (TD3) for continuous control achieves [98]: • 31% shorter paths than A* in complex terrain • 45% energy savings by exploiting currents • Zero collisions in 1000 hours of operation • Adapts to actuator failures within 50 episodes Computer Vision for Underwater Perception: Deep learning enables sophisticated visual perception despite under- water imaging challenges: colour distortion, backscatter, and limited visibility [68], [103]. Object detection using adapted YOLO architecture [163]: • Colour correction module: Learnable preprocessing • Dehazing layers: Remove backscatter effects • Multi-scale features: Handle size variations from distance • Rotation invariance: Objects at arbitrary orientations Detection performance: • 92% mAP for common objects (fish, rocks, structures) • 86% accuracy for pipeline damage detection • 15 FPS on embedded GPU (NVIDIA Jetson) • Robust to 70% visibility reduction Multi-Agent Coordination for AUV Swarms: Multiple AUVs collaborate for large-scale missions requiring sophisti- cated MARL-based coordination [9]. Decentralised actor-critic with communication: π i (a i |o i ,m −i ) = f π i (o i , aggregate(m −i )),(145) m i = f msg (o i ,h i ),(146) where π i is agent i’s policy, a i is agent i’s action, o i is agent i’s observation, m −i are messages from other agents, m i is agent i’s message, f msg is the message generation function, and h i is agent i’s hidden state. 33 Attention-based message aggregation: ̄m i = X j̸=i α ij m j , α ij = exp(e ij ) P k exp(e ik ) ,(147) where ̄m i is the aggregated message for agent i, α ij are attention weights, and e ij are attention scores between agents i and j. Swarm coordination achieves: • 3× faster area coverage than individual AUVs • 94% task completion under communication failures • Emergent division of labour without explicit program- ming 3) Environmental Monitoring and Prediction: ML trans- forms environmental monitoring from passive observation to active prediction [11]. Deep Learning for Ocean Current Prediction: ConvL- STM captures spatial-temporal dynamics for current forecast- ing [57]: i t = σ(W xi ∗ X t + W hi ∗ H t−1 + b i ),(148) f t = σ(W xf ∗ X t + W hf ∗ H t−1 + b f ),(149) o t = σ(W xo ∗ X t + W ho ∗ H t−1 + b o ),(150) C t = f t ⊙ C t−1 + i t ⊙ tanh(W xc ∗ X t + W hc ∗ H t−1 ), (151) H t = o t ⊙ tanh(C t ),(152) where ∗ denotes convolution. Physics-informed constraints improve predictions [47]: L physics =∥∇· u∥ 2 + ∂u ∂t + (u·∇)u + 1 ρ ∇p− ν∇ 2 u 2 , (153) where the first term enforces incompressibility (∇· u = 0) and the second term enforces the Navier-Stokes momentum equation (with u being velocity, ρ density, p pressure, and ν kinematic viscosity as previously defined). Current prediction results: • 6-hour forecast: 0.91 correlation, 0.12 m/s RMSE • 24-hour forecast: 0.78 correlation, 0.23 m/s RMSE • 100× faster than numerical ocean models Species Distribution Modelling: Neural networks predict species presence from environmental features [121], [141]: p(presence|x) = σ(f final ([h env ,h interact ,h spatial ])),(154) where p(presence|x) is the probability of species presence given environmental features x, σ is the sigmoid function, f final is the final network layer, and h env , h interact , and h spatial are learnt representations of environmental, species-interaction, and spatial features respectively, achieving 94% AUC for common species and 81% for rare species (< 50 observations). Lessons Learned – Application Layer Data Processing: • Quality over quantity: Clean data beats big data • Domain-specific augmentation: Simulate realis- tic underwater conditions • Temporal alignment: Synchronise multi-rate sensors carefully Deployment Best Practices: • Edge processing essential: Cannot rely on sur- face links for real-time applications • Model compression: Quantisation, pruning for embedded deployment • Continuous monitoring: Track model drift and degradation F. Cross-Layer Optimisation and Emerging Applications While individual layer optimisations yield significant im- provements, the greatest gains emerge from cross-layer ML approaches that jointly optimise multiple protocol layers [3]. These holistic solutions exploit correlations across layers and enable system-wide intelligence. 1) Joint Physical-MAC-Network Optimisation: Simultane- ous optimisation across multiple layers captures interdepen- dencies invisible to single-layer approaches. Multi-Task Deep Learning for Protocol Stack Optimi- sation: A unified neural network simultaneously optimises physical layer modulation, MAC scheduling, and routing de- cisions [164]. Shared encoder extracts common features: h shared = f enc (x channel ,x network ,x traffic ).(155) Task-specific heads produce layer decisions: [MCS, power] = f PHY (h shared ),(156) [slot, backoff] = f MAC (h shared ),(157) nexthop = f NET (h shared ).(158) Multi-task loss with adaptive uncertainty-based weighting: L = X i 1 2σ 2 i (t) L i + logσ i (t) .(159) Cross-layer information flow enables: • PHY → MAC: Channel quality affects scheduling • MAC → NET: Queue states influence routing • NET → PHY: Route length determines power Performance gains from joint optimisation: • 42% improvement over independent optimisation • Discovers non-obvious correlations across layers • Reduces total protocol overhead by 31% • Adapts all layers simultaneously to changes 2) End-to-End Learning for Underwater Communications: End-to-end learning replaces the entire protocol stack with learned representations, potentially discovering novel commu- nication strategies [34]. 34 Autoencoder-Based Communication Systems: Transmit- ter and receiver are jointly trained neural networks: s = f tx (m;θ tx ),(160) y = h(s) + n,(161) ˆ m = f rx (y;θ rx ).(162) End-to-end training: min θ tx ,θ rx E m,h,n [L(m, ˆ m)].(163) Learned constellations adapt to underwater channels: • Non-uniformspacingcompensatesforfrequency- selective fading • Asymmetric designs handle Doppler shifts • Hierarchical structures enable adaptive rates Results show 30% improvement in BER compared to tra- ditional QAM in multipath channels. 3) Future Directions: Several promising directions remain unexplored: Quantum ML: • Quantum feature maps for channel estimation • Quantum optimisation for network design (QAOA) • Quantum-resistant security protocols Foundation Models for Ocean Sensing: Large-scale pre- training on oceanographic data could dramatically reduce deployment-specific requirements [48]. Neuromorphic Computing: Spiking neural networks on specialised hardware (Intel Loihi) enable microwatt-level always-on monitoring [57]. G. Quantitative Comparisons and Performance Analysis Systematic evaluation across diverse underwater communi- cation tasks reveals consistent ML superiority, with improve- ments ranging from modest 20–30% gains in well-understood problems to revolutionary 10–100× improvements in complex scenarios [33]. Table XII presents comprehensive comparisons across all protocol layers. Energy Efficiency Achievements: The most remarkable improvements emerge in energy efficiency—critical for ex- tending operational lifetime of battery-powered sensors [27]. Table XIII details energy savings across applications. The extraordinary 1556× total energy reduction emerges from compound effects: ML reduces both the frequency of energy-intensive operations (fewer retransmissions, less fre- quent channel sounding) and the energy per operation (opti- mised transmission power, efficient routing). This translates to network lifetime extension from weeks to years—transforming underwater monitoring from expensive periodic deployments to persistent presence. Scalability Analysis: ML approaches demonstrate superior scaling characteristics: • 10 nodes: Traditional 82% PDR, ML 91% PDR (11% advantage) • 50 nodes: Traditional 68% PDR, ML 89% PDR (31% advantage) • 100 nodes: Traditional 51% PDR, ML 87% PDR (71% advantage) • 500 nodes: Traditional 23% PDR, ML 84% PDR (265% advantage) The widening performance gap reflects ML’s ability to learn complex interactions that overwhelm rule-based systems. While traditional protocols implement fixed behaviours re- gardless of scale, ML algorithms discover scale-appropriate strategies: hierarchical organisation for large networks, aggres- sive transmission in small networks, and adaptive clustering at intermediate scales. H. Computational Complexity Analysis Understanding computational requirements guides algo- rithm selection for resource-constrained platforms [53]. Ta- ble XIV presents complexity comparison. Key observations: • Training vs. Inference Asymmetry: ML exhibits high training complexity but constant O(1) inference— advantageous for long-term deployments • Memory-ComputationTrade-off: Neural networks store learned parameters |θ| instead of explicit tables— a DQN router with 10,000 parameters (40 KB) replaces O(n 2 ) routing tables • Parallelisation: ML algorithms exhibit natural paral- lelism achieving 4–8× speedup with SIMD instructions Model Compression Techniques: Practical deployment requires aggressive optimisation [64]: • Quantisation: Float32 → Int8 provides 4× memory reduction with <2% accuracy loss • Pruning: 90% sparsity achievable with <5% accuracy loss • Knowledge Distillation: 12× parameter reduction retain- ing 95% accuracy This comprehensive layer-by-layer analysis demonstrates that ML approaches consistently outperform traditional meth- ods across all IoUT protocol layers. The quantitative improvements—ranging from 24% to 1556× depending on the application—justify the additional complexity of ML imple- mentation while the computational analysis provides practical guidance for resource-constrained deployments. V. PERFORMANCE ANALYSIS: ML VS TRADITIONAL APPROACHES The transformation of underwater communications through ML demands rigorous quantitative analysis to justify the com- plexity and computational costs of intelligent algorithms [3], [165]. This section presents comprehensive performance com- parisons between ML-based and traditional approaches across multiple metrics, revealing not just marginal improvements but often order-of-magnitude gains that fundamentally change what is possible in underwater networks. Through detailed computational complexity analysis, energy efficiency evalua- tions, and statistical significance assessments, we demonstrate that ML techniques, despite their initial overhead, ultimately deliver superior performance-per-watt—the critical metric for battery-powered underwater systems [20], [166]. 35 TABLE XII COMPREHENSIVE ML VS. TRADITIONAL PERFORMANCE COMPARISON ACROSS IOUT APPLICATIONS. VALUES REPRESENT TYPICAL RESULTS FROM CITED STUDIES; ACTUAL PERFORMANCE VARIES WITH EXPERIMENTAL CONDITIONS AND ENVIRONMENTS. SEE SECTION V-A AND CITED PAPERS FOR DETAILED METHODOLOGY AND CONTEXTS. ApplicationMetricTraditionalML-BasedImprovementML Technique Physical Layer LocalisationPosition Error (m)8.50.891% reductionCNN Channel EstimationMSE0.0430.012Significant reduc- tion LSTM Modulation ClassificationAccuracy @ 0 dB75%96%28% increaseCNN Adaptive ModulationThroughput (kbps)Baseline+20–45% ∗ Substantial increase DQN MAC Layer Channel AccessUtilisation8%18–42% ∗ Significant increase Q-Learning Collision RateCollisions/hour451273% reductionMARL Power ControlEnergy/bit (mJ)2.80.9566% reductionTD3 Network Layer RoutingPDR76%94%24% increaseGNN Path LengthAverage Hops6.14.231% reductionQ-Learning Network LifetimeDaysBaseline2–3× ∗ Substantial increase DRL Transport Layer Congestion ControlPacket Loss8.2%0.7%91% reductionPPO RetransmissionsAverage Attempts3.21.456% reductionDQN End-to-End DelaySeconds18.37.261% reductionLSTM Application Layer Object DetectionmAP52%92%77% increaseYOLOv8 Anomaly DetectionDetection Rate71%96%35% increaseVAE Data CompressionRatio10:1100:110× improvementAutoencoder ∗ Ranges indicate performance variations across different deployment scenarios, network sizes, and environmental conditions reported in cited literature. TABLE XIII ENERGY EFFICIENCY GAINS: ML VS. TRADITIONAL APPROACHES OperationTraditionalML-BasedSavings Acoustic Transmission10 J/pkt0.34 J/pkt29× Channel Estimation0.5 J/est0.08 J/est6× Route Discovery45 J/route2.1 J/route21× Object Detection8.2 J/frame0.15 J/frame55× Network Maintenance850 J/day12 J/day71× Total Daily2800 J180 J1556× TABLE XIV COMPUTATIONAL COMPLEXITY: ML VS. TRADITIONAL ALGORITHMS TaskTraditionalML-Based Training/Setup Phase Localisation SetupO(n 3 )O(n 2 d) Routing TableO(n 3 )O(n 2 kE) Channel ModelO(T 2 )O(TBE) Inference/Operation Phase Position EstimationO(n 2 )O(k) Route ComputationO(n 2 )O(1) Channel PredictionO(T)O(1) Space Complexity Routing TablesO(n 2 )O(|θ|) Channel ModelsO(T)O(|θ|) A. Comparison Methodology Baseline Definitions: Throughout this analysis, “tradi- tional” or “baseline” methods refer to established non-ML approaches that represent the state-of-the-art prior to ML adoption in each domain. Specifically: • Physical Layer: Least squares estimation for localisation and channel estimation, cyclostationary feature detectors for modulation classification, fixed modulation schemes • MAC Layer: ALOHA variants (pure ALOHA, slotted ALOHA, CSMA), fixed power allocation, predetermined TDMA schedules • Network Layer: Geographic routing (VBF, DBR), op- portunistic protocols, flooding-based approaches • Transport Layer: Fixed ARQ schemes, static FEC codes, TCP variants adapted for underwater (e.g., TUCP) • Application Layer: Traditional computer vision (SIFT, HOG features with SVM), rule-based anomaly detection Performance Metrics: We evaluate ML approaches across multiple dimensions: • Primary metrics: Task-specific performance (accuracy, throughput, latency, energy efficiency) • Efficiency metrics: Computational complexity, memory requirements, training time, inference latency • Deployment metrics: Robustness to environmental vari- ations, adaptability to changing conditions, long-term stability Data Sources: Performance numbers are synthesised from peer-reviewed publications spanning 2015–2025, prioritising results from: (1) field deployments and sea trials over simula- tions when available, (2) studies with clearly defined test con- ditions and multiple independent runs, and (3) works providing statistical significance analysis. Where ranges are presented (e.g., 7–29× energy improvements), these reflect variations across different deployment scenarios, network sizes, or envi- ronmental conditions reported in the source literature. Caveats: Direct comparisons across studies can be chal- lenging due to differing test conditions, network scales, and 36 baseline implementations. We note specific limitations in our analysis where applicable. Percentage improvements should be interpreted within their specific deployment contexts rather than as universal guarantees. B. Quantitative Comparisons Systematic evaluation across diverse underwater communi- cation tasks reveals consistent ML superiority, with improve- ments ranging from modest 20–30% gains in well-understood problems to revolutionary 10–100× improvements in complex scenarios where traditional approaches struggle [44], [66]. The following subsections provide layer-by-layer analysis with supporting evidence from recent literature. 1) ComprehensivePerformanceMetrics:TableXV presentsquantitativecomparisonsacrossallprotocol layers, demonstrating the breadth and magnitude of ML improvements. These results synthesise findings from multiple experimental studies and field deployments conducted between 2015 and 2025 [137], [167]. 2) Statistical Significance of Results: The performance improvements reported in Table XV have been validated across multiple studies with statistical rigor. Key observations include: Localisation Accuracy: Recent advances using k-Nearest Neighbours (kNN) with adaptive distance metrics have achieved remarkably high localisation accuracy (99.98%) in controlled water tank experiments [168], though real-world performance may vary with environmental conditions. Con- volutional neural networks (CNNs) trained on matched-field processing data demonstrate robust performance even under sound speed profile mismatches, achieving position errors below 1 metre at ranges exceeding 5 km in deep ocean environments [59], [60]. Channel Estimation: Deep learning-based channel estima- tors consistently outperform traditional least-squares and min- imum mean square error (MMSE) methods. Long Short-Term Memory (LSTM) networks capture temporal correlations in time-varying channels, achieving substantial MSE reductions (reported as 72% in specific test scenarios [151]) compared to conventional pilot-based estimation [182]. Hybrid archi- tectures combining CNNs for spatial feature extraction with LSTMs for temporal tracking have demonstrated even greater improvements in rapidly fluctuating shallow-water environ- ments [183]. Adaptive Modulation: Reinforcement learning approaches to adaptive modulation selection have shown substantial throughput improvements. The LSTM-DQN-AM architecture achieves 22.95% throughput enhancement over traditional Q- learning by incorporating channel state prediction [170]. Proxi- mal Policy Optimisation (PPO)-based schemes further improve robustness to outdated channel state information, maintaining near-optimal performance with CSI delays up to 500 ms [184], [185]. 3) Energy Efficiency Achievements: The most remarkable improvements emerge in energy efficiency—critical for ex- tending operational lifetime of battery-powered sensors [20]. Table XVI details energy savings across different applications, synthesised from multiple deployment studies. The compound energy savings emerge from multiple syn- ergistic effects: ML reduces both the frequency of energy- intensive operations (fewer retransmissions, less frequent channel sounding) and the energy per operation (optimised transmission power, efficient routing) [187]. This translates to network lifetime extension from weeks to years—transforming underwater monitoring from expensive periodic deployments to persistent presence [27]. 4) Scalability Analysis: ML approaches demonstrate supe- rior scaling characteristics, maintaining performance as net- work size increases while traditional methods degrade rapidly. Figure 9 illustrates this divergence based on simulation studies with network sizes ranging from 10 to 500 nodes [12], [137]. 0100200300400500 0 20 40 60 80 100 Number of Nodes Packet Delivery Ratio (%) Traditional ML-Based Fig. 9. Scalability comparison showing packet delivery ratio vs. network size. ML approaches maintain consistent performance while traditional protocols degrade significantly with scale. The quantitative advantages are striking: • 10 nodes: Traditional 82% PDR, ML 91% PDR (11% advantage) • 50 nodes: Traditional 68% PDR, ML 89% PDR (31% advantage) • 100 nodes: Traditional 51% PDR, ML 87% PDR (71% advantage) • 500 nodes: Traditional 23% PDR, ML 84% PDR (265% advantage) The widening performance gap reflects ML’s ability to learn complex interactions that overwhelm rule-based systems [156]. While traditional protocols implement fixed behaviours re- gardless of scale, ML algorithms discover scale-appropriate strategies: hierarchical organisation for large networks, aggres- sive transmission in small networks, and adaptive clustering at intermediate scales [84], [85]. 5) Comparative Analysis Across Network Conditions: Ta- ble XVII presents ML performance advantages under varying environmental and network conditions, demonstrating robust- ness that traditional approaches lack. C. Computational Complexity Analysis Understanding computational requirements guides algo- rithm selection for resource-constrained underwater plat- forms [165]. This subsection analyses both theoretical com- plexity and practical implementation costs, providing guidance for deployment decisions. 37 TABLE XV COMPREHENSIVE ML VS TRADITIONAL PERFORMANCE COMPARISON ACROSS IOUT APPLICATIONS Application DomainMetricTraditionalML-BasedImprovementML TechniqueReference Physical Layer LocalisationPosition Error (m)8.50.891% reductionCNN[168] Channel EstimationMSE0.0430.012Significant reductionLSTM[151] Modulation ClassificationAccuracy@0dB SNR 75%96%28% increaseCNN[169] Adaptive ModulationThroughput (kbps)BaselineImprovedSubstantial increaseDQN[170] MAC Layer Channel AccessUtilisationBaselineImprovedSubstantial increaseQ-Learning[171] Collision RateCollisions/hour451273% reductionMARL[172] Power ControlEnergy/bit (mJ)2.80.9566% reductionTD3[173] Resource AllocationFairness Index0.620.9147% increaseMO-DQN[174] Network Layer RoutingPacket Delivery Ratio76%94%24% increaseGNN[175] Path LengthAverage Hops6.14.231% reductionQ-Learning[176] Network LifetimeDaysBaselineExtendedSubstantial increaseDRL[177] Void RecoverySuccess Rate52%89%71% increaseDQN[29] Transport Layer Congestion ControlPacket Loss8.2%0.7%91% reductionPPO[178] RetransmissionsAverage Attempts3.21.456% reductionDQN[39] Flow ControlBuffer Overflow12%2.8%77% reductionSARSA[155] End-to-End DelaySeconds18.37.261% reductionLSTM[164] Application Layer Object DetectionmAP52%92%77% increaseYOLOv8n[179] Anomaly DetectionDetection Rate71%96%35% increaseVAE[180] Data CompressionCompression Ratio10:1100:110× improvementAutoencoder[64] Environmental Prediction24hr Forecast RMSE0.45 m/s0.23 m/s49% reductionConvLSTM[181] TABLE XVI ENERGY EFFICIENCY GAINS: ML VS TRADITIONAL APPROACHES OperationTraditionalML-BasedSavingsRef. Acoustic Transmission10 J/packet0.34 J/packet29×[186] Channel Estimation0.5 J/estimate0.08 J/estimate6×[151] Route Discovery45 J/route2.1 J/route21×[176] Object Detection8.2 J/frame0.15 J/frame55×[179] Network Maintenance850 J/day12 J/day71×[78] Total Daily Energy2800 J180 J15.6×— TABLE XVII ML PERFORMANCE GAINS UNDER VARYING CONDITIONS ConditionMetricML GainReference High node mobilityPDR+45%[188] Sparse topologyDelivery ratio+38%[29] High traffic loadThroughput+67%[155] Time-varying channelBER-52%[170] Low SNR (<0 dB)Classification+28%[169] Multi-hop (5+ hops)Latency-41%[164] 1) Time Complexity Comparison: Table XVIII presents asymptotic complexity for key algorithms, where n rep- resents network size, d data dimensionality, k number of clusters/neighbours, E training epochs, B batch size, M modulation schemes, and T time series length. Key observations from complexity analysis: Training vs Inference Asymmetry: ML approaches exhibit high training complexity—O(n 2 kE) for iterative algorithms with E epochs—but constant O(1) inference time after train- ing [53]. Traditional methods show opposite characteristics: minimal setup but O(n 2 ) operational complexity. For long- term deployments where training occurs once but inference happens continuously, ML’s front-loaded complexity proves advantageous [144]. TABLE XVIII COMPUTATIONAL COMPLEXITY: ML VS TRADITIONAL ALGORITHMS Task/AlgorithmTraditionalML-Based Training/Setup Phase Localisation SetupO(n 3 )O(n 2 d) Routing Table CreationO(n 3 )O(n 2 kE) Channel Model FittingO(T 2 )O(TBE) Clustering InitialisationO(n 2 logn)O(nkE) Inference/Operation Phase Position EstimationO(n 2 )O(k) Route ComputationO(n 2 )O(1) Channel PredictionO(T)O(1) Modulation SelectionO(M)O(1) Space Complexity Routing TablesO(n 2 )O(|θ|) Channel ModelsO(T)O(|θ|) Localisation DatabaseO(nd)O(kd) Memory-Computation Trade-off: Neural networks trade memory for computation, storing learned parameters |θ| in- stead of explicit lookup tables [52]. A DQN router with 10,000 parameters (40 KB) replaces routing tables requiring O(n 2 ) entries—4 MB for 1000-node networks. This memory efficiency enables deployment on resource-constrained sensors with 256 KB RAM [189]. Parallelisation Opportunities: ML algorithms exhibit nat- ural parallelism: matrix operations in neural networks, inde- pendent Q-value updates in distributed learning, and parallel tree evaluation in random forests [53]. Modern embedded processors with SIMD instructions achieve 4–8× speedup for ML inference compared to sequential traditional algorithms. 2) Practical Complexity Metrics: Beyond asymptotic anal- ysis, practical deployment requires understanding actual re- source consumption. Table XIX presents measured metrics from embedded implementations. 38 TABLE XIX PRACTICAL RESOURCE REQUIREMENTS FOR EMBEDDED DEPLOYMENT AlgorithmRAMFlashInferencePlatform Q-Learning Router12 KB48 KB0.3 msCortex-M4 DQN Router64 KB256 KB2.1 msCortex-A53 LSTM Predictor128 KB512 KB5.4 msJetson Nano CNN Classifier256 KB1.2 MB8.7 msCoral TPU Dijkstra(100 nodes) 40 KB8 KB12.3 msCortex-M4 AODV (100 nodes)120 KB24 KB45.7 msCortex-M4 3) Optimisation Techniques for Embedded Deployment: Practical deployment requires aggressive optimisation to meet real-time constraints on limited hardware [57]. The following techniques enable ML deployment on resource-constrained underwater platforms. Model Compression Techniques: Quantisation reduces numerical precision with minimal accuracy loss: • Float32 → Int8: 4× memory reduction, 2–4× speedup • Binary/Ternarynetworks:32×compression,10× speedup • Performance impact: <2% accuracy loss for 8-bit, 5–10% for binary Pruning removes redundant parameters [53]: • Magnitude pruning: Remove weights below threshold • Structured pruning: Remove entire channels/layers • Typical results: 90% sparsity with <5% accuracy loss Knowledge Distillation transfers knowledge to smaller models: • Teacher model: ResNet-50 (25M parameters) • Student model: MobileNet (2M parameters) • Performance: 95% of teacher accuracy with 12× fewer parameters 4) Hardware Acceleration Options: Specialised hardware accelerates ML inference for underwater deployment scenar- ios: Embedded GPUs (NVIDIA Jetson series): • 472 GFLOPS at 10W power consumption (Jetson Nano) • 20× speedup for CNN inference vs. CPU • Enables real-time video processing underwater [179] Neural Processing Units (Google Coral, Intel Movidius): • 4 TOPS at 2W for Int8 operations (Coral Edge TPU) • 100× power efficiency vs. CPU • Ideal for battery-powered sensors [189] FPGAs (Xilinx Zynq series): • Customisable datapath for specific models • 5× power efficiency vs. GPU • Microsecond latency for time-critical decisions [190] 5) Trade-off Analysis: Accuracy vs Resources: The funda- mental trade-off between model complexity and performance guides deployment decisions. Figure 10 illustrates the Pareto frontier for underwater object detection models. Key trade-off considerations for deployment planning: • Accuracy plateau: Beyond certain complexity, accu- racy gains diminish (diminishing returns above 280 mJ/inference) 02004006008001,000 60 80 100 TinyYOLO MobileNet-SSD YOLOv8n YOLOv8s Energy per Inference (mJ) Detection Accuracy (mAP %) Pareto Optimal Pareto Frontier Sub-optimal Fig. 10. Pareto frontier for underwater object detection showing accuracy- energy trade-offs. Blue points represent Pareto-optimal configurations; red points are dominated solutions. • Energy cliff: Power consumption increases super-linearly with model size • Latency threshold: Real-time requirements impose hard complexity limits (<100 ms for collision avoidance) • Memory wall: Embedded RAM constraints absolutely limit model size (256 KB–1 MB typical) Optimal operating points depend on application require- ments: • Safety-critical (collision avoidance): Maximum accuracy despite energy cost • Routine monitoring: Balance accuracy and efficiency • Long-term deployment: Minimise energy even if accuracy suffers D. Energy Efficiency Gains Energy efficiency determines operational lifetime for battery-powered underwater systems [20], [27]. ML’s intelli- gent resource management achieves dramatic energy savings through multiple mechanisms that traditional approaches can- not replicate. 1) Per-Operation Energy Analysis: Detailed energy pro- filing reveals where ML provides greatest savings across different operational phases [186], [187]. Transmission Energy Optimisation: Traditional fixed-power transmission consumes energy ac- cording to: E tx,fixed = P max · T packet · N attempts (164) ML-adaptive transmission optimises multiple factors simul- taneously: E tx,ML = P optimal (h,d, SNR)· T packet · N attempts,reduced (165) where h represents channel state, d is distance, and SNR is signal-to-noise ratio. ML reduces both transmission power (average 3.2W vs 10W) and retransmission attempts (1.4 vs 3.2), achieving compound savings: E tx,fixed E tx,ML = 10× 3.2 3.2× 1.4 = 7.1×(166) 39 Computational Energy Comparison: Table X presents energy consumption per operation for different processing tasks, measured on representative embed- ded platforms. TABLE X ENERGY CONSUMPTION PER OPERATION OperationTraditionalMLHardware Channel Estimation450 mJ72 mJARM Cortex-M4 Route Computation890 mJ23 mJARM Cortex-A53 Object Detection8200 mJ150 mJJetson Nano Anomaly Detection340 mJ45 mJCoral TPU Packet Scheduling125 mJ18 mJARM Cortex-M4 Cluster Formation560 mJ85 mJARM Cortex-A53 2) Network Lifetime Improvements: Energy savings trans- late directly to extended network lifetime—the most critical metric for underwater deployments where node replacement costs $10,000–$100,000 per node [3]. Consider a typical sensor node with 1000 Wh battery capacity: Traditional operation: • Daily energy: 2800 J = 0.78 Wh • Lifetime: 1000 0.78 = 1282 days = 3.5 years ML-optimised operation: • Daily energy: 180 J = 0.05 Wh • Lifetime: 1000 0.05 = 20000 days = 54.8 years While 54-year lifetime exceeds battery shelf life and hard- ware reliability, the calculation demonstrates that energy be- comes non-limiting with ML optimisation [78]. Networks previously constrained by battery life can now operate until hardware failure—typically 5–10 years underwater. Table XXI summarises network lifetime improvements reported in recent literature. TABLE XXI NETWORK LIFETIME IMPROVEMENTS: ML VS TRADITIONAL PROTOCOLS ProtocolCompari- son ImprovementNetwork SizeReference QELAR vs VBF20% longer100 nodes[176] DEKCS vs LEACH70% longer200 nodes[78] EDORQ vs DBR35% longer150 nodes[191] Q-EAVARvs QELAR 25% longer100 nodes[192] ENCRQ vs QHUC23.5% longer200 nodes[177] CTRGWOvs LEACH 23.5% longer150 nodes[193] 3) Energy Harvesting Integration: ML optimisation en- ables operation entirely on harvested energy—impossible with traditional approaches due to their higher power require- ments [27], [28]. Available Energy Sources: • Ocean thermal gradients: 0.1–1 mW/cm 2 • Microbial fuel cells: 0.01–0.1 mW/cm 2 • Wave energy: 1–10 mW (highly variable) • Tidal currents: 0.5–5 mW/cm 2 • Total harvestable: ∼5–50 mW continuous Energy Budget Comparison: Traditional sensor requires 32 mW average (2800 J/day), exceeding harvestable energy capacity. ML-optimised sensor requires 2.1 mW average (180 J/day), enabling perpetual operation on harvested energy with surplus for opportunity sensing during favourable con- ditions [194]. 4) Adaptive Energy Management: ML enables intelli- gent energy allocation based on predicted future availability and demand—a capability fundamentally beyond traditional threshold-based approaches [27]. Predictive Energy Management: LSTM networks forecast energy availability from environmental conditions: E available (t + ∆t) = f LSTM (E history ,T gradient , Wave state , Tide phase ) (167) Reinforcement learning optimises energy allocation: π ∗ (s) = arg max a Q(s,a)(168) where state s = [E battery ,E predicted ,Task queue ,Priority levels ] captures both current resources and future predictions. This predictive management achieves: • 35% better energy utilisation efficiency • 89% fewer energy-starvation events • 2.3× extension in high-priority task completion 5) Cross-Layer Energy Optimisation: Joint optimisation across protocol layers yields compound energy savings ex- ceeding individual layer improvements [189], [195]: • Physical layer adaptation: 3× reduction (adaptive power, modulation) • MAC collision avoidance: 2.5× reduction (intelligent scheduling) • Routing optimisation: 2.8× reduction (energy-aware paths) • Transport reliability: 2.1× reduction (predictive retrans- mission) • Application intelligence: 4× reduction (semantic com- pression) Na ̈ ıve multiplication suggests 3×2.5×2.8×2.1×4 = 176× improvement, but layer interactions reduce this to observed 29–1556× range depending on network conditions and appli- cation requirements. Still, compound effects demonstrate that holistic ML approaches dramatically outperform piecemeal optimisation. 6) Energy-Aware Learning: Modern ML techniques explic- itly consider energy in training objectives, producing models optimised for underwater deployment constraints [187]. Energy-Regularised Loss Functions: L total =L task + λ· E inference E ref (169) where E inference estimates inference energy from model com- plexity (FLOPs, memory access patterns), E ref is a reference energy budget for normalisation, and λ controls the energy- accuracy trade-off. Neural Architecture Search with Energy Constraints: Au- tomated architecture search optimises the energy-accuracy Pareto frontier: NAS objective = Accuracy− α· log(Energy),(170) 40 where α is a weighting parameter controlling the energy- accuracy trade-off. This produces models with 85% accuracy at 10× lower energy than manually designed networks achieving 87% accuracy—a worthwhile trade-off for extended deployment lifetime. 7) Case Study: Complete System Energy Analysis: A de- ployed 50-node monitoring network demonstrates end-to-end energy improvements achievable with comprehensive ML op- timisation [78], [196]. Traditional System Configuration: • Sensing: 20 J/hour (continuous sampling) • Processing: 45 J/hour (FFT, filtering, feature extraction) • Communication: 320 J/hour (10 transmissions at fixed power) • Idle: 5 J/hour (sleep mode with periodic wake) • Total: 390 J/hour = 9.36 kJ/day per node • Network total: 468 kJ/day • Battery life: 77 days (with 1000 Wh capacity) ML-Optimised System Configuration: • Sensing: 8 J/hour (adaptive sampling based on predicted activity) • Processing: 12 J/hour (edge ML with early exit) • Communication: 18 J/hour (intelligent aggregation, adap- tive power) • Idle: 2 J/hour (deep sleep with ML-predicted wake win- dows) • Total: 40 J/hour = 0.96 kJ/day per node • Network total: 48 kJ/day • Battery life: 750 days (with same 1000 Wh capacity) The 9.75× improvement emerges from intelligent deci- sions at every level: sampling only when conditions change, processing locally to identify important events, transmitting only anomalies and aggregated statistics, and sleeping deeply when activity is unlikely. This holistic optimisation, impossible without ML’s pattern recognition and prediction capabilities, transforms underwater monitoring from periodic campaigns to persistent presence [195]. E. Summary of Performance Advantages Table XXII consolidates the key performance advantages of ML over traditional approaches across all evaluated dimen- sions. TABLE XXII SUMMARY OF ML PERFORMANCE ADVANTAGES Performance DimensionTypical GainMaximum Reported Localisation accuracy5–10×99.98% accuracy Throughput1.5–2.5×148% increase Energy efficiency6–70×1556× Network lifetime1.5–3×173% extension Scalability (500 nodes)3–4× PDR265% advantage Inference latency2–20× fasterO(1) vs O(n 2 ) These performance advantages must be weighed against implementation complexity, training data requirements, and deployment costs—considerations addressed in Section VI. However, for long-term deployments, large-scale networks, or applications requiring adaptation to changing conditions, ML approaches offer compelling advantages that justify their additional complexity. Important Caveats: Maximum reported performance fig- ures (e.g., 99.98% localisation accuracy) typically represent best-case results obtained in controlled environments such as water tanks or shallow harbours with favourable acoustic conditions. Field deployments in open ocean environments with strong currents, thermocline variations, and heavy vessel traffic generally achieve lower performance. The “typical gain” column provides more realistic expectations for operational deployments across varied conditions. VI. IMPLEMENTATION CHALLENGES AND SOLUTIONS The transition from laboratory demonstrations to operational underwater deployments reveals formidable challenges that can devastate even theoretically sound ML systems. Unlike terrestrial IoT where failed nodes can be easily accessed and replaced, underwater failures may require ship time costing $50,000 per day or abandonment of expensive equipment at ocean depths [3], [51]. This section examines the practical challenges confronting ML deployment underwater—from se- vere computational constraints of battery-powered platforms to the corrosive ocean environment that degrades sensors within months—and presents proven solutions derived from successful field deployments. Through detailed case stud- ies spanning military, commercial, and research applications, we demonstrate that these challenges, while significant, can be systematically addressed through careful engineering and adaptive strategies. A. Resource Constraints Underwater platforms operate under severe resource limita- tions that would be considered catastrophic failures in terres- trial systems: processors with 1/100th the capability of smart- phones, memory measured in megabytes rather than gigabytes, and energy budgets where every millijoule matters [18], [165]. These constraints fundamentally reshape how ML algorithms must be designed, trained, and deployed. 1) LimitedProcessingPower:FromGigaflopsto Megaflops:Underwatersensorsemploylow-power microcontrollersprioritisingenergyefficiencyover computational capability [197], [198]. Typical platforms include ARM Cortex-M4 processors operating at 80–180 MHz, providing approximately 200 MFLOPS—compared to 100+ GFLOPS for modern smartphones. This 500× computational disadvantage means neural network inference that completes in 10ms on a phone requires 5 seconds underwater—far exceeding real-time constraints for time- critical applications such as collision avoidance or threat detection [199]. The processing limitation manifests across multiple dimen- sions: Clock Speed Constraints: Power consumption scales quadratically with frequency (P ∝ f 2 V 2 ), forcing underwater processors to operate at reduced speeds [165]. A proces- sor consuming 100mW at 100MHz would require 1.6W at 41 400MHz—exceeding the entire power budget of most under- water sensors. This fundamental relationship between clock speed and power consumption necessitates careful optimisa- tion of computational workloads [90]. Architectural Limitations: Underwater processors lack hardware acceleration common in modern devices [197]: • No GPU for parallel matrix operations essential for deep learning • No dedicated neural processing units (NPUs) or tensor processing units (TPUs) • Limited SIMD instructions (often just basic NEON sup- port) • Single-core operation preferred due to multi-core’s 3–4× power overhead Thermal Constraints: Despite cold water providing exter- nal cooling, sealed pressure housings trap internally generated heat [12]. Sustained computation raises internal temperatures by 20–30°C, potentially exceeding component ratings and accelerating failure through thermal cycling stress. Thermal throttling further reduces already limited performance, creating a feedback loop that degrades ML inference quality during extended processing periods. Solutions for Processing Constraints: Model Architecture Optimisation: Designing networks specifically for embedded processors yields dramatic improve- ments [200], [201]. The TinyML paradigm has emerged as a crucial enabler for deploying ML on resource-constrained devices [197], [198]. Depthwise Separable Convolutions reduce computation from H·W ·D 2 k ·M·N to H·W ·D 2 k ·M +H·W ·M·N , achieving 8–9× speedup for typical layers: • Standard Conv2D(32→64, 3×3): 1.8M operations • Depthwise Separable equivalent: 0.2M operations • Performance impact: <2% accuracy loss for most under- water tasks Inverted Residual Blocks (MobileNetV2 architecture [201]) maintain representational power while minimising operations: Block: x expand −→ 6x depthwise −→ 6x project −→ x(171) This expansion-filtering-projection pattern achieves ResNet- level accuracy with 10× fewer operations, making it particu- larly suitable for underwater acoustic signal classification and underwater image recognition tasks [103]. Computation Scheduling: Intelligent scheduling maximises processor utilisation while meeting real-time constraints [90]: Priority-based inference allocates computation based on situational criticality: • Threat detected: Run full classification model (500ms budget) • Routine monitoring: Run lightweight detection only (50ms budget) • Idle state: Run minimal anomaly detection (5ms budget) Temporal amortisation spreads expensive computations across multiple time steps: • Frame 1: Extract full feature representation (100ms) • Frames 2–5: Track using Kalman filter with extracted features (10ms each) • Frame 6: Full feature update (100ms) This strategy achieves 5× average speedup for video pro- cessing while maintaining tracking accuracy within 95% of full-frame processing [8]. Hardware-Software Co-Design: Optimising algorithms for specific hardware capabilities provides substantial gains [199]: Fixed-point arithmetic using processor-native operations eliminates expensive floating-point computations. Floating- point values are converted to fixed-point representation: x fixed = round(x float · 2 Q ),(172) where x float is the original floating-point value, x fixed is the fixed-point representation, and Q is the number of fractional bits. Custom assembly kernels for critical operations achieve 3– 5× speedup by exploiting single-cycle dual 16-bit multiply- accumulate (MAC) instructions available on Cortex-M4 pro- cessors, enabling real-time processing of acoustic signals at sample rates up to 48kHz [33]. 2) Memory Limitations: Every Byte Counts: Underwater sensors typically provide 256KB–2MB RAM and 1–8MB flash storage—insufficient for modern neural networks re- quiring 10–100MB [197], [198]. Memory constraints affect both model storage and runtime allocation for intermediate activations, requiring careful memory management throughout the ML pipeline [165]. Memory Bottlenecks: Peak memory usage during infer- ence often exceeds model size due to intermediate tensors that must be stored during forward propagation: M peak = M model + max layer (M layer input + M layer output ),(173) where M peak is the peak memory requirement, M model is the memory for model parameters, and M layer input and M layer output are the input and output activation memory requirements for each layer. For a modest CNN with 1M parameters processing 128×128 images: • Model parameters: 4MB (float32) • Peak activation memory: 8MB • Total requirement: 12MB (far exceeding typical 2MB RAM) Memory Fragmentation: Dynamic allocation in con- strained memory causes fragmentation, leading to allocation failures despite sufficient total memory. After 1000 allocation- deallocation cycles, available contiguous memory can drop to 30% of total, causing inference failures even when aggregate free memory appears adequate [90]. Memory Optimisation Solutions: In-Place Operations: Modifying tensors in-place eliminates temporary allocations [197]: Instead of: y = ReLU(x) (creates new tensor)(174) Use: x = ReLU inplace(x) (modifies existing tensor) (175) This approach reduces peak memory by 40–50% for activation-heavy networks commonly used in underwater acoustic processing. 42 Memory Pooling: Pre-allocating memory pools eliminates fragmentation through static allocation strategies that reserve fixed-size blocks at initialisation, preventing runtime frag- mentation and guaranteeing deterministic memory availability throughout deployment [199]. Progressive Inference: Processing large inputs in tiles reduces memory requirements significantly. For processing 1024×1024 underwater images with only 64KB activation memory, the image is divided into 64×64 tiles processed sequentially with appropriate boundary handling, enabling deployment of sophisticated image recognition models on severely memory-constrained platforms [103]. Model Compression Techniques: Multiple complementary techniques reduce model footprint [202], [203]: Quantisation reduces memory 4–8×: • Float32 → Int8: 4× reduction, <1% accuracy loss • Int8 → Int4: Additional 2× reduction, 2–5% accuracy loss • Binary networks: 32× reduction, 10–15% accuracy loss (acceptable for simple detection) Quantisation-Aware Training (QAT) [204] incorporates quantisation effects during training, achieving better accuracy preservation than post-training quantisation, particularly im- portant for underwater applications where retraining opportu- nities are limited. Pruning removes redundant parameters through magnitude- based thresholding [203]: w pruned = ( wif |w| > θ 0otherwise ,(176) where w is the original weight, w pruned is the pruned weight, and θ is the pruning threshold. Achieving 90% sparsity with <5% accuracy loss is typical for underwater acoustic classifi- cation tasks. Knowledge Distillation creates compact student models from large teacher networks [205]: • Teacher: ResNet-50 (25M parameters, 98MB) • Student: MobileNet-v3-Small (1.5M parameters, 6MB) • Performance retention: 94% of teacher accuracy This technique has proven particularly effective for under- water species classification, where complex teacher models trained on large datasets can transfer knowledge to deployable student models [121]. 3) Energy Budget Management: Energy represents the ulti- mate constraint—when batteries die, missions fail. Typical un- derwater sensors operate on 100–1000 Wh batteries that must last months to years, requiring meticulous energy management at every system level [28], [78]. Power Budget Breakdown: • Sensing: 10–50mW continuous • Processing: 100–500mW during inference • Communication: 10–50W during acoustic transmission • Idle: 1–10mW sleep mode The dramatic range (10,000× between sleep and transmis- sion) demands intelligent power management that maximises time in low-power states while ensuring critical events are captured and communicated [12]. Energy-Aware ML Solutions: Adaptive Duty Cycling: ML predicts interesting events to optimise sampling schedules [28], [37]. The system achieves 90% event capture with 95% energy reduction by intelligently switching between: • High-rate sampling (10Hz, 50mW) during predicted ac- tivity periods • Low-rate sampling (0.1Hz, 0.5mW) during quiescent periods Reinforcement learning-based approaches have demon- strated particular effectiveness in learning optimal duty cycling policies that adapt to changing environmental conditions [206]. Hierarchical Processing: Cascaded models filter data at increasing complexity, reducing average energy consumption dramatically [165]: 1) Tiny anomaly detector (1mJ/inference): Filters 99% of normal data 2) Lightweight classifier (10mJ/inference): Identifies event type for anomalies 3) Full analysis network (100mJ/inference): Detailed classi- fication for significant events Average energy per sample: 0.99× 1 + 0.009× 10 + 0.001× 100 = 1.18 mJ, versus 100mJ for always running the full model—an 85× improvement [90]. Energy-Aware Neural Architecture Search (ENAS): Auto- mated design optimising the energy-accuracy Pareto fron- tier [207]: Objective = Accuracy− λ· log(Energy)(177) Discovered architectures achieve 90% accuracy at 10× lower energy than manually designed networks, with the additional benefit of being automatically adapted to specific hardware platforms [197]. Energy Harvesting Integration: Recent advances in un- derwater energy harvesting—from ocean currents, thermal gradients, and even biofouling organisms—provide opportu- nities for extended deployments [28]. ML-based predictive models optimise the balance between energy harvesting rates and consumption, ensuring sustainable operation even under variable environmental conditions. B. Environmental Challenges The ocean environment actively attacks electronic sys- tems through multiple mechanisms: biofouling covers sensors within weeks, corrosion penetrates housings within months, and pressure crushes inadequately designed enclosures [18], [51]. These environmental factors not only threaten hardware but also degrade ML model performance as sensor character- istics drift from their training distributions. 1) Biofouling: The Biological Attack: Marine organisms colonise any submerged surface, forming complex commu- nities that obscure sensors and alter acoustic properties [208]. The fouling process follows predictable stages that create progressively greater challenges for ML systems: Initial Conditioning (Hours): Organic molecules form a conditioning film altering surface properties: 43 • Thickness: 10–100nm • Effect: Changes optical properties, reduces transparency by 5–10% • ML impact: Minor calibration drift, correctable with baseline adjustment Microbial Colonisation (Days): Bacteria and diatoms form biofilms: • Thickness: 10–100μm • Effect: Scatters light, attenuates acoustic signals by 3– 6dB • ML impact: Increased noise floor, reduced signal-to-noise ratio [56] Macrofouling (Weeks–Months): Barnacles, mussels, and algae establish permanent communities: • Thickness: 1–10cm • Effect: Complete sensor obstruction, 20–30dB acoustic attenuation • ML impact: Severe sensor degradation, potential com- plete failure [51] ML Robustness to Fouling: Fouling-Aware Training: Training data augmentation simu- lates progressive fouling [209]: • Gaussian blur with kernel size proportional to fouling level • Additive noise scaled by fouling severity • Contrast reduction modelling light attenuation • Spectral filtering for acoustic frequency-dependent effects Models trained with fouling augmentation maintain 85% accuracy after 3 months deployment versus 45% for standard training, representing a critical improvement for long-term deployments [210]. Adaptive Calibration: Online learning compensates for sen- sor drift using self-supervised objectives [211]: θ t+1 = θ t − η∇L self-supervised (x t , ˆy t ),(178) where θ t are the model parameters at time t, η is the learning rate, L self-supervised is the self-supervised loss function, x t is the input at time t, and ˆy t = f θ (x t ) is the model’s prediction used to compute reconstruction or consistency losses without external labels. Self-supervised objectives detect and correct for fouling without requiring labelled data: • Temporal consistency: Adjacent frames should exhibit smooth transitions • Physical constraints: Measurements should obey conser- vation laws and physical bounds • Cross-modal agreement: Different sensors measuring re- lated phenomena should correlate Elastic Weight Consolidation (EWC) [211] prevents catas- trophic forgetting during online adaptation by constraining weight updates to preserve previously learned knowledge while accommodating sensor drift. Multi-Sensor Fusion for Robustness: Redundant sensors with different fouling characteristics enable weighted fusion based on estimated degradation [12]: ˆy = N X i=1 w i (d i )· y i , w i (d i ) = e −αd i P j e −αd j ,(179) where ˆy is the fused estimate, N is the number of sen- sors, y i is the measurement from sensor i, w i (d i ) is the weight for sensor i based on its degradation level d i , and α is a sensitivity parameter controlling how quickly weights decrease with degradation. This approach maintains system performance despite individual sensor fouling by dynamically adjusting sensor contributions. 2) Corrosion: The Chemical Attack: Seawater’s high salin- ity (35 ppt) and dissolved oxygen create an aggressive cor- rosion environment. Galvanic corrosion between dissimilar metals accelerates degradation, while crevice corrosion attacks sealed joints critical for pressure integrity [51]. Corrosion Rates (typical values): • Aluminum: 0.1–0.3 m/year • Stainless steel (316L): 0.01–0.05 m/year • Titanium: <0.001 m/year (but significantly more ex- pensive) Failure Modes: • Pitting corrosion: Creates pinholes allowing water ingress • Crevice corrosion: Attacks sealed joints and O-ring grooves • Stress corrosion cracking: Propagates under mechanical load Corrosion-Tolerant ML Systems: Predictive Maintenance Models: ML predicts corrosion progression from environmental sensor readings [212]. Input features include conductivity, temperature, pH, dissolved oxy- gen concentration, and cumulative deployment time. Random Forest and gradient boosting models achieve 87% accuracy in predicting remaining useful life within a 30-day window, enabling proactive maintenance scheduling before catastrophic failure [90]. Graceful Degradation Strategies: As sensors fail from cor- rosion, ML systems adapt through a systematic process [12]: 1) Detect failed sensors through statistical anomaly detection 2) Retrain or fine-tune models excluding failed inputs 3) Increase reliance on remaining healthy sensors through reweighted fusion 4) Activate backup systems when degradation exceeds op- erational thresholds Redundant Encoding for Model Survival: Critical ML mod- els are stored with Reed-Solomon error correction, enabling recovery from up to 30% flash memory corruption due to corrosion-induced failures [213]. This redundancy ensures that even partially degraded hardware can maintain ML inference capabilities. 3) Pressure Effects: The Physical Challenge: Pressure in- creases by 1 atmosphere per 10 metres depth, reaching 1000+ atmospheres in ocean trenches. This creates multiple chal- lenges for both hardware and ML systems [18]. Component Compression Effects: • Air spaces compress, changing acoustic transducer prop- erties • Semiconductor characteristics shift due to piezoelectric effects • Battery capacity reduces by 5–10% per 100 atmospheres 44 • Crystal oscillator frequencies drift, affecting timing syn- chronisation Seal Degradation: • O-rings extrude through gaps under high pressure differ- entials • Gaskets permanently deform after pressure cycling • Adhesivesfailunderrepeatedcompression- decompression cycles Pressure-Adaptive ML Techniques: Depth-Aware Model Selection: Different models optimised for different pressure regimes [3]: • Shallow water models (0–100m): Standard calibration • Mid-water models (100–1000m): Pressure-compensated parameters • Deep water models (>1000m): Specialised deep-sea training data Pressure Compensation in Predictions: Incorporating pres- sure as an explicit input to environmental models [56]: ˆy = f (x,p) = f base (x) + f pressure (p)· g(x),(180) where ˆy is the predicted output, x is the input features, p is the pressure measurement, f base is the base prediction function, f pressure (p) captures pressure-dependent modifications learnt during training across multiple depth profiles, and g(x) is a modulation function. 4) Temperature Variations: Temporal and Spatial: Ocean temperatures vary from -2°C near poles to 30°C in tropical surface waters, with dramatic thermoclines creating 10–15°C changes over tens of metres [214]. Temperature Effects on Electronics: • Clock drift:±100ppm over operational temperature range • Battery capacity: 50% reduction at 0°C versus 25°C • Semiconductor parameters: 2–3% variation per 10°C • Acoustic transducer sensitivity: 1–2dB variation per 10°C Temperature-Robust ML: Temperature-AwareNormalisation:Compensatingfor temperature-induced sensor drift through learnt temperature- dependent calibration coefficients [56]: x norm (T ) = x− μ(T ) σ(T ) ,(181) where x norm (T ) is the temperature-normalised input, x is the raw sensor reading, T is the temperature, and μ(T ) and σ(T ) are temperature-dependent mean and standard deviation parameters learnt during training. Multi-Temperature Training: Training across temperature ranges improves robustness without requiring online adapta- tion [215]: • Collect training data across seasonal temperature cycles • Augment with temperature-dependent noise models • Use domain adaptation techniques between temperature regimes • Employ batch normalisation with temperature-stratified statistics Figure 11 illustrates the comprehensive environmental adap- tation framework that integrates multiple strategies to maintain ML performance under challenging underwater conditions. C. Deployment Considerations Deploying ML systems underwater requires addressing unique challenges absent in terrestrial deployments: collecting training data costs thousands of dollars per day, updating models requires physical recovery or acoustic communication, and distributed learning must operate over severely bandwidth- limited channels [3], [41]. 1) Training Data Collection: The Million-Dollar Dataset: Unlike terrestrial applications with abundant labelled data, underwater datasets require expensive ship operations and expert annotation [216], [217]. Collection Costs: • Research vessel charter: $20,000–50,000/day • ROV operations: $50,000–100,000/day • Expert marine biologist annotation: $100–500/hour • Total cost for 10,000 high-quality labelled images: $500,000–2,000,000 Data Scarcity Comparison: • ImageNet: 14 million labelled images available free • Typical underwater dataset: 10,000 images costing $1M+ • Effective ratio: 1,400× less data at 1,000,000× higher cost Solutions for Limited Training Data: Transfer Learning from Terrestrial Datasets: Pre-training on abundant terrestrial data reduces required underwater samples by 90% while maintaining acceptable performance [215], [218]. Progressive fine-tuning freezes early layers (which learn general features like edges and textures) while adapt- ing later layers to underwater-specific characteristics such as colour distortion, turbidity effects, and marine-specific object classes [219]. Recent work has demonstrated that ImageNet-pretrained models transfer effectively to underwater domains when com- bined with domain-specific augmentation simulating underwa- ter optical effects [217]: I underwater = I clean · e −βd + B ∞ (1− e −βd ),(182) where I underwater is the degraded underwater image, I clean is the original scene, β represents the attenuation coefficient, d is the distance, and B ∞ is the backscatter background illumination. Synthetic Data Generation: Physics-based simulation cre- ates unlimited training data by rendering 3D underwater scenes with accurate light transport modelling [217]: • Wavelength-dependent light attenuation (blue penetrates deepest) • Forward and backward scattering from suspended parti- cles • Caustic patterns from surface wave focusing • Marine snow and particle effects Training on 90% synthetic + 10% real data achieves 95% of full real-data performance while reducing data collection costs by over 90% [105]. Active Learning for Efficient Annotation: Selecting the most informative samples for labelling based on model uncertainty reduces annotation requirements by 60–70% [72]. Entropy- 45 Biofouling Corrosion Pressure Temperature ML SYSTEM with Adaptive Capabilities Online Calibration + EWC Predictive Maintenance Depth-Aware Models Multi-Temp Training ROBUST UNDERWATER ML PERFORMANCE Integrated Adaptation Framework Fig. 11. Environmental adaptation framework for underwater ML systems. Red arrows indicate environmental challenges affecting system performance; green arrows represent adaptive solutions; the yellow output block shows the maintained performance through integrated adaptation strategies. based sample selection prioritises images where the current model is most uncertain: H(x) =− X c p(c|x) logp(c|x),(183) where H(x) is the entropy (uncertainty) for sample x, c indexes over classes, and p(c|x) is the predicted probability of class c given input x. This approach has proven particularly effective for rare species identification, where the long-tail distribution of ma- rine species makes uniform sampling highly inefficient [216]. Self-Supervised Pre-Training: Contrastive learning on un- labelled underwater video creates powerful feature extractors without expensive annotation [216]: L contrastive =− log exp(sim(z i ,z j )/τ ) P 2N k=1 ⊮ [k̸=i] exp(sim(z i ,z k )/τ ) , (184) where z i and z j are embeddings of two augmented views of the same image (positive pair), z k are embeddings of other images in the batch, sim(·,·) is a similarity function (typically cosine similarity), τ is a temperature parameter, N is the batch size, and ⊮ [k̸=i] is an indicator function. This self-supervised pre-training enables 85% classification accuracy with only 100 labelled examples per species—critical for rare deep-sea organisms where extensive labelled datasets are impossible to collect. 2) Model Updates Underwater: The Isolation Challenge: Deployed sensors cannot easily receive model updates— acoustic bandwidth limits transfers to bytes per second, and physical recovery requires expensive ship operations [3], [20]. Communication Constraints: • Acoustic bandwidth: 1–10 kbps typical • Propagation delay: 0.67 ms/m (1500 m/s sound speed) • Error rates: 10–30% packet loss in challenging conditions • Energy cost: 10–50W during transmission Update Mechanisms: Differential Updates: Transmitting only changed parameters reduces update size by 95% for fine-tuning updates [41]: ∆θ = θ new − θ old ,(185) where ∆θ is the parameter difference, θ new are the updated parameters, and θ old are the previous parameters. Sparse encoding of ∆θ (transmitting only non-zero differ- ences) combined with entropy coding achieves compression ratios of 20–100× compared to full model transmission [90]. Progressive Updates: Spreading updates across multiple communication windows accommodates acoustic channel con- straints [164]: • Segment model updates into chunks fitting acoustic packet size (typically 256–1024 bytes) • Prioritise updates to most critical layers • Use erasure codes to tolerate packet loss • Verify integrity before activating updated model Edge Learning: Training models underwater without ex- ternal updates through online adaptation [102]. Incremental learning algorithms adapt to distribution shifts caused by seasonal changes, biofouling, and sensor ageing: θ t+1 = θ t − η∇L(x t ,y t )− λ(θ t − θ 0 ),(186) where θ t are the model parameters at time t, η is the learning rate, L(x t ,y t ) is the loss on the current sample (x t ,y t ), λ is the regularisation coefficient, and θ 0 are the initial pre- deployment parameters. The regularisation term λ(θ t − θ 0 ) prevents catastrophic forgetting of pre-deployment training whilst allowing adaptation to local conditions [211]. 3) Distributed and Federated Learning Strategies: Under- water networks can collaboratively learn despite communica- tion constraints, enabling knowledge sharing without central- ising sensitive data [41], [220]. Hierarchical Federated Learning: Three-level aggregation reduces communication overhead by 100× compared to flat federated learning [41]: 46 1) Level 1 (Local): Nodes within acoustic range average models during opportunistic encounters 2) Level 2 (Regional): Cluster heads aggregate local models and exchange with neighbouring clusters 3) Level 3 (Global): Surface gateways perform final aggre- gation and distribute updated global model This hierarchical structure exploits the natural topology of underwater networks while minimising expensive long-range acoustic communication [3]. Gossip-BasedLearning:Gradualmodelpropagation through peer-to-peer exchange achieves consensus without centralised coordination [41]: θ (t+1) i = θ (t) i + θ (t) j 2 (187) Pairwise model averaging when AUVs or mobile nodes encounter each other achieves network-wide consensus in O(logN ) communication rounds, exploiting natural mobility patterns for model dissemination [9]. Communication-Efficient Gradient Compression: Tech- niques for reducing gradient communication overhead in bandwidth-constrained underwater channels [41]: • Top-K sparsification: Transmit only K largest gradient elements • Quantised gradients: Reduce precision from 32-bit to 1–8 bits • Error feedback: Accumulate quantisation errors for future transmission Combined, these techniques achieve 100–1000× compres- sion with minimal impact on convergence, making federated learning practical even over low-bandwidth acoustic channels. Table XXIII summarises the key deployment strategies and their applicability to different underwater scenarios. D. Case Studies: Successful Deployments Real-world deployments demonstrate that implementation challenges, while significant, can be overcome through careful engineering and adaptive strategies. These case studies span military, commercial, and research applications, providing concrete evidence of ML’s transformative impact on under- water operations. 1) Military: Project AMMO (Autonomous Mobile Marine Observatory): The U.S. Navy’s Project AMMO deployed ML- enabled underwater sensor networks for persistent maritime surveillance, achieving revolutionary improvements in threat detection and response time [45]. System Architecture: • 200 autonomous nodes with embedded ML processing • Hierarchical network: sensors → cluster heads → gate- way buoys → satellites • Edge AI: YOLOv5-nano for object detection, LSTM for behaviour prediction • Distributed learning: Federated updates every 24 hours via acoustic links Technical Challenges Addressed: Stealth Requirements: Minimal acoustic emissions to avoid detection required ML-predicted optimal transmission windows, using Q-learning to identify periods of high ambi- ent noise that mask sensor transmissions [32]. Result: 95% reduction in detectable transmission frequency. Adversarial Robustness: Protection against spoofing and jamming attacks through adversarial training with synthetically generated attack patterns. The system incorporates anomaly detection to identify potential adversarial inputs and falls back to conservative decision-making when under attack [13]. Result: 99.7% correct classification despite active jamming. Rapid Adaptation: Response to previously unseen vessel types through few-shot learning from as few as 10 examples, using metric learning to embed new classes into the existing feature space without full retraining [216]. Result: 97% faster model updates versus full retraining. Operational Achievements: • Detection accuracy: 98.5% for surface vessels, 94% for submarines • False alarm rate: Reduced from 8/day to 0.3/day • Response time: 3 minutes from detection to alert (vs. 45 minutes traditional) • Network lifetime: Extended from 3 months to 14 months through ML-optimised power management • Coverage area: 10,000 km² with 200 nodes Key Innovation—Collaborative Tracking: Multiple sen- sors collaborate using distributed particle filters where each sensor maintains local particle sets representing target state es- timates. High-weight particles (likely target states) are shared with neighbouring sensors through acoustic links, enabling network-wide tracking fusion [221]: p(x t |z 1:N 1:t )∝ N Y i=1 p(z i t |x t )· p(x t |x t−1 ),(188) where p(x t |z 1:N 1:t ) is the posterior distribution of target state x t given measurements z 1:N 1:t from all N sensors up to time t, p(z i t |x t ) is the likelihood of measurement z i t from sensor i, and p(x t |x t−1 ) is the state transition probability. This achieves submarine tracking accuracy within 50m at 10km range—impossible for single sensors operating indepen- dently. 2) Commercial: Norwegian Salmon Farm Monitoring: Marine Harvest (now Mowi), the world’s largest salmon producer, deployed ML-based monitoring across 50 salmon farms, revolutionising aquaculture management through early disease detection and optimised feeding [17]. System Components: • 500 underwater cameras with edge processing (NVIDIA Jetson Nano) • 2000 environmental sensors (dissolved O 2 , temperature, salinity, current velocity) • Biomass estimation using stereo computer vision • Disease detection through behavioural analysis ML Solutions Deployed: FishCountingandBiomassEstimation: Custom YOLOv8-nano detector trained on 50,000 annotated fish im- ages, combined with stereo vision CNN for size estimation and LSTM for temporal smoothing [121]. Processing pipeline achieves: 47 TABLE XXIII DEPLOYMENT STRATEGY SELECTION GUIDE FOR UNDERWATER ML SYSTEMS Deployment ScenarioData StrategyUpdate StrategyLearning StrategyKey Considerations Short-term (<1 month)Pre-collectedNone requiredPre-trained onlyMinimise complexity Medium-term(1–12 months) Transfer learningDifferential updatesOnline adaptationBalance adaptability vs. stability Long-term (>1 year)Active learning + syn- thetic Hierarchical federated Continual learningPrevent catastrophic for- getting Deep sea (>1000m)Synthetic + few-shotPhysicalrecovery only Edge learningExtremeisolation constraints Mobile (AUV-based)Opportunistic collec- tion Gossip-basedCollaborative learning Exploit mobility for up- dates • Counting accuracy: ±3% (vs. ±15% manual) • Size estimation:±5% biomass accuracy (vs.±20% sam- pling) • Processing rate: 30 fps on edge device Disease Detection via Behaviour Analysis: Sea lice infes- tation and other diseases detected through swimming pattern analysis before visible symptoms appear [17]: • Behavioural features: velocity variance, turning rate, depth variation, scratching frequency, schooling coher- ence • LSTM-based sequence model predicts health status from 5-minute behavioural windows • Early detection: 3–5 days before visible symptoms Operational Impact: • Mortality reduction: 32% through early disease interven- tion • Feed optimisation: 18% reduction through ML-predicted demand feeding • Labour savings: 60% reduction in diver inspections • Revenue increase: $12M annually across 50 farms • ROI: 14 months payback period Environmental Monitoring: ML predicts harmful algal blooms 72 hours ahead using ConvLSTM for spatial-temporal ocean patterns combined with satellite ocean colour data [11]: H t+72 = f ConvLSTM (S t−7:t ,O t−7:t ,T t−7:t ),(189) where H t+72 is the predicted harmful algal bloom indicator at time t + 72 hours, f ConvLSTM is the ConvLSTM network function, S t−7:t is satellite imagery from time t− 7 days to t, O t−7:t is ocean sensor data over the same period, and T t−7:t represents temperature profiles. This 72-hour warning provides sufficient time to relocate cages or adjust feeding schedules, preventing catastrophic losses. 3) Research: FathomNet Deep-Sea Exploration: MBARI’s FathomNet project created the largest ML-powered underwater image analysis system, processing 271TB of deep-sea imagery to accelerate marine discovery [16], [216]. System Scale: • Archive: 30 years of ROV footage comprising 28,000 hours of video • Annotations: 8.2 million labels across 200,000 taxonomic concepts • Data volume: 271TB of processed imagery • Collaboration: 84 institutional partners contributing data and expertise ML Architecture: Multi-Scale Object Detection: EfficientDet-D7 backbone handles extreme scale variations from microscopic larvae (sub- millimetre) to whale sharks (12+ metres), achieving 89% mAP across 200,000 marine concepts through multi-scale feature pyramid processing [222]. Few-Shot Species Classification: Prototypical networks enable identification of rare species from only 5–10 exam- ples [216]: p(y = k|x) = exp(−d(f θ (x),c k )) P k ′ exp(−d(f θ (x),c k ′ )) (190) where c k is the prototype (mean embedding) for class k. This capability is critical for documenting new discoveries in unexplored regions where labelled examples are unavailable. Temporal Context Integration: 3D ConvNets process video sequences to distinguish species through movement patterns when visual features alone are insufficient—essential for cryptic species and poor visibility conditions [16]. Scientific Impact: • New species discovered: 147 through automated anomaly detection flagging unusual specimens for expert review • Analysis speedup: 10,000× (30 years of footage analysed in 3 months) • Behavioural insights: 42 previously unknown migration patterns identified • Ecosystem monitoring: Real-time biodiversity tracking at 15 observatory sites • Open science: 2.1M images publicly available for re- search community 4) Lessons Learned Across Deployments: Synthesis of ex- periences across military, commercial, and research deploy- ments reveals common success factors and pitfalls to avoid: Start Simple, Iterate Quickly: Initial deployments should use proven architectures (YOLOv5/v8-nano, MobileNet, ResNet-18) rather than novel approaches. Complexity should be added only after establishing baseline performance in the actual deployment environment. The gap between laboratory and field performance is often larger than expected [8]. Design for Failure: Every component will eventually fail underwater. Systems must gracefully degrade, maintaining core functionality despite sensor losses, communication fail- ures, or model corruption [12]: • Redundant sensors with independent failure modes • Fallback to simpler models when resources are con- strained 48 • Automatic detection and isolation of failed components • Graceful capability reduction rather than complete failure Validate Extensively Before Deployment: Tank testing catches 90% of issues at 1% of the cost of ocean deployment. Progressive validation stages (tank → harbour → coastal → open ocean) prevent catastrophic failures and build confidence in system reliability [223]. Maintain Human Oversight: Full automation remains premature for most applications. Human-in-the-loop systems achieve better outcomes while building operator trust in ML predictions. Critical decisions should require human confir- mation, with ML providing recommendations and confidence estimates [8]. Document Everything: Underwater deployments gener- ate invaluable data for future improvements. Comprehensive logging—including failures, environmental conditions, and edge cases—accelerates learning across the community and enables retrospective analysis of system behaviour [216]. Table XXIV provides a comprehensive summary of imple- mentation challenges, solutions, and expected outcomes based on the case studies and literature reviewed. VII. FUTURE RESEARCH DIRECTIONS The intersection of ML and underwater communications stands at an inflection point where emerging technologies promise to overcome current limitations while opening entirely new application domains [3], [224]. Recent breakthroughs in physics-informed neural networks, transformer architec- tures, large language models, and quantum computing offer solutions to fundamental challenges that have constrained underwater systems for decades [225], [226]. This section explores promising research directions that will shape the next generation of intelligent underwater networks, examining both incremental advances that enhance existing capabilities and revolutionary approaches that could fundamentally transform how we interact with the ocean environment. A. Emerging ML Technologies The rapid evolution of ML continues to produce architec- tures and training paradigms with profound implications for underwater applications [56]. These emerging technologies address specific limitations of current approaches while intro- ducing capabilities previously thought impossible in resource- constrained underwater environments. 1) Physics-Informed Neural Networks: Bridging Data and Knowledge: Physics-Informed Neural Networks (PINNs) rep- resent a paradigm shift from purely data-driven learning to hybrid approaches that incorporate centuries of oceanographic knowledge directly into neural network training [225]. This fusion addresses the fundamental challenge of data scarcity underwater while ensuring physically consistent predictions critical for safety and reliability [227], [228]. Acoustic Propagation Modelling with PINNs: Traditional acoustic models solve the Helmholtz or parabolic equations numerically, requiring extensive computational resources and detailed environmental knowledge. PINNs learn solutions that satisfy both governing equations and sparse measurements, achieving remarkable efficiency gains [229], [230]. The acoustic pressure field p(x,y,z,f ) satisfies the Helmholtz equation: ∇ 2 p + k 2 (x,y,z)p = 0(191) where wavenumber k = 2πf/c(x,y,z) depends on spatially- varying sound speed. The PINN loss function combines data fidelity and physics constraints: L = λ data N d X i=1 |p N (x i )− p measured,i | 2 +λ PDE N c X j=1 |∇ 2 p N (x j ) + k 2 (x j )p N (x j )| 2 (192) The first term fits sparse measurements while the second enforces wave physics throughout the domain. Automatic dif- ferentiation computes spatial derivatives analytically, avoiding numerical approximation errors [231]. Recent advances have addressed key challenges in underwa- ter PINN deployment. Yoon et al. [232] developed OceanPINN for managing spatially non-coherent data through magnitude- based training and phase-refined prediction, achieving im- proved wavenumber estimation accuracy. Tang et al. [233] introduced PreT-OceanPINN with a two-stage pretraining op- timisation approach that significantly improves high-frequency component prediction. Chen et al. [227] proposed combining the retarded envelope function from parabolic equation theory with PINN formulations, demonstrating mean square errors as low as 0.01 for two-dimensional acoustic field prediction. Key advantages for underwater applications include: • Data efficiency: Accurate field prediction from 10–100 measurements versus millions for purely data-driven ap- proaches • Uncertainty quantification: Bayesian PINNs provide confidence intervals crucial for navigation decisions [234] • Extrapolation capability: Physics constraints enable prediction beyond training domains • Real-time inference: Trained networks evaluate in mil- liseconds versus hours for numerical models Current research challenges requiring investigation include: • Multi-physics coupling: Incorporating acoustic-elastic interfaces, bubble dynamics, and nonlinear effects • Adaptive sampling: Optimally placing sensors to max- imise PINN accuracy • Spatial domain decomposition: Duan et al. [230] demonstrated that SPINN with spatial domain decom- position significantly outperforms standard PINN for practical acoustic propagation estimation under ocean dynamics • Broadband modelling: Huang et al. [235] integrated modal equations of normal modes as a regular term in the loss function, enabling fast broadband modelling with sparse frequency sampling 49 TABLE XXIV IMPLEMENTATION CHALLENGES AND SOLUTIONS SUMMARY Challenge CategorySpecific ChallengeRecommended SolutionExpected Outcome Resource Constraints Limited processingTinyML, quantisation, pruning10–100× speedup Memory limitationsModel compression, tiling4–32× reduction Energy budgetAdaptive duty cycling, hierarchical inference 85× energy reduction Environmental BiofoulingFouling-aware training, online cal- ibration 85% accuracy at 3 months CorrosionPredictivemaintenance, redundancy 87% failure prediction Pressure effectsDepth-aware models92% accuracy maintained Temperature variationMulti-temperature training90% accuracy maintained Deployment Training data scarcityTransfer learning, synthetic data90% data reduction Model updatesDifferentialupdates,federated learning 95% bandwidth reduction Distributed learningHierarchical federation, gossip pro- tocols 100× comm. reduction Ocean Dynamics Prediction: PINNs for ocean circulation must satisfy Navier-Stokes equations with rotation: ∂u ∂t + (u·∇)u + fk× u =− 1 ρ ∇p + ν∇ 2 u,(193) ∇· u = 0,(194) where u is the velocity field, f is the Coriolis parameter (twice the Earth’s rotation rate times sine of latitude), k is the vertical unit vector, ρ is density, p is pressure, and ν is kinematic viscosity. The second equation enforces incompressibility. Research opportunities include subgrid parameterisation for learning unresolved turbulence effects, data assimilation com- bining PINNs with Kalman filtering, multi-scale modelling bridging coastal and basin scales, and biogeochemical cou- pling incorporating nutrient dynamics [236]. 2) Transformer Architectures: Long-Range Dependencies and Self-Attention: Transformers’ ability to capture long-range dependencies through self-attention mechanisms makes them ideally suited for underwater applications where signals prop- agate over extended spatial and temporal scales [61], [226]. Unlike RNNs that process sequences sequentially, transform- ers’ parallel processing enables efficient training and inference on modern hardware. Underwater Acoustic Target Recognition: Recent advances have demonstrated transformers’ superiority for underwater acoustic target recognition (UATR). Feng et al. [237] were the first to apply the Transformer model to underwater acoustics, introducing the spectrogram transformer model (STM). Xu et al. [238] employed self-supervised learning based on the Swin Transformer architecture, achieving 80.22% classifica- tion accuracy on the DeepShip dataset while addressing the dependency on large-scale annotated datasets. The self-attention mechanism for protocol and signal anal- ysis is defined as: Attention(Q,K,V ) = softmax QK T √ d k V(195) where queries Q represent packet positions seeking infor- mation, keys K identify information sources, and values V contain actual data. Yang et al. [239] proposed 1DCTN, an end-to-end model using raw time-domain signals as input, combining one- dimensional CNNs for local feature extraction with Trans- formers for global dependencies. Chen et al. [240] developed UACTC, combining CNN’s rapid local feature modelling with Swin Transformer’s global modelling attributes, achiev- ing state-of-the-art performance on DeepShip and ShipsEar datasets. Multi-head attention captures different aspects of underwa- ter signals: • Head 1: Synchronisation patterns and preambles • Head 2: Address fields and routing information • Head 3: Error correction codes and channel characteris- tics • Head 4: Payload structure, encoding, and semantic fea- tures The Depthwise Separable Convolutional Multihead Trans- former (DCMT) proposed by recent work [124] combines depthwise separable convolutions for localised feature ex- traction with multi-head self-attention for global contextual modelling, employing dual transformer branches with 4-head and 8-head structures for complementary feature processing. Research directions for underwater transformers include: • Sparse attention: Reducing O(n 2 ) complexity for long sequences critical for energy-constrained platforms • Continuous signal processing: Extending transformers beyond discrete tokens to raw acoustic waveforms • Multi-modal fusion: Combining acoustic, optical, and electromagnetic signals through cross-attention mecha- nisms [241] • Online adaptation: Continual learning without catas- trophic forgetting using techniques like elastic weight consolidation [211] Ocean State Forecasting with Spatial-Temporal Transform- ers: Vision Transformers (ViT) adapted for oceanographic data process spatial patches with temporal attention [242]: Spatial tokenisation divides ocean regions into patches: x (i,j) p = Flatten(Patch i,j (X t ))(196) Temporal attention links patterns across time: z t = TemporalAttention(z t−T :t )(197) 50 Promising research areas include handling irregular grids from unstructured ocean model outputs, multi-resolution at- tention focusing on different spatial and temporal scales, incorporating physical conservation laws as soft constraints, and extreme event prediction by attending to precursor pat- terns [243]. 3) Graph Neural Networks: Exploiting Network Topology: Graph Neural Networks naturally represent underwater sensor networks’ irregular connectivity, where communication links depend on acoustic propagation rather than Euclidean dis- tance [156]. GNNs learn from both node features and network topology, discovering optimal strategies that exploit graph structure [175]. Adaptive Network Topology Learning: GNNs simultane- ously learn network connectivity and optimise communication. He et al. [244] proposed GBSR (GNN-Based Secure Rout- ing), which includes a trust prediction model for underwater acoustic sensor networks to evaluate node trustworthiness and improve security performance against internal attacks. Message passing aggregates neighbour information: h (k+1) i = σ   W (k) self h (k) i + X j∈N (i) α (k) ij W (k) msg h (k) j   (198) where σ is a non-linear activation function (typically ReLU or ELU) and attention weights α (k) ij learn link importance based on channel quality and trust values. Edge prediction identifies potential communication links: p(e ij ) = σ(f edge (h i ,h j ,d ij , trust ij ))(199) Chen et al. [175] developed GNN-IR, an intelligent routing method for underwater acoustic sensor networks that signif- icantly outperforms traditional routing protocols in terms of packet delivery ratio and energy efficiency. Li et al. [245] used graph attention networks to embed information about ocean currents, time windows, and sensor locations into directed maneuver time-cost graphs, then applied proximal policy op- timisation for AUV route planning. Research opportunities in underwater GNNs include: • Dynamic graph learning: Adapting to mobile nodes and changing connectivity through temporal graph networks • Hierarchical graph networks: Multi-level organisation from local clusters to global topology • Robustness to missing edges: Handling intermittent acoustic links through graph dropout and edge imputation • Physics-constrained edges: Incorporating acoustic prop- agation models into graph construction for more realistic topology learning Distributed Learning on Underwater Graphs: Federated learning on graph-structured networks requires special con- sideration for the unique challenges of underwater communi- cation [220], [246]. Graph federated averaging with topology awareness: θ (t+1) i = θ (t) i + η X j∈N (i) w ij (θ (t) j − θ (t) i ),(200) where θ (t) i are the parameters at node i at iteration t, η is the learning rate, N (i) is the set of neighbours of node i, and w ij are weights depending on communication quality and trust values. Research challenges include asynchronous updates handling delays in acoustic communication, Byzantine robustness de- fending against compromised nodes [247], communication ef- ficiency minimising message passing overhead through gradi- ent compression, and privacy preservation protecting sensitive information during aggregation [248]. 4) Meta-Learning: Learning to Learn Underwater: Meta- learning enables rapid adaptation to new underwater envi- ronments using minimal data—critical when deploying to unexplored regions where extensive training data is unavail- able [249]. Model-Agnostic Meta-Learning (MAML) for Environment Adaptation: MAML learns initialisation parameters enabling few-shot adaptation across diverse ocean environments: Meta-objective across multiple environments: θ ∗ = arg min θ X T i ∼p(T ) L T i (θ− α∇ θ L T i (θ)),(201) where θ ∗ are the optimal meta-learnt parameters, T i is a task sampled from task distribution p(T ) (each representing a dif- ferent ocean region with distinct propagation characteristics), L T i is the loss on task T i , and α is the inner-loop learning rate. Inner loop adaptation (deployment): θ ′ i = θ− α∇ θ L T i (θ),(202) where θ ′ i are the task-specific adapted parameters. Outer loop meta-learning (training): θ ← θ− β∇ θ X T i L T i (θ ′ i ),(203) where β is the outer-loop (meta) learning rate. Zhao et al. [250] proposed federated meta-learning (FML) for training DNN-based receivers in ocean of things scenarios, exploiting model parameters gathered from multiple buoys while maintaining data privacy. Their analysis provides closed- form expressions for convergence rate considering scheduling ratios, local epochs, and data volumes. Research directions include: • Continual meta-learning: Accumulating knowledge across deployments without forgetting • Task distribution modelling: Predicting environment characteristics from limited observations • Few-shot reinforcement learning: Rapid policy adapta- tion for AUV control in new environments • Meta-learning with physics priors: Incorporating oceanographic knowledge into the meta-learning frame- work Neural Architecture Search for Underwater Constraints: Automated design of networks optimised for specific under- water platforms addresses the challenge of deploying ML on resource-constrained nodes [200], [207]. Search space for underwater networks: • Operations: depthwise conv, grouped conv, skip connection 51 • Widths: 8, 16, 32, 64 channels • Depths: 1, 2, 3, 4 blocks • Quantisation: 32-bit, 16-bit, 8-bit, 4-bit [202], [204] Multi-objective search identifies Pareto-optimal architec- tures: Pareto front =(accuracy i , latency i , energy i )(204) Research opportunities include hardware-aware search opti- mising for specific underwater processors, online architecture adaptation modifying networks during deployment based on observed conditions, transferable architectures generalising across platforms, and interpretable architectures understanding why discovered designs work [201]. 5) Large Language Models and Generative AI for Un- derwater Systems: The emergence of large language models (LLMs) and generative AI presents new opportunities for underwater systems, particularly in semantic communication and intelligent data compression [5], [251]. Recent advances enable deployment of edge-optimised LLMs on AUVs, facilitating local semantic extraction. Com- pact models with approximately 100M parameters have demonstrated 65% reduction in transmission latency through local semantic feature extraction [195]. On the receiver side, hybrid architectures introduce ControlNet-based diffusion models that can achieve 15× or greater data compression while maintaining structural similarity index (SSIM) values exceeding 0.8 for reconstructed sonar images [5]. Research directions include: • Multimodal underwater foundation models: Pre- trained on diverse underwater acoustic, optical, and sen- sor data • In-context learning for protocol adaptation: Adapting to new communication scenarios without retraining • LLM-guided semantic encoding: Using natural lan- guage prompts to specify compression priorities (e.g., “prioritise oil leak detection features”) • Generative channel modelling: Using diffusion models to generate realistic channel conditions for training data augmentation B. Integration Opportunities The convergence of underwater communications with emerging technologies creates unprecedented opportunities for system-level innovations that transcend traditional bound- aries [252]. 1) 6G-Underwater Network Integration: Sixth-generation wireless networks promise seamless connectivity across ter- restrial, aerial, and underwater domains, forming integrated space-air-ground-sea (SAGSIN) networks [253], [254]. Inte- grating underwater segments requires addressing fundamental disparities in propagation characteristics, data rates, and laten- cies [255], [256]. Hybrid RF-Acoustic-Optical Gateways: Multi-modal gate- ways bridge communication domains through intelligent modality selection [257]: • Surface layer (0–10m): RF communication with satel- lites/aircraft for global connectivity • Transition zone (10–100m): Optical links for high- bandwidth bursts with tight alignment constraints • Deep water (>100m): Acoustic communication for long- range, low-data-rate applications ML orchestrates modal selection to maximise efficiency: mode ∗ =arg max m∈RF,optical,acoustic R m (depth, conditions) E m (depth, conditions) (205) where R m is achievable rate and E m is energy cost. Research challenges include seamless handover switching modalities without data loss, Quality of Service mapping translating 6G QoS requirements to underwater constraints, network slicing virtualising underwater resources for different applications, and edge computing placement optimising pro- cessing location between underwater nodes, surface gateways, and cloud [257]. Semantic Communication for Bandwidth Efficiency: Trans- mitting meaning rather than bits dramatically reduces band- width requirements—particularly valuable for bandwidth- constrained underwater channels [195], [258], [259]. Traditional communication: Image (1MB) → Compression (100KB) → Transmission Semantic communication: Image → Feature extraction (1KB) → Meaning (100 bytes) → Transmission This paradigm achieves 80–99% reduction in transmitted data while preserving task-relevant information [5]. Research directions include: • Underwater semantic codebooks: Learning ocean- specific representations for common phenomena • Lossy semantic compression: Preserving meaning while discarding perceptually irrelevant details • Multi-modal semantic fusion: Combining meanings from acoustic, optical, and environmental sensors • Semantic error correction: Recovering meaning despite bit errors through context-aware decoding 2) Digital Twins for Underwater Systems: Digital twins create virtual replicas of underwater networks, enabling sim- ulation, prediction, and optimisation without costly physical deployments [236], [260], [261]. Real-Time Ocean Digital Twins: Synchronised virtual ocean environments support decision-making and what-if analysis: Twin t = f update (Twin t−1 , Observations t , Models t )(206) The European Digital Twin Ocean (EU DTO) initiative demonstrates the potential of large-scale ocean digital twins, integrating satellite observations, in-situ sensor data, and high- fidelity models to provide unprecedented ocean state aware- ness [262]. Chen et al. [236] proposed a five-layer architecture for marine digital twins: perception layer, data layer, model layer, fusion layer, and application layer. Components requiring research include: • State estimation: Inferring unobserved variables from sparse underwater measurements • Model calibration: Adjusting physics models using ML to match observed behaviour • Uncertainty propagation: Quantifying confidence in twin predictions for risk-aware decision-making 52 • Decision support: Optimising operations using twin- based scenario analysis Network Digital Twins for Protocol Optimisation: Virtual network replicas enable safe experimentation with ML-based protocols [261]: Shadow deployment testing: Performance new = Twin.simulate(Protocol new , Conditions real ) (207) Yan et al. [260] proposed digital twin-driven swarm con- trol of AUVs, creating digital replicas for each vehicle that integrate dynamics and environmental data. Their integral reinforcement learning (IRL)-based swarm controller drives both virtual and real AUVs, with virtual-real error optimisation minimising matching errors. Research opportunities include protocol synthesis automat- ically generating protocols from requirements specifications, what-if analysis predicting impact of network changes be- fore deployment, anomaly detection comparing real and twin behaviour to identify failures, and predictive maintenance forecasting equipment failures before occurrence. 3) Satellite-Underwater Communication Links:Direct satellite-to-underwater communication could revolutionise ocean monitoring by eliminating surface infrastructure [256], [263]. Blue-Green Laser Communication: Satellites equipped with blue-green lasers (450–550nm) can penetrate water to 100– 200m depth in clear conditions [264]. Challenges requiring ML solutions include: • Beam steering: Compensating for refraction at air-water interface using adaptive optics • Turbulence mitigation: ML-based prediction and pre- compensation for atmospheric and underwater turbulence • Cloud penetration: Multi-satellite diversity and link prediction • Pointing accuracy: Tracking moving underwater plat- forms through combined GPS/INS/acoustic localisation Hybrid Space-Underwater Networks: Constellation optimi- sation for ocean coverage: Coverage = [ s∈Satellites Footprint s (t)(208) ML optimises satellite tasking through dynamic schedul- ing allocating satellites to high-priority areas, predictive po- sitioning anticipating communication needs based on AUV trajectories and mission requirements, energy management balancing communication and Earth observation payloads, and data prioritisation selecting critical information for uplink under limited contact windows [265]. 4) Cross-Domain Learning and Transfer: Transferring knowledge between terrestrial and underwater domains accel- erates development while reducing costs [190], [238]. Domain Adaptation Techniques: Adversarial domain adapta- tion bridges the gap between data-rich terrestrial environments and data-scarce underwater domains: L =L task (f θ (x s ),y s )− λL domain (f d (g φ (x s )),f d (g φ (x t ))), (209) whereL task is the supervised task loss, f θ is the task classifier with parameters θ, x s and y s are source domain (terrestrial) samples and labels, λ is a weighting parameter, L domain is the domain confusion loss, f d is the domain discriminator, g φ is the feature extractor with parameters φ, and x t are target domain (underwater) samples. Xu et al. [238] successfully transferred Swin Transformer models pre-trained on ImageNet to underwater acoustic target recognition, demonstrating that terrestrial visual features can be adapted to spectrogram-based underwater signal analysis. Transfer learning has been shown to reduce training time by up to 70% while improving classification accuracy by 5–10% compared to training from scratch [266]. Research directions include: • Progressive domain shift: Gradual adaptation through intermediate environments (e.g., tank → pool → harbour → open ocean) • Synthetic intermediate domains: Bridging the reality gap through physics-based simulation • Multi-source adaptation: Combining knowledge from terrestrial, aerial, and underwater domains • Zero-shot underwater learning: Generalising to under- water scenarios without underwater training data using physics-informed priors C. Standardisation Needs The proliferation of ML-based underwater systems necessi- tates standardisation to ensure interoperability, reliability, and scalability across diverse deployments [224]. 1) Protocol Frameworks for ML-Enhanced Communica- tion: Standardised interfaces enabling ML integration at each protocol layer are essential for widespread adoption: ML-Aware Protocol Stack: • Application Layer: Semantic encoding APIs and task- specific compression interfaces [195] • Transport Layer: Learning-based congestion control interfaces with standardised state observation • Network Layer: Adaptive routing hooks enabling RL- based path selection [44], [164] • MAC Layer: Intelligent scheduling primitives supporting ML-based channel access • Physical Layer: Adaptive modulation/coding interfaces with channel state feedback [34] Each layer should expose: • State observation interfaces for ML training with stan- dardised feature definitions • Action execution mechanisms for ML control with bounded latency guarantees • Performance metric collection for reward computation using agreed-upon definitions • Model update protocols for online learning with version control Research needs include: • Abstraction levels: Balancing flexibility and efficiency in interface design • Backwardcompatibility: Integrating with legacy JANUS and other existing protocols 53 • Security mechanisms: Protecting against adversarial at- tacks on ML components [13] • Certification procedures: Validating ML-based proto- cols for safety-critical maritime applications 2) Benchmark Datasets for Underwater ML: Standardised datasets enabling fair comparison and reproducible research are critically needed [3]: Required Dataset Categories: • Channel measurements: Impulse responses across di- verse environments (shallow coastal, deep ocean, Arctic, tropical) • Network traces: Traffic patterns and protocol behaviours under realistic conditions • Sensor data: Multimodal observations (acoustic, optical, electromagnetic) with ground truth • Environmental conditions: Oceanographic context in- cluding temperature profiles, salinity, currents Existing datasets like DeepShip [267] and ShipsEar [268] have enabled significant progress in underwater acoustic target recognition. However, dataset requirements must be expanded to include: • Diversity: Multiple locations, seasons, depths, and envi- ronmental conditions • Scale: Sufficient size for deep learning (targeting millions of labelled samples) • Annotation quality: Expert-verified labels with confi- dence scores • Metadata completeness: Full experimental context for reproducibility 3) Performance Metrics for ML-Based Systems: Standard- ised metrics enabling meaningful comparisons across research groups and deployments: Multi-Dimensional Metric Framework: • Accuracy metrics: Task-specific performance (classifi- cation accuracy, localisation error, throughput) • Efficiency metrics: Energy per inference, computation per decision, memory footprint • Robustness metrics: Performance degradation under noise, interference, and environmental variation • Adaptability metrics: Learning speed, transfer effi- ciency, few-shot performance Composite scores balancing multiple objectives: Score = Y i Metric w i i ,(210) where Metric i is the i-th normalised performance metric and w i are weights reflecting application priorities (e.g., energy- critical vs. accuracy-critical deployments), with P i w i = 1. D. Interdisciplinary Frontiers The most transformative advances emerge at the intersection of ML, oceanography, marine biology, and climate science [3], [56]. 1) Marine Biology Integration: Understanding Ocean Life: ML transforms our understanding of marine ecosystems through automated observation and pattern discovery. Bioacoustic Monitoring Networks: Passive acoustic mon- itoring using ML identifies and tracks marine life non- invasively [14]: Species classification from vocalisations: p(species|spectrogram) = f CNN (spectrogram),(211) where p(species|spectrogram) is the probability distribution over species given the spectrogram input, and f CNN is the convolutional neural network classifier. Population estimation from detection rates: N = n detected p detection · p vocalisation · coverage ,(212) where N is the estimated population, n detected is the number of detections, p detection is the probability of detecting a vo- calisation when it occurs, p vocalisation is the vocalisation rate (vocalisations per individual per unit time), and coverage is the spatial coverage fraction. Research opportunities include: • Behavioural inference: Understanding activities from acoustic signatures • Health assessment: Detecting stress indicators in marine mammal vocalisations • Ecosystem modelling: Predicting trophic interactions from acoustic community structure • Conservation planning: Optimising marine protected area boundaries using ML-derived biodiversity maps Environmental DNA (eDNA) Analysis: ML accelerates species identification from water samples, enabling rapid bio- diversity assessment [17]: Sequence classification: species = arg max s p(s|sequence) = arg max s p(sequence|s)p(s) p(sequence) (213) Research directions include real-time on-platform DNA analysis using miniaturised sequencers, abundance estimation quantifying populations from eDNA concentrations, commu- nity reconstruction inferring ecosystem structure from metage- nomic data, and invasion detection providing early warning for non-native species. 2) Oceanography Integration: Advancing Ocean Science: ML accelerates oceanographic discovery through pattern recognition in massive datasets [3]. Internal Wave Detection and Prediction: Internal waves sig- nificantly affect acoustic propagation, mixing, and underwater vehicle operations: Detection from temperature profiles: p(internal wave|T (z,t)) = f LSTM (T (z,t))(214) Prediction of wave evolution: T (z,t + ∆t) = f Physics-LSTM (T (z,t), stratification, currents) (215) Research needs include identifying generation mechanisms, tracking wave packet propagation, forecasting breaking and mixing events, and quantifying impact on communication systems. 54 3) Climate Science Integration: Underwater ML systems contribute critical observations for climate models and envi- ronmental monitoring [1]. Carbon Flux Monitoring: Quantifying ocean carbon uptake is essential for climate prediction: F CO 2 = k w (Sc,U 10 )· ∆pCO 2 (216) where F CO 2 is the air-sea CO 2 flux, k w is the gas transfer velocity dependent on Schmidt number Sc and wind speed U 10 at 10 metres height, and ∆pCO 2 is the air-sea partial pressure difference. ML improves estimates through: • Transfer velocity: Learning k w from observations incor- porating wave and bubble effects • Spatial interpolation: Filling measurement gaps using physics-informed neural networks • Biological pump: Quantifying carbon export through particle flux estimation • Long-term trends: Detecting climate signals in noisy time series using advanced sequence models Sea Level Rise Prediction: ML enhances regional projec- tions beyond global mean estimates: SLR regional = SLR global + ∆ regional (217) where ML learns regional variations from ocean dynamics, ice sheet contributions, glacial isostatic adjustment, and ground- water depletion patterns. 4) Ethical and Societal Considerations: Advanced under- water ML raises important ethical questions that must be addressed as capabilities expand [190]. Environmental Impact: • Acoustic pollution: Minimising impact on marine life through adaptive transmission scheduling that avoids sen- sitive periods • Electronic waste: Developing recovery plans for de- ployed sensors and biodegradable alternatives • Energy consumption: Balancing capability and sustain- ability through efficient ML architectures • Ecosystem disruption: Avoiding behavioural changes in marine species through careful system design Data Governance: • Sovereignty: Respecting national waters and exclusive economic zones in data collection • Privacy: Protecting submarine operations and sensitive maritime activities • Sharing: Balancing scientific openness with security requirements • Indigenous rights: Consulting traditional ocean users and incorporating traditional ecological knowledge Dual-Use Concerns: • Military applications: Establishing frameworks to pre- vent weaponisation of civilian research • Resource exploitation: Avoiding over-extraction enabled by improved monitoring • Surveillance: Protecting privacy while enabling legiti- mate monitoring • Access equity: Ensuring developing nation participation in ocean ML benefits PINNs Transformers GNNs Meta-Learning System Integration 6G-SAGSIN Digital Twins Semantic Comm. Federated Learning Intelligent Underwater Networks Fig. 12. Overview of emerging ML technologies and their integration path- ways for intelligent underwater networks. Core ML technologies (top) enable system-level integration (middle) that supports next-generation underwater applications (bottom). E. Summary and Research Roadmap Table XXV presents a consolidated research roadmap organ- ised by technology readiness level and expected timeline for the emerging ML technologies discussed in this section. This roadmap aims to guide researchers in identifying high-impact areas requiring immediate attention versus those requiring longer-term foundational work. The future of ML-enabled underwater communications lies at the intersection of these emerging technologies. As il- lustrated in Figure 12, the convergence of physics-informed learning, advanced neural architectures, and system-level inte- gration creates a synergistic framework for addressing the fun- damental challenges of underwater environments. Success will require unprecedented collaboration across disciplines—from ML researchers developing new algorithms to oceanographers providing domain expertise, from communication engineers designing practical systems to marine biologists ensuring environmental responsibility. VIII. OPEN CHALLENGES AND RESEARCH GAPS Despite the remarkable progress in applying ML to un- derwater communications documented throughout this sur- vey, significant challenges remain that prevent widespread deployment and limit the full potential of intelligent IoUT systems [3], [269]. These challenges span technical limitations inherent to ML algorithms, practical constraints of underwater operations, and broader systemic issues requiring interdisci- plinary solutions. Understanding these gaps is crucial for di- recting future research efforts and setting realistic expectations for ML-enabled underwater networks. This section systemati- cally examines these challenges, identifying specific research opportunities that could enable transformative advances in the field. A. Technical Challenges The unique characteristics of underwater environments ex- pose fundamental limitations in current ML approaches, cre- ating technical challenges that demand novel solutions beyond incremental improvements to existing algorithms [33], [270]. 55 TABLE XXV RESEARCH ROADMAP FOR ML-ENABLED UNDERWATER COMMUNICATIONS TechnologyNear-term (1–2 years)Mid-term (3–5 years)TRLKey Enablers Physics-Informed NNBroadband acoustic modelling, Single-domain PINNs Multi-physics coupling, Real-time deployment 4–5GPU acceleration, Auto- matic differentiation TransformersUATR classification, Signal denoising End-to-end communication, Multi- modal fusion 5–6Sparse attention, Model compression Graph Neural NetsSecure routing, Trust mod- elling Dynamic topology learning, Dis- tributed inference 4–5Efficient message passing, Edge deployment Federated LearningPrivacy-preservingtraining, Model aggregation AsynchronousunderwaterFL, Byzantine robustness 4–5Communication-efficient protocols Meta-LearningFew-shot environment adapta- tion Continual meta-learning, Zero-shot deployment 3–4Diverse training environ- ments Semantic Comm.Task-specificcompression, Meaning encoding LLM-guided semantics, Generative decoding 3–4Edge-optimisedLLMs, Diffusion models Digital TwinsNetwork simulation, Protocol testing Real-time ocean twins, Predictive maintenance 4–5HPC infrastructure, Sen- sor fusion 6G IntegrationGateway architectures, Modal- ity switching Seamless SAGSIN connectivity3–4Standardsdevelopment, Hybrid modems 1) Data Scarcity and Quality: The Million-Dollar Train- ing Set Problem. The scarcity of labelled underwater data rep- resents perhaps the most fundamental challenge constraining ML deployment in IoUT systems [3], [271]. Unlike terrestrial applications where millions of labelled images are freely available through crowdsourcing initiatives such as ImageNet, underwater datasets require expensive ship time ($20,000– 50,000 per day), specialised equipment including ROVs and AUVs, and expert annotation—making even modest datasets cost millions of dollars to acquire [46]. Consider the economic contrast: ImageNet contains 14 million labelled images collected through crowdsourcing at minimal cost. A comparable underwater dataset would require approximately 280 days of continuous ship operations ($8.4 million), ROV deployment and operation ($14 million), and expert annotation at 2 minutes per image ($4.7 million)— totalling $27.1 million, assuming perfect weather and no equipment failures. This economic reality limits most un- derwater ML projects to datasets of 1,000–10,000 labelled samples—insufficient for training deep networks that typically require millions of examples to generalise effectively [52], [53]. Domain Shift and Environmental Variability. Limited data collection inevitably creates dataset bias that manifests as catastrophic domain shift [3]. Models trained on summer data from calm, clear waters fail when deployed in winter storms or turbid coastal regions. The underwater environment’s extreme variability means that datasets collected at one location rarely generalise to others: • Geographic variation: Arctic waters differ fundamen- tally from tropical seas in acoustic propagation char- acteristics, temperature profiles, and ambient noise pat- terns [21]. • Depth stratification: Coastal environments with depths of 10–100 metres exhibit dramatically different channel characteristics than deep ocean basins exceeding 4,000 metres [22]. • Temporal dynamics: Seasonal variations in temperature, salinity, and biological activity create essentially different communication channels throughout the year [56]. • Anthropogenic factors: Human activity patterns includ- ing shipping, fishing, and offshore operations vary sig- nificantly by region and time, introducing non-stationary noise characteristics [14]. Annotation Quality and Consistency. Even when data is collected, annotation presents significant challenges that compound the data scarcity problem: • Species identification requires marine biology expertise costing $200–500 per hour, with inter-annotator agree- ment rarely exceeding 85% for complex classification tasks [121], [141]. • Acoustic signature classification demands experienced sonar operators who can distinguish between biological, environmental, and mechanical sources [62], [67]. • Damage assessment for infrastructure inspection requires engineering knowledge to identify corrosion, cracks, and structural degradation [107]. Research Gaps and Emerging Solutions. Several promis- ing research directions address the data scarcity challenge: • Self-supervised learning: Recent advances in contrastive learning and masked prediction enable models to learn from unlabelled data through pretext tasks such as pre- dicting masked portions of acoustic signals, reconstruct- ing corrupted spectrograms, or forecasting future frames in sonar sequences [272]. • Physics-informed synthetic data: Creating physically accurate simulations that bridge the reality gap requires incorporating complex phenomena including turbulence, marine snow, bioluminescence, and realistic channel models derived from ray-tracing or parabolic equation methods [47], [227]. • Few-shot and meta-learning: Designing architectures that achieve high accuracy from 10–100 examples rather than thousands, leveraging techniques such as prototyp- ical networks, model-agnostic meta-learning (MAML), and metric learning [273]. • Active learning strategies: Intelligently selecting which data to collect and label to maximise information gain per dollar spent, using uncertainty sampling, query-by- committee, or expected model change criteria [72]. 56 • Transfer and domain adaptation: Transferring knowl- edge between different underwater environments without catastrophic forgetting, including techniques for unsuper- vised domain adaptation and continual learning [111], [274]. 2) Model Interpretability and Explainability: The Black Box Problem in Critical Applications. The opacity of deep learning models creates critical challenges for underwater deployments where failures can result in mission loss, environ- mental damage, or compromised security [8]. Unlike terrestrial systems where unexpected behaviours might be inconvenient, underwater ML failures can be catastrophic and irreversible—a malfunctioning AUV might be lost at depth, an incorrect threat classification could trigger international incidents, and failed environmental predictions could permit ecological disasters. Naval operators require understanding of why an ML system classified a contact as hostile before engagement decisions. En- vironmental regulators need explanations for why a model pre- dicted minimal impact before approving offshore operations. Pipeline operators must understand why an anomaly detec- tion system flagged a particular segment. These stakeholders cannot accept “the neural network said so” as justification for critical decisions [275]. Limitations of Current Interpretability Methods. Current interpretability techniques developed for terrestrial applica- tions often fail when applied to underwater data: • Gradient-based attribution: Methods such as Grad- CAM and integrated gradients produce noisy, unreli- able explanations for acoustic signals due to the high- frequency oscillations and phase sensitivity of underwater waveforms [56]. • Attention visualisation: While effective for images and text, attention mechanisms are difficult to interpret for 3D spatiotemporal data typical of sonar imagery and acoustic arrays [226]. • Concept activation vectors: These require labelled con- cepts (e.g., “multipath reflection,” “biological noise”) that are rarely available in sufficient quantity for underwater domains [272]. • Counterfactual explanations: Generating realistic un- derwater counterfactuals is challenging because small perturbations in acoustic space may not correspond to physically plausible scenarios [273]. Debugging and Failure Analysis. When an underwater ML system fails, understanding why becomes critical for prevention and system improvement: • Was it sensor degradation from biofouling progressively altering input distributions? • Did the model encounter out-of-distribution data from unusual environmental conditions? • Was there adversarial interference from natural or inten- tional sources? • Did environmental conditions exceed the bounds repre- sented in training data? Without interpretability, diagnosing failures requires expen- sive platform recovery and forensic analysis—if the platform can be recovered at all from deep water deployments [12]. Research Priorities for Interpretable Underwater ML. • Physics-grounded explanations: Developing explana- tion methods that map neural network features to oceano- graphic principles such as sound speed profiles, multipath propagation, and ambient noise sources [47], [227]. • Hierarchical interpretability: Providing explanations at multiple levels of abstraction, from raw signal characteris- tics to intermediate acoustic features to high-level tactical decisions [272]. • Uncertainty-aware explanations: Communicating not just predictions but calibrated confidence bounds, en- abling operators to know when to trust model outputs and when to seek additional verification [143]. • Interactive debugging tools: Enabling operators to query model reasoning in real-time during missions, supporting what-if analysis and confidence assessment [276]. • Causal inference methods: Distinguishing correlation from causation in environmental predictions to avoid spu- rious relationships that fail under distribution shift [50]. 3) Real-Time Processing Constraints: The Computational Gap. The combination of limited computational resources and strict timing requirements creates severe challenges for ML deployment on underwater platforms [197], [277]. Underwater nodes operate with processors 100–1000× less powerful than modern GPUs while facing harder real-time constraints than many terrestrial applications. A typical underwater sensor node provides limited compu- tational resources: • ARM Cortex-M4 processor at 180 MHz delivering ap- proximately 216 MFLOPS • 256 KB RAM and 2 MB Flash storage • Power budget of 10–100 mW for computation In contrast, modern neural networks require substantially greater resources: • ResNet-50 inference: 4 GFLOPS (20× available com- pute) • Memory footprint: 98 MB (approximately 50× available memory) • Power consumption: 5–10 W (100× available power) Latency Requirements. Underwater applications demand strict timing that conflicts with typical ML inference times [8], [33]: • Collision avoidance: AUVs require 10–100 ms response time to avoid obstacles detected by forward-looking sonar. • Acoustic equalisation: Adaptive channel estimation must complete within sub-millisecond intervals per symbol to track rapid fading. • Predator evasion: Biological monitoring systems must detect and respond to predator signatures immediately. • Communication protocols: MAC layer decisions require microsecond-precision timing for effective carrier sensing and collision avoidance. Current ML inference times on embedded processors sig- nificantly exceed these requirements: CNN forward passes require 50–500 ms, transformer inference takes 1–10 seconds 57 per sequence, and RL action selection including planning needs 10–100 ms [276]. Research Directions for Real-Time Underwater ML. • Neural architecture co-design: Jointly optimising net- work architecture and hardware implementation, includ- ing custom accelerators designed for underwater acoustic signal processing [197]. • Anytime algorithms: Developing methods that produce increasingly accurate results as computation time permits, allowing systems to return best-effort predictions when deadlines approach [277]. • Hierarchical processing: Implementing fast approximate decisions at the edge refined by more sophisticated mod- els when time and communication bandwidth allow [278]. • Predictive caching: Pre-computing likely inference paths based on environmental context, reducing runtime com- putation for expected scenarios. • Neuromorphic computing: Exploiting spike-based neu- ral networks and event-driven processing that naturally map to acoustic signal characteristics [33]. • Model compression: Advancing quantisation, pruning, and knowledge distillation techniques specifically opti- mised for underwater signal processing tasks [279]. 4) Adversarial Robustness and Security: Natural Adver- sarial Conditions. The ocean itself creates naturally adversar- ial inputs that challenge ML systems in ways not encountered in terrestrial deployments [13], [280]: • Marine mammal mimicry: Dolphins and whales pro- duce clicks and vocalisations that can be misclassified as mechanical sources or even deliberately learned sonar returns. • Bubble curtains: Ship wakes and biological activity create acoustic shadows and false targets that confound detection algorithms. • Thermoclines: Sharp temperature gradients bend acous- tic paths in unexpected ways, causing systematic locali- sation errors. • Bioluminescence: Biological light production triggers false optical detections in systems using underwater op- tical wireless communication. These natural phenomena cause significant performance degradation: 40% increases in false positive rates, complete tracking loss in 15% of challenging scenarios, and misclassifi- cation of 25% of biological sounds as mechanical sources [62], [281]. Intentional Adversarial Attacks. Strategic adversaries can exploit ML vulnerabilities through sophisticated attack vec- tors [13], [280], [282]: • Acoustic spoofing: Generating synthetic whale calls to mask submarine signatures or creating false targets to overwhelm detection capacity. • Replay attacks: Retransmitting recorded environmental sounds or communication signals to confuse temporal reasoning. • Model extraction: Probing deployed systems through carefully crafted queries to reverse-engineer capabilities and vulnerabilities. • Data poisoning: Contaminating training data through compromised sensors or manipulated environmental databases. • Physical-layer attacks: Exploiting the broadcast nature of acoustic communication to intercept, jam, or manipu- late transmissions [283]. Limitations of Current Defences. Defence mechanisms developed for terrestrial ML often fail in underwater contexts: • Adversarial training requires representative attack exam- ples that are difficult to generate for underwater acoustic signals. • Certified defences assume bounded perturbations that are invalid for the complex propagation characteristics of acoustic channels. • Detection methods relying on statistical properties are confounded by the inherent non-stationarity of underwa- ter environments. Research Gaps in Underwater Adversarial ML. • Physics-constrained adversarial examples: Ensuring that adversarial perturbations remain physically realisable given acoustic propagation constraints [280]. • Multi-modal verification: Cross-checking predictions across acoustic, optical, and magnetic sensors to detect inconsistencies indicative of attacks. • Robust feature learning: Discovering signal representa- tions that remain invariant to both natural environmental variation and adversarial perturbations [272]. • Game-theoretic defences: Modelling adversarial inter- actions as strategic games to develop optimal defence strategies under uncertainty [13]. • Forensic attribution: Distinguishing natural system fail- ures from intentional attacks to enable appropriate re- sponse and recovery procedures. • Secure federated learning: Protecting distributed ML systems from poisoning attacks while maintaining the benefits of collaborative training [284], [285]. 5) Physics-Informed ML: Bridging Data-Driven and Model-Based Approaches. Physics-informed neural networks (PINNs) have emerged as a promising paradigm for addressing data scarcity by incorporating oceanographic knowledge di- rectly into learning algorithms [47], [227], [286]. Rather than treating the underwater environment as a black box, PINNs encode physical laws—wave equations, ray acoustics, and conservation principles—as soft constraints during training, enabling accurate predictions from limited measurements. Current Applications and Achievements. Recent work demonstrates the potential of physics-informed approaches in underwater acoustics [232], [235]: • Sound field prediction: PINNs incorporating the Helmholtz equation achieve accurate acoustic field pre- dictions with 100× fewer training samples than purely data-driven approaches [287]. • Channel estimation: Physics-guided neural networks model underwater channel impulse responses by encoding multipath propagation physics [288]. • Source localisation: Matched-field processing enhanced with PINN-based replica field generation improves locali- 58 sation accuracy while reducing sensitivity to environmen- tal mismatch [289]. Remaining Challenges. Despite promising results, signifi- cant challenges limit broader PINN adoption: • Computational complexity: PINNs require solving par- tial differential equations during training, increasing com- putational costs 10–100× compared to standard neural networks. • Spectral bias: Neural networks struggle to learn high- frequency components of acoustic fields, requiring specialised architectures such as Fourier feature net- works [290]. • Boundary conditions: Complex geometries and time- varying boundaries (surface waves, moving vehicles) are difficult to incorporate as constraints. • Multi-scale physics: Underwater environments exhibit phenomena across scales from centimetre-scale turbu- lence to basin-scale circulation, challenging single-model approaches. Research Opportunities. • Hybrid architectures: Combining fast neural network inference with physics-based corrections for real-time applications [291]. • Transfer learning for PINNs: Pre-training physics- informed models on simulated environments and fine- tuning with limited field data. • Uncertainty quantification: Developing Bayesian PINN variants that provide calibrated uncertainty estimates for safety-critical decisions. • Multi-fidelity modelling: Integrating data from multiple sources with varying accuracy and resolution. 6) Emerging Paradigms: Federated and Distributed Learn- ing: Federated Learning for Privacy-Preserving Collab- oration. The distributed nature of underwater networks and sensitivity of collected data make federated learning (FL) particularly attractive for IoUT applications [41], [114], [292]. FL enables collaborative model training without centralising raw data, addressing privacy concerns while leveraging diverse observations from multiple platforms and operators. Unique Challenges for Underwater FL. Implementing FL in underwater environments faces distinctive obstacles [278], [279]: • Communication constraints: Acoustic links providing 10–100 kbps cannot support frequent gradient exchanges required by standard FL protocols. Transmitting a 10 MB model update requires 13–130 hours, consuming entire battery reserves. • Extreme heterogeneity: Underwater nodes vary dramat- ically in computational capability, from simple acoustic modems to sophisticated AUV platforms, complicating unified model architectures. • Non-IID data: Data collected at different depths, loca- tions, and times exhibits extreme non-independent and identically distributed (non-IID) characteristics that de- grade FL convergence. • Asynchronous participation: Nodes may be unreachable for extended periods due to deployment patterns, commu- nication blackouts, or mission priorities. Emerging Solutions and Research Directions. • Hierarchical FL: Multi-tier architectures where edge nodes (AUVs, surface buoys) aggregate updates before transmission to cloud servers, reducing communication overhead [278]. • Model compression for FL: Gradient quantisation, spar- sification, and sketching techniques adapted for extreme bandwidth constraints [279]. • Personalised FL: Learning device-specific model adap- tations that account for local environmental conditions while benefiting from global knowledge. • Asynchronous and semi-synchronous protocols: FL algorithms robust to delayed and missing updates from intermittently connected underwater nodes [292]. • Security in underwater FL: Byzantine-robust aggre- gation and differential privacy mechanisms adapted for resource-constrained underwater platforms [284]. 7) Digital Twin Integration: Virtual-Physical Synchro- nisation for Underwater Systems. Digital twins—virtual replicas synchronised with physical underwater systems— offer transformative potential for ML deployment by enabling simulation-based training, predictive maintenance, and real- time decision support [236], [260], [293]. Current Developments. Recent advances demonstrate growing capabilities [294]: • AUV digital twins: Virtual replicas incorporating hydro- dynamic models and environmental data enable RL-based controller training in simulation with improved sim-to- real transfer [260], [295]. • Marine environment twins: Large-scale initiatives such as the European Digital Twin Ocean (EU DTO) aim to create comprehensive virtual representations of ocean dynamics for scientific and operational applications [262]. • Infrastructure monitoring: Digital twins of subsea pipelines, cables, and offshore structures support ML- based anomaly detection and maintenance schedul- ing [261]. Research Challenges. • Model fidelity: Achieving sufficient accuracy in digital twin models to support reliable ML training while main- taining computational tractability. • Real-time synchronisation: Keeping virtual models up- dated with physical system states despite communication delays and intermittent connectivity. • Uncertainty propagation: Representing and propagating uncertainty through coupled physical-ML models for ro- bust decision-making. • Cognitive digital twins: Incorporating ML-based reason- ing and prediction capabilities directly into digital twin architectures [296]. B. Practical Challenges Beyond technical limitations, practical challenges related to deployment, maintenance, economics, and environmental 59 impact constrain ML adoption in underwater systems [12], [297]. 1) Deployment and Maintenance in Isolation: The Long- Duration Autonomy Problem. Deployed underwater systems operate in isolation for months or years, unable to receive updates or maintenance without expensive recovery opera- tions [12]. This creates unique challenges for ML systems that typically require frequent updates as they encounter new data and conditions. Model Drift and Performance Degradation. ML models experience progressive degradation as deployment conditions diverge from training distributions: • Sensor drift: Calibration changes alter input distributions by 2–5% monthly due to component aging and environ- mental exposure. • Biofouling: Progressive biological growth on sensors modifies acoustic and optical responses, shifting feature distributions. • Seasonal changes: Temperature stratification, biological activity, and weather patterns invalidate learned seasonal patterns. • Equipment aging: Battery degradation, connector cor- rosion, and mechanical wear affect signal quality unpre- dictably. Without updates, model accuracy degrades significantly: from 95% at initial deployment to 78% after 6 months (sensor drift and biofouling), 61% after 12 months (seasonal changes), and potentially 43% after 24 months—below random guessing for multi-class problems [297]. Update Mechanism Limitations. Acoustic communica- tion’s limited bandwidth makes over-the-air updates impracti- cal: • A small CNN model of 10 MB requires 13–130 hours transmission time at 10–100 kbps. • Power consumption of 50 W × 130 hours equals 6.5 kWh—potentially the entire battery capacity. • Cumulative error probability approaches certainty: 1 − (1− 10 −3 ) 10 8 ≈ 1. Physical recovery for updates incurs significant costs: $5,000–10,000 per node in shallow water, $50,000–100,000 per node in deep water, with 5–10% risk of total platform loss per recovery operation. Research Needs for Maintainable Underwater ML. • Self-healing models: Architectures that automatically de- tect performance degradation and apply corrective adap- tations without external intervention [274]. • Incremental and continual learning: Updating models with minimal data transfer by transmitting only essential parameter updates or learning from local data while preserving prior knowledge. • Federated maintenance: Coordinating updates across distributed networks to share learned adaptations while respecting communication constraints [292]. • Graceful degradation: Designing systems that maintain core functionality as components fail, automatically re- ducing capability rather than failing catastrophically. • Predictive maintenance: Using ML to anticipate failures before they occur, scheduling recovery operations proac- tively rather than reactively [236]. 2) Economic Viability and Scalability: Total Cost of Own- ership. The high costs of underwater operations create eco- nomic barriers to ML adoption, requiring careful cost-benefit analysis and innovative approaches to reduce expenses [3]. Deploying ML-enabled underwater systems involves sub- stantial investment across multiple phases: • Development costs: Data collection ($1–5 million), model development ($0.5–2 million), and testing and validation ($0.5–1 million). • Deployment costs: Hardware per node ($5,000–50,000), deployment operations ($20,000–100,000 per day), and integration and commissioning ($0.5–2 million). • Operational costs: Annual monitoring and mainte- nance ($100,000–500,000), data processing and stor- age ($50,000–200,000), and updates and improvements ($200,000–1 million). Total 5-year cost for a 100-node network ranges from $10– 50 million depending on depth, complexity, and operational requirements. Return on Investment Challenges. Quantifying ML ben- efits proves difficult for several reasons: • Prevented failures: How should one value disasters that did not occur due to ML-enabled early warning? • Efficiency improvements: Energy savings and extended network lifetime often yield indirect, long-term benefits difficult to attribute directly. • Scientific discoveries: Academic and societal value may not translate to immediate economic returns. • Environmental protection: Ecosystem services enabled by better monitoring are challenging to monetise within traditional financial frameworks. Economic Research Priorities. • Multi-stakeholder cost-sharing: Developing frame- works for government, industry, and research institutions to jointly fund underwater ML infrastructure. • Value quantification methodologies: Creating metrics that capture intangible benefits including risk reduction, environmental protection, and scientific advancement. • Risk-reward frameworks: Balancing upfront investment against uncertain long-term returns with appropriate dis- count rates and risk premiums. • Technology transfer mechanisms: Commercialising academic ML developments to accelerate practical de- ployment and reduce duplication of effort. • Standardisation for economies of scale: Reducing per- unit costs through common interfaces, protocols, and component specifications [256]. 3) Regulatory Compliance and Governance: Navigating Complex Legal Frameworks. ML-enabled underwater sys- tems must comply with complex, often conflicting regulations spanning multiple jurisdictions and domains [3], [255]: • Maritime law: UNCLOS provisions governing underwa- ter activities, IMO regulations for vessel operations, and coastal state jurisdiction extending 200 nautical miles. 60 • Environmental protection: Marine protected area re- strictions, MARPOL conventions limiting emissions and discharges, and endangered species protections affecting acoustic operations. • Spectrum management: ITU allocations for underwater acoustic frequencies and national regulations governing acoustic source levels. • Data privacy: GDPR requirements for EU waters, na- tional privacy laws affecting collected data, and restric- tions on biometric and location data. • Autonomous systems: Emerging regulations governing AI/ML decision-making, liability frameworks for au- tonomous vehicle accidents, and certification require- ments. • Dual-use restrictions: ITAR controls on military- relevant technologies and export restrictions limiting in- ternational collaboration. ML-Specific Compliance Challenges. • Algorithm transparency: Regulators increasingly re- quire explanations for automated decisions that current ML models cannot adequately provide. • Accountability: Liability allocation among algorithm de- velopers, system operators, and deployment organisations remains legally unsettled. • Certification: No established standards exist for certify- ing ML safety and reliability in underwater applications. • Cross-border operations: Models trained in one juris- diction may process data or make decisions that violate another’s laws. • Data sovereignty: Restrictions on international data transfer complicate federated learning and cloud-based processing. Governance Research Needs. • Standards development: Creating underwater ML certi- fication frameworks analogous to aviation and automotive safety standards. • Compliance by design: Building regulatory require- ments into ML architectures from inception rather than retrofitting compliance. • Automated compliance checking: Developing tools that verify adherence to applicable regulations across jurisdic- tions. • International harmonisation: Working toward aligned regulations that enable cross-border underwater ML de- ployments. • Adaptive governance: Creating regulatory frameworks flexible enough to accommodate rapid technological evo- lution. 4) Environmental Impact and Sustainability: First, Do No Harm. Deploying ML systems in sensitive marine ecosystems raises environmental concerns requiring careful consideration and mitigation [3]. Direct Environmental Impacts. • Acoustic pollution: Active sonar for ML training may exceed 200 dB source levels, with continuous monitor- ing creating 24/7 acoustic emissions. Marine mammals exhibit behavioural changes, and mass stranding events have been linked to naval sonar exercises [298]. • Physical presence: Deployed equipment creates entan- glement risks for marine life, artificial reef effects that alter local ecosystems, and contamination potential from batteries and electronic components. • Light pollution: Optical communication systems may disrupt biological rhythms, attract or repel species dif- ferentially, and interfere with bioluminescent signalling. Indirect Environmental Impacts. • Carbon footprint: Manufacturing sensors produces ap- proximately 500 kg CO 2 per node, deployment operations generate 10 tons CO 2 per vessel-day, and data centre processing for network analysis may require megawatt- scale power consumption. • Resource extraction: Rare earth elements for electronics, lithium for batteries, and copper for communications all carry environmental costs in mining and processing. • E-waste: End-of-life disposal of underwater electronics creates pollution risks, particularly for nodes that cannot be recovered. Environmental Research Priorities. • Bio-compatible designs: ML systems engineered to co- exist with marine life through appropriate materials, form factors, and operational patterns. • Energy harvesting: Eliminating or reducing battery re- quirements through wave, thermal, and microbial fuel cell energy sources [28]. • Biodegradable components: Materials that safely de- compose after mission completion, eliminating long-term pollution. • Passive monitoring: ML approaches that operate without active acoustic or optical emissions, relying entirely on ambient signals. • Impact assessment methodologies: Quantifying and monitoring ecological effects of ML-enabled underwater networks. Mitigation Strategies Requiring Development. • Adaptive duty cycling: Automatically reducing acoustic emissions when marine mammals are detected in prox- imity [14]. • Frequency management: Avoiding biologically sensi- tive frequency bands used by local species for communi- cation and navigation. • Collaborative monitoring: Sharing infrastructure among multiple users to reduce redundant deployments and cumulative impact. • Green ML: Optimising algorithms for minimal computa- tional and communication requirements, reducing energy consumption throughout the network [297]. • Ecosystemrestoration:Mandatingenvironmental restoration activities as conditions for deployment permits. C. Cross-Cutting Research Opportunities Several research directions address multiple challenges si- multaneously, offering high-leverage opportunities for advanc- 61 ing ML in underwater systems. 1) Integrated Sensing and Communication: The conver- gence of sensing and communication functions offers ef- ficiency gains particularly valuable in resource-constrained underwater environments [299]. Joint waveform designs that simultaneously perform channel estimation, localisation, and data transmission reduce energy consumption and spectrum us- age while providing richer inputs for ML algorithms. Research opportunities include ML-optimised waveform design, joint sensing-communication protocols, and multi-function neural network architectures. 2) Cross-Domain Adaptation and Transfer: Developing methods for transferring ML models across different under- water environments—from coastal to deep sea, tropical to polar, acoustic to optical—would dramatically reduce data requirements and accelerate deployment [111]. Key chal- lenges include identifying domain-invariant features, quantify- ing transferability, and developing safe adaptation procedures that avoid negative transfer. 3) Human-AI Collaboration: Many underwater ML appli- cations require effective collaboration between autonomous systems and human operators [8]. Research opportunities in- clude developing interfaces that communicate ML uncertainty and reasoning to operators, designing ML systems that can incorporate human guidance and corrections, and creating shared mental models between humans and underwater AI systems. 4) Integration with Space-Air-Ground-Sea Networks: Fu- ture IoUT systems will operate as components of integrated Space-Air-Ground-Sea (SAGS) networks, requiring ML ap- proaches that span multiple domains [255], [257], [300]. Research needs include cross-domain handoff optimisation, heterogeneous data fusion, and unified ML architectures that operate across satellite, aerial, terrestrial, and underwater seg- ments. D. Summary of Research Priorities Table XXVI synthesises the key research priorities identi- fied throughout this section, mapping challenges to specific research opportunities and their potential impact. Figure 13 presents a visual taxonomy of the challenges and their interconnections, illustrating how technical limitations compound practical constraints and identifying high-priority research intersections. The challenges documented in this section represent not obstacles but opportunities for researchers and practitioners to make significant contributions to a field of growing im- portance. As climate change intensifies pressure on marine ecosystems and the blue economy expands, the need for intelligent underwater networks becomes ever more urgent. Addressing these challenges requires collaboration across disciplines—ocean engineering, ML, marine biology, policy, and economics—to develop solutions that are technically sophisticated, practically deployable, and environmentally re- sponsible. IX. CONCLUSIONS The convergence of ML and underwater communications represents a paradigm shift in humanity’s ability to observe, understand, and interact with the ocean environment. This comprehensive survey has systematically examined ML appli- cations across all layers of the IoUT protocol stack, revealing that intelligent algorithms do not merely optimise existing systems but fundamentally transform what is achievable in un- derwater networks. As IoUT systems transition from research demonstrations to operational deployments supporting climate monitoring, marine resource management, and national secu- rity, it is essential to synthesise the key insights, acknowledge transformative impacts, and chart actionable paths forward. A. Synthesis of Key Findings Our layer-by-layer analysis reveals that ML addresses fun- damental challenges that have constrained underwater commu- nications for decades. The evidence demonstrates not merely incremental optimisation but transformative capabilities en- abling applications previously considered impossible. 1) Performance Achievements Across Protocol Layers: Table XXVII synthesises the quantitative improvements doc- umented throughout this survey, organised by protocol layer. These results represent the current state-of-the-art as of 2025, compiled from experimental deployments and rigorous simu- lation studies. Physical Layer Transformation. ML techniques have revo- lutionised fundamental signal processing tasks. Deep learning- based localisation achieves sub-metre accuracy (0.5–0.8 m) compared to 8.5 m errors from traditional trilateration—a 10– 17× improvement that enables precision applications such as AUV docking and pipeline inspection. Remarkably high localisation accuracy (approaching 99.98% in controlled water tank environments [168]) has been demonstrated using adap- tive k-N approaches, establishing the potential ceiling for future deployments. Channel estimation using LSTM networks captures temporal correlations that analytical models miss, achieving substantial MSE reductions (see [151] for detailed results) while decreasing pilot overhead from 10–20% to below 5% of transmission time. MAC and Network Layer Adaptation. Reinforcement learning enables protocols that adapt to conditions traditional approaches cannot model. Q-learning MAC protocols achieve substantial improvements in channel utilisation [171] by learn- ing when aggressive transmission succeeds versus when con- servative backoff prevents collisions—knowledge impossible to encode in fixed rules given the channel’s stochastic nature. Network lifetime is substantially extended through DRL-based routing [177] that continuously balances energy consumption, delay, and reliability based on actual network conditions rather than worst-case assumptions. Cross-Layer Synergies. Perhaps most significantly, cross- layer ML optimisation delivers 42% additional performance beyond layer-isolated approaches. Physical layer channel pre- dictions inform MAC scheduling, which shapes network layer routing decisions—creating optimisation cascades impossible with traditional siloed protocol design. The compound effect 62 TABLE XXVI SUMMARY OF RESEARCH PRIORITIES FOR ML IN IOUT SYSTEMS Challenge CategoryKey ProblemResearch DirectionPotential Impact Data ScarcityMillion-dollar datasetsSelf-supervised learning, PINNs100× reduction in data needs InterpretabilityBlack-box decisionsPhysics-grounded explanationsEnable regulatory approval Real-Time ProcessingComputational gapTinyML, neuromorphic computing100× efficiency improvement Adversarial Robustness Natural/intentional attacksPhysics-constrained defencesMaintain 95%+ accuracy under at- tack Federated LearningCommunication constraintsHierarchical FL, compressionEnable collaborative training MaintenanceModel drift in isolationContinual learning, self-healingExtend deployment 3–5× Economic ViabilityHigh deployment costsStandardisation, cost-sharing50% cost reduction RegulationCompliance complexityCertification frameworksAccelerate deployment approval EnvironmentalAcoustic pollutionPassive monitoring, green MLMinimise ecosystem impact Technical Challenges Data ScarcityInterpretability Real-Time Adversarial PINNs Federated Learning Practical Challenges MaintenanceEconomic Regulatory Environmental Digital Twins High-Priority Intersection: Distributed Intelligence Fig. 13. Taxonomy of open challenges in ML for IoUT systems. By aligning Federated Learning with Real-Time constraints, the vertical dependency is clarified. Bidirectional arrows (orange) show the interplay between technical robustness and practical deployment, centred around the high-priority research intersection. TABLE XXVII SUMMARY OF ML PERFORMANCE ACHIEVEMENTS ACROSS IOUT PROTOCOL STACK LayerApplicationTraditional Performance ML PerformanceKey Enabling Technique Physical Localisation accuracy8.5 m error0.5–0.8 m errorCNN, DQN active sensing Channel estimation MSE0.0430.012 (significant reduc- tion) LSTM temporal modelling Modulation classification75% @ 0 dB SNR96% @ 0 dB SNRCNN feature learning Adaptivemodulation throughput Baseline+20–45%(substantial gain) DQN policy optimisation MAC Channel utilisation8%18–42%(scenario- dependent) Q-learning adaptive backoff Collision rate45/hour12/hour (73% reduction)Multi-agent RL coordination Energy per bit2.8 mJ0.95 mJ (66% reduction)TD3 power control Network Packet delivery ratio76%94% (24% gain)GNN topology learning Network lifetime15 days41days(substantial gain) DRL energy-aware routing Void recovery success52%89% (71% gain)DQN adaptive forwarding Transport Packet loss rate8.2%0.7% (91% reduction)PPO congestion control End-to-end delay18.3 s7.2 s (61% reduction)LSTM traffic prediction Application Object detection mAP52%92% (77% gain)YOLOv8 with attention Data compression ratio10:1100:1 (10× gain)Convolutional autoencoders Anomaly detection rate71%96% (35% gain)VAE latent modelling Cross-Layer System-wide efficiencyBaseline42% additional gainMulti-task learning Energy efficiency2800 J/day180 J/day (15.6×)Holistic optimisation reduces daily energy consumption from 2,800 J to 180 J (15.6× improvement), transforming underwater sensors from short-lived devices requiring frequent battery replacement to persistent platforms operating for years. 2) Critical Insights and Lessons Learned: Analysis of hun- dreds of ML applications in underwater environments reveals several fundamental insights that should guide future research and deployment: Hybrid Approaches Dominate. Purely data-driven or purely model-based approaches consistently underperform hy- brid methods that combine physical knowledge with learning. Physics-informed neural networks achieve accurate acoustic 63 field predictions from 100 measurements versus millions re- quired by pure ML approaches—addressing the critical data scarcity challenge. The ocean’s complexity demands leverag- ing centuries of oceanographic knowledge rather than attempt- ing to learn everything from scratch. Co-Design is Essential. The extreme resource constraints underwater necessitate joint optimisation of algorithms and hardware. Successful deployments treat accuracy, latency, and energy as coupled objectives rather than independent metrics. Neuromorphic computing achieving 10 μW idle power and TinyML approaches enabling complex inference on micro- controllers demonstrate that computational limitations, while severe, are surmountable through thoughtful co-design. Graceful Degradation Trumps Peak Performance. Un- derwater ML systems must maintain core functionality as sen- sors fail, communication degrades, and models drift. The 3– 5× deployment lifetime extension achieved through continual learning approaches validates designing for resilience rather than optimal steady-state performance. Perfect operation is neither achievable nor necessary—robust partial functionality enables mission success. Successful Deployment Patterns. Real-world implemen- tations consistently follow a validated progression: starting simple with proven architectures, validating extensively in controlled environments, maintaining human oversight during initial operation, and continuously monitoring for degrada- tion. The 98.5% vessel detection accuracy with 95% false alarm reduction achieved by Project AMMO demonstrates that systematic engineering, not algorithmic novelty, primarily determines deployment success. B. Transformative Impact on the IoUT Field ML has catalysed fundamental transformation across under- water communications and networking, shifting the field along multiple dimensions simultaneously. 1) From Reactive to Proactive Systems: Traditional under- water systems responded to conditions after they occurred: retransmitting after packet loss, rerouting after link failure, and surfacing after battery depletion. ML enables proactive systems that anticipate and prepare: LSTM networks predict channel degradation hours before it occurs, enabling preemp- tive modulation adjustment; GNNs forecast topology changes, allowing route pre-computation; and RL agents learn energy harvesting patterns, scheduling high-power operations during predicted abundance. This temporal shift from reaction to anticipation fundamentally changes operational paradigms. 2) From Rigid to Adaptive Protocols: Fixed-parameter protocols optimised for worst-case scenarios waste resources during favourable conditions and fail during unexpected ex- tremes. ML-enabled adaptive protocols continuously learn and improve, optimising for actual conditions. The 200– 300% throughput improvements demonstrated by learning- based MAC protocols reflect not algorithmic superiority but rather the fundamental advantage of adaptation over rigidity in stochastic environments. 3) From Isolated to Collaborative Networks: Federated learning enables unprecedented collaboration among underwa- ter systems while preserving operational security. Military and commercial entities can jointly improve environmental models without exposing sensitive data—achieving 95% bandwidth re- duction through distributed training while maintaining privacy. This collaborative paradigm multiplies the effective dataset size without centralised data collection, directly addressing the data scarcity challenge. 4) Economic and Scientific Acceleration: The economic equation for underwater operations fundamentally changes with ML. Autonomous operation for months rather than days reduces ship time from continuous presence to periodic deployment/recovery, cutting operational costs by orders of magnitude. The $27 million cost of comprehensive labelled datasets is amortised across deployments through transfer learning. Predictive maintenance prevents costly failures while optimised energy management extends deployment lifetime. Scientific discovery accelerates commensurately. Pattern recognition in massive datasets reveals phenomena invisi- ble to human analysis. Adaptive sampling guided by ML captures transient features that predetermined surveys miss, increasing detection of important events by 300%. The 10,000× acceleration in species identification demonstrated by FathomNet—enabling discovery of 147 new species through automated anomaly detection—previews ML’s potential for oceanographic science. C. Research Roadmap and Call to Action The progress documented in this survey represents the beginning rather than culmination of ML’s impact on under- water communications. Realising the full potential requires coordinated effort across multiple dimensions. 1) Priority Technical Directions: Figure 14 presents a technology roadmap organising research priorities by time- line and expected impact. Near-term efforts should focus on deployment-ready solutions, while longer-term research addresses fundamental limitations. Near-Term Priorities (2025–2027): • Transfer learning libraries: Curated pretrained models for common underwater tasks—localisation, channel esti- mation, species classification—enabling rapid deployment without extensive local training. • TinyML deployment frameworks: Standardised toolchains for quantising and deploying models on underwater micro- controllers, with validated accuracy-efficiency tradeoffs. • Hierarchical federated learning: Protocols enabling AUV- mediated model aggregation that respect acoustic bandwidth constraints while achieving convergence guarantees. • PINN acoustic toolkits: Open-source implementations of physics-informed networks for standard underwater propa- gation scenarios, reducing the barrier to hybrid approaches. Medium-Term Priorities (2027–2030): • Self-supervised pretraining: Foundation models trained on unlabelled underwater acoustic data, enabling task-specific fine-tuning with minimal labelled examples. • Neuromorphic underwater processors: Custom silicon opti- mised for spiking neural networks in extreme power budgets (<100 μW continuous operation). 64 20252027203020332035+ Near-Term Medium-Term Long-Term Transformative Deployment OptimisationAutonomous AdaptationIntelligent CollaborationCognitive Ocean Transfer learning librariesSelf-supervised pretrainingFew-shot adaptation (¡10 samples)Zero-shot generalisation TinyML deploymentNeuromorphic processorsQuantum-classical hybridUnderwater edge AI mesh Hierarchical FL protocols Cross-domain FLGlobal ocean FL network Autonomous knowledge sharing PINN acoustic models Real-time digital twinsMulti-physics integrationPredictive ocean modelling Data & Learning Architecture Collaboration Physics Fig. 14. Technology roadmap for ML in underwater communications, organising research priorities across four dimensions: data and learning paradigms, computational architectures, collaborative frameworks, and physics integration. Near-term efforts (2025–2027) focus on deployment optimisation using proven techniques; medium-term (2027–2030) enables autonomous adaptation; long-term (2030–2033) achieves intelligent collaboration; and transformative capabilities (2035+) realise the cognitive ocean vision. • Cross-domain federated learning: Protocols enabling knowl- edge transfer across coastal, deep-sea, polar, and tropical deployments while respecting domain differences. • Real-time digital twins: Virtual replicas synchronised with physical deployments, enabling simulation-based training and what-if analysis. Long-Term Priorities (2030–2035 and Beyond): • Few-shot and zero-shot adaptation: Systems achieving deployment-ready performance from fewer than 10 local samples through meta-learning and semantic transfer. • Quantum-classical hybrid optimisation: Leveraging near- term quantum devices for combinatorial problems in sensor placement and resource allocation. • Global ocean federated network: International infrastructure enabling collaborative model improvement across institu- tional and national boundaries. • Predictive ocean digital twins: Comprehensive virtual ocean enabling week-scale forecasting with kilometre-scale reso- lution. 2) Interdisciplinary Collaboration Imperatives: The chal- lenges facing underwater ML transcend traditional disciplinary boundaries. Effective progress requires: • Computer science–oceanography integration: Algorithms must respect physical constraints and leverage domain knowledge; this requires deep collaboration, not superficial consultation. • Marine biology–engineering partnerships: Systems must monitor ecosystems without disrupting them, demanding joint design from conception through deployment. • Academia–industry–government coordination: Transitioning research to operational systems requires sustained engage- ment across sectors with different timelines and incentives. • International cooperation: Ocean-scale challenges ignore political boundaries; effective monitoring requires data shar- ing and coordinated deployment across jurisdictions. 3) OpenScienceandReproducibility:Accelerating progress requires embracing open science principles: • Dataset release: Anonymised, standardised datasets en- abling comparative studies and reproducible research— building toward underwater equivalents of ImageNet. • Code and model sharing: Open-source implementations through repositories enabling others to build upon previous work rather than reimplementing from papers. • Standardised benchmarks: Common evaluation protocols and metrics enabling fair comparison across approaches and institutions. • Negative result publication: Failed approaches and deploy- ment lessons provide valuable guidance; journals and con- ferences should actively solicit such contributions. 4) Workforce Development: Realising ML’s potential un- derwater requires developing human capital alongside tech- nology: • Interdisciplinary curricula: University programs combining oceanography, ML, and communications—none of which alone suffices. • Industry engagement: Internships and co-ops exposing stu- dents to real underwater challenges beyond simulation. • Professional development: Courses helping practicing engi- neers acquire ML skills relevant to their domains. • Global accessibility: Online resources making underwater ML education available worldwide, not just at coastal insti- tutions. D. Vision for the Future Looking ahead, the convergence of ML and underwater communications promises to fundamentally transform human- ity’s relationship with the ocean. 1) The Intelligent Ocean (2030–2035): Within the next decade, we envision persistent, adaptive monitoring networks spanning the global ocean. Millions of ML-enabled sensors will provide real-time, three-dimensional understanding of ocean state from surface to seafloor. Key characteristics in- clude: 65 • Autonomous response: Swarms of AUVs responding to detected events, investigating anomalies without human in- tervention. • Edge intelligence: Distributed processing handling ex- abytes locally, transmitting only critical insights through bandwidth-limited acoustic links. • Predictive capability: Week-scale ocean forecasting with kilometre resolution, comparable to current atmospheric weather prediction. • Continuous adaptation: Networks improving over time through federated learning, accumulating knowledge across deployments. 2) Symbiotic Human-Ocean Systems (2035+): Future un- derwater ML systems will work symbiotically with marine ecosystems: • Biomimetic integration: Robots indistinguishable from ma- rine life monitoring ecosystems without disturbance. • Environmental optimisation: Adaptive systems minimising acoustic pollution while maximising scientific value. • Active restoration: ML-guided robots repairing coral reefs, removing pollution, and restoring degraded habitats. • Interspecies communication: Algorithms decoding animal vocalisations, enabling new forms of human-ocean interac- tion. 3) Democratised Ocean Access: Advanced ML will make ocean exploration accessible beyond well-funded institutions: • Citizen science: Low-cost, ML-enabled sensors enabling broad participation in ocean monitoring. • Virtual exploration: Immersive experiences powered by un- derwater ML allowing anyone to explore the deep sea. • Open tools: AI assistants helping non-experts interpret ocean data and make discoveries. • Global equity: Enabling developing nations to effectively monitor their waters through accessible technology. E. Concluding Remarks The ocean, covering 71% of Earth’s surface and contain- ing 97% of its water, remains humanity’s last frontier. For decades, technological limitations have constrained our ability to observe, understand, and protect this critical resource. ML, adapted to the unique challenges of underwater environments, finally provides tools commensurate with the ocean’s impor- tance. This survey has documented the transformation already underway: neural networks overcoming channel distortions that defied traditional signal processing; reinforcement learn- ing discovering strategies impossible to derive analytically; federated learning enabling collaboration across competitive boundaries; and physics-informed approaches extracting max- imum insight from sparse data. The quantitative evidence is compelling—order-of-magnitude improvements in efficiency, accuracy, and capability that enable applications previously considered impossible. Yet we stand at the beginning. The challenges are im- mense: million-dollar datasets, black-box decisions in safety- critical systems, and computational constraints that would be unacceptable in any terrestrial application. The stakes are correspondingly high: climate change accelerates, ma- rine ecosystems face unprecedented pressure, and sustainable ocean resource management becomes ever more critical. The research community must rise to meet this challenge with urgency, creativity, and collaboration. The technology roadmap presented herein provides direction; the call to action identifies specific priorities; the vision articulates the desti- nation. Progress requires not just algorithmic innovation but institutional change: embracing open science, building inter- disciplinary teams, and investing in workforce development. The convergence of ML and underwater communications is not merely technical evolution but revolution in how we per- ceive, understand, and interact with seven-tenths of our planet. This survey has mapped the current landscape, identified chal- lenges ahead, and pointed toward promising horizons. Now it falls to researchers, engineers, policymakers, and practitioners to navigate these waters, guided by the knowledge that our efforts today will determine whether future generations inherit an ocean that is understood, protected, and thriving. The choice, and the responsibility, is ours. ACKNOWLEDGMENTS This work was supported by the Petroleum Technology Development Fund (PTDF) of the Federal Republic of Nigeria [grant number 1353/18]. REFERENCES [1] C. Funk, P. Peterson, M. Landsfeld, D. Pedreros, J. Verdin, S. Shukla, G. Husak, J. Rowland, L. Harrison, A. Hoell et al., “The climate hazards infrared precipitation with stations—a new environmental record for monitoring extremes,” Scientific Data, vol. 2, no. 1, p. 1–21, 2015, provides context on ocean-climate interactions and monitoring needs. [2] M. C. Domingo, “An overview of the internet of underwater things,” Journal of Network and Computer Applications, vol. 35, no. 6, p. 1879–1890, 2012. [3] M. Jahanbakht, W. Xiang, L. Hanzo, and M. Rahimi Azghadi, “Internet of underwater things and big marine data analytics—a comprehensive survey,” IEEE Communications Surveys Tutorials, vol. 23, no. 2, p. 904–956, 2021. [4] S. A. H. Mohsan, A. Mazinani, N. Q. H. Othman, and H. Amjad, “Towards the internet of underwater things: a comprehensive survey,” Earth Science Informatics, p. 1–30, 2022. [5] R. A. Khalil et al., “Semantic communication in underwater IoT networks for meaning-driven connectivity,” arXiv preprint, vol. arXiv:2601.13289, Jan. 2026, first comprehensive survey on semantic communication for IoUT. [6] S. A. H. Mohsan, A. Mazinani, N. Q. H. Othman, and H. Amjad, “Towards the internet of underwater things: A comprehensive survey,” Earth Science Informatics, p. 1–30, 2022. [7] G. Xu, W. Shen, and X. Wang, “Applications of wireless sensor net- works in marine environment monitoring: A survey,” Sensors, vol. 14, no. 9, p. 16 932–16 954, 2014. [8] L. Christensen, J. de Gea Fern ́ andez, M. Hildebrandt, C. E. S. Koch, and B. Wehbe, “Recent advances in ai for navigation and control of underwater robots,” Current Robotics Reports, p. 1–11, 2022. [9] Z. Fang, D. Jiang, J. Huang, C. Cheng, Q. Sha, B. He, and G. Li, “Au- tonomous underwater vehicle formation control and obstacle avoidance using multi-agent generative adversarial imitation learning,” Ocean Engineering, vol. 262, p. 112182, 2022. [10] Y. Li, S. Wang, C. Jin, Y. Zhang, and T. Jiang, “A survey of underwater magnetic induction communications: Fundamental issues, recent advances, and challenges,” IEEE Communications Surveys & Tutorials, vol. 21, no. 3, p. 2466–2487, 2019. [11] A. R. Rashid and A. Chennu, “A trillion coral reef colors: Deeply annotated underwater hyperspectral images for automated classification and habitat mapping,” Data, vol. 5, no. 1, p. 19, 2020. 66 [12] S. Li, W. Qu, C. Liu, T. Qiu, and Z. Zhao, “Survey on high reliability wireless communication for underwater sensor networks,” Journal of Network and Computer Applications, vol. 148, p. 102446, 2019. [13] P. N. Mahalle, P. A. Shelar, G. R. Shinde, and N. Dey, “Threats and attacks in uwsn,” in The Underwater World for Digital Data Transmission. Springer, 2021, p. 43–53. [14] G. Song, X. Guo, W. Wang, Q. Ren, J. Li, and L. Ma, “A machine learning-based underwater noise classification method,” Applied Acous- tics, vol. 184, p. 108333, 2021. [15] H. Zhang, S. Zhang, Y. Wang, Y. Liu, Y. Yang, T. Zhou, and H. Bian, “Subsea pipeline leak inspection by autonomous underwater vehicle,” Applied Ocean Research, vol. 107, p. 102321, 2021. [16] L. Stanchev, H. Egbert, and B. Ruttenberg, “Automating deep-sea video annotation using machine learning,” in 2020 IEEE 14th International Conference on Semantic Computing (ICSC), 2020, p. 17–24. [17] K. Banno, M. Yano, K. Maeda, T. Yamamoto, D. Yoshida, and A. Obara, “Identifying losers: Automatic identification of growth- stunted salmon in aquaculture using computer vision,” Machine Learn- ing with Applications, vol. 15, 2024. [18] I. F. Akyildiz, D. Pompili, and T. Melodia, “Underwater acoustic sensor networks: research challenges,” Ad hoc networks, vol. 3, no. 3, p. 257–279, 2005. [19] M. Stojanovic, “On the relationship between capacity and distance in an underwater acoustic communication channel,” ACM SIGMOBILE Mobile Computing and Communications Review, vol. 11, no. 4, p. 34–43, 2007. [20] S. Khisa and S. Moh, “Survey on recent advancements in energy- efficient routing protocols for underwater wireless sensor networks,” IEEE Access, vol. 9, p. 55 045–55 062, 2021. [21] L. Bjørnø, Applied underwater acoustics. Elsevier, 2017. [22] P. Qarabaqi and M. Stojanovic, “Statistical characterization and com- putationally efficient modeling of a class of underwater acoustic com- munication channels,” IEEE Journal of Oceanic Engineering, vol. 38, no. 4, p. 701–717, 2013. [23] Z. Zeng, S. Fu, H. Zhang, Y. Dong, and J. Cheng, “A survey of underwater optical wireless communications,” IEEE communications surveys & tutorials, vol. 19, no. 1, p. 204–238, 2016. [24] I. Romdhane and G. Kaddoum, “A reinforcement learning based beam adaptation for underwater optical wireless communications,” IEEE Internet of Things Journal, p. 1–1, 2022. [25] X. Che, I. Wells, G. Dickers, P. Kear, and X. Gong, “Re-evaluation of RF electromagnetic communication in underwater sensor networks,” IEEE Communications Magazine, vol. 48, no. 12, p. 143–151, 2010. [26] H.-X. Zou, M. Li, L.-C. Zhao, Q.-H. Gao, K.-X. Wei, L. Zuo, F. Qian, and W.-M. Zhang, “A magnetically coupled bistable piezoelectric harvester for underwater energy harvesting,” Energy, vol. 217, p. 119429, 2021. [27] M. Han, J. Duan, S. Khairy, and L. X. Cai, “Enabling sustainable underwater iot networks with energy harvesting: a decentralized rein- forcement learning approach,” IEEE Internet of Things Journal, vol. 7, no. 10, p. 9953–9964, 2020. [28] —, “Enabling sustainable underwater iot networks with energy harvesting: A decentralized reinforcement learning approach,” IEEE Internet of Things Journal, vol. 7, no. 10, p. 9953–9964, 2020. [29] Z. A. Khan, O. A. Karim, S. Abbas, N. Javaid, Y. B. Zikria, and U. Tariq, “Q-learning based energy-efficient and void avoidance routing protocol for underwater acoustic sensor networks,” Computer Net- works, vol. 197, p. 108309, 2021. [30] B. Mishachandar and S. Vairamuthu, “Diverse ocean noise classifi- cation using deep learning,” Applied Acoustics, vol. 181, p. 108141, 2021. [31] L. Huang, Y. Wang, Q. Zhang, J. Han, W. Tan, and Z. Tian, “Machine learning for underwater acoustic communications,” IEEE Wireless Communications, vol. 29, no. 3, p. 102–108, 2022. [32] K. G. Omeke, A. I. Abubakar, L. Zhang, Q. H. Abbasi, and M. A. Imran, “How reinforcement learning is helping to solve internet-of- underwater-things problems,” IEEE Internet of Things Magazine, vol. 5, no. 4, p. 24–29, 2022. [33] L. Huang, Y. Wang, Q. Zhang, J. Han, W. Tan, and Z. Tian, “Machine learning for underwater acoustic communications,” IEEE Wireless Communications, vol. 29, no. 3, p. 102–108, 2022. [34] Y. Zhang, J. Li, Y. Zakharov, X. Li, and J. Li, “Deep learning based underwater acoustic ofdm communications,” Applied Acoustics, vol. 154, p. 53–58, 2019. [35] H. Wang, Y. Li, and J. Qian, “Self-adaptive resource allocation in underwater acoustic interference channel: A reinforcement learning approach,” IEEE Internet of Things Journal, vol. 7, no. 4, p. 2816– 2827, 2020. [36] R. Raj Priyadarshini and N. Sivakumar, “Enhancing coverage and connectivity using energy prediction method in underwater acoustic wsn,” Journal of Ambient Intelligence and Humanized Computing, vol. 11, p. 2751–2760, 2020. [37] R. C. Hsu, C.-T. Liu, and H.-L. Wang, “A reinforcement learning-based tod provisioning dynamic power management for sustainable operation of energy harvesting wireless sensor node,” IEEE Transactions on Emerging Topics in Computing, vol. 2, no. 2, p. 181–191, 2014. [38] J. Christensen, P. E. Wahl, and F. S. Hover, “AUV path planning for data collection using deep reinforcement learning,” IEEE Journal of Oceanic Engineering, vol. 47, no. 4, p. 1012–1028, 2022, dDPG and SAC for AUV trajectory optimization, 15-25% data utility improve- ment. [39] P. Consul, I. Budhiraja, and D. Garg, “Deep reinforcement learning based reliable data transmission scheme for internet of underwater things in 5g and beyond networks,” Procedia Computer Science, vol. 235, p. 1752–1760, 2024. [40] Y. He, G. Han, A. Li, T. Taleb, C. Wang, and H. Yu, “A federated deep reinforcement learning-based trust model in underwater acoustic sensor networks,” IEEE Transactions on Mobile Computing, vol. 23, p. 5150–5165, 2024. [41] N. Victor, R. C, M. Alazab, S. Bhattacharya, S. Magnusson, P. K. Reddy Maddikunta, K. Ramana, and T. Reddy Gadekallu, “Federated Learning for IoUT: Concepts, Applications, Challenges and Opportu- nities,” arXiv e-prints, p. arXiv:2207.13976, Jul. 2022. [42] W. Su, J. Lin, K. Chen, L. Xiao, and C. En, “Reinforcement learning- based adaptive modulation and coding for efficient underwater com- munications,” IEEE Access, vol. 7, p. 67 539–67 550, 2019. [43] F. Ahmed and H.-S. Cho, “A time-slotted data gathering medium access control protocol using q-learning for underwater acoustic sensor networks,” IEEE Access, vol. 9, p. 48 742–48 752, 2021. [44] R. T. Rodoshi, Y. Song, and W. Choi, “Reinforcement learning- based routing protocol for underwater wireless sensor networks: A comparative survey,” IEEE Access, vol. 9, p. 154 578–154 599, 2021. [45] Defense Innovation Unit, “Project ammo: Accelerated machine learning for maritime operations,” U.S. Department of Defense, Tech. Rep., 2023. [Online]. Available: https://w.diu.mil [46] K. Katija, E. Orenstein, B. Schlining, L. Lundsten, K. Barnard, G. Sainz, O. Boulais, M. Cromwell, E. Butler, B. Woodward, and K. C. Bell, “Fathomnet: A global image database for enabling artificial intelligence in the ocean,” Scientific Reports, vol. 12, no. 1, p. 15914, 2022. [47] M. Raissi, P. Perdikaris, and G. E. Karniadakis, “Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations,” Journal of Computational Physics, vol. 378, p. 686–707, 2019. [48] Z. Bi, N. Zhang, Y. Xue, Y. Ou, D. Ji, G. Zheng, and H. Chen, “OceanGPT: A large language model for ocean science tasks,” in Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL), 2024, p. 3292–3310. [49] M. A. Alsheikh, S. Lin, D. Niyato, and H.-P. Tan, “Machine learning in wireless sensor networks: Algorithms, strategies, and applications,” IEEE Communications Surveys & Tutorials, vol. 16, no. 4, p. 1996– 2018, 2014. [50] M. I. Jordan and T. M. Mitchell, “Machine learning: Trends, perspec- tives, and prospects,” Science, vol. 349, no. 6245, p. 255–260, 2015. [51] J. Heidemann, M. Stojanovic, and M. Zorzi, “Underwater sensor net- works: Applications, advances and challenges,” Philosophical Transac- tions of the Royal Society A: Mathematical, Physical and Engineering Sciences, vol. 370, no. 1958, p. 158–175, 2012. [52] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, no. 7553, p. 436–444, 2015. [53] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning.MIT Press, 2016. [54] R. S. Sutton and A. G. Barto, Reinforcement learning: An introduction. MIT press, 2018. [55] J. Yan, Y. Gong, C. Chen, X. Luo, and X. Guan, “Auv-aided localiza- tion for internet of underwater things: A reinforcement-learning-based method,” IEEE Internet of Things Journal, vol. 7, no. 10, p. 9728– 9746, 2020. [56] M. J. Bianco, P. Gerstoft, J. Traer, E. Ozanich, M. A. Roch, S. Gannot, and C.-A. Deledalle, “Machine learning in acoustics: Theory and applications,” The Journal of the Acoustical Society of America, vol. 146, no. 5, p. 3590–3628, 2019. 67 [57] M. Z. Alom, T. M. Taha, C. Yakopcic, S. Westberg, P. Sidike, M. S. Nasrin, M. Hasan, B. C. Van Essen, A. A. Awwal, and V. K. Asari, “A state-of-the-art survey on deep learning theory and architectures,” Electronics, vol. 8, no. 3, p. 292, 2019. [58] M. C. Domingo, “Overview of channel models for underwater wireless communication networks,” Physical Communication, vol. 1, no. 3, p. 163–182, 2008. [59] H. Niu, E. Reeves, and P. Gerstoft, “Source localization in an ocean waveguide using supervised machine learning,” Journal of the Acous- tical Society of America, vol. 142, no. 3, p. 1176–1188, 2017. [60] W. Liu, H. Niu, P. Gerstoft, and R. Zhang, “CNN-based source localization in deep ocean with sound speed mismatch,” Journal of the Acoustical Society of America, vol. 147, no. 4, p. 2307–2319, 2020, mTL-CNN for deep ocean localization, South China Sea experiments. [61] S. Feng and B. Zhu, “A transformer-based deep learning network for underwater acoustic target recognition,” IEEE Journal of Oceanic Engineering, vol. 47, no. 4, p. 1469–1479, 2022. [62] X. Luo, L. Chen, H. Zhou, and H. Cao, “A survey of underwater acous- tic target recognition methods based on machine learning,” Journal of Marine Science and Engineering, vol. 11, no. 2, p. 384, 2023. [63] S. Wold, K. Esbensen, and P. Geladi, “Principal component analysis,” Chemometrics and intelligent laboratory systems, vol. 2, no. 1-3, p. 37–52, 1987. [64] N. Krishnaraj, M. Elhoseny, M. Thenmozhi, M. M. Selim, and K. Shankar, “Deep learning model for real-time image compression in internet of underwater things (iout),” Journal of Real-Time Image Processing, vol. 17, no. 6, p. 2097–2111, 2020. [65] M. A. Alsheikh, S. Lin, D. Niyato, and H.-P. Tan, “Machine learning in wireless sensor networks: Algorithms, strategies, and applications,” IEEE Communications Surveys & Tutorials, vol. 16, no. 4, p. 1996– 2018, 2014. [66] D. P. Kumar, T. Amgoth, and C. S. R. Annavarapu, “Machine learn- ing algorithms for wireless sensor networks: A survey,” Information Fusion, vol. 49, p. 1–25, 2019. [67] W. Qiao, M. Khishe, and S. Ravakhah, “Underwater targets classifi- cation using local wavelet acoustic pattern and multi-layer perceptron neural network optimized by modified whale optimization algorithm,” Ocean Engineering, vol. 219, p. 108415, 2021. [68] S. Mittal, S. Srivastava, and J. P. Jayanth, “A survey of deep learning techniques for underwater image classification,” IEEE Transactions on Neural Networks and Learning Systems, p. 1–15, 2022. [69] T. Cover and P. Hart, “Nearest neighbor pattern classification,” IEEE Transactions on Information Theory, vol. 13, no. 1, p. 21–27, 1967. [70] B. Beckler, A. Pfau, M. Orescanin, S. Atchley, N. Villemez, J. E. Joseph, C. W. Miller, and T. Margolina, “Multilabel classification of heterogeneous underwater soundscapes with bayesian deep learning,” IEEE Journal of Oceanic Engineering, vol. 47, no. 4, p. 1143–1154, 2022. [71] L. C. Domingos, P. E. Santos, P. S. Skelton, R. S. Brinkworth, and K. Sammut, “A survey of underwater acoustic data classification meth- ods using deep learning for shoreline surveillance,” Sensors, vol. 22, no. 6, p. 2181, 2022. [72] S. Tong and D. Koller, “Support vector machine active learning with applications to text classification,” Journal of machine learning research, vol. 2, no. Nov, p. 45–66, 2001. [73] J. R. Quinlan, “Induction of decision trees,” Machine learning, vol. 1, no. 1, p. 81–106, 1986. [74] Authors, “A dynamic trust evaluation and update model using advance decision tree for underwater wireless sensor networks,” Scientific Reports, vol. 14, p. 72775, 2024. [75] Y. Zhou, B. Li, J. Wang, E. Rocco, and Q. Meng, “Discovering unknowns: Context-enhanced anomaly detection for curiosity-driven autonomous underwater exploration,” Pattern Recognition, vol. 131, p. 108860, 2022. [76] Y. Chen, W. Yu, X. Sun, L. Wan, Y. Tao, and X. Xu, “Environment- aware communication channel quality prediction for underwater acous- tic transmissions: A machine learning method,” Applied Acoustics, vol. 181, p. 108128, 2021. [77] S. C. Johnson, “Hierarchical clustering schemes,” Psychometrika, vol. 32, no. 3, p. 241–254, 1967. [78] K. G. Omeke, M. S. Mollel, M. Ozturk, S. Ansari, L. Zhang, Q. H. Abbasi, and M. A. Imran, “Dekcs: A dynamic clustering protocol to prolong underwater sensor networks,” IEEE Sensors Journal, vol. 21, no. 7, p. 9457–9464, 2021. [79] J. P. Ortega, M. Del, R. B. Rojas, and M. J. Somodevilla, “Research issues on k-means algorithm: An experimental trial using matlab,” in CEUR workshop proceedings: semantic web and new technologies, 2009, p. 83–96. [80] H. Harb, A. Makhoul, and R. Couturier, “An enhanced k-means and anova-based clustering approach for similarity aggregation in underwater wireless sensor networks,” IEEE Sensors Journal, vol. 15, no. 10, p. 5483–5493, 2015. [81] W. B. Heinzelman, A. P. Chandrakasan, and H. Balakrishnan, “An application-specific protocol architecture for wireless microsensor net- works,” IEEE Transactions on Wireless Communications, vol. 1, no. 4, p. 660–670, Oct 2002. [82] Q. Bai and C. Jin, “A k-means and ant colony optimization-based routing in underwater sensor networks,” Mobile Information Systems, vol. 2022, 2022. [83] M. Wang, Y. Chen, X. Sun, F. Xiao, and X. Xu, “Node energy consumption balanced multi-hop transmission for underwater acoustic sensor networks based on clustering algorithm,” IEEE Access, vol. 8, p. 191 231–191 241, 2020. [84] J. Zhu, Y. Chen, X. Sun, J. Wu, Z. Liu, and X. Xu, “Ecrkq: Machine learning-based energy-efficient clustering and cooperative routing for mobile underwater acoustic sensor networks,” IEEE Access, vol. 9, p. 70 843–70 855, 2021. [85] Y. Sun, M. Zheng, X. Han, S. Li, and J. Yin, “Adaptive clustering routing protocol for underwater sensor networks,” Ad Hoc Networks, vol. 136, p. 102953, 2022. [86] J. C. Bezdek, Pattern recognition with fuzzy objective function algo- rithms. Springer Science & Business Media, 2013. [87] K. Arulkumaran, M. P. Deisenroth, M. Brundage, and A. A. Bharath, “Deep reinforcement learning: A brief survey,” IEEE Signal Processing Magazine, vol. 34, no. 6, p. 26–38, 2017. [88] X. Wang, S. Wang, X. Liang, D. Zhao, J. Huang, X. Xu, B. Dai, and Q. Miao, “Deep reinforcement learning: A survey,” IEEE Transactions on Neural Networks and Learning Systems, p. 1–15, 2022. [89] C. J. C. H. Watkins, “Learning from delayed rewards,” Ph.D. disserta- tion, King’s College, Cambridge, UK, 1989. [90] W. Chen, X. Qiu, T. Cai, H.-N. Dai, Z. Zheng, and Y. Zhang, “Deep reinforcement learning for internet of things: A comprehensive survey,” IEEE Communications Surveys & Tutorials, vol. 23, no. 3, p. 1659– 1692, 2021. [91] G. A. Rummery and M. Niranjan, On-line Q-learning using connec- tionist systems. Citeseer, 1994, vol. 37. [92] D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot et al., “Mastering the game of go with deep neural networks and tree search,” nature, vol. 529, no. 7587, p. 484–489, 2016. [93] L. Graesser and W. L. Keng, Foundations of deep reinforcement learning: theory and practice in Python. Addison-Wesley Professional, 2019. [94] R. S. Sutton, D. McAllester, S. Singh, and Y. Mansour, “Policy gradient methods for reinforcement learning with function approximation,” Advances in neural information processing systems, vol. 12, 1999. [95] S. Han, L. Li, X. Li, Z. Liu, L. Yan, and T. Zhang, “Joint relay selection and power allocation for time-varying energy harvesting-driven uasns: A stratified reinforcement learning approach,” IEEE Sensors Journal, vol. 22, no. 20, p. 20 063–20 072, 2022. [96] J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Prox- imal policy optimization algorithms,” arXiv preprint arXiv:1707.06347, 2017. [97] T. T. Nguyen, N. D. Nguyen, and S. Nahavandi, “Deep reinforcement learning for multiagent systems: A review of challenges, solutions, and applications,” IEEE transactions on cybernetics, vol. 50, no. 9, p. 3826–3839, 2020. [98] T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra, “Continuous control with deep reinforce- ment learning,” arXiv preprint arXiv:1509.02971, 2015. [99] T. M. Moerland, J. Broekens, and C. M. Jonker, “Model-based rein- forcement learning: A survey,” arXiv preprint arXiv:2006.16712, 2020. [100] F.-M. Luo, T. Xu, H. Lai, X.-H. Chen, W. Zhang, and Y. Yu, “A survey on model-based reinforcement learning,” arXiv preprint arXiv:2206.09328, 2022. [101] R. S. Sutton, “Dyna, an integrated architecture for learning, planning, and reacting,” ACM Sigart Bulletin, vol. 2, no. 4, p. 160–163, 1991. [102] R. W. Coutinho, “Machine learning for self-adaptive internet of under- water things,” in Proceedings of the 10th ACM Symposium on Design and Analysis of Intelligent Vehicular Networks and Applications, 2020, p. 65–69. 68 [103] M. Moniruzzaman, S. M. S. Islam, M. Bennamoun, and P. Lavery, “Deep learning on underwater marine object detection: A survey,” in International Conference on Advanced Concepts for Intelligent Vision Systems. Springer, 2017, p. 150–160. [104] Y. Li, H. Lu, J. Li, X. Li, Y. Li, and S. Serikawa, “Underwater image de-scattering and classification by deep neural network,” Computers & Electrical Engineering, vol. 54, p. 68–77, 2016. [105] K. A. Skinner, J. Zhang, E. A. Olson, and M. Johnson-Roberson, “Uwstereonet: Unsupervised learning for depth estimation and color correction of underwater stereo imagery,” in 2019 International Con- ference on Robotics and Automation (ICRA). IEEE, 2019, p. 7947– 7954. [106] A. K. Cherian, E. Poovammal, N. S. Philip, K. Ramana, S. Singh, and I.-H. Ra, “Deep learning based filtering algorithm for noise removal in underwater images,” Water, vol. 13, no. 19, p. 2742, 2021. [107] S. J. Bertram, Y. Fan, D. Raffelt, and P. Michalak, “An applied machine learning approach to subsea asset inspection,” in Abu Dhabi International Petroleum Exhibition & Conference. OnePetro, 2018. [108] G. Huo, Z. Wu, and J. Li, “Underwater object classification in sidescan sonar images using deep transfer learning and semisynthetic training data,” IEEE access, vol. 8, p. 47 407–47 418, 2020. [109] P. Sarkar, S. De, and S. Gurung, “A survey on underwater object detection,” in Intelligence Enabled Research. Springer, 2022, p. 91– 104. [110] I. Kvasi ́ c, N. Mi ˇ skovi ́ c, and Z. Vuki ́ c, “Convolutional neural network architectures for sonar-based diver detection and tracking,” in OCEANS 2019 - Marseille, 2019, p. 1–6. [111] Z. Zhu, K. Lin, and J. Zhou, “Transfer learning in deep reinforcement learning: A survey,” arXiv preprint arXiv:2009.07888, 2020. [112] W. Huang, P. Wu, J. Lu et al., “STNet: Prediction of underwater sound speed profiles with an advanced semi-transformer neural network,” Journal of Marine Science and Engineering, vol. 13, no. 7, p. 1370, 2025. [113] P. M, S. P. R. M, Q.-V. Pham, K. Dev, P. K. Reddy Maddikunta, T. Reddy Gadekallu, and T. Huynh-The, “Fusion of Federated Learn- ing and Industrial Internet of Things: A Survey,” arXiv e-prints, p. arXiv:2101.00798, Jan. 2021. [114] J. Pei, W. Liu, L. Wang, C. Liu, A. K. Bashir, and Y. Wang, “Fed-IoUT: Opportunities and challenges of federated learning in the internet of underwater things,” IEEE Internet of Things Magazine, vol. 6, no. 1, p. 108–112, 2023. [115] G. Cirincione and D. Verma, “Federated machine learning for multi- domain operations at the tactical edge,” in Artificial Intelligence and Machine Learning for Multi-Domain Operations Applications, vol. 11006. SPIE, 2019, p. 29–48. [116] Z. Qin, J. Ye, J. Meng, B. Lu, and L. Wang, “Privacy-preserving blockchain-based federated learning for marine internet of things,” IEEE Transactions on Computational Social Systems, vol. 9, no. 1, p. 159–173, 2022. [117] Y. Gao, L. Liu, B. Hu, T. Lei, and H. Ma, “Federated region-learning for environment sensing in edge computing system,” IEEE Transactions on Network Science and Engineering, vol. 7, no. 4, p. 2192–2204, 2020. [118] H. Zhao, F. Ji, Q. Li, Q. Guan, S. Wang, and M. Wen, “Federated meta- learning enhanced acoustic radio cooperative framework for ocean of things,” IEEE Journal of Selected Topics in Signal Processing, vol. 16, no. 3, p. 474–486, 2022. [119] Authors, “Federated learning-based privacy-preserving internet of un- derwater things: a vision, architecture, computing, taxonomy, and future directions,” The Journal of Supercomputing, May 2025. [120] J. Yan, Y. Meng, X. Yang, X. Luo, and X. Guan, “Privacy-preserving localization for underwater sensor networks via deep reinforcement learning,” IEEE Transactions on Information Forensics and Security, vol. 16, p. 1880–1895, 2021. [121] A. Salman, A. Jalal, F. Shafait, A. Mian, M. Shortis, J. Seager, and E. Harvey, “Fish species classification in unconstrained underwater environments based on deep learning,” Limnology and Oceanography: Methods, vol. 14, no. 9, p. 570–585, 2016. [122] X. Liu et al., “An end-to-end underwater acoustic target recognition model based on one-dimensional convolution and transformer,” Journal of Marine Science and Engineering, vol. 12, no. 10, p. 1793, 2024. [123] J. Tang, E. Ma, Y. Qu, W. Gao, Y. Zhang, and L. Gan, “UAPT: An underwater acoustic target recognition method based on pre-trained transformer,” Multimedia Systems, vol. 31, no. 1, 2025. [124] Y. Iqbal et al., “An efficient transformer architecture with depthwise separable convolutions for high-accuracy underwater acoustic target recognition,” Scientific Reports, vol. 15, p. 32401, 2025. [125] X. Li et al., “An effective convolutional and transformer cooperation network for underwater acoustic target recognition,” Engineering Ap- plications of Artificial Intelligence, vol. 141, p. 109832, 2025. [126] Y. Wang, J. Xiao, X. Cheng, Q. Wei, and N. Tang, “Underwater acoustic signal classification based on a spatial-temporal fusion neural network,” Frontiers in Marine Science, vol. 11, p. 1331717, 2024. [127] Y. Wang, H. Zhang, W. Huang, M. Zhang, and Y. Gao, “DWSTr: A hybrid framework for ship-radiated noise recognition,” Frontiers in Marine Science, vol. 11, p. 1334057, 2024. [128] J. Guo, B. He, and Q. Sha, “Shallow-sea application of an intelligent fusion module for low-cost sensors in auv,” Ocean Engineering, vol. 148, p. 386–400, 2018. [129] K. Xu et al., “Self-supervised learning-based underwater acoustical signal classification via mask modeling,” Journal of the Acoustical Society of America, vol. 154, no. 1, p. 5–15, 2023. [130] J. Zhou, G. Cui, S. Hu, Z. Zhang, C. Yang, Z. Liu, L. Wang, C. Li, and M. Sun, “Graph neural networks: A review of methods and applications,” AI Open, vol. 1, p. 57–81, 2020. [131] X. Chen et al., “A secure routing protocol based on graph neural net- works for underwater acoustic sensor networks,” in IEEE International Conference on Communications, 2024. [132] X. Wang et al., “Routing protocol for underwater wireless sensor networks based on a trust model and void-avoided algorithm,” Sensors, vol. 24, no. 23, p. 7614, 2024. [133] T. Luo, W. Chen, Y. Zhang, and M. Li, “Delay-tolerant networking for underwater sensor networks: A reinforcement learning approach,” Ad Hoc Networks, vol. 112, p. 102382, 2021, 75-85% contact prediction accuracy, 20-30% delivery ratio improvement. [134] L. Nkenyereye, L. Nkenyereye, and B. Ndibanje, “Internet of under- water things: A survey on simulation tools and 5g-based underwater networks,” Electronics, vol. 13, no. 3, p. 474, 2024. [135] R. F. Prudencio, M. R. Maximo, and E. L. Colombini, “A survey on offline reinforcement learning: Taxonomy, review, and open problems,” arXiv preprint arXiv:2203.01387, 2022. [136] P. V. Klaine, M. A. Imran, O. Onireti, and R. D. Souza, “A survey of machine learning techniques applied to self-organizing cellular networks,” IEEE Communications Surveys & Tutorials, vol. 19, no. 4, p. 2392–2431, 2017. [137] J. Luo, Y. Chen, M. Wu, and Y. Yang, “A survey of routing protocols for underwater wireless sensor networks,” IEEE Communications Surveys & Tutorials, vol. 23, no. 1, p. 137–160, 2021. [138] N. Wang, Y. Wang, and M. J. Er, “Review on deep learning techniques for marine object recognition: Architectures and algorithms,” Control Engineering Practice, vol. 118, p. 104458, 2022. [139] X. Luo, L. Chen, H. Zhou, and H. Cao, “A survey of underwater acous- tic target recognition methods based on machine learning,” Journal of Marine Science and Engineering, vol. 11, no. 2, p. 384, 2023. [140] S. Jiang, “On securing underwater acoustic networks: A survey,” IEEE Communications Surveys & Tutorials, vol. 21, no. 1, p. 729–752, 2018. [141] A. Saleh, M. Sheaves, and M. Rahimi Azghadi, “Computer vision and deep learning for fish classification in underwater habitats: A survey,” Fish and Fisheries, 2022. [142] Z. Yang, Z. Zhu, Y. Zhao, Y. Tian, C. Fan, R. Guo, W. Lu, J. Ge, B. Chen, Y. Zhang et al., “A comprehensive survey on underwater acoustic target positioning and tracking: Progress, challenges, and perspectives,” arXiv preprint arXiv:2506.14165, 2025. [143] L. Huang, Q. Zhang, W. Tan, Y. Wang, L. Zhang, C. He, and Z. Tian, “Adaptive modulation and coding in underwater acoustic communications: a machine learning perspective,” EURASIP Journal on Wireless Communications and Networking, vol. 2020, no. 1, p. 1–25, 2020. [144] K. Arulkumaran, M. P. Deisenroth, M. Brundage, and A. A. Bharath, “Deep reinforcement learning: A brief survey,” IEEE Signal Processing Magazine, vol. 34, no. 6, p. 26–38, 2017. [145] H.-P. Tan, R. Diamant, W. K. Seah, and M. Waldmeyer, “A survey of techniques and challenges in underwater localization,” Ocean Engi- neering, vol. 38, no. 14-15, p. 1663–1676, 2011. [146] T. Hu and Y. Fei, “Q-learning based adaptive clustering and routing for underwater wireless sensor networks,” Wireless Networks, vol. 26, no. 7, p. 5029–5044, 2020. [147] Z. Qin, H. Ye, G. Y. Li, and B.-H. F. Juang, “Deep learning in physical layer communications,” IEEE Wireless Communications, vol. 26, no. 2, p. 93–99, 2019. [148] F. Maurelli, S. Krupi ́ nski, X. Xiang, and Y. Petillot, “Auv localisation: a review of passive and active techniques,” International Journal of Intelligent Robotics and Applications, vol. 6, no. 2, p. 246–269, 2022. 69 [149] H. Li, Y. He, X. Cheng, H. Zhu, and L. Sun, “Security and privacy in localization for underwater sensor networks,” IEEE Communications Magazine, vol. 53, no. 11, p. 56–62, 2015. [150] J. Yan, X. Li, X. Yang, X. Luo, C. Hua, and X. Guan, “Integrated local- ization and localization for auv with model uncertainties via scalable sampling-based reinforcement learning approach,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, p. 1–16, 2021. [151] Y. Zhang, X. Li, and Y. Zakharov, “Deep learning-based channel estimation for underwater acoustic OFDM communications,” IEEE Journal of Oceanic Engineering, vol. 46, no. 4, p. 1214–1229, 2021. [152] M. Alamgir, M. N. Sultana, and K. Chang, “Link adaptation on an un- derwater communications network using machine learning algorithms: Boosted regression tree approach,” IEEE Access, vol. 8, p. 73 957– 73 971, 2020. [153] X. Li, X. Hu, R. Zhang, and L. Yang, “Routing protocol design for un- derwater optical wireless sensor networks: A multiagent reinforcement learning approach,” IEEE Internet of Things Journal, vol. 7, no. 10, p. 9805–9818, 2020. [154] Y. Zhang, Z. Zhang, L. Chen, and X. Wang, “Reinforcement learning- based opportunistic routing protocol for underwater acoustic sensor networks,” IEEE Transactions on Vehicular Technology, vol. 70, no. 3, p. 2756–2770, 2021. [155] Z. Jin, Q. Zhao, and Y. Su, “Rcar: A reinforcement-learning-based routing protocol for congestion-avoided underwater acoustic sensor networks,” IEEE Sensors Journal, vol. 19, no. 22, p. 10 881–10 891, 2019. [156] Y. Zhou, T. Wang, W. Chen, and L. Zhang, “Graph neural networks for network routing: A survey,” AI Open, vol. 1, p. 57–81, 2020, 70% control overhead reduction in multicast tree maintenance. [157] K. Geethu and A. Babu, “A hybrid arq scheme combining erasure codes and selective retransmissions for reliable data transfer in underwater acoustic sensor networks,” EURASIP Journal on Wireless Communi- cations and Networking, vol. 2017, no. 1, p. 1–18, 2017. [158] J. Trubuil, A. Goalic, and N. Beuzelin, “An overview of channel coding for underwater acoustic communications,” in MILCOM 2012 - 2012 IEEE Military Communications Conference, 2012, p. 1–7. [159] R. Ahmed and M. Stojanovic, “Joint power and rate control for packet coding over fading channels,” IEEE Journal of Oceanic Engineering, vol. 42, no. 3, p. 697–710, 2017. [160] R. Lou, Z. Lv, S. Dang, T. Su, and X. Li, “Application of machine learning in ocean data,” Multimedia Systems, p. 1–10, 2021. [161] P. Bhopale, F. Kazi, and N. Singh, “Reinforcement learning based obstacle avoidance for autonomous underwater vehicle,” Journal of Marine Science and Application, vol. 18, no. 2, p. 228–238, 2019. [162] R. Cui, C. Yang, Y. Li, and S. Sharma, “Adaptive neural network control of auvs with control input nonlinearities using reinforcement learning,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 47, no. 6, p. 1019–1029, 2017. [163] S. Fayaz, S. A. Parah, and G. Qureshi, “Underwater object detection: architectures and algorithms–a comprehensive review,” Multimedia Tools and Applications, p. 1–46, 2022. [164] V. Di Valerio, F. Lo Presti, C. Petrioli, L. Picari, D. Spaccini, and S. Basagni, “Carma: Channel-aware reinforcement learning-based multi-path adaptive routing for underwater wireless sensor networks,” IEEE Journal on Selected Areas in Communications, vol. 37, no. 11, p. 2634–2647, 2019. [165] M. A. Alsheikh, S. Lin, D. Niyato, and H.-P. Tan, “Machine learning in wireless sensor networks: Algorithms, strategies, and applications,” IEEE Communications Surveys & Tutorials, vol. 16, no. 4, p. 1996– 2018, 2014. [166] K. G. Omeke, A. I. Abubakar, L. Zhang, Q. H. Abbasi, and M. A. Imran, “How reinforcement learning is helping to solve internet-of- underwater-things problems,” IEEE Internet of Things Magazine, vol. 5, no. 4, p. 24–29, 2022. [167] N. Li, J.-F. Mart ́ ınez, J. M. Meneses Chaus, and M. Eckert, “A survey on underwater acoustic sensor network routing protocols,” Sensors, vol. 16, no. 3, p. 414, 2016. [168] A. Khan, M. Ali, W. Zhang, and H. Chen, “Advanced KNN-based cost- efficient algorithm for precision localization and energy optimization in dynamic underwater sensor networks,” Scientific Reports, vol. 15, p. 86266, 2025, 99.98% localization accuracy in water tank experiments. [169] Y. Wang, M. Liu, J. Yang, and G. Gui, “Modulation classification of un- derwater communication with deep learning network,” Computational Intelligence and Neuroscience, vol. 2019, p. 8039632, 2019, 94-98% accuracy at SNR ≥ 0 dB. [170] Y. Zhang, J. Zhu, H. Wang, X. Shen, B. Wang, and Y. Dong, “Deep reinforcement learning-based adaptive modulation for underwa- ter acoustic communication with outdated channel state information,” Remote Sensing, vol. 14, no. 16, p. 3947, 2022, lSTM-DQN-AM achieves 22.95% throughput improvement over Q-learning. [171] S. H. Park, P. D. Mitchell, and D. Grace, “Reinforcement learning based MAC protocol (UW-ALOHA-Q) for underwater acoustic sensor networks,” IEEE Access, vol. 7, p. 165 531–165 542, 2019, 30% improvement over original ALOHA-Q. [172] L. Jin and D. D. Huang, “A slotted csma based reinforcement learning approach for extending the lifetime of underwater acoustic wireless sensor networks,” Computer Communications, vol. 36, no. 9, p. 1094– 1099, 2013. [173] R. Wang, A. Yadav, E. A. Makled, O. A. Dobre, R. Zhao, and P. K. Varshney, “Optimal power allocation for full-duplex underwater relay networks with energy harvesting: A reinforcement learning approach,” IEEE Wireless Communications Letters, vol. 9, no. 2, p. 223–227, 2020. [174] Z. Ye, X. Wang, S. Chen, and M. Li, “Deep reinforcement learning based resource allocation for underwater acoustic communication net- works,” IEEE Transactions on Communications, vol. 67, no. 9, p. 6402–6415, 2019, 20-30% throughput improvement in hybrid MAC protocols. [175] W. Chen, X. Wang, Y. Liu, and L. Zhang, “GNN-IR: An intelligent routing method based on graph neural network for underwater acoustic sensor networks,” IEEE Internet of Things Journal, vol. 11, no. 14, p. 25 337–25 357, 2024. [176] T. Hu and Y. Fei, “QELAR: A machine-learning-based adaptive routing protocol for energy-efficient and lifetime-extended underwater sensor networks,” IEEE Transactions on Mobile Computing, vol. 9, no. 6, p. 796–809, 2010, 20% longer network lifetime than VBF. [177] Z. Jin, C. Li, W. Zhang, and C. Wang, “Energy-efficient nonuniform cluster-based routing protocol with Q-learning for UASNs,” Ad Hoc Networks, vol. 161, p. 103456, 2025, 23.5% network lifetime extension over LEACH, QELAR, QHUC. [178] C. Wang, Y. Li, M. Zhang, and W. Chen, “Deep reinforcement learning for congestion control in underwater acoustic networks,” IEEE Transactions on Network Science and Engineering, vol. 10, no. 5, p. 2876–2890, 2023, pPO achieves 91% packet loss reduction. [179] X. Liu, J. Wang, H. Chen, and W. Zhang, “YOLOv8 for real-time underwater object detection: Optimization and deployment on edge devices,” Ocean Engineering, vol. 285, p. 115421, 2023, 92% mAP, real-time processing on Jetson devices. [180] W. Chen, M. Liu, Q. Wang, and L. Zhang, “Variational autoencoder for anomaly detection in underwater acoustic sensor networks,” IEEE Sensors Journal, vol. 22, no. 18, p. 17 856–17 868, 2022, 60-80% energy reduction through event-triggered sensing. [181] H. Li, Y. Xu, J. Wang, L. Wang, and H. Zhao, “Advances and applications of machine learning in underwater acoustics,” Intelligent Marine Technology and Systems, vol. 1, no. 1, p. 5, 2023. [182] Y. Zhang, Y. Zakharov, and J. Li, “Deep learning-based channel estimation and equalization for underwater acoustic communications,” Journal of the Acoustical Society of America, vol. 151, no. 2, p. 1342– 1354, 2022. [183] W. Jiang, F. Tong, and Y. Chen, “Hybrid deep learning-based chan- nel estimation for underwater acoustic ofdm communications,” IEEE Journal of Oceanic Engineering, vol. 47, no. 4, p. 1132–1145, 2022. [184] X. Cui, Z. Zhang, J. Li, B. Jiang, S. Li, and J. Liu, “Reinforcement learning-based adaptive modulation scheme over underwater acoustic OFDM communication channels,” Physical Communication, vol. 61, p. 102207, 2023, pPO-based adaptive modulation, up to 25% throughput improvement. [185] T. Sweta, S. Ruthrapriya, J. Sneka, G. Rohith et al., “Reinforcement learning-based automated modulation switching algorithm for an en- hanced underwater acoustic communication,” Results in Engineering, vol. 23, p. 102791, 2024. [186] Y. Wang, C. Li, X. Zhang, and W. Liu, “Cooperative relay selection using deep reinforcement learning for underwater acoustic networks,” IEEE Transactions on Vehicular Technology, vol. 71, no. 10, p. 10 856–10 869, 2022, 25-40% energy reduction vs direct transmission. [187] M. Li, W. Chen, Q. Wang, and L. Zhang, “Adaptive duty cycling with deep Q-learning for energy-efficient underwater sensor networks,” IEEE Internet of Things Journal, vol. 8, no. 14, p. 11 234–11 248, 2021, 40-55% energy reduction vs fixed duty cycling. [188] S. Tomovi ́ c and I. Radusinovi ́ c, “DR-ALOHA-Q: A Q-learning-based adaptive MAC protocol for underwater acoustic sensor networks,” 70 Sensors, vol. 23, no. 9, p. 4474, 2023, 13-106% channel utilization gains (static), 23-126% (mobile). [189] X. Hou, J. Wang, Z. Fang, X. Zhang, S. Song, X. Zhang, and Y. Ren, “Machine-learning-aided mission-critical internet of underwa- ter things,” IEEE Network, vol. 35, no. 4, p. 160–166, 2021. [190] N. Victor, R. C., M. Alazab, S. Bhattacharya, S. Magnusson, P. K. Reddy Maddikunta, K. Ramana, and T. Reddy Gadekallu, “Federated learning for iout: Concepts, applications, challenges and opportunities,” arXiv preprint arXiv:2207.13976, 2022. [191] G. Li, N. Li, X. Zhang, and Z. Zhou, “Energy-efficient depth-based opportunistic routing with Q-learning for underwater wireless sensor networks,” Sensors, vol. 20, no. 4, p. 1025, 2020, 15-25% PDR improvement over QELAR, DBR, VBF. [192] Z. A. Khan, O. A. Karim, S. Abbas, and N. Javaid, “Q-learning based energy-efficient and void avoidance routing protocol for underwater acoustic sensor networks,” Computer Networks, vol. 197, p. 108309, 2021, 11% PDR improvement, 25% better energy efficiency vs QE- LAR. [193] W. Li, H. Chen, and Y. Zhang, “An energy efficient hierarchical routing approach for UWSNs using biology inspired intelligent optimization,” Scientific Reports, vol. 15, p. 21336, 2025, 23.5% network lifetime extension over LEACH, DMaOWOA, GSHFA-HCP. [194] C.-J. Chun, J.-M. Kang, and I.-M. Kim, “Adaptive rate and energy harvesting interval control based on reinforcement learning for swipt,” IEEE Communications Letters, vol. 22, no. 12, p. 2571–2574, 2018. [195] R. A. Khalil, M. I. Babar, N. Saeed, and T. Masood, “Semantic communication for the internet of underwater things,” IEEE Network, vol. 38, no. 4, p. 156–163, 2024, 5-15× energy reduction through semantic compression. [196] C. Wang, L. Zhang, Y. Li, and W. Chen, “Reinforcement learning- based mobile sink scheduling for energy-efficient underwater sensor networks,” Ad Hoc Networks, vol. 154, p. 103389, 2024, 35% network lifetime extension with AUV data mule. [197] P. Warden and D. Situnayake, “Tinyml: Machine learning with tensor- flow lite on arduino and ultra-low-power microcontrollers,” O’Reilly Media, 2019. [198] P. P. Ray, “A review on TinyML: State-of-the-art and prospects,” Journal of King Saud University - Computer and Information Sciences, vol. 34, no. 4, p. 1595–1623, 2022, comprehensive survey of TinyML techniques and applications. [199] J. Lee, M. Stanley, A. Spanias, and C. Tepedelenlioglu, “Integrating machine learning in embedded sensor systems for internet-of-things applications,” IEEE International Symposium on Circuits and Systems (ISCAS), p. 1–5, 2020. [200] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, “MobileNets: Efficient convolutional neural networks for mobile vision applications,” in arXiv preprint arXiv:1704.04861, 2017, depthwise separable convolutions for efficient inference. [201] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, “MobileNetV2: Inverted residuals and linear bottlenecks,” in Pro- ceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, p. 4510–4520, inverted residual blocks for efficient architectures. [202] A. Gholami, S. Kim, Z. Dong, Z. Yao, M. W. Mahoney, and K. Keutzer, “A survey of quantization methods for efficient neural network infer- ence,” Low-Power Computer Vision, p. 291–326, 2022, comprehensive survey of quantization techniques. [203] S. Han, J. Pool, J. Tran, and W. J. Dally, “Learning both weights and connections for efficient neural networks,” Advances in Neural Infor- mation Processing Systems, vol. 28, p. 1135–1143, 2015, foundational work on neural network pruning. [204] B. Jacob, S. Kligys, B. Chen, M. Zhu, M. Tang, A. Howard, H. Adam, and D. Kalenichenko, “Quantization and training of neural networks for efficient integer-arithmetic-only inference,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p. 2704– 2713, 2018, foundational work on neural network quantization for embedded deployment. [205] G. Hinton, O. Vinyals, and J. Dean, “Distilling the knowledge in a neural network,” arXiv preprint arXiv:1503.02531, 2015, seminal paper on knowledge distillation. [206] M. Han, J. Duan, S. Khairy, and L. X. Cai, “Enabling sustainable underwater iot networks with energy harvesting: A decentralized rein- forcement learning approach,” IEEE Internet of Things Journal, vol. 7, no. 10, p. 9953–9964, 2020. [207] H. Cai, L. Zhu, and S. Han, “Once-for-all: Train one network and specialize it for efficient deployment,” in International Conference on Learning Representations, 2020, neural architecture search for deployment efficiency. [208] L. Delauney and C. Comp ` ere, “Biofouling protection for marine environmental sensors by local chlorination,” in Marine and Industrial Biofouling.Springer, 2009, p. 119–134, biofouling protection strategies for marine sensors. [209] H. Rashid, H. Habbouche, Y. Amirat, A. Mamoune, H. Titah- Benbouzid, and M. Benbouzid, “B-FLOWS: Biofouling focused learn- ing and observation for wide-area surveillance in tidal stream turbines,” Journal of Marine Science and Engineering, vol. 12, no. 10, p. 1828, 2024, deep learning for biofouling detection. [210] A. L. Bowler, S. Sherrod, P. Sherrod, and N. Watson, “Predicting and monitoring biofouling progression on submerged surfaces using machine learning,” Applied Ocean Research, vol. 116, p. 102872, 2021, mL-based biofouling prediction. [211] J. Kirkpatrick, R. Pascanu, N. Rabinowitz, J. Veness, G. Desjardins, A. A. Rusu, K. Milan, J. Quan, T. Ramalho, A. Grabska-Barwinska et al., “Overcoming catastrophic forgetting in neural networks,” Pro- ceedings of the National Academy of Sciences, vol. 114, no. 13, p. 3521–3526, 2017, elastic Weight Consolidation for continual learning. [212] R. K. Dwivedi, A. K. Rai, and R. Kumar, “A study on machine learning based anomaly detection approaches in wireless sensor network,” in 2020 10th International Conference on Cloud Computing, Data Science & Engineering (Confluence), 2020, p. 194–199. [213] L. Rizzo, “Effective erasure codes for reliable computer communication protocols,” SIGCOMM Comput. Commun. Rev., vol. 27, no. 2, p. 24–36, Apr. 1997. [Online]. Available: https://doi.org/10.1145/263876.263881 [214] M. Stojanovic and J. Preisig, “Underwater acoustic communication channels: Propagation models and statistical characterization,” IEEE Communications Magazine, vol. 47, no. 1, p. 84–89, 2009. [215] S. J. Pan and Q. Yang, “A survey on transfer learning,” IEEE Transac- tions on Knowledge and Data Engineering, vol. 22, no. 10, p. 1345– 1359, 2010. [216] O. Boulais, B. Woodward, B. Schlining, L. Lundsten, K. Barnard, K. C. Bell, and K. Katija, “Fathomnet: An underwater image train- ing database for ocean exploration and discovery,” arXiv preprint arXiv:2007.00114, 2020. [217] C. Li, C. Guo, W. Ren, R. Cong, J. Hou, S. Kwong, and D. Tao, “An underwater image enhancement benchmark dataset and beyond,” IEEE transactions on image processing, vol. 29, p. 4376–4389, 2019. [218] K. Weiss, T. M. Khoshgoftaar, and D. Wang, “A survey of transfer learning,” Journal of Big Data, vol. 3, no. 1, p. 1–40, 2016. [219] M. J. Islam, C. Edge, Y. Xiao, P. Luo, M. Mehtaz, C. Morse, S. S. Enan, and J. Sattar, “Semantic segmentation of underwater imagery: Dataset and benchmark,” in 2020 IEEE/RSJ international conference on intelligent robots and systems (IROS). IEEE, 2020, p. 1769–1776. [220] Z. Xia, J. Du, C. Jiang, Z. Han, and Y. Ren, “Latency constrained energy-efficient underwater dynamic federated learning,” IEEE/ACM Transactions on Networking, vol. 33, p. 355–369, 2024, federated learning optimization for underwater networks. [221] G. Ferri*, A. Munaf ` o*, A. Tesei, P. Braca, F. Meyer, K. Pelekanakis, R. Petroccia, J. Alves, C. Strode, and K. LePage, “Cooperative robotic networks for underwater surveillance: an overview,” IET Radar, Sonar Navigation, vol. 11, no. 12, p. 1740–1761, 2017. [222] F. Lei, F. Tang, and S. Li, “Underwater target detection algorithm based on improved yolov5,” Journal of Marine Science and Engineering, vol. 10, no. 3, p. 310, 2022. [223] C. Petrioli, R. Petroccia, and J. R. Potter, “The SUNSET framework for simulation, emulation and at-sea testing of underwater wireless sensor network protocols,” Ad Hoc Networks, vol. 34, p. 224–238, 2015. [224] R. Gupta and A. Singh, “Survey of AI-driven routing protocols in underwater acoustic networks for enhanced communication efficiency,” Ocean Engineering, vol. 312, p. 119445, 2024, comprehensive survey of AI in underwater routing. [225] M. Raissi, P. Perdikaris, and G. E. Karniadakis, “Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations,” Journal of Computational Physics, vol. 378, p. 686–707, 2019, foundational PINN paper. [226] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” in Advances in Neural Information Processing Systems, vol. 30, 2017. [227] L. Chen, L. Zhang, X. Sun, J. Duan, L. Yin, X. Zheng, and J. Chen, “Research on intelligent predicting method of underwater acoustic field based on physics-informed neural network,” Frontiers in Marine Science, vol. 12, p. 1665305, 2025, pINN for underwater acoustic field prediction using elliptic wave equation. 71 [228] M. Marques, L. Mendonc ̧a, A. Bizzi, L. Moreira, C. Oliveira, D. Oliveira, L. Fernandez, V. Balestro, J. Pereira, D. Yukimura, T. Novello, P. Petrov, and L. Nissenbaum, “Stable adaptive training for physics-informed neural networks in acoustic wave propagation,” JASA Express Letters, vol. 5, no. 11, p. 112401, 2025, adaptive domain sampling with absorbing BCs for underwater acoustics. [229] L. Du, Z. Wang, Z. Lv, L. Wang, and D. Han, “Research on underwater acoustic field prediction method based on physics-informed neural network,” Frontiers in Marine Science, vol. 10, p. 1302077, 2023, pINN for underwater acoustic field prediction. [230] J. Duan, H. Zhao, and J. Song, “Spatial domain decomposition-based physics-informed neural networks for practical acoustic propagation estimation under ocean dynamics,” Journal of the Acoustical Society of America, vol. 155, p. 3306–3321, 2024, sPINN for practical acoustic propagation with spatial decomposition. [231] Y. Gao, P. Xiao, and Z. Li, “Physics-informed neural networks for solving underwater two dimensional sound field,” in 2024 OES China Ocean Acoustics (COA).IEEE, 2024, p. 1–4, pINN for 2D underwater sound field with Helmholtz equation. [232] S. Yoon, Y. Park, P. Gerstoft, and W. Seong, “OceanPINN: Physics- informed neural network for ocean acoustic propagation,” Journal of the Acoustical Society of America, vol. 155, no. 3, p. 2037–2049, 2024, oceanPINN for spatially non-coherent data. [233] J. Tang and H. Niu, “Physics-informed neural network with pretrain- ing optimization for ocean acoustic field prediction,” Journal of the Acoustical Society of America, 2025, preT-OceanPINN with two-stage pretraining optimization. [234] L. Yang, X. Meng, and G. E. Karniadakis, “B-PINNs: Bayesian physics-informed neural networks for forward and inverse pde prob- lems with noisy data,” Journal of Computational Physics, vol. 425, p. 109913, 2021, bayesian PINNs for uncertainty quantification. [235] W. Huang et al., “Fast broadband modeling using physics-informed neural network with modal equations,” Journal of the Acoustical Society of America, 2024, pINN with normal mode integration for broadband modeling. [236] X. Chen et al., “Marine digital twin: A comprehensive review and development roadmap,” Ocean, 2025, comprehensive marine digital twin framework. [237] J. Feng, Y. Cui, X. Wang, and W. Liu, “Transformer-based underwater acoustic target recognition,” IEEE Journal of Oceanic Engineering, vol. 47, no. 4, p. 1189–1203, 2022, self-attention for acoustic classi- fication. [238] K. Xu et al., “Self-supervised learning-based underwater acoustical signal classification via mask modeling,” Journal of the Acoustical Society of America, vol. 154, no. 1, p. 5–15, 2023, swin Transformer with self-supervised learning for UATR. [239] K. Yang, B. Wang, Z. Fang, and B. Cai, “An end-to-end underwater acoustic target recognition model based on one-dimensional convolu- tion and transformer,” Journal of Marine Science and Engineering, vol. 12, no. 10, p. 1793, 2024, 1DCTN combining 1D CNN with Transformers. [240] Y. Chen et al., “An effective convolutional and transformer cooperation network for underwater acoustic target recognition,” Engineering Ap- plications of Artificial Intelligence, 2024, uACTC hybrid CNN-Swin Transformer architecture. [241] Y. Wang, J. Xiao, X. Cheng, Q. Wei, and N. Tang, “Underwater acoustic signal classification based on a spatial-temporal fusion neural network,” Frontiers in Marine Science, vol. 11, p. 1331717, 2024, transformer and DWC fusion for modulation classification. [242] A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby, “An image is worth 16x16 words: Trans- formers for image recognition at scale,” in International Conference on Learning Representations, 2021, vision Transformer (ViT) architecture. [243] others, “STNet: Prediction of underwater sound speed profiles with an advanced semi-transformer neural network,” Ocean Engineering, 2024, semi-transformer for sound speed profile prediction. [244] Y. He, G. Han et al., “A secure routing protocol based on graph neural networks for underwater acoustic sensor networks,” in Proc. IEEE International Conference on Communications (ICC).IEEE, 2024, p. 1–6, gBSR protocol with GNN-based trust prediction. [245] Y. Li et al., “Graph attention network-based auv path planning with ocean current information,” Ocean Engineering, 2023, gAT for AUV route planning with environmental embedding. [246] Y. He, G. Han, A. Li, T. Taleb, C. Wang, and H. Yu, “A federated deep reinforcement learning-based trust model in underwater acoustic sensor networks,” IEEE Transactions on Mobile Computing, vol. 23, no. 5, p. 5150–5161, 2023, trust-aware federated learning for UASNs. [247] X. Zhang et al., “DBSCAN-based byzantine attack detection for federated learning in underwater networks,” IEEE Internet of Things Journal, 2024, byzantine robustness for underwater FL. [248] J. Yan, Y. Zheng, X. Yang, C. Chen, and X. Guan, “Privacy-preserving localization for underwater acoustic sensor networks: A differential privacy-based deep learning approach,” IEEE Transactions on Informa- tion Forensics and Security, vol. 20, p. 737, 2024, differential privacy for underwater localization. [249] C. Finn, P. Abbeel, and S. Levine, “Model-agnostic meta-learning for fast adaptation of deep networks,” in International Conference on Machine Learning.PMLR, 2017, p. 1126–1135, foundational MAML paper. [250] H. Zhao et al., “Federated meta learning enhanced acoustic radio cooperative framework for ocean of things underwater acoustic com- munications,” arXiv preprint arXiv:2105.13296, 2021, fML for DNN- based UWA receivers. [251] S. Guo, Y. Wang, N. Zhang, Z. Su, T. H. Luan, Z. Tian et al., “A survey on semantic communication networks: architecture, security, and privacy,” IEEE Communications Surveys & Tutorials, 2024. [252] M. Zetas, S. Spantideas, A. Giannopoulou, N. Nomikos, and P. Trakadas, “Empowering 6G maritime communications with dis- tributed intelligence and over-the-air model sharing,” Frontiers in Communications and Networks, vol. 4, p. 1280602, 2024, federated learning for maritime networks. [253] H. Liu, T. Qin, Z. Gao, T. Mao et al., “Near-space communications: The last piece of 6G space-air-ground-sea integrated network puzzle,” Space: Science & Technology, vol. 4, p. 0176, 2024, near-space communications for SAGSIN. [254] W. Saad, M. Bennis, and M. Chen, “A vision of 6g wireless systems: Applications, enabling technologies, and research challenges,” IEEE Network, vol. 34, no. 3, p. 134–142, 2020. [255] H. Guo, J. Li, J. Liu, N. Tian, and N. Kato, “A survey on space-air- ground-sea integrated network security in 6G,” IEEE Communications Surveys & Tutorials, vol. 24, no. 1, p. 53–87, 2022, sAGSIN security survey. [256] N.-N. Dao, N. Tu, T. Thanh, V. Bao, W. Na, and S. Cho, “Neglected infrastructures for 6G—underwater communications: How mature are they?” Journal of Network and Computer Applications, vol. 213, p. 103595, 2023, 6G underwater infrastructure maturity assessment. [257] X. Zhang et al., “From 6G to SeaX-G: Integrated 6G TN/NTN for AI-assisted maritime communications—architecture, enablers, and optimization problems,” Journal of Marine Science and Engineering, vol. 13, no. 6, p. 1103, 2025, seaX-G architecture for maritime 6G. [258] Z. Qin, X. Tao, J. Lu, and G. Y. Li, “Semantic communications: An information theoretic view,” IEEE Wireless Communications, vol. 29, no. 4, p. 24–30, 2022, theoretical foundations of semantic communi- cation. [259] Authors, “A survey on semantic communications: Technologies, so- lutions, applications and challenges,” Digital Communications and Networks, 2023. [260] J. Yan et al., “Digital twin-driven swarm of autonomous underwater vehicles for marine exploration,” Communications Engineering, 2025, dT-driven AUV swarm control with IRL. [261] X. Wang et al., “Underwater digital twin sensor network-based mar- itime communication and monitoring using exponential hyperbolic crisp adaptive network-based fuzzy inference system,” Water, vol. 17, no. 9, p. 1324, 2025, uDT with EHC-ANFIS for maritime monitoring. [262] Mercator Ocean International, “The european digital twin ocean (EU DTO),” https://digitaltwinocean.mercator-ocean.eu/, 2024, eU DTO ini- tiative and platform. [263] H. Luo et al., “Air/water cross-boundary communications: A compre- hensive review,” IEEE Communications Surveys & Tutorials, 2024, air- water cross-boundary communication survey. [264] H. Kaushal and G. Kaddoum, “Underwater optical wireless communi- cation,” IEEE Access, vol. 4, p. 1518–1547, 2016, uOWC fundamen- tals and challenges. [265] L. Yang, J. Xiang, S. Li et al., “Performance analysis of relay-aided satellite-underwater acoustic communication systems,” IEEE Trans- actions on Communications, vol. 72, no. 6, p. 3511–3525, 2024, satellite-underwater relay analysis. [266] X. Li et al., “A hierarchical underwater acoustic target recognition method based on transformer and transfer learning,” in Proc. 6th In- ternational Conference on Image, Video and Signal Processing (IVSP), 2024, hUATrans with transfer learning from ImageNet. 72 [267] M. Irfan, J. Zheng, S. Ali, M. Iqbal, Z. Masood, and U. Z. A. Hamid, “DeepShip: An underwater acoustic benchmark dataset and a separable convolution based autoencoder for classification,” Expert Systems with Applications, vol. 183, p. 115270, 2021. [268] D. Santos-Dom ́ ınguez, S. Torres-Guijarro, A. Cardenal-L ́ opez, and A. Pena-Gimenez, “ShipsEar: An underwater vessel noise database,” Applied Acoustics, vol. 113, p. 64–69, 2016. [269] X. Kazmierczak et al., “Underwater communication technologies: A review,” Telecommunication Systems, vol. 88, no. 2, 2025, comprehen- sive underwater communication review including AI integration. [270] H. Niu, X. Li, Y. Zhang, and J. Xu, “Advances and applications of machine learning in underwater acoustics,” Intelligent Marine Technol- ogy and Systems, vol. 1, no. 1, p. 8, 2023, comprehensive ML review covering source localization, target recognition, communication, and geoacoustic inversion. [271] S. Mittal, S. Srivastava, and J. P. Jayanth, “A survey of deep learning techniques for underwater image classification,” IEEE Transactions on Neural Networks and Learning Systems, vol. 34, no. 7, p. 3636–3650, 2023. [272] M. J. Bianco, P. Gerstoft et al., “Machine learning in acoustics: A re- view and open-source repository,” npj Acoustics, 2025, comprehensive ML review with AcousticsML GitHub repository. [273] Various, “AquaSignal: An integrated framework for robust underwater acoustic analysis,” arXiv preprint arXiv:2505.14285, 2025, integrated ML framework for preprocessing, denoising, classification, and novelty detection. [274] others, “Predicting transmission loss in underwater acoustics using continual learning with range-dependent conditional convolutional neu- ral networks,” Journal of the Acoustical Society of America, 2024, continual learning for underwater acoustics. [275] P. Consul, I. Budhiraja, and D. Garg, “Deep reinforcement learning based reliable data transmission scheme for internet of underwater things in 5G and beyond networks,” Procedia Computer Science, vol. 235, p. 1752–1760, 2024. [276] L. Yang et al., “Toward intelligent underwater acoustic systems: Systematic insights into channel estimation and modulation methods,” Electronics, vol. 14, no. 15, p. 2953, 2025, systematic literature review of ML/DL for UWA communication 2020-2025. [277] X. Hou, J. Wang, Z. Fang, X. Zhang, S. Song, X. Zhang, and Y. Ren, “Machine-learning-aided mission-critical internet of underwa- ter things,” IEEE Network, vol. 35, no. 4, p. 160–166, 2021. [278] J. Hou, C. Yang, Q. Zou, J. Chen, and X. Nie, “Optimization of IoUT systems: A hierarchical federated transfer learning approach based on UAV computation offloading,” in Springer LNCS, 2025, hFTL for IoUT with UAV edge computing. [279] X. Xu et al., “Federated learning for Internet of Underwater Things based on lightweight distillation and data refinement,” IEEE Internet of Things Journal, 2025, lightweight FL addressing bandwidth and heterogeneity challenges. [280] W. Aman, S. Al-Kuwari, M. Muzzammil, M. M. U. Rahman, and A. Kumar, “Security of underwater and air-water wireless communica- tion: State-of-the-art, challenges and outlook,” Ad Hoc Networks, vol. 142, p. 103114, 2023. [281] L. C. Domingos, P. E. Santos, P. S. Skelton, R. S. Brinkworth, and K. Sammut, “A survey of underwater acoustic data classification meth- ods using deep learning for shoreline surveillance,” Sensors, vol. 22, no. 6, p. 2181, 2022. [282] N. Adam, M. Ali, F. Naeem, A. S. Ghazy, and G. Kaddoum, “State- of-the-art security schemes for the Internet of Underwater Things: A holistic survey,” IEEE Open Journal of the Communications Society, vol. 5, p. 6561, 2024. [283] S. B. Goyal, R. V. Ravi, C. Verma et al., “A lightweight cryptographic algorithm for underwater acoustic networks,” in Procedia Computer Science, vol. 215, 2022, p. 266–273. [284] M. S. Popli, R. P. Singh, N. K. Popli, and M. Mamun, “A federated learning framework for enhanced data security and cyber intrusion detection in distributed network of underwater drones,” IEEE Access, vol. 13, p. 12634, 2025. [285] A. Giannopoulos, P. Gkonis, P. Bithas, N. Nomikos, A. Kalafatelis, and P. Trakadas, “Federated learning for maritime environments: Use cases, experimental results, and open issues,” in IEEE Conference Proceedings, 2024. [286] J. Duan, H. Zhao, and J. Song, “Spatial domain decomposition-based physics-informed neural networks for practical acoustic propagation estimation under ocean dynamics,” Journal of the Acoustical Society of America, vol. 155, p. 3306–3321, 2024. [287] Y. Gao, P. Xiao, and Z. Li, “Physics-informed neural networks for solving underwater two dimensional sound field,” in 2024 OES China Ocean Acoustics (COA). IEEE, 2024, p. 1–4. [288] K. Li and M. Chitre, “Data-aided underwater acoustic ray propagation modeling,” IEEE Journal of Oceanic Engineering, vol. 48, p. 1127– 1148, 2023. [289] UASP 2025 Conference, “A book of abstracts for the 2025 underwater acoustic signal processing workshop,” 2025, conference abstracts on PINN-based matched-field processing and localization. [290] M. Marques, L. Mendonc ̧a, A. Bizzi et al., “Stable adaptive training for physics-informed neural networks in acoustic wave propagation,” JASA Express Letters, vol. 5, no. 11, p. 112401, 2025. [291] Various, “Hankel-FNO: Fast underwater acoustic charting via physics- encoded Fourier neural operator,” 2025, fNO-based surrogate model for efficient acoustic charting. [292] M. Shaheen, M. S. Farooq, T. Umer, and T. A. Tran, “Revolutionizing Internet of Underwater Things with Federated Learning,” in Artificial Intelligence and Edge Computing for Sustainable Ocean Health, ser. The Springer Series in Applied Machine Learning. Springer, 2024. [293] X. Liyanage et al., “Underwater digital twin applications: A systematic literature review,” Digital Twin, 2025, systematic review of underwater DT applications. [294] N. Ciuccoli, L. Screpanti, and D. Scaradozzi, “Underwater simulators analysis for digital twinning,” IEEE Access, vol. 12, p. 34 306–34 324, 2024. [295] Y. Lin, P. Chuang, and J. Y. Huang, “Simultaneous depth and heading control for autonomous underwater vehicle docking maneuvers using deep reinforcement learning within a digital twin system,” Computers, Materials & Continua, vol. 84, no. 3, 2025. [296] N. Vedachalam, “Cognitive digital twins in strategic anti-submarine warfare: A scoping review,” ORF Special Report, no. 268, 2025, observer Research Foundation. [297] E. S. Ali, R. A. Saeed, I. K. Eltahir et al., “A systematic review on energy efficiency in the Internet of Underwater Things (IoUT): Recent approaches and research gaps,” Journal of Network and Computer Applications, vol. 213, p. 103594, 2023. [298] International Maritime Organization, “Guidelines for the reduction of underwater noise from commercial shipping to address adverse impacts on marine life,” MEPC.1/Circ.906-Rev.1, 2014, iMO guidelines on underwater radiated noise. [299] X. Liu et al., “Underwater drone-enabled wireless communication systems for smart marine communications: A study of enabling tech- nologies, opportunities, and challenges,” Drones, vol. 9, no. 11, p. 784, 2025, underwater drone communication enabling technologies. [300] X. Wang et al., “Space-air-ground-sea integrated network with feder- ated learning,” Remote Sensing, vol. 16, no. 9, p. 1640, 2024, fL for SAGSIN integration. APPENDIX MATHEMATICAL DERIVATIONS This appendix provides detailed mathematical derivations for key ML techniques discussed in Section I. Whilst these derivations are standard in the ML literature, we present them here for completeness and to aid readers seeking deeper understanding of the mathematical foundations. A. Gaussian Process Regression Posterior For underwater field estimation using Gaussian Processes (GPs), we model the unknown function as a distribution over functions specified by mean m(x) and covariance k(x,x ′ ) functions: f (x)∼GP(m(x),k(x,x ′ )).(218) Given observations y at locations X, the GP posterior at unmeasured location x ∗ is: p(f ∗ |X,y,x ∗ ) =N ( ̄ f ∗ , cov(f ∗ )),(219) 73 where the predictive mean and covariance are given by: ̄ f ∗ = k T ∗ (K + σ 2 n I) −1 y,(220) cov(f ∗ ) = k ∗ − k T ∗ (K + σ 2 n I) −1 k ∗ .(221) Here, K is the covariance matrix with entries K ij = k(x i ,x j ), k ∗ is the vector of covariances between the test point and training points with entries k i ∗ = k(x ∗ ,x i ), k ∗ = k(x ∗ ,x ∗ ), and σ 2 n is the observation noise variance. The predictive mean (220) provides the best estimate of the function value, whilst the predictive variance (221) quantifies uncertainty. For underwater applications, this uncertainty is crucial for adaptive sampling strategies, where AUVs prioritise measurements in regions of high uncertainty. B. Long Short-Term Memory (LSTM) Gate Equations LSTM networks maintain information over extended time periods through three gate mechanisms that control informa- tion flow. Given input x t and previous hidden state h t−1 , the gates and cell state updates are: Forget gate (determines what information to discard from cell state): f t = σ(W f · [h t−1 ,x t ] + b f ),(222) Input gate (determines what new information to store): i t = σ(W i · [h t−1 ,x t ] + b i ),(223) Candidate cell state (new information to potentially add): ̃ C t = tanh(W C · [h t−1 ,x t ] + b C ),(224) Cell state update (combine forget and input): C t = f t ⊙ C t−1 + i t ⊙ ̃ C t ,(225) Output gate (determines what to output based on cell state): o t = σ(W o · [h t−1 ,x t ] + b o ),(226) Hidden state (filtered cell state output): h t = o t ⊙ tanh(C t ),(227) where ⊙ denotes element-wise (Hadamard) multiplication, σ(x)=1/(1 + e −x ) is the sigmoid function, and W f ,W i ,W C ,W o and b f ,b i ,b C ,b o are learnt weight matrices and bias vectors, respectively. For underwater channel prediction, x t typically con- tains environmental measurements (temperature profiles, wave heights, current velocities), and the LSTM learns to capture temporal dependencies ranging from short-term fluctuations (seconds to minutes) to long-term cycles (tidal periods of 12.4 hours or seasonal variations). C. Support Vector Machine Optimisation Formulation The Support Vector Machine (SVM) solves a constrained optimisation problem to find the hyperplane w T x +b = 0 that maximises the margin between classes. The margin is defined as γ = 2/||w||. Hard-margin SVM (for linearly separable data): min w,b 1 2 ||w|| 2 subject to y i (w T x i + b)≥ 1, ∀i, (228) where y i ∈−1, +1 are class labels. Soft-margin SVM (for non-separable data, used in prac- tice): min w,b,ξ 1 2 ||w|| 2 + C n X i=1 ξ i subject to y i (w T x i + b)≥ 1− ξ i , ξ i ≥ 0, ∀i, (229) where ξ i are slack variables that allow misclassification, and C > 0 is a regularisation parameter controlling the trade-off between margin maximisation and training error minimisation. Kernel trick for nonlinear classification: The optimisation can be expressed in dual form, depending only on dot products x T i x j . These can be replaced with kernel functions K(x i ,x j ) that implicitly compute dot products in high-dimensional feature spaces without explicitly construct- ing the feature vectors. Common kernels for underwater acoustic classification in- clude: Gaussian RBF kernel: K(x i ,x j ) = exp −γ||x i − x j || 2 ,(230) Polynomial kernel: K(x i ,x j ) = (x T i x j + c) d ,(231) where γ, c, and d are hyperparameters chosen via cross- validation. For underwater modulation classification, the RBF kernel with appropriately tuned γ enables SVMs to learn complex decision boundaries in spectral feature space, achieving robust classification even at low SNR conditions. LIST OF ACRONYMS AIArtificial Intelligence ALOHA Additive Links On-line Hawaii Area AMCAdaptive Modulation and Coding AMMOAutonomous Mobile Marine Observatory ANNArtificial Neural Network APIApplication Programming Interface AQMActive Queue Management ARQAutomatic Repeat Request ASICApplication-Specific Integrated Circuit AUCArea Under the Curve AUVAutonomous Underwater Vehicle BERBit Error Rate BiLSTM Bidirectional Long Short-Term Memory BPSKBinary Phase Shift Keying 74 cGANConditional Generative Adversarial Network CNNConvolutional Neural Network COBYLA Constrained Optimisation BY Linear Approxi- mation ConvLSTMConvolutional Long Short-Term Memory CPUCentral Processing Unit CRFConditional Random Field CSMACarrier Sense Multiple Access CSIChannel State Information CTDConductivity, Temperature, Depth DARPADefense Advanced Research Projects Agency DBSCAN Density-Based Spatial Clustering of Applications with Noise DDPGDeep Deterministic Policy Gradient DLDeep Learning DNADeoxyribonucleic Acid DNNDeep Neural Network DQNDeep Q-Network DRLDeep Reinforcement Learning ECNExplicit Congestion Notification eDNAEnvironmental DNA ELFExtremely Low Frequency ELBOEvidence Lower Bound EWCElastic Weight Consolidation FECForward Error Correction FFTFast Fourier Transform FLFederated Learning FLOPSFloating Point Operations Per Second FPGAField-Programmable Gate Array FSKFrequency Shift Keying GANGenerative Adversarial Network GAPGlobal Average Pooling GATGraph Attention Network GCNGraph Convolutional Network GDOPGeometric Dilution of Precision GDPRGeneral Data Protection Regulation GFLOPS Giga Floating Point Operations Per Second GNNGraph Neural Network GPGaussian Process GPSGlobal Positioning System GPUGraphics Processing Unit GRUGated Recurrent Unit HARQHybrid Automatic Repeat Request IEEEInstitute of Electrical and Electronics Engineers IFFTInverse Fast Fourier Transform IMOInternational Maritime Organisation IoTInternet of Things IoUTInternet of Underwater Things ITARInternational Traffic in Arms Regulations ITUInternational Telecommunication Union k-NNk-Nearest Neighbours KLKullback-Leibler LDALinear Discriminant Analysis LIDARLight Detection and Ranging LMSLeast Mean Squares LSLeast Squares LSTMLong Short-Term Memory MACMedium Access Control MAMLModel-Agnostic Meta-Learning mAPmean Average Precision MARLMulti-Agent Reinforcement Learning MARPOL International Convention for the Prevention of Pollution from Ships MBARIMonterey Bay Aquarium Research Institute MCSModulation and Coding Scheme MFCCMel-Frequency Cepstral Coefficients MFLOPS Mega Floating Point Operations Per Second MIMagnetic Induction MLMachine Learning MLPMulti-Layer Perceptron MMSEMinimum Mean Square Error MO-DQN Multi-Objective Deep Q-Network MSAMulti-Head Self-Attention MSEMean Squared Error NASNeural Architecture Search NEONARM Advanced SIMD Extension NPUNeural Processing Unit NTUNephelometric Turbidity Units OFDMOrthogonal Frequency-Division Multiplexing PCAPrincipal Component Analysis PDEPartial Differential Equation PDRPacket Delivery Ratio PINNPhysics-Informed Neural Network POMDPPartially Observable Markov Decision Process PPOProximal Policy Optimisation PSKPhase Shift Keying QAMQuadrature Amplitude Modulation QAOAQuantum Approximate Optimisation Algorithm QoSQuality of Service QPSKQuadrature Phase Shift Keying RAMRandom Access Memory RBFRadial Basis Function ReLURectified Linear Unit RFRadio Frequency RLReinforcement Learning RLSRecursive Least Squares RMSERoot Mean Square Error RNNRecurrent Neural Network ROIReturn on Investment / Region of Interest ROMRead-Only Memory ROVRemotely Operated Vehicle RSSIReceived Signal Strength Indicator RTTRound-Trip Time RTS/CTS Request to Send/Clear to Send SARSAState-Action-Reward-State-Action SIMDSingle Instruction, Multiple Data SINRSignal-to-Interference-plus-Noise Ratio SLRSea Level Rise SNNSpiking Neural Network SNRSignal-to-Noise Ratio SONARSound Navigation and Ranging SONSelf-Organising Network SSIMStructural Similarity Index Measure STFTShort-Time Fourier Transform SVMSupport Vector Machine TCPTransmission Control Protocol 75 TDTemporal Difference TD3Twin Delayed Deep Deterministic Policy Gradi- ent TDMATime Division Multiple Access TOPSTera Operations Per Second TPUTensor Processing Unit TTLTime To Live TVTotal Variation UAVUnmanned Aerial Vehicle UNCLOS United Nations Convention on the Law of the Sea UUVUnmanned Underwater Vehicle UWSNUnderwater Wireless Sensor Network VAEVariational Autoencoder ViTVision Transformer WCSSWithin-Cluster Sum of Squares WSNWireless Sensor Network WUSNWireless Underwater Sensor Network YOLOYou Only Look Once YOLOv8n You Only Look Once version 8 nano SUMMARY TABLES This section provides quick reference tables for practitioners implementing ML solutions in underwater communication systems. These tables synthesise key insights from the survey for rapid consultation during system design and deployment. 76 TABLE XXVIII ML ALGORITHM SELECTION GUIDE FOR UNDERWATER APPLICATIONS ApplicationBest ML MethodKey AdvantagesConstraintsData RequirementsAccuracy Physical Layer LocalisationCNN + k-NNSub-metre accuracy, ro- bust to multipath High memory for finger- prints 1000+ fingerprints0.8-1.2m Channel EstimationLSTM + PINNPredictivecapability, physics-consistent Computational complexity100-1000 samplesMSE: 0.012 Modulation ClassificationCNNRobust at low SNRRequires diverse training5000+ per class96% @ 0dB Adaptive ModulationDQNHandles outdated CSILarge state space1000+ episodes20–45% gain MAC Layer Channel AccessQ-LearningSimple implementationDiscrete actions only500+ iterations18–42% utilisation Power ControlTD3Continuous controlComplex training5000+ episodes66% energy re- duction Resource AllocationMO-DQNMulti-objective optimisa- tion High complexity10000+ episodes0.91 fairness Network Layer RoutingGNNTopology-awareGraph structure needed100+ nodes94% PDR ClusteringDeep EmbeddingAdaptive clustersComputational overhead500+ samples/node2.8× lifetime Void RecoveryDQNHandles 3D topologyMemory intensive1000+ episodes89% success Transport Layer Congestion ControlPPOStable learningComplex implementation5000+ episodes91% loss reduc- tion Error ControlNeural FECAdaptive protectionTraining complexity10000+ packets73% fewer retx Flow ControlSARSAOnline learningConvergence time1000+ episodes77% buffer re- duction Application Layer Object DetectionYOLOv8nReal-time, efficientLimited by visibility5000+ images92% mAP Anomaly DetectionVAEUnsupervised learningLatent space design1000+ normal sam- ples 96% detection Multi-modal FusionCross-attentionHandles missing dataComplexity scales1000+ per modality96.5% accuracy Path PlanningTD3Continuous controlSim-to-real gap10000+ episodes31%shorter paths TABLE XXIX COMPUTATIONAL REQUIREMENTS AND PLATFORM RECOMMENDATIONS Algorithm ClassMemoryFLOPSPower (W)Latency (ms)Recommended Platform k-NNO(nd)O(ndk)0.01-0.110-50ARM Cortex-M4 Decision TreesO(nodes)O(depth)0.01-0.051-10Any microcontroller SVMO(nsv × d)O(nsv × d)0.05-0.25-20ARM Cortex-M7 Small CNN (¡5 layers)100KB-1MB10-100M0.1-110-100ARM Cortex-A53 Medium CNN (5-20 lay- ers) 1-10MB100M-1G1-550-500NVIDIA Jetson Nano Large CNN (¿20 layers)10-100MB1-10G5-20100-1000NVIDIA Jetson Xavier LSTM/GRUO(4h²)O(4h²T)0.5-220-200ARM Cortex-A72 TransformerO(n²d)O(n²d)2-10100-1000GPU required Q-LearningO(—S—×—A—)O(1)0.001-0.01¡1Any microcontroller DQNO(|θ|)O(|θ|)0.5–210–100ARM Cortex-A53+ PPO/TD3O(2—θ—)O(2—θ—)1-550-200Jetson Nano+ Federated Learning+20% base+10% base+30% base+50% baseDistributed system Edge LearningBase modelBase modelBase modelBase modelLocal processor TABLE X ENERGY EFFICIENCY COMPARISON: ML VS TRADITIONAL METHODS OperationTraditional (J)ML-Based (J)ImprovementBattery Life GainKey Technique Acoustic Transmission10 per packet0.34 per packet29×Weeks→ YearsAdaptive power, Q-learning Channel Estimation0.5 per estimate0.08 per estimate6×3→ 18 monthsCNN prediction Route Discovery45 per route2.1 per route21×Days→ MonthsGNN, caching Object Detection8.2 per frame0.15 per frame55×Hours→ DaysYOLOv8n, pruning Network Maintenance850 per day12 per day71×3→ 214 daysPredictive, federated Data Compression2.0 per MB0.02 per MB100×10→ 1000 daysAutoencoder Anomaly Detection1.5 continuous0.05 event-driven30×Months→ YearsVAE, edge processing Multi-hop Routing5.6 per packet0.95 per packet6×2→ 12 monthsQ-routing Total Daily28001801556×77 days→ 3.5 yearsHolistic optimisation 77 TABLE XXXI IMPLEMENTATION COMPLEXITY AND DEPLOYMENT READINESS TechnologyComplexityTRLTime to DeployRisk LevelPrimary Challenges Ready for Deployment (TRL 7-9) k-N LocalisationLow81-3 monthsLowTraining data collection Q-Learning MACMedium73-6 monthsLowParameter tuning CNN Channel Est.Medium73-6 monthsMediumModel size, real-time Decision TreeLow9¡1 monthVery LowLimited capability Pilot Testing (TRL 4-6) DQN RoutingHigh66-12 monthsMediumConvergence, stability YOLOv8n DetectionMedium66-9 monthsMediumTraining data, visibility Federated LearningHigh512-18 monthsHighCommunication overhead LSTM PredictionMedium66-9 monthsMediumLong-term accuracy Research Phase (TRL 1-3) Transformer NetsVery High318-24 monthsHighComputational limits PINNsHigh412-18 monthsMediumPhysics integration Quantum MLVery High224-36 monthsVery HighHardware availability NeuromorphicHigh318-24 monthsHighHardware maturity TABLE XXXII TRAINING DATA REQUIREMENTS AND COLLECTION STRATEGIES ApplicationMin. SamplesIdeal SamplesCollection MethodAugmentation StrategyCost Estimate Localisation5005,000Grid surveyNoise injection, multipath$50K-200K Channel Estimation1001,000Continuous recordingDoppler, time-varying$20K-100K Object Detection1,00010,000ROV surveyColour, turbidity, rotation$200K-1M Species Classification50/class500/classOpportunistic + targetedPitch shift, time stretch$100K-500K Anomaly Detection1,000 normal10,000 normalLong-term monitoringSynthetic anomalies$50K-200K Protocol Learning100 hours1,000 hoursPassive recordingNoise, interference$20K-50K Current Prediction30 days365 daysFixed sensorsPhysical simulation$100K-300K TABLE XXXIII CROSS-LAYER OPTIMISATION OPPORTUNITIES Layer 1Layer 2Optimisation MethodPerformance GainKey Insight PhysicalMACJoint channel-access learning35% efficiencyChannel predicts collision proba- bility PhysicalNetworkChannel-aware routing40% reliabilityRoute around poor channels MACNetworkTraffic-aware clustering45% energyCluster based on communication patterns MACTransportQueue-aware scheduling60% latency reductionPrioritise based on transport needs NetworkTransportCongestion-aware routing50% throughputRoute around congested nodes NetworkApplicationContent-aware routing30% bandwidthDifferent paths for different data types All Layers-Holistic multi-task learning42% overallShared representations across tasks TABLE XXXIV ENVIRONMENTAL ADAPTATION STRATEGIES Environmental FactorImpact on MLAdaptation StrategyML TechniqueSuccess Rate BiofoulingSensor drift, degradationProgressive calibrationOnline learning, EWC85% maintained Temperature VariationModel accuracy dropMulti-temperature training Domain adaptation90% maintained Pressure (Depth)Componentbehaviour change Depth-stratified modelsEnsemble methods92% maintained TurbidityOptical degradationRobust featuresAttention mechanisms86% maintained Seasonal ChangesDistribution shiftContinual learningProgressive networks88% maintained Node MobilityTopology changesDynamic retrainingGNN, online RL91% maintained Noise VariationSNR fluctuationNoise-robust trainingData augmentation94% maintained 78 TABLE XXXV COST-BENEFIT ANALYSIS FOR ML IMPLEMENTATION Investment AreaInitial CostAnnual OpExBenefit/YearROI Period5-Year NPV Data Collection$1-5M$100K---$5.5M Model Development$0.5-2M$200K---$2.5M Hardware Upgrade$5-50K/node$10K/node---$100K/node Training/Personnel$200K$100K---$700K Energy Savings--$50K/nodeImmediate$200K/node Maintenance Reduction--$100KYear 1$400K Failure Prevention--$500KYear 1$2M Improved Efficiency--$200KYear 2$600K Net (100 nodes)$10-15M$1.5M$5.8M2.5 years$8.5M TABLE XXXVI QUICK DECISION MATRIX FOR ML ADOPTION ScenarioNetwork SizeDurationBudgetUse ML?Recommended Approach Short-term monitoring¡10 nodes¡1 month¡$100KNoTraditional protocols Coastal surveillance10-50 nodes3-12 months$100K-1MPartialML for critical functions Long-term monitoring50-200 nodes¿1 year$1-10MYesFull ML stack Ocean observatory¿200 nodesPermanent¿$10MEssentialAdvanced ML + federa- tion Research deploymentAnyVariableLimitedYesTransfer learning Commercial aquaculture20-100 nodesContinuous$500K-5MYesProven ML solutions Military operationsVariableVariableClassifiedYesCustom ML + security Emergency responseVariableDays-weeksUrgentPartialPre-trained models