Paper deep dive
Learning from Radio using Variational Quantum RF Sensing
Ivana Nikoloska
Abstract
Abstract:In modern wireless networks, radio channels serve a dual role. Whilst their primary function is to carry bits of information from a transmitter to a receiver, the intrinsic sensitivity of transmitted signals to the physical structure of the environment makes the channel a powerful source of knowledge about the world. In this paper, we consider an agent that learns about its environment using a quantum sensing probe, optimised using a quantum circuit, which interacts with the radio-frequency (RF) electromagnetic field. We use data obtained from a ray-tracer to train the quantum circuit and learning model and we provide extensive experiments under realistic conditions on a localisation task. We show that using quantum sensors to learn from radio signals can enable intelligent systems that require no channel measurements at deployment, remain sensitive to weak and obstructed RF signals, and can learn about the world despite operating with strictly less information than classical baselines.
Tags
Links
- Source: https://arxiv.org/abs/2603.10239v1
- Canonical: https://arxiv.org/abs/2603.10239v1
PDF not stored locally. Use the link above to view on the source site.
Intelligence
Status: succeeded | Model: google/gemini-3.1-flash-lite-preview | Prompt: intel-v1 | Confidence: 93%
Last extracted: 3/13/2026, 1:08:36 AM
Summary
The paper introduces a framework for 'Variational Quantum RF Sensing,' where an agent uses a quantum sensing probe, optimized via a variational quantum circuit, to interact with radio-frequency (RF) electromagnetic fields. By training on data from a ray-tracer, the system learns to perform tasks like localization without requiring explicit channel measurements at deployment, leveraging the sensitivity of quantum states to environmental RF signatures.
Entities (5)
Relation Signals (3)
Variational Quantum RF Sensing → utilizes → Quantum Sensing Probe
confidence 95% · In the variational quantum RF sensing framework, the agent prepares a N-qubit quantum sensing probe.
Variational Quantum RF Sensing → performs → Localisation
confidence 92% · We provide extensive experiments in realistic conditions using a localisation task.
Ray-tracer → trains → Variational Quantum RF Sensing
confidence 90% · We use data obtained from a ray-tracer to train the quantum circuit and learning model.
Cypher Suggestions (2)
Find all technologies used by the Variational Quantum RF Sensing framework. · confidence 90% · unvalidated
MATCH (f:Framework {name: 'Variational Quantum RF Sensing'})-[:UTILIZES]->(t:Technology) RETURN t.nameIdentify tasks performed by the framework. · confidence 90% · unvalidated
MATCH (f:Framework {name: 'Variational Quantum RF Sensing'})-[:PERFORMS]->(task:Task) RETURN task.nameFull Text
42,834 characters extracted from source content.
Expand or collapse full text
Learning from Radio using Variational Quantum RF Sensing Ivana Nikoloska Signal Processing Systems Group, Department of Electrical Engineering, Eindhoven University of Technology, Eindhoven, 5612 AP, The Netherlands i.nikoloska@tue.nl Abstract In modern wireless networks, radio channels serve a dual role. Whilst their primary function is to carry bits of information from a transmitter to a receiver, the intrinsic sensitivity of transmitted signals to the physical structure of the environment makes the channel a powerful source of knowledge about the world. In this paper, we consider an agent that learns about its environment using a quantum sensing probe, optimised using a quantum circuit, which interacts with the radio-frequency (RF) electromagnetic field. We use data obtained from a ray-tracer to train the quantum circuit and learning model and we provide extensive experiments under realistic conditions on a localisation task. We show that using quantum sensors to learn from radio signals can enable intelligent systems that require no channel measurements at deployment, remain sensitive to weak and obstructed RF signals, and can learn about the world despite operating with strictly less information than classical baselines. 1 Introduction 1.1 Context and Motivation Over the past several decades, five generations of wireless systems have progressively transformed how people and machines are connected, enabling ever-greater capacity, coverage, and service diversity, ranging from analog voice in 1G, to broadband data and massive IoT connectivity in 5G [1]. As 6G is beginning to take shape, radio channels have become rich media that embed measurable signatures of the physical environment that can be used beyond their role in supporting reliable data transmission [2]. In particular, in wireless systems, the combined effects of propagation phenomena such as scattering, diffraction, reflection, and refraction resulting from objects in the environment give rise to multipath whereby multiple copies of the original signal travel along different paths. The multipath parameters that define the channel state information (CSI) effectively describe how the transmitted signal traversed the environment en route to the receiver [3, 4]. Specifically, the amplitude, phase, and path delay resulting from each propagation path capture distinct aspects of the propagation environment. The amplitude of a path encodes information about the attenuation suffered due to distance, shadowing by obstacles, and reflectivity of surfaces, thereby capturing variations in the geometry and material properties of the environment. Similarly, the time delay associated with each path corresponds to the geometric length of that path relative to the line-of-sight component, providing implicit clues about the spatial arrangement of reflectors and scatterers. The phase of a received component, influenced by accumulated propagation delay and any additional phase rotation from reflection, carries fine-grained information valuable for inferring relative positioning within the environment. Figure 1: Learning from radio waves via quantum sensing: An agent uses a quantum sensing probe |ψλ⟩ _λ that interacts with the incident RF electromagnetic field ξ, modeled as a unitary transformation UintU_int derived from the rotating wave approximation of the physical interaction. The agent then measures the perturbed state |ψλ(ξ)⟩ _λ(ξ), producing measurements z~ z, and learns to make predictions r~ r using a machine learning model fγ(z~)f_γ( z). Both the quantum circuit parameters λ and the neural network parameters γ are jointly optimised during training. Recent research has demonstrated that the CSI can be leveraged for environmental sensing and mapping, effectively re-purposing communication signals from carriers of bits to enablers of situational awareness. Techniques such as channel charting use the spatial and temporal structure embedded in the CSI to learn representations of the local radio geometry, enabling tasks such as user localisation or environmental mapping without external positioning references [5]. In indoor environments, fine-grained analysis of CSI has been used to extract human motion, showing that even subtle changes in the propagation environment reveal themselves as measurable variations in amplitude and phase across multipath components [6]. Moreover, deep learning methods applied to channel characteristics have been shown to identify environmental conditions — for example, differentiating between types of surrounding terrain or weather — by exploiting the sensitivity of CSI to environmental changes [7]. Because the channel encodes fine-grained spatial and temporal characteristics of the environment, accurately capturing this information requires a highly sensitive sensing apparatus. High sensing precision is essential to resolve small variations in the channel that carry valuable information about the propagation medium. Recently, a parallel trajectory of advances in quantum sensing highlights the ability of this technology to detect and measure minute signal variations [8]. Quantum sensors exploit uniquely quantum mechanical effects — such as superposition, entanglement, and coherence — to detect extremely weak signals with sensitivities that can surpass those of classical sensors [9]. These devices are inherently sensitive to electromagnetic fields and can be engineered to operate across broad frequency ranges, including the radio‑frequency (RF) bands relevant to wireless communication [10, 11]. Practically, capturing the benefits of quantum sensing is far from straightforward, since quantum sensors must operate on current hardware that is constrained by noise and physical size. Modern noisy intermediate-scale quantum (NISQ) devices, in particular, suffer from decoherence and sampling errors. Finding near-optimal quantum probe states for a specific sensing task involves searching over extremely high-dimensional spaces, making direct computational approaches largely infeasible. One approach to navigate the complexity of quantum optimisation in the NISQ era is to use variational methods [12], which has given rise to frameworks for variational quantum sensing (VQS) [13, 14]. A VQS scheme uses parameterised quantum circuits to prepare probe states that are adaptively optimised to suit the parameter estimation task. Prior works have demonstrated the potential of VQS focusing on enhancing estimation precision [15, 16], or improving estimation reliability [17, 18]. 1.2 Main Contributions Motivated by these advances, in this paper we investigate agents that use quantum RF sensing probes to learn from the incident RF electromagnetic fields as shown in Fig. 1. For example, the agent can learn to predict its position, or headings, to navigate unfamiliar terrain. The main contributions are summarised as follows. Variational quantum RF sensing: We first develop a framework for variational quantum RF sensing whereby a quantum sensing probe, optimised using a variational quantum circuit, interacts with the incident RF electromagnetic field. We derive a rotating wave approximation of the interaction from first principles, yielding an efficient training procedure suitable for NISQ hardware. Unlike existing VQS schemes, we are not attempting to solve the respective estimation problem directly. We are interested in enabling the agent to make predictions with the RF electromagnetic field serving as stimuli. Learning from the RF electromagnetic field: The agent uses a machine learning model to process the resulting information, effectively integrating sensing and learning. To train the quantum circuit and learning models, we use data obtained from a ray-tracer, and we do not assume any additional structural knowledge, e.g., the positions of the transmitters, or types of objects/materials in the environment. Once deployed, the proposed scheme does not actually require access to any channel state measurements; rather, the state of the quantum sensor is altered by the incident RF electromagnetic field, which then results in relevant information for the learning model. Experimental validation: Finally, we provide extensive experiments in realistic conditions using a localisation task in which the agent must determine whether it has reached specific target locations in the deployment environment. The results show that the use of variational quantum RF sensing can give rise to intelligent systems that can learn about the world directly from radio waves. Organisation: The remainder of the paper is organised as follows. Sec. I provides some preliminaries on wireless communication channels. In Sec. I we derive the interaction between the sensor and the RF field and in Sec. IV we present the proposed scheme for learning from radio waves. In Sec. V we provide experiments testing the performance of the proposed scheme, including details on the setup, benchmarks, and results. Sec. VI offers a discussion and concludes the paper. 2 Wireless Communication Channels In a wireless communication system, the fundamental objective is to reliably transfer bits of information between spatially separated endpoints without requiring a physical connection. To this end, a transmitter maps its message into bits of information b∈0,1b∈\0,1\ which are to be sent via the wireless channel, grouped into symbols sks_k. For example, Quadrature Amplitude Modulation (QAM) symbols are generated by mapping groups of k bits to complex values as sk=|sk|eiθk, s_k=|s_k|e^i _k, (1) where θk _k denotes the phase [19]. The continuous-time baseband signal is then formed by pulse shaping as xbb(t)=∑ks[k]p(t−kT), x_b(t)= _ks_[k]\,p(t-kT), (2) where p(t)p(t) is a pulse shaping filter (e.g., raised-cosine) and T is the symbol period. The baseband signal is then upconverted to a carrier frequency fcf_c as x(t)=Rexbb(t)ej2πfct. x(t)= \x_b(t)\,e^j2π f_ct \. (3) The choice of carrier frequency is largely driven by spectrum allocation, propagation physics, and the performance needs of the service. Different generations of wireless technologies operate in distinct ranges of the radio spectrum because of regulatory spectrum allocation and performance trade-offs [20]. The frequencies used by wireless communication systems are also referred to as RF. Figure 2: In a realistic environment, many physical components affect the propagation of radio waves. Structures such as buildings, vehicles, and terrain features cause the transmitted signal to undergo reflection, diffraction, and scattering. These result in multiple copies of the original signal traveling along different paths — a phenomenon known as multipath propagation. This real-valued signal then drives an antenna which radiates an electromagnetic wave into space. In particular, the antenna acts as a transducer that converts the high‑frequency electrical signal at its terminals into electromagnetic radiation propagating into space. When the voltage x(t)x(t) drives current through the antenna, this time‑varying current produces time‑varying charge and current distributions on the antenna structure, which in turn generate radiated RF electromagnetic waves (alternatively, radio waves). These radio waves occupy designated RF bands and propagate through the environment, carrying the encoded information. The propagation of radio waves is governed by the principles of electromagnetism and is influenced by both the physical medium and the surrounding environment [3, 4]. As a result, at the receiver, the signal is shaped by the propagation environment. In free space, radio waves attenuate according to the inverse-square law: as the distance between the transmitter and receiver increases, the signal power decreases rapidly due to spatial spreading. In realistic environments, however, many additional physical mechanisms affect the propagation of radio waves. Structures such as buildings, vehicles, and terrain features cause the transmitted signal to undergo reflection, diffraction, and scattering. As shown in Fig. 2, these result in multiple copies of the original signal traveling along different paths — a phenomenon known as multipath propagation. Each multipath component arrives at the receiver with a different delay, amplitude, and phase, leading to constructive and destructive interference that causes rapid fluctuations in the received signal, often described as fading. The resulting RF electromagnetic field at the receiver is given by [11] ξ(t) ξ(t) =∑k=1K∑l=1LμegTϵklPkρkl|sk| = _k=1^K _l=1^L _eg^T\, _kl\, P_k _kl\,|s_k| ×cos(ω(t−τkl)+ϕkl+θk), × (ω(t- _kl)+ _kl+ _k), (4) where k indexes transmitters and l indexes propagation paths, μeg _eg is the atomic transition dipole moment, ϵkl _kl is the field polarization unit vector, PkP_k is transmit power, τkl _kl is the propagation delay, and ρkl _kl is the amplitude. Using (2), the intended receiver would then attempt to decode the transmitted symbols sks_k and retrieve the transmitted message. In this scenario, the delay of each path τkl _kl reflects the geometric length of that path and the interactions it has encountered. Indirect paths impose larger delays and the resulting delay spread across all multipath components is itself a measure of the richness and complexity of the channel. Similarly, the amplitude ρkl _kl of each path depends on the degree of attenuation experienced along that path which arises from path length, absorption by materials, and the number and type of reflections encountered. Stronger amplitudes may indicate a relatively unobstructed path or surfaces with high reflectivity, whereas weaker amplitudes suggest either longer propagation distances, losses due to surface absorption, or more complex scattering. Phase shifts arise directly from the path length as −ωτkl-ω _kl and reflection and refraction in the environment which introduce additional phase rotation ϕkl _kl; as a result, the phase term of a multipath component contains fine-grained information about the interaction history of that component. These parameters form a signature of the propagation channel which reflects the presence, distribution, and properties of any objects and obstacles along the way. We show next how this can be used in order to learn tasks that require situational awareness. 3 Variational Quantum RF Sensing We consider an agent that uses a quantum sensing probe interacting with the incident RF electromagnetic field. The agent then makes predictions based on the accrued information. The agent does not decode any messages and it does not have any structural information about the environment, including the presence of objects or obstacles, types of material, or the number or positions of transmitters, making a prediction using only the interaction with the RF field. In particular, the evolution over time of a quantum state |ψ⟩=α|g⟩+β|e⟩ ψ=α g+β e, where |g⟩ g and |e⟩ e denote the ground and excited states, is governed by Schrödinger’s equation as [11] iℏ∂|ψ⟩∂t=H|ψ⟩, i ∂ ψ∂ t=H ψ, (5) where H=H0+H1(t). H=H_0+H_1(t). (6) In (6), the free Hamiltonian H0H_0 is given by H0=ℏω02σz, H_0= _02 _z, (7) and the time-dependent term H1(t)H_1(t) is given by H1(t) H_1(t) =ξ(t)|e⟩⟨g|+ξ∗(t)|g⟩⟨e|, =ξ(t)\, e g+ξ^*(t)\, g e, with ξ(t)ξ(t) defined in (2). The evolution in (6) can be transformed using the so-called rotating frame [21, 22, 23]. In particular, we first define the rotating frame state |ψ~⟩=U†(t)|ψ⟩=eiωtσz/2|ψ⟩,| ψ =U (t)|ψ =e^iω t\, _z/2\,|ψ , (9) where σz _z denotes the Pauli Z operator. To transform the Hamiltonian into the rotating frame, we consider the time evolution of a vector in the rotating frame given by [21] ∂|ψ~⟩∂t ∂| ψ ∂ t =(∂tU†(t))|ψ⟩+U†(t)∂t|ψ⟩ =( _tU (t))\,|ψ +U (t)\, _t|ψ (10) =((∂tU†(t))U(t)−iU†(t)HU(t))|ψ~⟩. = (( _tU (t))\,U(t)-i\,U (t)\,H\,U(t) )\,| ψ . (11) Moreover, using Ξ=∑k=1K∑l=1LμegTϵklPkρkl|sk|ei(−ωτkl+φkl+θk), = _k=1^K _l=1^L _eg^T\, _kl\, P_k _kl\,\,|s_k|e^i(-ω _kl+ _kl+ _k), (12) we define the Rabi frequency and phase as Ω=|Ξ| =| | (13) and Φ=arg(Ξ), =arg( ), (14) respectively. With these, we can write H1(t) H_1(t) ∝Ωcos(ωt+Φ)σx, \, (ω t+ )\, _x, (15) where σx _x denotes the Pauli X operator, and we can identify the so-called rotating wave Hamiltonian as H~=i(∂tU†(t))U(t)+U†(t)HU(t) H=i\,( _tU (t))\,U(t)+U (t)\,H\,U(t) =ℏω0−ω2σz+eiωtσz/2Ωcos(ωt+Φ)σxe−iωtσz/2 = _0-ω2\, _z+e^iω t\, _z/2\, \, (ω t+ )\, _x\,e^-iω t\, _z/2 =ℏω0−ω2σz+Ωcos(ωt+Φ)(0eiωte−iωt0). = _0-ω2\, _z+ \, (ω t+ )\, pmatrix0&e^iω t\\ e^-iω t&0 pmatrix. (16) Expanding cos(⋅) (·) using Euler’s formula cos(ωt+Φ)=12(e(iωt+iΦ)+e−(iωt+iΦ)) (ω t+ )= 12 (e^(iω t+i )+e^-(iω t+i ) ) (17) and plugging it into (3) results in H~ H =ℏω0−ω2σz+12Ω = _0-ω2\, _z+ 12\, ×(0e−iΦ+e2iωt+iΦe−2iωt−iΦ+eiΦ0). × pmatrix0&e^-i +e^2iω t+i \\ e^-2iω t-i +e^i &0 pmatrix. (18) The rapidly oscillating terms at 2ω2ω average out over timescales of interest and can be neglected. Thereby, for the rotating frame approximation we have H~ H ≈ℏω0−ω2σz+12Ω(0e−iΦeiΦ0) ≈ _0-ω2\, _z+ 12\, \, pmatrix0&e^-i \\ \,e^i &0 pmatrix (19) which can be expressed in the Pauli basis as H~≈ℏΔ2σz+ℏΩ2(cos(Φ)σx+sin(Φ)σy), H≈ 2 _z+ 2 ( ( )\, _x+ ( )\, _y ), (20) where Δ=ω0−ω = _0-ω and where σy _y denotes the Pauli Y operator. Moreover, when Δ=0 =0, the Hamiltonian in (20) simplifies to H~≈ℏΩ2(cos(Φ)σx+sin(Φ)σy). H≈ 2 ( ( )\, _x+ ( )\, _y ). (21) For N particle systems, assuming that the field is uniform and couples independently, the total Hamiltonian becomes H~=∑n=1NH~(n), H= _n=1^N H^(n), (22) with H~(n) H^(n) given by (21). With this, we can efficiently map the interaction to a NISQ device and train the agent as shown in Fig. 1. In particular, in the variational quantum RF sensing framework, the agent prepares a N-qubit quantum sensing probe |ψλ⟩ _λ, using a quantum circuit UλU_λ parameterised by vector λ as |ψλ⟩=Uλ|ψ0⟩ _λ=U_λ _0 (23) where |ψ0⟩ _0 denotes an initial state. The probe state |ψλ⟩ _λ then interacts with the RF field which is implemented as a unitary transformation Uint=⨂n=1NU(n),U_int= _n=1^NU^(n), (24) where U(n)U^(n) denotes the single-qubit unitary sequence acting on qubit n corresponding to the evolution operator in (21). Specifically, as (21) describes a rotation about an axis in the XYXY-plane at angle Φ from the X-axis, it can be decomposed as a basis-changing Rz(Φ)R_z( ) rotation, followed by an Rx(Ωt)R_x( t) rotation, and the inverse Rz(−Φ)R_z(- ), resulting in U(n)=Rz(Φ)Rx(Ωt)Rz(−Φ). U^(n)=R_z( )\,R_x( t)\,R_z(- ). (25) The perturbed state of the quantum sensor after the interaction is then given by |ψλ(ξ)⟩=Uint|ψλ⟩, _λ(ξ)=U_int _λ, (26) where Uint|ψλ⟩U_int _λ denotes the application of unitary UintU_int in (24) to state |ψλ⟩ _λ. 4 Learning from the RF Electromagnetic Field To train the agent, we consider a supervised learning setting where we have access to a training dataset. We obtain the training dataset using a simulator, as commonly done in simulation-based inference [24]. In general, a simulator builds and maintains a high-fidelity virtual representation of a physical system. In wireless communication, a ray-tracer can track the propagation paths between each transmitter-receiver pair based on the geometry and material information [25, 26]. In this process, multiple propagation paths are explicitly modeled by considering various propagation effects, such as reflection, scattering, and diffraction. The resulting propagation paths are used to determine the training dataset. The use of a ray-tracer for training is a deliberate design choice rooted in the simulation-to-real transfer paradigm [24], whereby the policy developed in simulation is then deployed in the real world without requiring further adaptation. Whilst the ray-tracer does require geometric and material information about the environment during training, this information is only needed once, offline, prior to deployment. Once the variational parameters are learned, the deployed agent requires no channel measurements and no knowledge of the environment — it operates solely through its interaction with the incident RF electromagnetic field. This separation between a knowledge-intensive training phase and a knowledge-free deployment phase is a key design principle of the proposed scheme. In particular, for M agent locations in the deployment environment, we obtain simulated RF fields ξm(t)ξ^m(t) for m=1,…,Mm=1,...,M, which we write as ξmξ^m. Together with the corresponding targets rmr^m, which depend on the task under consideration, we write the dataset as =ξm,rmm=1MD=\ξ^m,r^m\_m=1^M. We use the simulated fields to determine the interaction unitaries in (24), and implement the interaction with the RF electromagnetic field by extension. We note that the framework can also be extended to unsupervised learning, or reinforcement learning settings [27]. As seen in Fig. 1, to learn the task with a classical machine learning model, the perturbed state |ψλ(ξm)⟩ _λ(ξ^m) in (26) is measured which results in z~m=⟨O⟩ψλ(ξm). z^m= O _ _λ(ξ^m). (27) The result of the measurement z~m z^m is fed into the classical machine learning model fγ(z~m)f_γ( z^m) (i.e., a neural network) to obtain a prediction. The goal is to optimise the parameters of the quantum circuit U(λ)U(λ) and the learning model fγ(z~m)f_γ( z^m) and minimise a loss function ℒλ,γ(⋅)L_λ,γ(·) between the prediction and the true target. For example, the loss can be defined as the cross-entropy loss in the case of classification as ℒλ,γ(ξm,rm)=−logewr~m∑r′ewr~′m, _λ,γ(ξ^m,r^m)=- e^w^m_ r _r e^w^m_ r , (28) where wr~mw^m_ r denotes the logit corresponding to hypothesis r~m r^m and the sum runs over all hypotheses. In the case of regression, the loss can be defined as the mean-squared error (MSE) loss whereby ℒλ,γ(ξm,rm)=(fγ(z~m)−rm)2. _λ,γ(ξ^m,r^m)=(f_γ( z^m)-r^m)^2. (29) Formally, we aim to solve arg minλ,γ1M∑m=1Mℒλ,γ(ξm,rm), λ,γarg min\,\,\, 1M _m=1^ML_λ,γ(ξ^m,r^m), (30) where the optimisation is done over the variational parameters λ,γλ,γ. This problem is solved via gradient descent whereby the parameters are updated as λ←λ−η∂ℒλ,γ(ξm,rm)∂λ, λ←λ-η _λ,γ(ξ^m,r^m)∂λ, (31) and γ←γ−η∂ℒλ,γ(ξm,rm)∂γ, γ←γ-η _λ,γ(ξ^m,r^m)∂γ, (32) where η denotes a learning rate. The gradients in (31) can be obtained via parameter-shift rules on quantum hardware, or via standard backpropagation on a simulator [28, 29, 30]. The gradients in (32) can also be computed via standard backpropagation. Training is done in silico, using the ray-tracer data, and, assuming transferability, one could then use the same variational parameters when the RF field is unknown in a real environment. 5 Experiments In this section, we provide experimental results to validate the proposed scheme. The code for the experiments will be available on Github. 5.1 Task We consider the task of localisation, where an agent must determine whether it reached a predetermined target location [31, 32]. In particular, we consider the network deployment depicted in Fig. 3, in which two transmitters (blue circles) communicate in an urban environment with multiple buildings (grey rectangles). We assume that communication takes place at a frequency fc=2.14f_c=2.14 GHz. The transmitter has single-antenna equipment and the agent’s location [xm,ym][ x^m, y^m], where xm∈[−70,70] x^m∈[-70,70], ym∈[−30,70] y^m∈[-30,70], is generated uniformly at random within the deployment area. We assume unit transmit power. The ray-tracer takes as input the agent’s location [xm,ym][ x^m, y^m] and, for an unmodulated carrier, produces the sequence of complex path gains, and propagation delays aklm,τklmk=1Kl=1Lm=1M\\\a_kl^m,τ_kl^m\\_k=1^K\_l=1^L\_m=1^M, where aklm=μegTϵklPkρklmeiϕklm.a_kl^m= _eg^T\, _kl\, P_k _kl^me^iφ^m_kl. (33) We use the Sionna ray-tracer [25] and we generate the dataset under ground-truth material parameters. Whilst Sionna does not inherently support quantum receivers, using an isotropic ("iso") antenna at the agent location ensures that the ray tracer captures the full incoming field without any hardware-specific filtering. Figure 3: We consider an urban scenario with two transmitters (blue circles) communicating in an urban environment with multiple buildings (grey rectangles). We assume that communication occurs at a frequency fc=2.14f_c=2.14 GHz. The transmitter has single-antenna equipment and the agent’s location [x,y][ x, y] is generated uniformly at random within the deployment area. We consider two target locations: one which has a transmitter nearby and, as a result, strong line-of-sight path (orange rectangle), and the second which is hidden behind an obstacle (red rectangle). Figure 4: Accrued training loss (left), testing loss (center), and prediction accuracy (right) over training epochs. We consider the target location shown in Fig. 3 (orange rectangle) which has a transmitter nearby and strong line-of-sight path. All results are averaged over three independent trials. Shaded regions represent the variance across trials, which reflects random initialisation and dropout during training. Figure 5: Accrued training loss (left), testing loss (center), and prediction accuracy (right) over training epochs. We consider the target location shown in Fig. 3 (red rectangle) which is hidden behind obstacles. All results are averaged over three independent trials. Shaded regions represent the variance across trials, which reflects random initialisation and dropout during training. The agent’s goal is to learn to predict whether it reached the target after the interaction with the RF field. We specifically chose two distinct target locations, shown in Fig. 3, whereby one target (orange rectangle) has a transmitter nearby and, as a result, strong line-of-sight path, and the second target (red rectangle) is hidden behind an obstacle. Each target region is a rectangle of size 10×1010× 10. Specifically, for the first target location the corresponding target labels are given by rm=0,−30≤xm≤−20,10≤ym≤201,otherwise, r^m= cases&0,\,\,\,-30≤ x^m≤-20,10≤ y^m≤ 20\\ &1,\,\,\,otherwise, cases for m=1,…,Mm=1,...,M, and for the second target we have rm=0,−70≤xm≤−60,10≤ym≤201,otherwise, r^m= cases&0,\,\,\,-70≤ x^m≤-60,10≤ y^m≤ 20\\ &1,\,\,\,otherwise, cases for m=1,…,Mm=1,...,M. We note that, if the agent could use the position of the transmitters, this problem could be solved with standard triangulation techniques. Here however, we assume that the agent has no knowledge about the environment and only uses the interaction to make a prediction at deployment. 5.2 Architecture and Hyperparameters We consider an N=10N=10-qubit probe state |ψλ⟩ _λ prepared by a parameterised circuit UλU_λ comprised of 55 layers. Each layer consists of two-qubit RZ(λ)R_Z(λ) gates, followed by single-qubit RZ(λ)R_Z(λ) and RY(λ)R_Y(λ) rotations. The parameters λ are not shared between gates and layers. We use a compact learning model. In particular, we measure the expected values of Pauli Z observables and feed the results into a fully-connected neural network. The model is comprised of two hidden layers of sizes 128128 and 6464 and ReLU activations. The input layer of size N takes in the measurement outcomes of |ψλ(ξm)⟩ _λ(ξ^m), and the output layer of size 22 produces the logits for two classes. We use dropout with rate 0.20.2. The variational parameters are learnt by minimising the cross-entropy loss via (31) and (32) with learning rate η=0.003η=0.003. We use M=2000M=2000 data samples, with 0.8−0.20.8-0.2 train-test split and mini-batches of size 6464. 5.3 Benchmarks As a benchmark, we consider the setting where all multipath parameters are known. This approach is inspired by existing works in the literature that use CSI measurements for localisation tasks, e.g., [5, 6, 33, 34]. We specifically choose this scheme since in this case the agent has access to both richer information (all individual multipath parameters and delay spread by extension), as well as a very powerful model architecture. The sequence of complex path gain, and propagation delay aklm,τklmk=1Kl=1Lm=1M\\\a_kl^m,τ_kl^m\\_k=1^K\_l=1^L\_m=1^M, is used to obtain the inputs to a long short-term memory (LSTM) network [35] that learns to predict whether the target location has been reached. Specifically, the LSTM model takes as the m-th input the phase sequence ϕklm=arg(aklme−iωτklm) φ^m_kl=arg(a^m_kle^-iωτ^m_kl) (34) for k=1,…,Kk=1,...,K and l=1,…,Ll=1,...,L. The sequence is time-ordered with the phase ϕklmφ^m_kl corresponding to the smallest delay τklmτ^m_kl being the first entry, and the phase ϕklmφ^m_kl corresponding to the largest delay τklmτ^m_kl being the last. We choose a sequential model to encode an inductive bias into the choice of the architecture, aiming for the best possible performance [36]. In addition, LSTM networks naturally handle data samples of different length, as is the case here. The model is comprised of an LSTM layer with hidden size of 6464, followed by a fully connected layer of size 6464 and an output layer of size 22, for a total of 17,28217,282 trainable parameters. We use the Adam optimiser with a learning rate of 0.00010.0001. We use M=2000M=2000 data samples, with 0.8−0.20.8-0.2 train-test split and mini-batches of size 6464. 5.4 Results We evaluate the performance of the proposed scheme and the benchmark by examining the accumulated training and testing loss, as well the prediction accuracy, defined as the ratio of correctly classified inputs. The training loss is calculated according to (28), reflecting how well each method fits the data it was trained on. The testing loss and the prediction accuracy are evaluated on 400400 previously unseen data samples providing a measure of how well the learned model generalises to new inputs not encountered during training. Figure 4 shows the evolution of training and testing losses. The proposed scheme successfully learns the underlying task, as evidenced by the rapid decrease and eventual stabilisation of the training loss. In addition, it is consistent across trials and achieves high prediction accuracy after only a few training epochs, demonstrating that the model does not only memorise the training data, but it generalises quickly to unseen samples. We compare the proposed scheme with a sequential model that learns from the complete sequence of multipath parameters. The benchmark converges more slowly and plateaus at a higher loss level. Overall, the results in Fig. 4 demonstrate that learning from electromagnetic fields with variational quantum sensors is both fast and accurate, generalising well to unseen data and outperforming the benchmark in convergence speed. When the target location is hidden behind obstacles, as seen in Fig. 5, the agent can still learn to solve the task using the proposed approach. Overall, the training and testing losses are higher and the prediction accuracy is lower compared to the first target location. However, the agent still achieves comparable prediction accuracy to the classical benchmark. We note that the benchmark has access to full channel knowledge. The proposed scheme almost matches a fully-informed, classical model with a suitable architecture and double capacity, even in the harder setting, highlighting the sensitivity of quantum sensing to weaker RF electromagnetic fields. Unlike the benchmark, the proposed approach does not require any channel measurements once deployed and could carry out its task only by interacting with the RF electromagnetic field. 6 Discussion and Conclusion Quantum sensors could significantly expand what intelligent machines can perceive about their environment, enabling capabilities beyond the reach of classical techniques. In this paper, we investigated the use of variational quantum sensing to learn from the incident RF electromagnetic field in wireless communication networks. We provided extensive experiments in realistic conditions on a localisation task, and we showed that using quantum RF sensing can enable machine intelligence that can effectively utilise the radio channel as a powerful source of knowledge about the world. Several directions remain for future work. Practical quantum systems are subject to noise and decoherence, and incorporating realistic noise models into the framework is an important next step. Alternative parameterised circuits, such as symmetry-preserving ansätze, could further improve performance whilst reducing circuit complexity [37]. Although we did not encounter barren plateau issues in our relatively shallow circuits [38], advanced optimisers or regularisation strategies to mitigate this phenomenon are also worth exploring, as are gradient-free training methods. Finally, more suitable architectures for the learning model can also be useful. References [1] H. Yu, H. Lee, and H. Jeon, “What is 5g? emerging 5g mobile services and network requirements,” Sustainability, vol. 9, no. 10, p. 1848, 2017. [2] T. Wild, A. Grudnitsky, S. Mandelli, M. Henninger, J. Guan, and F. Schaich, “6g integrated sensing and communication: From vision to realization,” in 2023 20th European Radar Conference (EuRAD). IEEE, 2023, p. 355–358. [3] D. Tse and P. Viswanath, Fundamentals of wireless communication. Cambridge university press, 2005. [4] P. Popovski, Wireless Connectivity: An Intuitive and Fundamental Guide. John Wiley & Sons, 2020. [5] C. Studer, S. Medjkouh, E. Gonultaş, T. Goldstein, and O. Tirkkonen, “Channel charting: Locating users within the radio environment using channel state information,” IEEE Access, vol. 6, p. 47 682–47 698, 2018. [6] H. Zhou, Y. Zhang, and M. Temiz, “High-resolution indoor sensing using channel state information of wifi networks,” Electronics, vol. 12, no. 18, p. 3931, 2023. [7] S. Ribouh, R. Sadli, Y. Elhillali, A. Rivenq, and A. Hadid, “Vehicular environment identification based on channel state information and deep learning,” Sensors, vol. 22, no. 22, p. 9018, 2022. [8] C. L. Degen, F. Reinhard, and P. Cappellaro, “Quantum sensing,” Reviews of modern physics, vol. 89, no. 3, p. 035002, 2017. [9] C. W. Helstrom, “Quantum detection and estimation theory,” Journal of statistical physics, vol. 1, no. 2, p. 231–252, 1969. [10] M. T. Simons, A. B. Artusio-Glimpse, A. K. Robinson, N. Prajapati, and C. L. Holloway, “Rydberg atom-based sensors for radio-frequency electric field metrology, sensing, and communications,” Measurement: Sensors, vol. 18, p. 100273, 2021. [11] M. Cui, Q. Zeng, and K. Huang, “Towards atomic mimo receivers,” IEEE Journal on Selected Areas in Communications, vol. 43, no. 3, p. 659–673, 2025. [12] M. Cerezo, A. Arrasmith, R. Babbush, S. C. Benjamin, S. Endo, K. Fujii, J. R. McClean, K. Mitarai, X. Yuan, L. Cincio et al., “Variational quantum algorithms,” Nature Reviews Physics, vol. 3, no. 9, p. 625–644, 2021. [13] S. Nolan, A. Smerzi, and L. Pezzè, “A machine learning approach to bayesian parameter estimation,” npj Quantum Information, vol. 7, no. 1, p. 169, 2021. [14] T. Xiao, J. Fan, and G. Zeng, “Parameter estimation in quantum sensing based on deep reinforcement learning,” npj Quantum Information, vol. 8, no. 1, p. 2, 2022. [15] J. J. Meyer, J. Borregaard, and J. Eisert, “A variational toolbox for quantum multi-parameter estimation,” npj Quantum Information, vol. 7, no. 1, p. 89, 2021. [16] B. MacLellan, P. Roztocki, S. Czischek, and R. G. Melko, “End-to-end variational quantum sensing,” arXiv preprint arXiv:2403.02394, 2024. [17] I. Nikoloska, H. Joudeh, R. van Sloun, and O. Simeone, “Dynamic estimation loss control in variational quantum sensing via online conformal inference,” arXiv preprint arXiv:2505.23389, 2025. [18] I. Nikoloska, R. Van Sloun, and O. Simeone, “Adaptive bayesian single-shot quantum sensing,” arXiv preprint arXiv:2507.16477, 2025. [19] T. S. Rappaport, “Wireless communications–principles and practice, (the book end).” Microwave Journal, vol. 45, no. 12, p. 128–129, 2002. [20] A. Tikhomirov, E. Omelyanchuk, and A. Semenova, “Recommended 5g frequency bands evaluation,” in 2018 Systems of Signals Generating and Processing in the Field of on Board Communications. IEEE, 2018, p. 1–5. [21] C. Fleming, N. Cummings, C. Anastopoulos, and B.-L. Hu, “The rotating-wave approximation: consistency and applicability from an open quantum system analysis,” Journal of Physics A: Mathematical and Theoretical, vol. 43, no. 40, p. 405304, 2010. [22] G. Agarwal, “Rotating-wave approximation and spontaneous emission,” Physical Review A, vol. 4, no. 5, p. 1778, 1971. [23] D. Zeuch, F. Hassler, J. J. Slim, and D. P. DiVincenzo, “Exact rotating wave approximation,” Annals of physics, vol. 423, p. 168327, 2020. [24] K. Cranmer, J. Brehmer, and G. Louppe, “The frontier of simulation-based inference,” Proceedings of the National Academy of Sciences, vol. 117, no. 48, p. 30 055–30 062, 2020. [25] J. Hoydis, S. Cammerer, F. A. Aoudia, A. Vem, N. Binder, G. Marcus, and A. Keller, “Sionna: An open-source library for next-generation physical layer research,” arXiv preprint arXiv:2203.11854, 2022. [26] J. Hoydis, F. A. Aoudia, S. Cammerer, M. Nimier-David, N. Binder, G. Marcus, and A. Keller, “Sionna rt: Differentiable ray tracing for radio propagation modeling,” in 2023 IEEE Globecom Workshops (GC Wkshps). IEEE, 2023, p. 317–321. [27] I. Goodfellow, Y. Bengio, A. Courville, and Y. Bengio, Deep learning. MIT press Cambridge, 2016, vol. 1, no. 2. [28] K. Mitarai, M. Negoro, M. Kitagawa, and K. Fujii, “Quantum circuit learning,” Physical Review A, vol. 98, no. 3, p. 032309, 2018. [29] M. Schuld, V. Bergholm, C. Gogolin, J. Izaac, and N. Killoran, “Evaluating analytic gradients on quantum hardware,” Physical Review A, vol. 99, no. 3, p. 032331, 2019. [30] I. Nikoloska, “Machine learning with quantum computers,” in Artificial Intelligence and Intelligent Matter: Nanoscience, Soft Matter, Philosophy. Springer, 2026, p. 417–434. [31] N. Akai, S. A. Rahok, K. Inoue, and K. Ozaki, “Development of magnetic navigation method based on distributed control system using magnetic and geometric landmarks,” ROBOMECH Journal, vol. 1, no. 1, p. 21, 2014. [32] A. Ataka, H.-K. Lam, and K. Althoefer, “Magnetic-field-inspired navigation for robots in complex and unknown environments,” Frontiers in Robotics and AI, vol. 9, p. 834177, 2022. [33] H. Yu, G.-L. Chen, G.-J. Yu, S.-H. Zhao, B. Yang, and J. Liu, “Indoor passive localisation based on reliable csi extraction,” IET Communications, vol. 13, no. 11, p. 1633–1642, 2019. [34] S. De Bast, A. P. Guevara, and S. Pollin, “Csi-based positioning in massive mimo systems using convolutional neural networks,” in 2020 IEEE 91st Vehicular Technology Conference (VTC2020-Spring). IEEE, 2020, p. 1–5. [35] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, vol. 9, no. 8, p. 1735–1780, 1997. [36] M. M. Bronstein, J. Bruna, T. Cohen, and P. Veličković, “Geometric deep learning: Grids, groups, graphs, geodesics, and gauges,” arXiv preprint arXiv:2104.13478, 2021. [37] I. Nikoloska, O. Simeone, L. Banchi, and P. Veličković, “Time-warping invariant quantum recurrent neural networks via quantum-classical adaptive gating,” Machine Learning: Science and Technology, 2023. [38] J. R. McClean, S. Boixo, V. N. Smelyanskiy, R. Babbush, and H. Neven, “Barren plateaus in quantum neural network training landscapes,” Nature communications, vol. 9, no. 1, p. 4812, 2018.