SYSTEM AND METHOD FOR TRANSMITTING COVERT WIRELESS SIGNALS WITHIN AN OVERT WIRELESS SIGNAL TRANSMISSION

20230078254 · 2023-03-16

Inventors

Cpc classification

International classification

Abstract

A system and method for transmitting, from an encoder to a decoder, one or more covert wireless signals within an overt wireless signal. The encoder receives a bitstream and encodes the received bitstream into an encoded noise signal that replicates a noise signal of a predetermined hardware device. The encoded noise signal is then combined with a cover modulated signal to form at least one covert wireless signal that is distinct from and conceals the received bitstream. The covert wireless signal is transmitted within an overt wireless signal to a decoder that receives the covert wireless signal, removes the cover modulated signal from the received covert wireless signal to isolate the encoded noise signal, and then converts the isolated encoded noise signal into a decoded bitstream. The covert wireless signal can be a plurality of carrier signals, optionally established through orthogonal frequency-division multiplexing (OFDM) or quadrature amplitude modulation (QAM).

Claims

1. A system for transmitting one or more covert wireless signals within an overt wireless signal, comprising: an encoder configured to: receive a bitstream; encoding the received bitstream to an encoded noise signal, the encoded noise signal replicating a noise signal of a predetermined hardware device; and combine a cover modulated signal with the encoded noise signal to form at least one covert wireless signal, the at least one covert wireless signal distinct from the received bitstream; transmit the at least one covert wireless signal within an overt wireless signal; and a decoder operably coupled to the encoder via the overt wireless signal, the decoder configured to: receive the at least one covert wireless signal; remove the cover modulated signal from the received at least one covert wireless signal to isolate the encoded noise signal; and convert the isolated encoded noise signal into a decoded bitstream.

2. The system of claim 1, further comprising a critic module operably coupled to the encoder, the critic module configured to: compare the encoded noise signal generated by the encoder and the noise signal of the predetermined hardware device; and determine statistical properties for each of the encoded noise signal and the noise signal of the predetermined hardware device.

3. The system of claim 2, wherein the encoder is further configured to adjust characteristics of the encoded noise signal in response to the critic module determining that the statistical properties for the encoded noise signal differ from the statistical properties of the noise signal of the predetermined hardware device.

4. The system of claim 1, wherein each of the encoder and the decoder includes a multi-node neural network.

5. The system of claim 1, wherein the encoded noise signal is within a transmission bandwidth defined by at least one predetermined regulatory communication standard.

6. The system of claim 1, wherein: the encoder further configured to transmit the bitstream to the decoder for a predetermined training session; and the decoder further configured to: receive the bitstream from the encoder during the training session; and compare the received bitstream with the decoded bitstream to thereby determine a decoding accuracy.

7. The system of claim 6, wherein the decoder further configured to transmit the decoding accuracy to the encoder.

8. The system of claim 1, wherein the at least one covert wireless signal is a plurality of orthogonal frequency-division multiplexing (OFDM) carrier signals or quadrature amplitude modulation (QAM) carrier signals.

9. The system of claim 1, wherein the at least one covert wireless signal is encrypted.

10. A method for transmitting one or more covert wireless signals within an overt wireless signal, comprising: receiving a bitstream at an encoder; encoding the received bitstream, at the encoder, into an encoded noise signal, the encoded noise signal replicating a noise signal of a predetermined hardware device; combining a cover modulated signal with the encoded noise signal to form at least one covert wireless signal, the at least one covert wireless signal distinct from the received bitstream; transmitting the at least one covert wireless signal from the encoder within an overt wireless signal; receiving the at least one covert wireless signal at a decoder; removing, at the decoder, the cover modulated signal from the received at least one covert wireless signal to isolate the encoded noise signal; and converting, at the decoder, the isolated encoded noise signal into a decoded bitstream.

11. The method of claim 10, further comprising: comparing, at a critic module, the encoded noise signal generated by the encoder and the noise signal of the predetermined hardware device; and determining statistical properties for each of the encoded noise signal and the noise signal of the predetermined hardware device.

12. The method of claim 10, further comprising: adjusting, at the encoder, characteristics of the encoded noise signal in response to the critic module; and determining, at the encoder, that the statistical properties for the encoded noise signal differ from the statistical properties of the noise signal of the predetermined hardware device.

13. The method of claim 10, further comprising creating a multi-node neural network at each of the encoder and the decoder.

14. The method of claim 10, wherein encoding the received bitstream, at the encoder, into to the encoded noise signal is encoding within a transmission bandwidth defined by at least one predetermined regulatory communication standard.

15. The method of claim 10, further comprising: transmitting the bitstream from the encoder to the decoder for a predetermined training session; receiving the bitstream at the decoder during the training session; and comparing, at the decoder, the received bitstream with the decoded bitstream to thereby determine a decoding accuracy.

16. The method of claim 10, further including, in the at least one covert wireless signal, a plurality of orthogonal frequency-division multiplexing (OFDM) carrier signals or a plurality of quadrature amplitude modulation (QAM) carrier signals.

17. The method of claim 10, further including encrypting the at least one covert wireless signal.

18. A system for transmitting one or more covert wireless signals within an overt wireless signal, comprising: an encoding means for: receiving a bitstream; encoding the received bitstream to an encoded noise signal, the encoded noise signal replicating a noise signal of a predetermined hardware device; and combining a cover modulated signal with the encoded noise signal to form at least one covert wireless signal, the at least one covert wireless signal distinct from the received bitstream; transmitting the at least one covert wireless signal within an overt wireless signal; and a decoding means operably coupled to the encoding means via the overt wireless signal, the decoding the means for: receiving the at least one covert wireless signal; removing the cover modulated signal from the received at least one covert wireless signal to isolate the encoded noise signal; and converting the isolated encoded noise signal into a decoded bitstream.

19. The system of claim 18, further comprising a critic means, operably coupled to the encoding means, the critic means for: comparing the encoded noise signal generated by the encoding means and the noise signal of the predetermined hardware device; and determining statistical properties for each of the encoded noise signal and the noise signal of the predetermined hardware device.

20. The system of claim 18, wherein: the encoding means further transmitting the bitstream to the decoding means for a predetermined training session; and the decoding means further: receiving the bitstream from the encoder during the training session; and comparing the received bitstream with the decoded bitstream to thereby determine a decoding accuracy.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

[0013] FIG. 1 is a pictorial diagram of a high-bandwidth covert side-channel between multiple radios using a common wireless network.

[0014] FIG. 2A shows a graph representing hardware noise.

[0015] FIG. 2B shows a schematic view of a system for hiding covert wireless signals in an overt wireless signal.

[0016] FIG. 3A shows a schematic view of a neural network architecture for an encoder of the system shown in FIG. 2B.

[0017] FIG. 3B shows a schematic view of a neural network architecture for a decoder of the system shown in FIG. 2B.

[0018] FIG. 3C shows a schematic view of a neural network architecture for a critic module of the system shown in FIG. 2B.

[0019] FIG. 4A shows a graph representing the loss function for QPSK at the training SNR.

[0020] FIG. 4B shows a graph representing the generated noise signal constellation for QPSK at the training SNR.

[0021] FIG. 4C shows a graph representing the transmitted covert signal constellation for QPSK at the training SNR.

[0022] FIG. 4D shows a graph representing the received covert signal constellation for QPSK at the training SNR.

[0023] FIG. 5A shows a graph representing the learned distribution of the system shown in FIG. 2B.

[0024] FIG. 5B shows a graph representing the confidence probability generated by the critic module for the system shown in FIG. 2B.

[0025] FIG. 6A shows a graph representing the Bit error rate for the system shown in FIG. 2B.

[0026] FIG. 6B shows a graph representing the corresponding regions of operation.

DETAILED DESCRIPTION OF THE INVENTION

[0027] With reference to the figures in which like numerals represent like elements throughout the several views, FIG. 1 is a pictorial diagram of a high-bandwidth covert side-channel between multiple radios, transmitter radio 10 and receiver radio 12 using a common wireless network 14. The method is covert because, in one embodiment, the devices, such transmitter radio 10 and receiver radio 12 (or like laptops or smartphones) can function as normal devices using standard over-the-air communication channels that transmit overt signals. But, rather using over encrypted messaging with each other, or through or some centralized server or other device, the transmitter radio 10 and receiver radio 12 can appear to be conducting normal network communication (browsing web pages, sending mail, streaming multimedia) when, in fact, they are able to communicate undetected. An adversary will face great challenge in discovering the side channel because the covert channel(s) is being transmitted by normal mobile nodes.

[0028] In one embodiment, the technique uses a common, physical-layer protocol to mask the communication that takes advantage of the hardware imperfections present in commodity hardware, intrinsically noisy channel of wireless communication, as well as potentially receiver diversity. When embodied within software-defined radios, the system operates in the standard 2.4 GHz ISM band, but can also be easily extended to TV or other broadcast channel whitespaces. In one embodiment, the system 22 (FIG. 2B) uses an OFDM waveform. Most consumer electronic devices use OFDM waveforms for high-bandwidth networks (including DVB, DAB, WiFi, WiMAX and LTE), and there are benefits in “hiding” in such a ubiquitous waveform. For example, imperfections in off-the-shelf Network Interface Cards (NICs), coupled with an additive random wireless channel, causes the signal to degrade over time and distance. To mask a covert communication, one can “pre-distort” the signal to mimic the normal imperfection of the hardware and Gaussian distortion arising from the channel. This distortion appears as noise to the unobservant receiver, be it the Wi-Fi access point or an adversary. However, a receiver (such as receiver radio 12) can be aware of the presence of the signal and its encoding technique can decode the “noise” to reveal the hidden message in the covert signal.

[0029] FIG. 2A shows a graph representing hardware noise, and FIG. 2B shows a schematic view of a system 22 for hiding covert signals in wireless signals. The radio frontend and analog components in the transmitting radio 10 and the receiving radio 12 radios introduce impairments, including carrier frequency offset, sampling clock offset, phase noise, IQ imbalance, DC offset and power amplifier (PA) non-linearity. In practical scenarios, a combination of these different impairments render as a noise that changes a modulated constellation point to a constellation cloud 10, as shown FIG. 2A. Wireless standards provide acceptable range of error for operation. Consequently, one can generate a covert wireless signal statistically identical distribution of hardware noise such that a steganalysis on this covert signal will not be able to differentiate whether the source of the noise is the transmitter frontend or there is an underlying covert communication.

[0030] Generative Adversarial Networks (GAN) are extant intelligent networks that can generate realistic images, videos, speech, handwritten text that efficiently transforms the domain of input data to another desired domain. The present invention can leverage this property of the GANs to transfer the domain of secret message to a hardware noise, which can be carried by any cover signal of choice. FIG. 2B shows a block diagram of the system 12 for adversarial learning, where the encoder 14 and decoder 16 network is trained in presence of a critic network (or module) 22, which acts like a steganalyzer to differentiate true hardware noise from encoder generated covert signal.

[0031] In the system 12, one or more covert wireless signals (such as QAM or OFDM digital signals) can be transmitted within an overt wireless signal (such as Wi-Fi, LTE, LoRA, or other standard communication, regulatory band signals). The encoder 14 receives a bitstream M and encodes the received bitstream (C.sub.enc) into an encoded noise signal that replicates a noise signal (N) of a predetermined hardware device (Noise generator 20), such as a transmitting radio 10 or other transmitting device, such as repeater, Wi-Fi router, etc. The encoded noise signal is then combined with a cover modulated signal (C.sub.mod) to form at least one covert wireless signal that is distinct from and conceals the received bitstream. The covert wireless signal is transmitted within an overt wireless signal over a channel 16 to a decoder 18 that receives the covert wireless signal (C.sub.mod), removes the cover modulated signal from the received covert wireless signal to isolate the encoded noise signal (C.sub.enc), and then converts the isolated encoded noise signal into a decoded bitstream (M). The decoder 18, can, but does not have to, receive the overt wireless signal and/or act upon it.

[0032] The system 12 can include a critic module 22 operably coupled to the encoder 14 that compares the encoded noise signal (C.sub.enc) generated by the encoder 14 and the noise signal (N) of the predetermined hardware device, and determines statistical properties for each of the encoded noise signal and the noise signal of the predetermined hardware device. The predetermined hardware device can be any device that is known to introduce an amount of noise in a wireless signal transmission. The encoder 14 can be further configured to adjust characteristics of the encoded noise signal in response to the critic module 22 determining that the statistical properties for the encoded noise signal (C.sub.enc) differ from the statistical properties of the noise signal (N) of the predetermined hardware device.

[0033] In one embodiment, each of the encoder 14 and the decoder 16 includes a multi-node neural network. Other types of AI or other expert systems can be used for each of the encoder 14 and decoder 16. In such embodiment, the encoder 14 can be further configured to transmit the original bitstream (M) to the decoder 16, or other device, for a predetermined training session such that the decoder 16 will receive the bitstream from the encoder 14 during the training session, and compare the received bitstream with the decoded bitstream to thereby determine a decoding accuracy. Thus, the network can be “trained” to ensure it is correctly recreating the covert bitstream. The decoder 16 can relay the decoding accuracy to the encoder 14 and thus train the system 12 to increase accuracy in receipt and reveal of the contents of the one or more covert wireless signals

[0034] As is further explained herein, the covert wireless signal can be a single signal within the overt channel, or can be plurality of carrier signals, optionally established through orthogonal frequency-division multiplexing (OFDM) or quadrature amplitude modulation (QAM). The covert signal(s) can also themselves be encrypted depending on the embodiment, but the encryption/decryption of the covert signals does add overhead to the data transmission, both lowering data capacity as well as increasing the possibility of detection.

[0035] Several advantages of the present invention can be categorized as: 1) Cover-independent covert signal: The proposed method to generate a covert signal by domain transformation is independent of any properties of the cover signal, like waveform or modulation order; 2) High capacity. As the covert signal is independent of the cover, one symbol of cover signal can embed one symbol of covert signal. Hence, in the domain of complex representation of signals, it can achieve up to 100% embedding capacity; 3) Hardware Noise as an input to the NN: a flexible neural network architecture can be used where the variation of hardware noise is chosen as an input parameter; 4) Steganalyzer in a training session: Instead of performing steganalysis as a separate task, the steganalyzer can be integrated during the process of encoding, in form of a critic module 22. The critic module 22 helps in differentiating true hardware noise and encoder 14 generated covert signal, thus providing important feedback to the encoder 14 for optimizing the encoding process. As the steganalysis is performed in signal domain, and not on decoded data, it is resilient to signal anomaly detection techniques; and 5) Operational in wide range of SNR: Instead of modulating symbol-by-symbol as in traditional communication system, the encoder 14 and decoder 18 are designed to operate on blocks of bits, which improves the performance of the covert link at different levels of induced hardware noise

[0036] In one embodiment, system 12 can consist of three main nodes: an encoder 14, a decoder 18 and a critic as shown in FIG. 1b. The encoder 14 encodes a confidential message M and outputs covert noise vector C.sub.enc that has the same statistical properties of the transmitter's hardware impairments. In other words, C.sub.enc˜C custom-character (0,σ.sup.2.sub.HW) where σ.sup.2.sub.HW equals maximum constellation error (E.sub.rms)(i.e. σ.sup.2.sub.HW=E.sub.rms) where for a given modulation order m. The encoder modulates C.sub.enc using a cover signal to produce a complex modulated covert signal C.sub.mod, then transmits C.sub.mod over a broadcast channel.

[0037] Similar to most steganographic schemes, a distorted cover signal N.sub.mod consists of a modulated cover signal added to a noise signal N. The noise signal N is collected from a real transmitter to carry the statistical properties of the transmitter's hardware impairments. In other words, N˜C custom-character (0,σ.sup.2.sub.HW). The main goal of the encoder is encoding the confidential information M to generate complex covert signal C.sub.mod that looks statistically identical to a distorted cover signal N.sub.mod such that any receiver can demodulate C.sub.mod as a standard modulated signal. But, an intended receiver with a decoder 18 neural network, can decode it and extract the secret message, M. Thus, there can exist an AWGN channel between the encoder 14 and decoder 18. If the encoder 14 transmits a complex modulated signal C.sub.mod the decoder 18 receives Ċ.sub.mod which is given by: Ċ.sub.mod=C.sub.mod+W where W˜C custom-character (0,σ.sup.2.sub.ch) is the added noise vector due to the channel, and σ.sup.2.sub.ch depends on SNR of the channel (SNR.sub.ch).

[0038] The decoder 18 first demodulates the received complex modulated covert signal Ċ.sub.mod as a standard modulated cover signal, which is then subtracted from Ċ.sub.mod to reveal the encoded noise vector Ċ.sub.enc. Then, it decodes Ċ.sub.enc to recover original message M. The critic module 22 (Steganalyzer) is required to distinguish between C.sub.mod and N.sub.mod. It accepts C.sub.mod or N.sub.mod and calculates the confidence probability (P.sub.con) for each sequence. The critic module 22 measures the statistical properties for both C.sub.mod and N.sub.mod. C.sub.mod can be detected as an altered message if the two sequences C.sub.mod and N.sub.mod have different distribution. Thus, the encoder 14 has to modify C.sub.mod so that it looks statistically similar to N.sub.mod. However if P.sub.con=0.5, then the encoder 14 performed well since the critic module 22 can not distinguish between C.sub.mod and N.sub.mod. At that point, C.sub.mod can not be detected as an altered message the encoder 14 and decoder 18 has been trained to generate undetectable covert signal.

[0039] The encoder 14, decoder 18 and critic module 22 can all be neural networks with parameter θ.sub.E, θ.sub.D, θ.sub.C respectively. The encoder 14 network is designed to accept M of length k in bits (i.e., M∈ß.sup.k×1, where ß={0, 1}) and outputs covert noise C.sub.enc∈C.sup.(k/2)×1. For practical implementation, this is constrained with a variance less or equal to σ.sup.2.sub.HW. Note that both M and C.sub.enc have the same length k. The decoder 18 accepts the demodulated encoded noise vector Ċ.sub.enc∈C.sup.(k/2)×1, and outputs {dot over (M)}∈R.sup.k×1. {dot over (M)} is restricted within the range between (0,1). At the end of a successful training process, M should converge to ß. The critic network accepts either C.sub.mod and N.sub.mod∈C.sup.(k/2)×1 and outputs P.sub.con, which is restricted within the range of (0,1).

[0040] FIGS. 3A-3C show the neural network architecture for the three entities of the learning model. FIG. 3A shows a schematic view of a neural network architecture for an encoder 14 of the system shown in FIG. 2B. FIG. 3B shows a schematic view of a neural network architecture for a decoder 18 of the system shown in FIG. 2B. FIG. 3C shows a schematic view of a neural network architecture for a critic module 22 of the system shown in FIG. 2B. The encoder 14 network accepts a confidential message M, then outputs a complex noise vector. On the other side, the decoder 18 network accepts the complex demodulated signal C.sub.enc and outputs the decoded confidential message M. The critic network (module 22) accepts either C.sub.mod and N.sub.mod, and outputs the confidence probability P.sub.con.

[0041] The encoder 14 network starts with a fully connected (FC) layer 24 without any activation function. The FC layer 24 performs an initial permutation of the input data and changes the domain of the input data from bit domain to real domain to increase the mapping space and avoid singularities. The rest of the network consists of multiple convolutional layers, which extracts optimal feature representation for M. The convolutional layer, is described as Conv(W, d.sub.in, d.sub.out, s), where W is the feature window size, d.sub.in is the input depth of the feature vector, d.sub.out is the depth of the output feature vector, and s is the stride. The last layer is k-normalization to maintain C.sub.enc's power constraint. Finally, we use “real to complex” layer to merge the real output vector to a complex noise vector, C.sub.enc∈C.sup.(k/2)×1.

[0042] The decoder 18 network starts by “complex to real” layer to convert C.sub.mod to real data vector, followed by a FC layer 24, which acts as a denoising layer to compensate the noise effect due to the channel between the transmitter and the receiver. The rest of the network consists of multiple convolutional layers to decode the encoded feature representation and obtain M. The last layer has a sigmoid activation function to restrict M's values between (0,1). After a successful training process, M should converge to the bit values. The critic network (module) 22 is similar to the decoder 18 network. However, it differs from the decoder 18 network of having an extra FC layer 26 followed by a Sigmoid activation function to output P.sub.con.

[0043] The k-normalization layer is designed to constrain C.sub.enc's power level to mimic a given hardware impairment σ.sup.2.sub.HW. In this work, we provide a generic design for the k-normalization layer such that it accepts as an input and can generate different levels of hardware noise as required by the system. Thus, the k-normalization layer is formulated as:

[00001] $y_{i} = \sqrt{\frac{k σ_{HW}^{2}}{2}} ⨯ \frac{x_{i}}{{.Math.}_{l = 1}^{k} x_{l}^{2}}$

where x.sub.i and y.sub.i are the elements of the input vector X and output vector Y respectively.

[0044] In this embodiment, the encoder 14 encodes a secret message M to produce a noise vector C.sub.enc and modulated over a covert signal to produce a covert C.sub.mod. The main goal of the encoder 14 to create C.sub.mod that looks like a distorted modulated signal for a defined modulation order m. Moreover, C.sub.enc should have the same statistical properties of the hardware noise impairments of the transmitter (e.g transmitting radio 10). The decoder 18 knows the encoding process, so it can recover the message. On the other hand, the critic network (module 22) measures the statistical properties of either C.sub.mod or N.sub.mod to figure out if the input signal is altered or not. In a learning-based model, the encoder 14, decoder 18, and critic module 22 can all be configured as neural networks. The encoder 14 network is trained to encode a secret message M to generate covert signals C.sub.mod such that only the decoder can recover M, and the critic network (module 22) cannot do better than the random guessing between C.sub.mod and N.sub.mod.

[0045] One can define E(θ.sub.e, M), D(θ.sub.d, C.sub.mod), C(θ.sub.c, C.sub.mod, N.sub.mod) as the mapping functions of the encoder 14, decoder 18 and critic module 22 respectively. Moreover, we define d(M, {dot over (M)}) as the L2 norm between M and {dot over (M)}. Intuitively, the decoder's 18 loss function can be formulated as:

[00002] $L_{D} (θ_{E}, θ_{D}, M) = E_{M} {d (M, \dot{M})} = E_{M} {d (M, D (θ_{D}, E (θ_{E}, M))}$

[0046] where E.sub.M {.} is the expected value over M data set. As shown, both the encoder 14 and the decoder 18 try to optimize their parameters to achieve communication reliability through minimizing the mean error between the original message M and the predicted message {dot over (M)}. The critic network (module 22) has the same role as the discriminator in GANs [11]. Thus, the critic's loss function can be expressed as:

[00003] $L_{C} (θ_{C}, θ_{E}, M, N_{\mod}) = E_{M} {- \log (C (θ_{C}, N_{\mod})) - \log (1 - C (θ_{C}, C_{\mod}))} = E_{M} {- \log (C (θ_{C}, N_{\mod})) - \log (1 - C (θ_{C}, E (θ_{E}, M)))}$

[0047] L.sub.C (θ.sub.C, θ.sub.E, M, N.sub.mod) represents the binary cross entropy loss between the distorted cover N.sub.mod and the covert C.sub.mod, which depends on θ.sub.C and θ.sub.E. In this model, one can achieve the most adversarial case for the critic network (module 22) through optimizing θ.sub.C using the above, which accepts the output of the encoder 14 network as an input. Thus, during the training process, one can freeze θ.sub.E while updating θ.sub.C to ensure that the critic network can make informed decision about distinguishing between N.sub.mod and C.sub.mod. As mentioned in the above discussion, the encoder should generate a covert message C.sub.mod which has the same statistical properties of N.sub.mod, however, the loss function presented is not normally sufficient to accomplish this property. So, a joint objective function is defined between the encoder 14 and the decoder 18 so that they can defeat the critic network (module 22) through learning an optimal transmission scheme such that the critic reaches the maximum uncertainty between N.sub.mod and C.sub.mod, and only the decoder 18 can recover the message. This loss function L.sub.E,D can be expressed as:

L.sub.E,D(θ.sub.E,θ.sub.D,θ.sub.E,M)=L.sub.D(θ.sub.E,θ.sub.D,M)+L.sub.C(θ.sub.C,θ.sub.E,M,M.sub.mod)

[0048] Here, the first term maintains the communication reliability between the encoder and the decoder, while the second term guarantees that the generated covert C.sub.mod has the same statistical properties of the distorted cover signal N.sub.mod. Similar to the critic network (module 22), both the encoder 14 and the decoder 18 update their parameters (i.e. θ.sub.E and θ.sub.D) based on L.sub.E,D (θ.sub.E, θ.sub.D, θ.sub.E, M) while critic's 22 parameters are frozen.

[0049] For the steganography system requirements, one can define I(X;Y) as the mutual information between X and Y. In addition, define D.sub.KL (P∥Q) as the KL divergence between P and Q distributions.

[0050] For a fixed cover distribution P.sub.N (n), and message distribution P.sub.M (m), a steganography system having encoding and decoding functions

(ε custom-character ) is perfectly secure, if

I(M;{circumflex over (M)})>0, and D.sub.KL(P.sub.N(n)∥P.sub.C.sub.enc.sub./M(c.sub.enc/m))=0

[0051] The first condition ensures the communication reliability between the transmitter (such as transmitter radio 10) and the receiver (such as receiver radio 12)(i.e., useful steganography system) while the second guarantees that the critic function (Module 22) cannot distinguish between the cover and covert messages. From previous definition, I(M,{dot over (M)}) is given by:

I(M;{circumflex over (M)})=H(M)−Ĥ(M/{circumflex over (M)})

where H (.) is the binary entropy function. The first goal of steganography system is maximizing I(M,{dot over (M)}). The conditional entropy H(M/{dot over (M)}) depends on the probability density function P(M/{dot over (M)}) which is given by:

[00004] $P (M / ?) = \frac{? (? / ?) P (M)}{P (?)}$ $? indicates text missing or illegible when filed$

[0052] Assuming M symbols are uniformly distributed, then we can use the likelihood approximation (i.e., P(M/{dot over (M)})≃P({dot over (M)}/M)). Since {dot over (M)}=D(ε(M)), then P({dot over (M)}/M) can be assumed as normal distribution with mean M and maximum acceptable variance (error) e (i.e., P({dot over (M)}/M)˜ custom-character (M,e)).

[0053] Consequently

[00005] $\max I (M, M) \equiv \max P (? / M) \equiv \min \frac{1}{L} {.Math.}_{i = 1}^{L} {(M_{i} - ?)}^{2}$ $? indicates text missing or illegible when filed$

[0054] where L is the total number of symbols in message set. Maximizing I(M,{dot over (M)})) is equivalent to minimizing the mean square error between M and {dot over (M)}. Accordingly, the learning model satisfies the secrecy condition (i.e.,

D.sub.KL(P.sub.N(n)∥P.sub.C.sub.enc.sub./M(c.sub.enc/m))=0

[0055] As stated earlier, the encoder 14 and the critic network (Module 22) acts as the generator and the discriminator in a typical GAN architecture. Thus, an optimal critic network C* and be derived as:

[00006] $C^{*} = \frac{P_{N} (n)}{P_{N} (n) + P_{C_{enc} / M} (c_{enc} / m)}$

[0056] Moreover, the optimal encoder ε* can be obtained from:

ε*=min{−log 4−2JSD(P.sub.C.sub.enc.sub./M(c.sub.enc/m)))∥P.sub.N}

where JSD(P∥Q) is the Jensen-Shannon divergence between P and Q distributions. Thus, one obtains ε* if:

JSD(P.sub.C.sub.enc.sub./M(c.sub.enc/m))∥P.sub.N(n))=0.

[0057] Consequently:

[00007] $JSD (P_{C_{enc} / M} (c_{enc} / m) .Math. P_{N} (n)) = 0 \equiv D_{KL} (P_{N} (n) .Math. P_{C_{enc} / M} (c_{enc} / m)) = 0 \equiv C^{*} = \frac{1}{2}$

[0058] Therefore, the steganography system is perfectly secure if the output of the critic network (module 22) equals ½, which means that the critic can not distinguish between the cover and covert messages, i.e.:

P.sub.N(n)≃P.sub.C.sub.enc.sub./M(c.sub.enc/m)

[0059] Experimental verification of the above was performed with Tensorflow framework. The input length k=48, and the maximum relative constellation error E.sub.rms values similar to 64 point FFT in OFDM PHY of WiFi standard, such that it can be used over an OFDM signal. Two training sets are constructed for the secrets M and distorted cover messages N.sub.mod. Each training set consists of 20000 symbols and each symbol is of size k. The cover signal is embodied as a modulated QPSK signal (i.e., m=2). The batch size is 8000. An optimizer with a learning rate of 0.001 is used to optimize the three networks included in the learning model. The number of the training epochs is 8000. The three networks are trained simultaneously in each epoch such that the parameters of the critic network (Module 22) are updated, while the parameters of both the encoder 14 and the decoder 18 are frozen. Then, the parameters of both the encoder 14 and the decoder 18 are updated jointly while the parameters of the critic network (module 22) are frozen. The channel's training signal to noise ratio (SNR.sub.t) equals 17 dB. For the testing phase, a testing set consisting of 1000 symbols for M and N.sub.mod were used. Then a range for SNR.sub.ch was defined from 0 to 40 dB.

[0060] FIGS. 4A-4D show the results of a successful training process at SNR.sub.t and E.sub.rms equals 23 db. FIG. 4A shows a graph representing the loss function for QPSK at the training SNR. FIG. 4A further shows the loss functions for both the critic network (Module 22) and the encoder 14-decoder 18. At the beginning of the training session, both the encoder 14 and the decoder 18 try to achieve only communication reliability (i.e., minimizing the error between M and {dot over (M)}). Consequently, the critic (module 22) loss increases, which means that the critic module 22 can distinguish between C.sub.mod and N.sub.mod. After some time, both the encoder 14 and the decoder 18 succeed in finding their pattern to achieve both communication reliability and signal hiding capability and defeat the critic module 22. Consequently, the critic's network (module 22) loss decreases until the critic network reaches maximum uncertainty such that it cannot distinguish between C.sub.mod and N.sub.mod.

[0061] FIG. 4B shows a graph representing the generated noise signal constellation for QPSK at the training SNR. FIG. 4B shows that the generated encoded noise C.sub.enc constellation at the end of the training process takes the shape of circular Gaussian distribution with zero mean and a variance below E.sub.rms. This result indicates that the encoder 14 succeeds in encoding M to have the same distribution similar to the hardware impairments noise vector N.

[0062] FIG. 4C shows a graph representing the transmitted covert signal constellation for QPSK at the training SNR, and FIG. 4D shows a graph representing the received covert signal constellation for QPSK at the training SNR. Note that FIG. 4C. looks like a distorted QPSK signal and the added encoded noise C.sub.enc has the same statistical properties of the hardware of the hardware impairments The received covert enc impairments (i.e., C.sub.enc˜C custom-character (0,σ.sup.2.sub.ch)). The received covert signal constellation looks like a standard modulated QPSK signal with an added noise such that any receiver cannot infer that the signal is altered and embeds a secret message M.

[0063] FIGS. 5A-5B show the learned C.sub.enc distribution, and the score for both the cover N.sub.mod and the covert C.sub.mod. FIG. 5A shows a graph representing the learned distribution of the system shown in FIG. 2B. FIG. 5A shows that C.sub.enc's distribution converges to a normal distribution with zero mean and variance less than E.sub.rms=−23 dB.

[0064] FIG. 5B shows a graph representing the confidence probability generated by the critic module for the system shown in FIG. 2B. FIG. 5B shows the confidence probability P.sub.con for both N.sub.mod and C.sub.mod during the training process. At the beginning of the training process, the critic network can distinguish between the distorted cover and the covert since they took different P.sub.con. At the end of the training process, we can observe that P.sub.con for both C.sub.mod and N.sub.mod equals 0.5, reaching an optimal secrecy condition. Hence, the encoder 14 succeeded to generate a covert signal C.sub.mod such that the critic network (module 22) cannot distinguish between the distorted cover and the covert message.

[0065] FIGS. 6A-6B show the bit error rate curves and the region of operation for different values of E.sub.rms. The SNR reported here is measured at the receiver, which aggregates the effect of E.sub.rms and SNR.sub.ch. FIG. 6A shows a graph representing the Bit error rate for the system shown in FIG. 2B. FIG. 6A should that as E.sub.rms increases, the information hiding capability increases, and the BER decreases. However, as E.sub.rms increases, it has a higher effect on received signal quality compared to the channel's distortion SNR.sub.ch. In other words, the communication channel between the encoder 14 and the decoder 18 requires high SNR for the covert signal to be decoded without any error. This observation can be shown directly from FIG. 5B.

[0066] FIG. 6B shows a graph representing the corresponding practical region of operation for the system such that the decoder can decode M correctly. As each bit (M) gets mapped to one real value, where two of those are combined to map in the complex domain (C.sub.enc), the effective number of covert data bits that can be transmitted per IEEE 802.11a/g OFDM symbol with 48 data subcarriers is 96. Thus, an effective throughput of 12 Mbps can be achieved for covert communication at the received SNR of >12 db.

[0067] The corresponding structures, materials, acts, and equivalents of all means or step plus function elements in the claims below, if any, are intended to include any structure, material, or act for performing the function in combination with other claimed elements as specifically claimed. The description of the present invention has been presented for purposes of illustration and description, but is not intended to be exhaustive or limited to the invention in the form disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the invention. The embodiment was chosen and described in order to best explain the principles of one or more aspects of the invention and the practical application, and to enable others of ordinary skill in the art to understand one or more aspects of the invention for various embodiments with various modifications as are suited to the particular use contemplated.