DEVICE AND METHOD FOR COMPRESSING A DATA STREAM
20190319642 ยท 2019-10-17
Inventors
Cpc classification
H03M7/30
ELECTRICITY
H03M7/55
ELECTRICITY
H03M13/156
ELECTRICITY
International classification
H03M13/15
ELECTRICITY
H03M7/30
ELECTRICITY
Abstract
We provide a method of compressing a data stream for transmission, including: generating a data sequence representing a received data stream, generating a plurality of data substreams, each comprising a portion of the data sequence, identifying a formal concept defining a dependency between a first one of the data substreams and one or more further ones of the data sub streams that are dependent on the first data substream, removing those dependent data sub streams from the plurality of data sub streams, and transmitting the remaining data sub streams, and a method of reconstructing a data stream at a receiver, including: receiving a received data sequence representing a received data stream, identifying that a substream has been removed from the data stream prior to transmission, identifying a formal concept definition for regenerating the removed substream based on an identified substream of the received data sequence, regenerating a data substream using the formal concept definition and the identified sub stream of the received data sequence, and adding the regenerated data substream to the received data sequence.
Claims
1. A method of compressing a data stream for transmission, including: generating a data sequence representing a received data stream, generating a plurality of data sub streams, each comprising a portion of the data sequence, identifying a formal concept defining a dependency between a first one of the data sub streams and one or more further ones of the data sub streams that are dependent on the first data sub stream, removing those dependent data sub streams from the plurality of data sub streams, transmitting the remaining data sub streams.
2. A method according to claim 1, wherein the step of identifying a formal concept includes identifying a plurality of formal concepts, each defining a dependency between a first one of the data sub streams and one or more of the further ones of the data sub streams that are dependent on the first data sub stream.
3. A method according to claim 1, further including the step of transmitting data representative of the formal concept for use in reconstructing the removed data substreams at the receiver.
4. A method according to claim 1 wherein the data stream is an EEG data stream, comprising data from an electroencephalogram.
5. A method according to claim 1 further including transforming the data stream using a Fast Fourier Transform or its inverse to convert the data stream from the time domain to the frequency domain.
6. A method according to claim 5 further including applying a threshold to the data stream such that values less than 6 are rounded to zero.
7. A method according to claim 6 further including a step of further compressing the data stream by removing occurrences of sequential consecutive zero values in the data stream and replacing those removed sequences of zeros with data representing the length of the sequence of consecutive zeros removed.
8. A system including a transmitter and a processor, wherein the processor is configured to: generate a data sequence representing a received data stream, generate a plurality of data substreams, each comprising a portion of the data sequence, identify a formal concept defining a dependency between a first one of the data substreams and one or more further ones of the data substreams that are dependent on the first data substream, and remove those dependent data sub streams from the plurality of data sub streams, wherein the transmitter is configured to transmit the remaining data sub streams.
9. A method of reconstructing a data stream at a receiver, including: receiving a received data sequence representing a received data stream, identifying that a substream has been removed from the data stream prior to transmission, identifying a formal concept definition for regenerating the removed substream based on an identified substream of the received data sequence, regenerating a data substream using the formal concept definition and the identified substream of the received data sequence, and adding the regenerated data substream to the received data sequence.
10. A method according to claim 9, wherein the step of identifying a formal concept includes identifying a plurality of formal concepts, each defining a dependency between a first one of the data sub streams and one or more of the further ones of the data sub streams that are dependent on the first data sub stream.
11. A method according to claim 9, wherein identifying the formal concept includes receiving data representative of the formal concept.
12. A method according to claim 9, wherein identifying the formal concept includes accessing data stored at the receiver representative of the formal concept.
13. A method according to claim 9, further including transforming the data stream using a Fast Fourier Transform or its inverse to convert the data stream from the frequency domain to the time domain.
14. A method according to claim 9, further including identifying, in the received data sequence, data representing the length of a sequence of zeros removed from the data stream prior to transmission, and inserting a sequence of zeros of the identified length in the received data sequence.
15. A system including a receiver and a processor, wherein the receiver is configured to receive a data stream, and the processor is configured to: generate a data sequence representing the data stream, identify that a substream has been removed from the data stream prior to transmission, identify a formal concept definition associated with regenerating the removed substream based on an identified substream of the received data sequence, regenerate a data substream using the formal concept definition and the identified substream of the received data sequence, and add the regenerated data substream to the received data sequence.
Description
[0044] We now describe features of embodiments of the invention, by way of example only, with reference to the accompanying drawings of which
[0045]
[0046]
[0047]
[0048]
[0049]
[0050]
[0051]
[0052]
[0053]
[0054]
[0055]
[0056]
[0057]
[0058]
[0059]
[0060] With reference to the drawings we describe the methods and devices involved.
[0061] The devices of the present invention are configured to operate broadly as outlined in
[0062] While the techniques described are suitable for EEG data, it should be appreciated that the techniques may be applied to other data sources and data types.
Sampling
[0063] Let the original continuous-time electroencephalography (EEG) waveform s(t) have a duration of T seconds. The waveform is sampled at a constant interval of T.sub.s seconds to yield Ns discrete-time consecutive samples. The sampling frequency is then given as F.sub.s=1/T.sub.sHz.
[0064] Hence,
s.sub.n=s(t)(tnT.sub.s)=s(nT.sub.s)(1)
[0065] for n{0, 1, . . . , N.sub.s1}, where () is the Dirac delta function. Our adopted notations are set out in Table II.
TABLE-US-00001 TABLE II SUMMARY OF USED NOTATIONS. Notation Definition T EEG waveform duration N.sub.s Number of samples F.sub.s = N.sub.s/T Hz Sampling Frequency T.sub.s Inter-sample duration L Number of bits per sample M Number of symbols per sample K = L/M Number of bits per symbol [0066] Quantization: The continuous amplitude of each sampled signal is quantized using an L-bit analog-to-digital converter (ADC) to one of 2.sup.L levels, yielding the quantized signal {tilde over (x)}.sub.n time index n. Each signal {tilde over (x)}.sub.n holds a signed integer value in range {2.sup.L-1, . . . , 2.sup.L-11}. We can express the quantized signal in vector form as
x.sub.n={tilde over (x)}.sub.n+2.sup.L-1.(3) [0071] Then the symbols are forwarded to the modulation and IFFT modules. Details on am TBC and FFT vector reconstruction blocks are discussed in the following section.
[0072]
[0073] The physical layer's characteristics are leveraged to decompose quantized EEG samples into multiple streams of symbols, such that the dependency between different streams is reduced, and hence, compressibility is improved. In general terms, the signal is split (i.e. decomposed) into multiple streams. The streams are then analysed to discover dependencies and similarities between the stream content, and this information is then used to limit or reduce the data for transmission. In more detail, the steps involved are as follows.
[0074] In broad terms, the method involves generating a data sequence representing a received data stream. In other words, a stream of symbols is generated as a representation of the data stream that is received from a measuring device such as an EEG headset, for example, or from another data source. The method then involves generating a plurality of data substreams, each comprising a portion of the data sequence. These substreams can then be analysed to determine whether any formal concept exists, linking one or more of the substreams. One or more formal concepts are identifyied, each defining a dependency between a first one of the data substreams and one or more further ones of the data substreams that are dependent on the first data substream. To compress the data for transmission, the redundant dependent substreams can be removed, since they can be recreated at the receiver based on knowledge of the formal concept, and the substream to which they are linked by the formal concept.
[0075] To this end, once an applicable formal concept has been identified, the transmitter may include data representative of that formal concept. This may involve either describing the formal concept, or otherwise allowing the receiver to identify it (for example, by reference to a stored record of formal concepts).
[0076] At the receiver side, the receiver identifies that a substream has been removed from the data stream prior to transmission. This may occur based on identification of a symbol or token identifying the removal of a substream at a point in the data, or may identify a formal concept directly and/or provide a definition of that formal concept. This enables the receiver to identify a specific formal concept and a substream of the data to which the formal concept should be applied to recreate the removed substream. In response to identifying this information, the receiver may regenerate a data substream using the formal concept definition and the identified substream of the received data sequence, and add the regenerated data substream to the received data sequence. [0077] A. Data Decomposition [0078] We first decompose the EEG signal x.sub.n into multiple streams of symbols x.sup.m, for m{1, 2, . . . , M}. Let the binary encoded sequence of x.sub.n be denoted as b.sub.2.sub.
.sub.p being the Galois Field of order p. Hence, b is a sequence of L bits on the form
B. Knowledge Discovery
[0082] We leverage the symbol streams that are created, and the compression ratio can be further increased by discovering the correlation between different streams. In summary, using Formal Concept Analysis (FCA) for knowledge discovery, we select the minimal-representative streams so as to minimise the number of transmitted data streams without losing knowledge. [0083] We start by introducing the basic notions used to induce a binary relation between the generated streams. Let be the set of streams (i.e., objects).
the set of symbols' values (i.e., attributes), and I the binary relation on the universe
6=
A that defines which objects have which attributes. In order to transform our streams into formal context of (
, A, I), we consider the attributes .sub.v of each symbol s to be all the possible values it may take, depending on the employed modulation, for v{0, 1, . . . , 2.sup.K1}, and .sub.v{0, 1}. Thus, the vector of attributes A for each stream x.sup.m is defined as
[0085] The aim is to obtain the dependency between different streams through finding the minimal set of formal concepts covering our relation. (O,A) is a formal concept if A is the set of all attributes shared by the objects O, and in the same time O is the set of all objects that have all attributes in A.
[0086] We refer to the implications as the minimal set of rules, by which we can infer some attributes from others. We can derive formal concepts from our formal context using the derivation operators or difunctional decomposition. Difunctional decomposition enables obtaining the isolated points of a binary relation through calculating the Fringe Relation. This fringe relation is, by definition, a difunctional relation, and all its elements are isolated points. Thus, the formal concepts can be easily obtained by finding such isolated points, since if (a; b) is an isolated point, by definition it is included in one concept only.
[0087] Once the formal concepts are derived, implications can be identified, hence transmitting only the minimal-representative number of streams. For the sake of clarity, we describe the adopted procedure by referring to a toy example where a data length of 20 samples with QPSK modulation is considered.
Step 1: Generation of Formal Context.
[0088] Consider the generated streams of symbols. We consider each stream as an object with attributes corresponding to its symbols' values. As an example,
Step 2: Identifying Formal Concepts.
[0089] The generated binary relation are then decomposed into a set of concepts, using the algorithm presented by R. Khcherif, M. M. Gammoudi, and A. Jaoua, Using difunctional relations in information organization, Information Sciences 125, pp. 153-166, 2000, for example. However, in order to identify the dependency between different streams, we leverage a concept referred to as shadow concept: considering not only the attributes for which the relation I is equal to 1, but also the negation of the attributes, i.e., the attributes values for which the relation is equal to 0. In this case, both the attributes and the negation of the attributes form the identified concept.
Step 3: From Concepts to Implications.
[0090] Based on the identified concepts, we derive the implications that can be used to effectively eliminate the streams that can be retrieved at the receiver using their implications with other received streams. For instance, looking at
Step 4: Elimination.
[0091] For each obtained concept, we transmit only one stream and eliminate other streams that belong to the same concept. Then, the retrieval process is carried out at the receiver using the identified implications.
A. EEG Signal Characteristics
[0092] We first visualize and analyze the EEG signal in the time and frequency domains in order to understand its properties and obtain the best approach of processing and transmission. A normal continuous EEG signal in the time domain is shown in
[0093] Looking at the generated spectrum shown in
B. Threshold-Based Compression
[0094] Motivated by the EEG signal characteristics in the frequency domain, we update the OFDM transceiver architecture at the physical layer to support our compression scheme. Unlike the prior art compression techniques that are applied at the higher layers, we convey our compression scheme into the physical layer exploiting the existing OFDM transceiver's components in order to perform efficient compression without adding much complexity.
[0095] As mentioned, given the basic OFDM transceiver architecture in
[0096] So, in general terms, the compression method involves transforming the data stream using a Fast Fourier Transform or its inverse to convert the data stream from the time domain to the frequency domain. In the frequency domain, as discussed above, a large part of the data stream is likely to consist of low valuesapproaching zero. Therefore, it is possible to apply a threshold to the data stream such that values less than are rounded to zero, without losing a significant portion of the data content.
[0097] Subsequently, the stream may be further compressed by removing occurrences of sequential consecutive zero values in the data stream and replacing those removed sequences of zeros with data representing the length of the sequence of consecutive zeros removed.
[0098] At the receiver side, the FFT vector reconstruction block is responsible for adding zeros in the received vector at the positions of the ignored symbols before forwarding it to the FFT block. The latter will then demodulate the received symbols and reconstruct the EEG signal.
C. Error Correction
[0099] In order to quantify the achieved compression gain compared to the consequent signal distortion due to our compression scheme, we define the compression ratio as
[0102] Interestingly, using our EEG compression transceiver we can easily define some of the wrong reconstructed samples at the receiver side. As shown in
(i) identifying received samples with relatively large amplitude (samples with error), (ii) retransmitting the reconstructed samples with error.
[0103] Despite the achieved compression ratio using TBC, it has been found that it is of prominent importance to further analyse the effect of symbol mapping and modulation on EEG signal characteristics in order to enhance the compression ratio. As noted from
[0104] This is mainly due to the effect of symbol mapping and modulation, since representing each data sample with multiple symbols turns the generated symbols after IFFT to be less compressible, i.e., most of the generated symbols after IFFT will have large magnitudes and therefore cannot be neglected.
D. Higher-Order Modulation
[0105] To tackle the problem of symbol mapping effect on EEG sparsity and increase compression efficiency of our transceiver, we study the characteristics of generated symbols after Fourier transform with and without symbol mapping and modulation (see
[0106] However, as shown in
[0107] This masking is based on our prior knowledge about the EEG characteristics in the frequency domain. We define a window size W which is the percentage of compressible symbols relative to the total number of symbols. Using this masking, we define the less important symbols of x.sub.f to be passed by the TBC scheme, while isolating more important symbols from compression (see
[0108] For the hardware implementation complexity, we remark that the proposed threshold-based compression results in adding few numbers of real valued operations compared to multicarrier modulations techniques considered for 5G (e.g., filtered orthogonal frequency division multiplexing (fOFDM), filter bank multicarrier (FBMC), and cyclic convolution based FBMC). These adopted modulations techniques result in increasing the computational complexity compared to conventional OFDM.
E. Stream-Based Compression
[0109] Due to the quality of wireless channel, hardware design, or standards limitations, leveraging higher-order modulation may not be recommended in all cases. Thus, in order to make our transceiver adaptive for different channel conditions and modulation schemes, we propose a Stream-Based Compression (SBC) scheme. Leveraging the generated symbol streams in Section IV, the compression ratio can be further increased as follows. The independent streams of symbols are forwarded to the modulation and IFFT blocks, thus at TBC block, we can deal with each stream separately using different values of the threshold. This, as also shown in the simulation results section, yields a greater overall compression ratio.
[0110] For instance, using QPSK modulation and L=12 bits, we will generate 6 streams of symbols. The symbols in each stream will have different values before modulation (see
[0111] We note that discovering the dependency between different streams and selecting only the independent streams is performed before IFFT (i.e., it pertains to the higher layers of the transceiver architecture, while only the threshold-based compression is done after IFFT, i.e., in the physical layers of the transceiver. Thus, to summarize, the main steps of the SBC scheme are as follows (see
[0114] While at the receiver side, the inverse process is adopted through: (i) using FFT vector reconstruction, which is responsible for adding zeros in the received vector at the positions of the compressed symbols before forwarding it to the FFT, and (ii) leveraging obtained dependency between different streams to retrieve discarded streams from transmission.
Simulation Results
[0115] In order to derive simulation results, the system model shown in
TABLE-US-00002 TABLE III SIMULATION PARAMETERS Parameter T N.sub.s T.sub.s L M Value 23.6 sec 4096 0.0058 12 bits {2, 3, 4, 6}
[0116] First, the performance of the proposed TBC transceiver, described above, was assessed without performing the signal decomposition into different symbol streams.
[0117] However, when EC is applied, SER and PRD reduce significantly thanks to the retransmission of the erroneous samples. On the contrary, the actual or effective Cr decreases due to the higher retransmission overhead. Importantly, these results show that, using the well-known OFDM transceiver architecture with slight modifications, we can obtain about 25% compression ratio while keeping SER and distortion below 10%, which is acceptable by many applications.
[0118]
[0119] Next, we assess the performance of the proposed SBC scheme in Section IV, i.e., we also account for the benefits brought by the decomposition of the signal into streams of symbols and their processing. Interestingly, our SBC transceiver can support both lossless and lossy compression. As depicted in
[0120] The transceiver performance further improves if the SBC-KD scheme is used. Indeed, by applying knowledge discovery and transmitting only the minimal-representation streams, we can considerably reduce the amount of transferred data while still accurately reconstructing the signal at the receiver side. The results in
[0121] Finally, in
[0122] This achieved increase in compression ratio also reflects on the transmission energy consumption (see Table IV, below). Thus, a significant amount of energy consumption can be saved using the proposed compression scheme. Also, as energy consumption decreases with increasing compression ratio and distortion, our scheme can be adapted to maintain the best tradeoff between energy consumption and signal distortion, based on application requirements and energy availability.
TABLE-US-00003 TABLE IV TRANSMISSION ENERGY CONSUMPTION VS. COMPRESSION RATIO Transmission Energy (mJ) C.sub.r % 163.84 0 147.46 10 114.69 30 65.54 60 49.15 70 24.58 85
[0123] In general terms, the transmitter 10 of the present invention includes a sensing or acquisition deviceeither a device suitable to detect and record a signal, for example, or an input for receiving a detected or recorded signal. In embodiments this is an EEG acquisition device or an input for receiving an encoded EEG signal.
[0124] The transmitter 12 includes a processor and/or other hardware (such as memory and storage hardware) suitable for performing sampling of the data signal and a quantization step, in which the signal is converted to a stream of data with discrete values/magnitudes.
[0125] As shown in
[0126] The transmitter 12 than carries out data decomposition and thresholding steps, as described above, before transmission. Of course the transmitter includes suitable equipment for transmitting a radio frequency signal (or other wireless signal) such as those known generally in the art.
[0127] As shown in
[0128] The receiver 14 includes a suitable wireless signal receiving device preferably a threshold-based receiver as described. The receiver 14 according to embodiments of the invention, and as shown in
[0129] Finally, symbol demapping/demodulation takes place prior to outputting the received and processed data (encoded EEG data, for example).
[0130] Representative features are set out in the following clauses, which stand alone or may be combined, in any combination, with one or more features disclosed in the text and/or drawings of the specification.
[0131] When used in this specification and claims, the terms comprises and comprising and variations thereof mean that the specified features, steps or integers are included. The terms are not to be interpreted to exclude the presence of other features, steps or components.
[0132] The features disclosed in the foregoing description, or the following claims, or the accompanying drawings, expressed in their specific forms or in terms of a means for performing the disclosed function, or a method or process for attaining the disclosed result, as appropriate, may, separately, or in any combination of such features, be used for realising the invention in diverse forms thereof.
[0133] Although certain example embodiments of the invention have been described, the scope of the appended claims is not intended to be limited solely to these embodiments. The claims are to be construed literally, purposively, and/or to encompass equivalents.