Coded light
10075236 ยท 2018-09-11
Assignee
Inventors
- Constant Paul Marie Jozef BAGGEN (EINDHOVEN, NL)
- Ronald Rietman (Eindhoven, NL)
- Paul Henricus Johannes Maria VAN VOORTHUISEN (EINDHOVEN, NL)
Cpc classification
H03M5/12
ELECTRICITY
International classification
Abstract
A coded light signal is embedded into visible light emitted from the light source, to be received by a rolling-shutter camera which captures frames (16) by exposing a plurality of lines (18, 24) of each frame in sequence, the camera having an exposure time with each line being exposed for the exposure time. The coded light signal is formatted according to a format whereby the coded light signal comprises at least one message and the message is repeated multiple times with a timing such that, when samples of the coded light signal are obtained from a substantially smaller number of lines (24) than exposed by the camera in each frame and the message is longer than this number of lines, a different part of the message is seen by the camera in each of a plurality of different ones said frames.
Claims
1. A device adapted to provide a coded light signal to a rolling-shutter camera, comprising: a driver for controlling a light source based on a controller output to embed the coded light signal into visible light emitted from the light source for reception by the camera, the camera being configured to capture frames by exposing a plurality of lines of each frame in sequence, the camera having an exposure time being the time for which each line is exposed; and a controller configured to generate the controller output to generate the coded light signal according to a format whereby the coded light signal comprises at least one message and the message is repeated multiple times with a timing that is configured relative to the exposure time of the camera; wherein: the message is of a duration longer than an amount of time for capturing one frame, wherein a different part of the message is emitted during different captured frames; and the message comprises one or more packets comprising different data content, wherein each of the packets of the message is followed by an inter-packet idle period, and wherein the repetitions of the message are separated by an inter-message idle period different than the inter-packet idle period.
2. The device of claim 1, wherein the message is repeated such that the whole message will be emitted over a plurality of frames.
3. The device of claim 1, wherein said number of lines is less than or equal to 14% of the lines of each frame.
4. A system comprising the device of claim 1, the light source, and the camera.
5. The device of claim 1, wherein the inter-packet idle period is greater than or equal to the exposure time or a maximum anticipated value of the exposure time.
6. The device of claim 1, wherein the inter-message idle period is selected to obtain said timing whereby a different part of the message is emitted in each of a plurality of different ones of said frames.
7. The device of claim 1, wherein the exposure time is less than or equal to ( 1/30)s, less than or equal to ( 1/60)s, or less than or equal to ( 1/120)s.
8. The device of claim 1, wherein the at least one message is formed of at least three packets per message.
9. The device of claim 1, wherein each of the packets is of a length less than or equal to 17 bits long, less than or equal to 12 bits long, or less than or equal to 9 bits long.
10. The device of claim 9, wherein the packet length is 9 bits, consisting of a byte of content and a synchronization bit.
11. The device of claim 10, wherein the controller is configured to encode the coded light signal with a symbol rate of said symbols being 1 kHz, 2 kHz or 4 kHz.
12. The device of claim 1, wherein the controller is configured to encode the coded light signal according to a ternary Manchester modulation coding scheme whereby data bits of the signal are represented by being mapped to ternary Manchester symbols.
13. The device of claim 12, wherein the inter-message idle period has a duration of at least 4 of said symbols.
14. The device of claim 13, wherein each of the packets is 19 of said symbols long, the inter-packet idle period has a duration of 33 of said symbols, and the inter-message idle period has a duration of 5 of said symbols.
15. The device of claim 1, wherein the controller is configured to receive an indication of the exposure time from the camera, and to adapt the format of the message based on the exposure time.
16. The device of claim 15, wherein the controller is configured to perform said adaptation by selecting one of more parameters such that a different part of the message is emitted in each of a plurality of different ones said frames, the one or more parameters comprising: the inter-packet idle period, inter-message idle period, number of packets per message, and/or symbol rate.
17. The device of claim 16, wherein the controller is configured to adapt the format by selecting between a plurality of different predetermined combinations of said parameters.
18. A method for providing a coded light signal to a rolling-shutter camera, comprising: controlling a light source to embed the coded light signal into visible light emitted from the light source for reception by the camera, the camera being configured to capture frames by exposing a plurality of lines of each frame in sequence, the camera having an exposure time being the time for which each line is exposed; and generating the coded light signal according to a format whereby the coded light signal comprises at least one message and the message is repeated multiple times with a timing that is configured relative to the exposure time of the camera; wherein: the message is of a duration longer than an amount of time for capturing one frame, wherein a different part of the message is emitted during different captured frames; and the message comprises one or more packets comprising different data content, wherein each of the packets of the message is followed by an inter-packet idle period, and wherein the repetitions of the message are separated by an inter-message idle period different than the inter-packet idle period.
19. A computer program product comprising code embodied on a computer-readable storage medium and configured so as, when executed on a device comprising a driver for controlling a light source based on a controller output to provide a coded light signal to a rolling-shutter camera, to perform operations of: controlling the light source based on the controller output to the driver to embed the coded light signal into visible light emitted from the light source, to be received by the camera, the camera being configured to capture frames by exposing a plurality of lines of each frame in sequence, the camera having an exposure time being the time for which each line is exposed; and generating the controller output for output to the driver to generate the coded light signal according to a format whereby the coded light signal comprises at least one message and the message is repeated multiple times with a timing that is configured relative to the exposure time of the camera; wherein: the message is of a duration longer than an amount of time for capturing one frame, wherein a different part of the message is emitted during different captured frames; and the message comprises one or more packets comprising different data content, wherein each of the packets of the message is followed by an inter-packet idle period, and wherein the repetitions of the message are separated by an inter-message idle period different than the inter-packet idle period.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) To assist understanding of the present disclosure and to show how embodiments may be put into effect, reference is made by way of example to the accompanying drawings in which:
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
(15)
(16)
(17)
(18)
(19)
(20)
(21)
(22)
(23)
(24)
DETAILED DESCRIPTION OF EMBODIMENTS
(25) The following relates to a coded light application, and provides a format for transmitting coded light, a decoder for receiving the coded light, and one particular advantageous building block used in the decoder (which can also be used in other applications other than coded light).
(26) The format and decoding techniques are aimed at providing a practical solution for coded light, defining a format that can work with existing rolling-shutter cameras as well as dedicated so-called region-of-interest (ROI) cameras alike. The disclosure provides a method of encoding and decoding, an encoder and decoder, a signal format, and software for encoding and decoding, that in embodiments allow such cheap rolling shutter cameras to receive coded light and to decode the messages contained therein.
(27) Cheap rolling shutter cameras effectively scan their images, so as the lines progress, so does time. This implies that the timestamp of the top line is much earlier than the timestamp of the bottom line. Now imagine that coded light is present in the imagethe coded light will typically only be visible in a small section of the image.
(28) The lines that actually image the light are the lines that contain coded light. Each line is condensed into a single value and that single value corresponds with a bit of information or a symbol; that is the bit or symbol transmitted at the moment in time that the line was scanned. Now for the rolling shutter camera to decode a message, one could make sure that the number of lines per frame is high enough (so the light big enough) and decode images based on a single frame. However, as will be discussed in more detail shortly, that is not always possible.
(29)
(30) Referring to
(31) To capture a sample for the purpose of detecting coded light, some or all of the individual pixels samples of each given line 18 are combined into a respective combined sample 19 for that line (e.g. only the active pixels that usefully contribute to the coded light signal, to be discussed later with reference to
(32) In the existing literature it is assumed that the source 10 covers all or almost all of every frame. However this is often not the case. Moreover the light being emitted is not necessarily synchronized with the capturing process which can result in further problems.
(33) A particular problem in using a rolling shutter camera for coded light detection therefore arises, because the light source 10 serving as a coded light transmitter may in fact cover only a fraction of the lines 18 of each frame 16. Actually, only the lines 24 in
(34) Apart from the above there may alternatively or additionally be one or more other problems. In embodiments problems may comprise: firstly, a rolling shutter may result in short temporal-interrupted views of the coded light source; secondly, there may be a conflict of interest between automatic exposure control and coded light; thirdly, driver technology at present allows only low frequency signaling which may cause flicker; and/or fourthly, the filtering effect produced by the rolling shutter process may result in inter-symbol interference (ISI).
(35) Therefore existing techniques may be insufficiently flexible and/or prone to error or interference. The following embodiments aim to combine information from multiple video frames in a rolling shutter camera, such that messages longer than their footprint in a single video frame can be captured and decoded. In embodiments this involves:
(36) (i) use of a signal format whereby a message is cyclically repeated by the transmitter; and
(37) (ii) at the receiver, exploiting the knowledge of the repetition time of the message (Tm) and the knowledge of the frame duration (Tframe) for reconstructing a complete message from the partial snapshots obtained in each frame. To this end the disclosure provides a method to collect and reassemble the data collected from multiple frames.
(38) A message is cyclically repeated, and at the receiver the message is effectively re-assembled over time (e.g. this can for certain messages actually take 1 or 2 seconds, e.g. so 30-60 frames). In order to enable this, the following describes a particular data format for encoding information in the light.
(39) Part of the decoding of the signal in turn is described using a method referred to herein as reassembly. To facilitate the decoding, the message duration and/or the Texp of the camera are tweaked in a manner that enables a cheap rolling shutter camera to detect a complete message fairly quickly.
(40) Once the message is re-assembled it will be equalized. The normal approach is to take the message and to effectively use a slicer to determine the exact timing of the signal and then equalize it. However, according to embodiments of the following, this can be achieved in a smart manner using a robust Wiener filter implementation that is rather efficient (preferably such that the entire decoding algorithm can be implemented on standard run-of-the-mill mobile phones).
(41) The robust Wiener filter takes into consideration the uncertainty of the channel and in this manner can reduce the inter-symbol interference (ISI). In the following embodiments this filter is used following re-assembly, but note that it may be used in other systems as well (not limited just to equalizing the effect of a rolling-shutter nor even just to coded light applications).
(42) Message Format
(43) The following describes a message format that allows for a reliable combination of the information of multiple video frames such that messages longer than the footprint, and even messages having a duration of many frames can be captured and decoded. Moreover, the signal format allows for asynchronous (Wiener-like) equalization to undo the ISI caused by the camera at the receiver. Further, the frequency content of the messages can be such that there is no visible flicker or stroboscopic effects, even for message lengths having a repetition frequency of, e.g., 18 Hz (very sensitive flicker frequency).
(44) An example of such a message format is shown in
(45) In embodiments, aside from the length (duration) of the message's actual data content (payload) 30, the message length Tm may be selected by including an inter-message idle period (IMIP) 34 between repeated instances of the same message. That way, even if the message content alone would result in each frame seeing more-or-less the same part of the message, the inter-message idle period can be used to break this behavior and instead achieve the rolling condition discussed above. In embodiments the inter-message idle period may be adapted given feedback of Texp (negotiated format), or may be predetermined to accommodate a range of possible values of Texp (universal format).
(46) As mentioned, the rolling condition is linked to the exposure time (i.e. line exposure time) Texp of the rolling-shutter camera. There is no one single solution to this, it is more a matter of avoiding combinations of Tm and Texp that do not meet the condition (discussed in more detail shortly). In the case of seeking a universal format, the inventors have discovered that sufficient solutions can be assured to be available as long as Texp<=33 ms or ( 1/30)s (approximately).
(47) Another issue is inter-symbol interference (ISI), which is a result of the filtering effect of the exposure of each line (effectively a box filter applied in the time domain as each line is exposed). To mitigate this, in embodiments the message format is arranged such that each instance of the message comprises a plurality of individual packets 29 (e.g. at least three) and includes an inter-packet idle period (IPIP) 32 between each packet. In embodiments, the inter-packet idle period follows each packet, with the inter-message idle period (IMIP) 34 tagged on the end after the last packet (there could even be only one packet, with the IPIP 32 and potentially IMIP 34 following).
(48) Inter-symbol interference is then a function of packet length and inter-packet idle period. The more data symbols there are in a row, the more inter-symbol interference (ISI). Therefore it is desirable to keep the packet length small with good sized gaps in between. The idle gaps (no data, e.g. all zeros) between bursts of data helps to mitigate the inter-symbol interference, as does keeping the packet length short. Again these properties may be adapted in response to actual knowledge of a particular camera's exposure time Texp being fed back via a suitable back channel such as an RF channel between receiver 4 and transmitter 2 (negotiated format), or alternatively the timing may be formatted in a predetermined fashion to anticipate a range of possible exposure time values Texp of cameras the format is designed to accommodate (universal format). In embodiments, the inventors have discovered that a packet length no longer than 9 bits separated by an inter-packet idle period of at least Texp provides good performance in terms of mitigating ISI. By convenient coincidence, 9 bits also advantageously allows for one byte of data plus a synchronization bit. Nonetheless, in other embodiments a packet length of up to 12 bits, or even up to 17 bits may be tolerated.
(49) As well as achieving rolling, another potential issue is synchronization. The receiver has a template of the message format which is uses to synchronize with the received signale.g. it knows that after a gap of the IPIP+IMIP, to expect a synchronization bit, then a byte of data, then the IPIP, then another synchronization bit and byte of data, etc. By comparing this template with the received coded light signal, the receiver can synchronize with the signal. In embodiments, in order to assisted synchronization, the inventors have found that the inter-message idle period should preferably be at least 4 symbols of the relevant modulation code, e.g. 4 ternary Manchester symbols.
(50) Given the above considerations, an exemplary message format comprises:
(51) (i) use of a signal format where a message is cyclically repeated (many times) by the transmitter, thus allowing a (temporal) recombination of footprints from consecutive video frames, each footprint containing a partial received message, for obtaining a complete received messagemessage size may be chosen such that by cyclic repetition eventually the entire message can be recovered;
(ii) a message having relatively short packets (e.g. of 9 bits), separated by inter-packet idle periods for allowing an equalizer to reconstruct the original transmitted waveform in the presence of heavy ISI caused by an un-controllable camera exposure time setting; and
(iii) using a form of Ternary Manchester (TM) as a DC-free modulation code, leading to extra suppression of low frequency components, thus eliminating flicker at low symbol frequencies.
(52) Variations are also possible. For example, while the preferred modulation code is ternary Manchester (which may be abbreviated by the initials TM), other codes could alternatively be used (preferably DC-free or low DC content, with no visible flicker), e.g. conventional Manchester or non-return to zero (NRZ). The following also further describes various particularly advantageous choices for the format parameters (e.g. IMIP). In further embodiments, the IPIP may be tuned to the maximum exposure time. The TM-symbol length may also be tuned to exposure time when exposure time>IPIP. In yet further embodiments, guided descrambling may be used for medium length messages, and/or unscrambled short packets for short messages.
(53) Returning to
(54) As already discussed, a particular problem in using a rolling shutter camera for coded light detection arises because the light source serving as a coded light transmitter may cover only a fraction of the lines of each frame (see again
(55) Another issue is that current smartphones such as iPhones and iPads do not allow for control of the exposure time Texp and ISO by an app. Existing automatic built-in control algorithms often lead to long exposure times that, after camera detection, lead to heavy inter-symbol interference (ISI) between the digital symbols that are sequentially transmitted by the light source.
(56) Further, current LED driver technology only allows for cheap, energy-efficient solutions if the bandwidth (symbol rate) of the transmitted digital signal is very limited (say a symbol rate between 1 and 8 kHz). For such low frequencies, flicker and stroboscopic effects may become serious, unless special precautions are taken in the signal format for suppressing low frequencies. Having just a DC-free code does not always suffice.
(57) The present disclosure describes a signal format that allows for a reliable combination of the information of multiple video frames such that messages longer than the footprint, and even messages having a duration of many frames can be captured and decoded. Moreover, the signal format allows for asynchronous (Wiener-like) equalization to undo the ISI caused by the camera at the receiver. Finally, the frequency content of the messages can be such that there are no visible flicker or stroboscopic effects, even for message lengths having a repetition frequency of, e.g., 18 Hz (very sensitive flicker frequency).
(58) A snapshot of a typical coded-light signal at the transmitter is depicted in
(59) A message, in this example having a duration of 161 ms, consists of 3 packets, each packet comprising 9 TM-encoded bits. A message is cyclically repeated by the transmitter (3 repetitions are shown in
(60) Each packet of a message in this example is trailed by an inter-packet idle period of 33 TM-symbols (33 ms). At the end of each message, there is an (extra) inter-message idle period of 5 TM-symbols, resulting in a total idle period of 33+5=38 idle symbols between the third packet of the current message and the first packet of the next message.
(61)
(62) The reason for cyclically repeating a message is that, at each frame of a rolling shutter camera movie, only a small part of the transmitted message may be recoverable. The size of that part depends on the size of the light source in the images of the camera (footprint), and of the duration of the message. For instance, if the size of the light source is such that only 14% of the lines of a frame are covered by the light source, and if the duration of the message is in the order of 5 frames (assuming a recording speed of 30 frames/second), only about 3% of a message is potentially recoverable from a single movie frame.
(63) If the message duration is carefully chosen with respect to the frame rate of the movie, consecutive frames of the movie reveal different parts of the repeated message such that eventually the whole message is recovered.
(64)
(65) The following considers the relationship shown in
(66) It may also be desired to minimize N, as a large N leads to large latencies. Also for a small footprint, one may desire a small N, e.g. N=301 second.
(67) Transmitter frequency deviations lead to Tm variations. Some deviations may lead to slow rolling or even absence of rolling. N has to remain reasonable for a certain range of message durations around a nominal value.
(68) Now consider what happens to covering a message with footprints if:
(69) relative footprint =Tfootprint/Tf=0.4
(70) 0<1, (in practice e.g. 0<0.88 due to hidden lines) If Tm is about Tf, the message barely rolls (each frame sees practically the same part of the message). But if Tm is about 1.5 times Tf, the message switches so that every other frame sees alternate parts of the message, but some parts are repeatedly missed.
(71) It turns out that, if <1, one obtains non-rolling footprints if the message durations Tm are a multiple of the frame duration Tf. If <0.5, one obtains switching footprints if Tm is a half-integer multiple of Tf (0.5, 1.5, 2.5, . . . ).
(72) In general, if 1/(n+1)<1/n, where n is integer, then one encounters non-rolling footprints if:
(73)
(74) It turns out that the rolling may already be insufficient if the above ratio is close to one of the non-rolling ratios. It also turns out that the rolling may already be insufficient if the above ratio is close to one of the non-rolling ratios.
(75) The result is a complicated relationship as seen in
(76) Modulation Code
(77) The preferred modulation code for low bit rates is ternary Manchester (TM) because of the extra suppression of low frequency components that may lead to flicker. Low bit rates might be imperative because of two reasons: (i) the limited affordable complexity and minimum required efficiency for drivers of the LED light sources; and/or (ii) for obtaining a signaling speed that can be recovered for very long exposure times.
(78) Comparing NRZ, Manchester and ternary Manchester, note that NRZ (actually: no modulation code) has a very high DC content. The Manchester modulation code, well-known from magnetic recording, and also proposed for the IEEE Visible Light Communication (VLC) standard, is a so-called DC-free code, i.e., the spectral content at frequency zero equals 0. The Ternary Manchester modulation code is a so-called DC.sup.2-free modulation code, implying that the spectral density around DC remains much smaller compared to a DC-free code like Manchester. In the spectra for low frequencies, Tm is therefore advantageous compared to Manchester. For flicker, frequencies up to 100 Hz are important.
(79) Since the signal format makes use of relatively short packets, interspersed with idle symbols, one can guarantee a message to be DC.sup.2-free, by letting each packet be DC.sup.2-free. This is accomplished by modulating the user bits using the TM impulse response {0.5, 1, 0.5}. Note that a packet of 9 user bits leads to a TM-encoded packet of 19 TM-symbols.
(80) For larger bit rates, other modulation codes, maybe even multi-level DC-free modulation codes (e.g. quaternary Manchester) also can be envisioned, provided the spectral densities do not lead to visible flicker.
(81) The modulation codes to be used can be defined in a manner that allows for some freedom in the actual implementation of the driver, e.g. for drivers having an Amplitude Modulation (AM) implementation or for drivers having a Pulse Width Modulation (PWM) implementation. This implies that, in contrast to traditional modulation formats, the actual shape of the waveforms to be transmitted is not exactly defined for coded light.
(82) A preferred way of defining a modulation code for coded light would be to define the rules and acceptable values of the output of a full-T moving-average filter applied to a modulator output waveform at the optimum sampling points.
(83) Packet Length
(84) Turning to the question of packet length, the packet length is preferably chosen such that the worst case data pattern is still recoverable under worst case exposure times.
(85) An example is shown in
(86) The moving average filtering of Texp leads to inter symbol interference (ISI) between the TM-symbols of the packet. Note the reduction of the amplitude of the received signal with respect to the incoming transmitted signal. Also note that in the last half of the packet the amplitude of the received signal has been reduced to zero. Finally note that the received signal extends beyond the transmitted signal by Texp=8 ms because of the causal FIR-type filtering by Texp. It is the task of the signal processing in the receiver to reconstruct the transmitted signal from the received signal.
(87)
(88) If one desires that the transmitted signal is recoverable from the received signal, it is required that at least sufficient signal energy remains after filtering the transmitted signal with the ISI filter for all reasonable choices of Texp. For this to happen, the spectral representation of the transmitted signal has to be sufficiently spread across many frequencies (for all possible choices of the bit content of a packet). This turns out to be the case if the packet length is in the order of 9 bit.
(89) On the other hand, if one would make a packet (consisting of all ones) longer than 9 bits (say 17 bits), the spectral representation of such a long packet would still be concentrated around 500 Hz, but its spectral width would be about of the original packet. It turns out that in that case too much signal energy is destroyed by the ISI filter.
(90) The inventors have found that, using TM modulation with fsymbol=1 kHz, for a packet length from say 9 to 12 bits, one can recover the transmitted signal sufficiently accurately for all Texp( 1/30)s, provided the inter-packet idle period (IPIP) is at least Texp. Note that, if IPIP=( 1/30)s, a fixed transmit signal format works for all Texp( 1/30)s. This may be used to provide a universal signal format.
(91) If the packet length is between 12 and 17 bits long, it turns out that the minimum eye height of the eye pattern is determined by only a few detrimental bit patterns that have a poor spectral representation that can be destroyed by the Texp moving average filter in such a manner that it is irrecoverable. If those detrimental bit patterns are only a few, one can avoid those from occurring by so-called guided scrambling. However, it turns out that one requires in the order of 16 different scrambling patterns for applying a useful guided scrambling. Since the index of the scrambling pattern also has to be encoded in each packet, the number of useful bits would again be reduced to 8 or 9 per packet. So for very short repeated messages, the un-scrambled short packets may be deemed to be most useful. For longer messages, guided scrambling may be very useful.
(92) Messages Constructed from Multiple Packets
(93) For transmitting a useful amount of information from a light source to a camera receiver, messages are constructed which consist of a concatenation of p packets, where each packet has its own bit content. Between each two packets, there is at least an inter-packet idle period (IPIP) to prevent ISI crosstalk between different packets. At the end of a message, there is an extra inter-message idle period (IMIP). A message consisting of p packets is cyclically repeated.
(94) In a preferred embodiment, p=3, so effectively 3 bytes of information (24 bits) are transmitted per message.
(95) Inter-Packet Idle Period
(96) The purpose of the inter-packet idle period (IPIP) is to limit the ISI induced by the exposure time (Texp) of the camera to a single packet. In a preferred embodiment, the duration of the IPIP shall be equal to the maximum expected exposure time, Texp_max. This may provide a universal IPIP format, since it allows recovery of the messages for any Texp if: TexpIPIP=Texp_max.
(97) The inventors have also found that messages are recoverable if Texp>IPIP, for carefully chosen TM-symbol rates, where the carefully chosen TM-symbol rates then depend on the actual Texp used by the camera. Formats exploiting the enhanced signaling speed for this case will belong to the negotiated signal formats, since the transmitting light source and the camera receiver should agree on the choice of transmit parameters such as TM-symbol rate, number of packets per message, IPIP and/or IMIP, to ensure that the actual coded light transmissions can be received. The choice of these parameters depends on the available camera settings of, e.g., Texp, frame rate, line rate and the footprint of the light source.
(98) Note, while embodiments herein are described in terms of an IPIP following each packet and an extra IMIP being tagged on the end of the last IPIP, in an alternative description or implementation an IPIP may be included only between adjacent packets of the same message, with the total idle period following the end of the last message being the IMIP.
(99) Inter-Message Idle Period
(100) The inter-message idle period (IMIP) is an idle period that is appended after the last IPIP which trails the last packet of a message. The IMIP may be measured in TM symbols.
(101) The IMIP Serves Two Goals:
(102) (i) to make sure that the total message duration is such that it satisfies nice rolling properties given the frame rate, i.e. such that footprints of consecutive frames reveal the complete message as fast as possible; and/or
(103) (ii) the second purpose of the IMIP is to provide an asymmetry in the pattern of packets and idle periods within the cyclic repetition of messages. This property can be used in the cyclic synchronization of a receiver.
(104) Synchronization Elements of Format
(105) For synchronization purposes, two elements of the signal format are significant.
(106) (i) The usage of the first bit of each 9-bit packet as a synchronization bit. In a preferred embodiment, the first bit of the first packet of a message shall be one, while the first bit of all remaining packets shall be zero.
(107) (ii) The usage of the inter-message idle period (IMIP). The presence of a non-zero IMIP breaks the regular temporal packet structure in a repeated message, because the total idle time after the last packet of a message is longer than the idle times between the other packets.
(108) In a preferred embodiment, the IMIP shall have a duration of at least 4 symbols.
EXAMPLE PARAMETERS
(109) Given all of the above considerations, some example parameter choices are:
(110) fsymbol1 kHz (flicker and strobo),
(111) envisioned packet durations: around 52 ms (49 ms) for fsymbol1 kHz around 26 ms (24.5 ms) for fsymbol2 kHz around 13 ms (12.25 ms) for fsymbol4 kHz,
(112) message durations Tm are an integer multiple of packet durations, and/or
(113) interesting message durations: around 26, 52, 104 ms.
(114) For Instance:
(115) the exposure time is less than or equal to ( 1/30)s, the symbol rate is 1 kHz and the packet is 52 ms including inter-packet idle period;
(116) the exposure time is less than or equal to ( 1/60)s, the symbol rate is 2 kHz and the packet is 26 ms including inter-packet idle period; or
(117) the exposure time is less than or equal to ( 1/120)s, the symbol rate is 4 kHz and the packet is 13 ms including any inter-packet idle period.
(118) Other example parameter choices:
(119) 3-packets format (with CRC) having a duration of 158 ms @ 1 kHz symbol rate, with the 158 ms corresponding to a 3-byte message having an IPIP of 33 symbols and an IMIP of 2 symbols; or
(120) a packet length of 70 symbols 35 ms @ 2 kHz, with the 35 ms corresponding to a 3-byte message having an IPIP of 3 symbols and an IMIP of 4 symbols (e.g. this format can be used where T_exp is controlled to be less than ( 1/500)s).
(121) In a negotiated format case, the controller may be arranged to select between a list of multiple combinations of parameters, comprising any of one or more these combinations, and/or other combinations. In a universal format, one particular combination is pre-chosen to satisfy as many cameras (or rather exposure times) as possible.
(122) Cyclic Redundancy Check (CRC)
(123) In a preferred embodiment, a message consists of several packets, where each packet contains 1 byte of information. In case a CRC is used, it is suggested that the last byte of each message is an 8-bit CRC. Because of the repeated decoding results delivered by a receiver decoding the cyclically repeated signal format, one can obtain potentially many realizations of the transmitted message, which allow the reliability of a received message to be enhanced by comparing the decoding results of consecutive decoded variants of the same message.
(124) In a preferred embodiment, the CRC is characterized by a pre-load and parity inversion. The pre-load can be application-specific, thus allowing a receiver to distinguish between messages from different applications in use in the same environment. Note that there is a trade-off between the number of different pre-loads in use, and the effective error-detection capability of the CRC
(125) Multiple Messages
(126) The inventors have found that one can transmit a concatenation of different messages m.sub.i, where each message m.sub.i is repeated N times, where N is a sufficient number of times such that a camera receiver can reconstruct reliably a complete message m.sub.i given the footprint of the transmitting light source. After N repetitions of the same message m.sub.i, the light source can transmit a completely different message m.sub.i+1 having the same signal parameters, by just concatenating, say N, repetitions of message m.sub.i+1.right after m.sub.i. It turns out that a receiver is capable of recognizing a coherently reconstructed message by observing the CRC.
(127) Message Reassembly
(128) The following describes a process of reassembling or stitching of video frames for coded light message recovery by a camera. The receiver receives a signal formatted as described above and re-assembles the parts of the message into a complete message, which is then provided for further processing.
(129) In embodiments the reassembly process comprises the following.
(130) (i) For each of multiple frames, establish a sample per image line as described above (see again the samples 19 taken from lines 18 in
(131) (ii) Collect all the (active) samples of a given frame into a time-sequence (each positioned at the respective time at which the sample from that line was located within the frame). This sequence forms a marginal signal or frame signal for each frame.
(132) (iii) Next extend the signals with zeros resulting in an extended marginal signal or extended frame signal, where the duration of each extended signal is n times the message duration (n being an integer) and where the duration is longer than the frame duration.
(iv) Next the active samples are time-aligned, i.e. shift the samples per line by Tframe to the right within the time frame or scale defined by the extended signal. This is done cyclically, i.e. in a wrap-around fashion wrapping around beyond the end of the extended frame signal length. This way the shifted position of the samples within the extended framework is such that it facilitates reassembly.
(v) Next the samples are collapsed (i.e. reassembled). In embodiments different reconstructions can be found by shifting one measurement further.
(133) Once reconstructed the signal can be filtered to eliminate inter-symbol interference (ISI), e.g. using a Wiener filter.
(134) In embodiments, the ISI filter is robust enough to handle gaps in the reassembled data (this robustness being at least in part a result of the modulation code, message format and the Wiener filter). The process may also allow elegant handling of skipped frames.
(135) In further embodiments, as an additional feature, the process may also allow the receiver to correct for clock deviations relative to the timing of Tm or Tframe based on correlation of reconstructed signals.
(136) An example of the message reassembly process will be discussed in more detail shortly, but first some example details of the receiver front end are elaborated upon with reference to
(137) In embodiments, the camera-based, digitally-coded light receiver disclosed herein is very different from the class of well-known receivers of digital signals using radio or IR communication. Both the general structure of the coded light receiver, as well as the detailed algorithms for performing the sub-tasks within a coded light receiver, are quite distinct.
(138) The input of a camera-based coded light receiver consists of a movie taken in a known format. For instance, a well-known video format is 480p, a progressive scan format having frames taken at 29.97 frames per second (fps), where each frame consists of 480 lines and each line contains 640 pixels. The coded light receiver consists of the digital signal processing applied to this movie for obtaining the digital content of the modulated light source.
(139) The signal processing performed by the receiver may comprise 2D signal processing and 1D signal processing. The 2D signal processing may comprise:
(140) (i) selection of an appropriate color (R, G or B) or a linear combination of colors for extracting the coded light signal;
(141) (ii) image segmentation using a blob approach, efficiently identifying regions in the image containing coded light sources;
(142) (iii) identifying spatial filter active pixels within each blob;
(143) (iv) efficient motion compensation (independently for each source) using marginal; and/or
(144) (iv) computing signal marginal by combining of active pixels per line (computing samples 19 resulting from each line 18 in
(145) The 1D signal processing may comprise:
(146) (i) using correlations within a frame for estimating the transmit clock (works best for footprint>>duration of message);
(147) (ii) assuming the use of the above-described signal format, where a message is cyclically repeated by the transmitter, and exploiting the knowledge of the repetition time of the message (Tm) and the knowledge of the number of frames per second (Tframe) for reconstructing a complete message from the partial snapshots obtained in each frame (this is the reassembly process to be described in more detail shortly);
(iii) using correlations between successive reconstructed signals for estimating the transmit clock;
(iv) using robust Wiener filtering on a single period of the message for mitigating the ISI caused by Texp;
(v) applying robust Wiener interpolation if the reassembly procedure has left holes in the reconstruction;
(vi) finding global circular synchronization by processing using a sync template;
(vii) decoding the bits by making decisions on the optimum sampling points given by the global circular synchronization; and/or
(viii) checking CRC on consecutive reconstructed messages. If m out of n consecutive reconstructions have CRC=OK, accept the message.
(148) For a particular message format and a given footprint, it may take for example 30 consecutive frames for reassembling a complete message. If one has a recording of 2 seconds (say, 60 frames), the receiver can generate 31 different realizations of the same message. In embodiments, by comparing theses different decoding results it is possible to aid synchronization of the receiver clock with the received signal.
(149) Regarding the selection of appropriate color, it turns out that the section of the appropriate color can be significant for recovering the coded light signal. For instance, the color Green (G) is characterized by the highest pixel density in the camera, thus giving the highest spatial (and thus temporal) resolution of a coded light signal. This may be of importance if the coded light is using a high symbol frequency (wide bandwidth). On the other hand, it turns out that the color blue (B) is favorable if a light source has a high intensity and if Texp is rather long, since this color tends to lead to less clipping of the pixels.
(150) Referring to
(151) To find contributing pixels within a blob, only those pixels that are modulated, i.e., have sufficient intensity variations due to the modulated light source, contribute effectively to the signal. Other source pixels effectively only produce noise or other unwanted side effects. Typically, pixels that are clipped are also removed from further consideration (e.g. see
(152) The following describes of an algorithm that operates on the samples that are obtained as the marginals in each frame (the samples 19 in
(153)
(154)
(155) Note also that, although Tframe equals about 1/3033 ms, the marginal signal of a single frame has a duration of only about 26.5 ms due to the hidden lines 26. At the bottom of
(156) In
(157) (i) Define for each frame a stretch, i.e. a temporal region around (e.g. extending after) the active samples of
(ii) Compute num_periods=ceiling(Tframe/(m*Tm)), where ceiling means round up to the nearest integer.
(iii) Cyclically repeat each stretch num_periods times such that an extended marginal signal is obtained for each frame having a total duration of at least Tframe. Note that the extended marginal signal always has a duration that is larger than Tframe, and that it is an integer multiple of Tm.
(158) In the example, Tm=158 ms; Tframe=33.36, so m=1 and num_periods=1, and each frame is extended by zeros to obtain a stretch of 158 ms (=1 period of the message). Note that the actual useful observation in each frame (stretch) is only a fraction of about 0.03 of a complete message, indicated by the bar 48 in
(159) Note that in embodiments it is not necessary to use two separate integers m and num_periods. The point is to determine a time period that is an integer multiple of the message length (duration) Tm, and which is longer than the frame length (duration) Tframe. This period defines a reference time scale or reference frame within which the signals obtained from the different frames can be aligned, as now disused.
(160) The time-alignment of the observations originating from the different frames is performed using Tframe and the now-defined reference framework or scale determined as explained above. The extended marginal signal of each line is shifted Tframe to the right (in the positive time direction) with respect the extended marginal signal of its previous frame. However, as the extended marginal signals were made a multiple of the message duration Tm, and because the transmitted message is repeated cyclically, one can replace the shift of each extended marginal signal by a cyclic (wrap around) shift, thus obtaining the results in
(161) That is, as mentioned, the extending discussed above provides a timing reference period, which defines scale or framework within which to position the signals obtained from each frame. This reference period has a length being an integer multiple of the message duration Tm. Furthermore, the scale or framework it defines wraps around. I.e. beyond the end of the timing reference period, the scale or framework wraps back around to the beginning of the reference period. Hence if in shifting the signal from a given frame right by Tframe relative to its preceding frame causes a portion of that frame's signal to shift off the end or off the right hand side of the reference scale or frame (beyond the timing reference period, i.e. beyond the integer multiple of Tm that has been defined for this purpose), then the portion of that signal continues by reappearing at the beginning of the reference scale or frame (starting from time zero relative to the timing reference period).
(162) Note that in embodiments, it need not be necessary to extend the signals from each frame (the marginal signals) by adding zeros. This is just one way of implementing the idea of creating a wrap-around reference frame that is an integer multiple of the message duration Tm. An equivalent way to implement or consider this would be that this timing reference period (that is an integer multiple of Tm) defines a canvas on which to place the signals from each frame, and on which to shift them by their respective multiples of Tframe in a wrap-around manner.
(163) Note also that in all cyclically-shifted, extended marginal signals, the receiver keeps track of the locations of the active samples originating from the coded light source.
(164) Having results as in
(165) From the FSM being about 0.03, one can expect that it takes at least (0.03)133 frames for recovering a complete message. Typically, because of overlap, in embodiments the receiver may need about twice that many frames for complete recovery.
(166) From
(167)
(168) In further embodiments, the procedure described above also can deal with so-called skipped frames. The assumption is that a possibly skipped frame is detected by observing the frame recording times that are given by the camera. If a frame is skipped, the corresponding marginal signal will obtain no valid support in
(169) In yet further embodiments, by observing the correlations between different reconstructed signals (31 of them in
(170) The minimum number of frames required in order to get a complete reassembly is now discussed.
(171) Consider again what happens to covering a message with footprints if:
(172) relative footprint =Tfootprint/Tf=0.4
(173) 0<1, (in practice e.g. 0<0.88 due to hidden lines)
(174) If Tm is about Tframe, the alignment of the messages looks like
(175) If Tm is about 1.5 times Tframe, the alignment of the messages looks like
(176) It turns out that, if <1, one obtains non-rolling footprints if the message durations Tm are a multiple of the frame duration Tf. If <0.5, one obtains switching footprints if Tm is a half-integer multiple of Tf (0.5, 1.5, 2.5, . . . ).
(177) As discussed previously in relation to
(178)
(179) Note that the singularities for small m are wider than for larger m.
(180) For a non-rolling message duration Tm=T.sub.0, define m.sub.0, the smallest m such that m.sub.0.Math.T.sub.0=k.sub.0.Math.Tframe, as the order of the non-rolling T.sub.0. GCD(m.sub.0, k.sub.0)=1
(181) The numbers m.sub.0 and k.sub.0 determine the repeat pattern of footprints and messages in the neighborhood of T.sub.0: about k.sub.0 non-rolling footprints go into m.sub.0 messages.
(182) Consider a message of duration TmT.sub.0 in the neighborhood of a non-rolling message duration T.sub.0: after 1 round of m.sub.0 messages, there are k.sub.0 disjoint equidistant footprints partly covering the message.
(183) The non-covered part is: T.sub.0k.sub.0 a Tframe, divided into k.sub.0 equal parts of size Tg, where
Tg=(T.sub.0k.sub.0.Math.Tframe.Math.)/k.sub.0=(T.sub.0m.sub.0.Math.T.sub.0.Math.)/k.sub.0=T.sub.0(1m.sub.0.Math.)/k.sub.0
(184)
(185)
(186) After 1 round of m.sub.0 messages, there are k.sub.0 gaps each of duration Tg that have to be covered by the incremental shifts of the footprints in the next rounds.
(187) Considering the shift T of footprints from one to the next round:
T=m0.Math.|TmT0|[ms]
(188) one needs 1+Tg/T rounds to cover the complete message
(189) 1+Tg/T rounds correspond to Nf=(1+Tg/T).Math.k.sub.0 frames
(190)
(191) Note the hyperbolic behavior of Nf for Tm in neighborhood of T.sub.0. Note also the effect of m.sub.0 and T.sub.0 on the width of a singularity.
(192) Robust Wiener Filtering
(193) This following describes another part of the decoder which in embodiments allows the above implementation to have a considerably better performance and allows the device to be used with a much wider range of cameras.
(194) A robust Wiener filter is introduced, that can be used, e.g. for equalizing a signal that is corrupted by a filter H(f) having unknown parameters, and by additive noise. The robust Wiener is a constant filter that produces optimum results in an MSE sense, assuming that the probability distribution of the filter parameters is known.
(195) Wiener filter theory in itself is well-known in digital signal processing, and has been used extensively since the second world war. Wiener filters can, for instance, be used for estimation of a (linearly) distorted signal in the presence of noise. A Wiener filter (equalizer) then gives the best (mean square error, MSE) result.
(196) In classical (frequency-domain) Wiener filtering, e.g. de-convolution, one has two independent, stationary, zero mean random processes X and N.sub.0 as shown in
(197) In a typical application, X represents an input signal input to a filter H (numeral 54 in
(198) A typical application is the detection of coded light with a rolling shutter camera. In this case, the equivalent digital signal processing problem corresponds to the restoration of a digital signal that has been filtered by a temporal box function. See
(199) (
(200) The task is to find a linear filter G which provides a minimum mean square error estimate of X using only Y. To do this the Wiener filter G is preconfigured based on assumed knowledge of the filter H to be equalized (i.e. undone), as well as N.sub.0. It is configured analytically such that (in theory given knowledge of H and the spectrum of X and N), applying the Wiener filter G to Y (where Y is the input signal X plus the noise N) will result in an output signal X^ that minimizes the mean square error (MSE) with respect to the original input signal X.
(201) The classical Wiener filter formulation (in the frequency domain) is:
(202)
where S(f) is the spectral density of the input signal X and N(f) is the spectral density of the noise term N.sub.0.
(203) As can be seen, the formulation of a Wiener filter comprises a representation of the filter to be equalized, in this case in the form of H* and |H|.sup.2 (=HH*). Traditionally in the classical Wiener filter, it is assumed that H(f), the filter to be equalized, and N(f), the noise spectral density, are exactly known. In the case of equalizing for the ISI filter created by a rolling shutter acquisition process, this implies exact knowledge of Texp. It is also assumed that the spectral densities S(f) and N.sub.0(f) of the processes X and N, respectively, are known.
(204) However, Wiener filters are in fact very sensitive to errors in the estimation of H(f). Some techniques have been developed in the past to deal with an unknown distortion, such as
(205) iterative (time-consuming) approaches, where one tries to vary the target response until one gets the best result; or
(206) mini-max approaches, where one tries to identify the worst case H(f) and optimizes the Wiener filter for this.
(207) A problem therefore in using classical Wiener filtering for equalization, is in applying this theory if the gain of the filter has to be large and the filter to be equalized is not known very accurately.
(208) E.g. for a bandwidth of the signal is in the order of 1 kHz with Texp in the range of 1/30 of a second, the ISI filter can introduce severe inter-symbol interference (ISI) like shown in
(209) In order to undo this ISI at the receiver side, it would be desirable to provide a powerful equalizer filter that is insensitive to inaccuracies in the definition of H(f).
(210) According to the present disclosure, this can be achieved by computing a fixed average Wiener filter, a Wiener-like filter that is robust under unknown variations of the ISI filter H(f). This robust Wiener filter can produces a more optimal output in terms of MSE, given a statistical distribution of the relevant parameters of H(f).
(211) In an application to coded light, this theory allows one to reconstruct a coded light signal where Texp of the camera is only known approximately, which can often be the case.
(212) The inventors have found a particularly efficient derivation of an optimal robust Wiener filter. In the following the problem is described in the frequency domain (so in terms of H(f), as introduced before). Note that in an application to coded light, the robust Wiener filter may be constructed in real time in a camera-based (smart phone) decoding algorithm, as Texp, and therefore H(f), is defined or changed during the actual read-out of a lamp.
(213) The robust Wiener filtering is based on noting that H(f) is not known exactly, but may in fact be dependent on at least one unknown quantity , i.e. a parameter of H whose value is not known and may in fact in any given case be found within a range of values, e.g. between two limits and + (or more generally 1 and 2). That is, it is assumed that the filter H(f;) depends on a random parameter , independent of X and N.
(214) For a box function of width , i.e. a sin c in the frequency domain, one may write:
(215)
(216) And in the case of an ISI filter created by the box, is Texp.
(217) The robust Wiener filter 56 is then created by taking the classical Wiener filter representation given above, and where a representation of the filter to be equalized appears, replacing with a corresponding averaged representation that is averaged over the potential values of the unknown parameter (e.g. average between and + or more generally 1 and 2). That is, wherever a term based on H(f) appears, this is replaced with an equivalent averaged term averaged with respect to .
(218) Starting from the classical formulation above, this gives:
(219)
where E is the average with respect to . See also
(220) A derivation of this is now explained in further detail. It is desired to find a fixed linear filter G that provides a linear minimum mean square error estimate
{circumflex over (X)}(f)=G(f)Y(f)
such that
e(f)=E.sub.X,N,[(X(f){circumflex over (X)}(f)).sup.2]
is minimal.
(221) Extending the classical derivation by taking also the ensemble average with respect to , one obtains:
(222)
(223) since X, N and are independent and
(224)
(225) The best G(f) is found by differentiating e to G and setting the result to 0:
(226)
from which one obtains:
(227)
(228) In a similar manner, one can incorporate a target response of a matched filter (MF):
(229)
(230) To apply this there remains the computation of E.sub.[H*] and E.sub.[HH*]. Some examples are given below.
(231) A first approach is to use a Taylor series expansion of H and moments of . In the coded light rolling shutter application =Texp.
(232)
(233) A Taylor series expansion gives:
(234)
In the rolling shutter application:
(235)
Then:
(236)
(237) This approach works better for low frequencies since H(f,) blows up with increasing frequency.
(238) A second approach is to use a more exact computation assuming a known distribution of . Example: is uniform distributed between ^, and ^+, and
(239)
Then:
(240)
(241) Although in embodiments the above has been described in terms of a certain modification to the classical Wiener frequency domain formulation, there may be other Wiener filter formulations (e.g. time domain or approximations of a Wiener filter, or formulations solved for a particular H) and the principle of replacing an assumed-to-be-known H or function of with an average H or function of H may also be applied in such formulations.
(242) Note also the robust Wiener filter disclosed herein can be used to equalize other filters other than a box (rectangular) filter, and/or in other applications other than receiving coded light. Another example is a band pass filter having a center frequency f.sub.0 which may not be exactly known. In this case the filter to be equalized is a function of frequency f and center frequency f.sub.0, H(f; f.sub.0), and the robust Wiener filter is determined from an averaged representation of H(f; f.sub.0) averaged with respect to f.sub.0. E.g.:
(243)
(244) Further, the idea of the robust Wiener filter can also be extended to a higher dimensional theta, i.e. more than one parameter may be allowed to be uncertain. In this case the representation of filter H to be equalized (e.g. H* and HH*) is averaged over each of the unknown quantities. For example, the parameters may be the center frequency and/or band width of a band pass filter.
(245) Further, the noise term N.sub.0 could alternatively or additionally represent the spectral density of an interfering signal. A generic term for noise and/or interference is disturbance.
(246) It will be appreciated that the above embodiments have been described only by way of example. Other variations to the disclosed embodiments can be understood and effected by those skilled in the art in practicing the claimed invention, from a study of the drawings, the disclosure, and the appended claims. In the claims, the word comprising does not exclude other elements or steps, and the indefinite article a or an does not exclude a plurality. A single processor or other unit may fulfill the functions of several items recited in the claims. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measured cannot be used to advantage. A computer program may be stored and/or distributed on a suitable medium, such as an optical storage medium or a solid-state medium supplied together with or as part of other hardware, but may also be distributed in other forms, such as via the Internet or other wired or wireless telecommunication systems. Any reference signs in the claims should not be construed as limiting the scope.