Abstract
A wireless conference system includes an access point and a plurality of conference units, configured for bi-directional, TDMA based wireless communication of latency sensitive audio data packets. The access point and conference units include clocks actively synchronized, generating local audio clock signals used for processing the audio data packets and local synchronization clock signals used for the TDMA based wireless communication. The conference units and access point have a transceiver. The receiver includes a packet loss detection unit configured to detect loss of an audio data packet and including: means to determine an expected arrival time for an audio data packet from the local synchronization clock signal and from a predetermined expected transmission delay, and means to detect whether the audio data packet is lost. The receiver further includes a packet loss concealment unit configured to generate a replacement packet for a lost audio data packet.
Claims
1.-13. (canceled)
14. A wireless conference system adapted to enable a plurality of users to participate to a conference in a conference room, said wireless conference system comprising an access point and a plurality of conference units, wherein said access point and one or more of said conference units comprise a transceiver configured for bi-directional, time division multiple access based or TDMA based wireless communication of latency sensitive audio data packets between said one or more conference units and said access point, said transceiver comprising a transmitter and receiver; wherein said access point and said one or more conference units comprise respective clocks that are actively synchronized, a clock of said respective clocks being configured to generate a local audio clock signal used locally for processing said audio data packets and a local synchronization clock signal used for said TDMA based wireless communication; wherein said receiver comprises a packet loss detection unit configured to detect loss of an audio data packet transmitted from a conference unit to said access point or vice-versa, said packet loss detection unit comprising: means configured to determine an expected arrival time for said audio data packet from said local synchronization clock signal and a predetermined expected transmission delay, and means configured to detect that said audio data packet is lost if it has not arrived by said expected arrival time; and wherein said receiver comprises a packet loss concealment unit configured to generate a replacement packet for said audio data packet that is detected to be lost by said packet loss detection unit.
15. The wireless conference system according to claim 14, wherein said access point and said one or more conference units are configured to not acknowledge receipt of audio data packets.
16. The wireless conference system according to claim 14, wherein said access point and said one or more conference units are configured to not retransmit a lost audio data packet.
17. The wireless conference system according to claim 14, wherein said latency sensitive audio data packets have a round trip time latency limit of 25 milliseconds for wireless transfer from a conference unit to said access point, and wireless transfer from said access point to said conference unit.
18. The wireless conference system according to claim 14, wherein said latency sensitive audio data packets have a round trip time latency limit of 15 milliseconds for wireless transfer from a conference unit to said access point, and wireless transfer from said access point to said conference unit.
19. The wireless conference system according to claim 14, wherein said TDMA based wireless communication uses TDMA frames of 5 milliseconds.
20. The wireless conference system according to claim 14, wherein said transmitter is configured to listen for interfering traffic within an assigned timeslot within a TDMA frame before transmitting an audio data packet therein.
21. The wireless conference system according to claim 14, wherein said wireless communication uses Wi-Fi.
22. The wireless conference system according to claim 14, wherein said one or more conference units comprise clock synchronization units, configured to actively synchronize their respective clocks with a clock in said access point based on a timestamp inserted in beacon messages regularly broadcasted by said access point.
23. The wireless conference system according to claim 14, wherein said predetermined expected transmission delay is determined as a sum of a propagation delay, jitter, an interrupt handling delay, processing delay and clock synchronization inaccuracy.
24. The wireless conference system according to claim 23, wherein said jitter delay comprises a listen-before-talk jitter contribution.
25. The wireless conference system according to claim 14, wherein said predetermined expected transmission delay is set at a value between 1.5 milliseconds and 2 milliseconds.
26. A method for transfer of latency sensitive audio data packets between one or more conference units and an access point in a wireless conference system adapted to enable a plurality of users to participate to a conference in a conference room, said method for transfer comprising bi-directional, time division multiple access based or TDMA based wireless communication of said audio data packets, said method further comprising: actively synchronizing respective clocks in said one or more conference units and said access point, a clock of said respective clocks being configured to generate a local audio clock signal used locally for processing said audio data packets and a local synchronization clock signal used for said TDMA based wireless communication; detecting loss of an audio data packet transmitted from a conference unit to said access point or vice-versa, comprising: determining an expected arrival time for said audio data packet from said local synchronization clock signal and a predetermined expected transmission delay; and detecting that said audio data packet is lost if it has not arrived by said expected arrival time; and generating a replacement packet for said audio data packet that is detected to be lost through packet loss concealment.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0051] FIG. 1 schematically shows an embodiment of the wireless conference system 100 according to the invention;
[0052] FIG. 2 is a functional block scheme of an embodiment of a conference unit 200 that forms part of an embodiment of the wireless conference system 100 according to the invention;
[0053] FIG. 3 is a functional block scheme of an embodiment of an access point 300 that forms part of an embodiment of the wireless conference system 100 according to the invention;
[0054] FIG. 4 is a pseudo flow diagram illustrating an embodiment of the method for transfer of latency sensitive audio data packets according to the invention; and
[0055] FIG. 5 shows an example embodiment of a suitable computing system 500 for performing one or several steps in embodiments of the invention.
DETAILED DESCRIPTION OF EMBODIMENT(S)
[0056] FIG. 1 shows a wireless conference system comprising an access point AP or 101 and five conference units CU1-CU5 or 111-115. The conference units 111-115 capture audio signals from respective conference participants using a microphone, digitize the audio signal into audio data packets, and transmit the audio data packets to the access point 101. The access point 101 receives audio packets from one or several conference units, processes the audio packets and generates new audio packets for distribution to all conference units 111-112 such that the conference units 111-115 receive an audio stream corresponding to the audio signal generated by a single, selected conference participant (the selection may be made by a chairperson controlling the access point 101), the combined audio signals generated by plural, selected conference participants (again, the selection may be made by a chairperson controlling the access point 101), or a processed audio signal like for instance an interpreted or translated audio signal, an interpretation or translation in overlay of the original audio signal, etc. In the embodiment of FIG. 1, the conference units 111-115 communicate wirelessly and bi-directionally with the access point 101 using Wi-Fi. The selected Wi-Fi channel for the upstream direction, i.e. the direction from the conference units 111-115 to the access point 101, is shared using a soft time division multiple access (TDMA) protocol. The soft TDMA protocol requires the transmitter in a conference unit to first listen whether the channel is free before transmitting in the assigned timeslot. The TDMA frame in the conference system 100 of FIG. 1 is assumed to have a length of 5 milliseconds. The TDMA frame comprises 10 timeslots of 0.5 milliseconds. If the channel is occupied, the transmitter will wait until the channel is free. This listen-before-talk or LBT mechanism introduces jitter, i.e. uncertainty on the arrival time of a packet at the access point 101 within the assigned timeslot. In the downstream direction, i.e. the direction from the access point 101 to the conference units 111-115, the audio data packets are also transmitted in timeslots of a TDMA frame. Typically, the same TDMA frame is used for up- and downstream transmission, and the same Wi-Fi channel is used for up- and downstream transmission. The transmitter in the access point 101 also implements the LBT mechanism to avoid interference with other Wi-Fi networks using the same channel.
[0057] FIG. 2 shows the functional building blocks of a conference unit CU or 200 relevant in view of the present invention. The conference unit 200 of FIG. 2 represents a possible implementation of each of the conference units 111-115 drawn in FIG. 1. The conference unit 200 comprises a transceiver 201, an antenna 202 for wireless transmission, a clock 203, clock synchronization logic 204, an audio input connector or microphone connector 205, and an audio output connector or headphones connector 206. FIG. 2 further also schematically shows the microphone 250 connected to microphone connector 205 to capture audio signals, for instance speech from a conference participant, and the headphones 260 connected to the audio output connector 206 or built-in speaker to enable the conference participant to follow the conference conversation. The transceiver 201 comprises a transmitter 210 and receiver 220. The transmitter 210 comprises a processor 211 that is configured to receive the audio signal captured by microphone 250, to digitize this audio signal, and to packetize the digitized audio signal into audio data packets each containing 5 milliseconds of captured speech. The processor 211 forwards the generated audio data packets to the antenna 202, i.e. the radio interface that transmits the audio data packets in upstream direction to the access point of the conference system using the Wi-Fi protocol with soft TDMA and LBT mechanism as explained here above. In the downstream direction, the antenna 202 regularly receives Wi-Fi beacon messages from the access point it associates with. The beacon messages comprise a timestamp or time value that is used by the clock synchronisation unit 204 to synchronize the clock 203 with a clock in the access point. The clock signal, locally generated by clock 203 in conference unit 200 but actively synchronized with a clock in the access point through the beacon messages, is used by an estimated time of arrival unit 223 that estimates the time of arrival of the next audio data packet. As clock 203 is actively synchronized with the clock used in the access point for transmission of audio data packets, the ETA unit 223 can accurately determine the time the next audio data packet is ready for transmission at the access point. The ETA unit 223 increases that time with a predetermined transmission delay, i.e. an acceptable time required for interrupt handling at the transmitter and receiver side, propagation through the air over the Wi-Fi channel, and possible jitter as a result for instance of the LBT mechanism implemented at the transmitter side. The transmission delay may further account for clock synchronisation inaccuracies between clock 203 and the clock in the access point where it is actively synchronized with. In the example of FIG. 2, the transmission delay is predetermined to correspond to 1.5 milliseconds. The locally generated clock signal and predetermined transmission delay enable the ETA unit 223 to defensively estimate the arrival time of the next audio data packet. A loss detection unit 224 that receives an interrupt from processor 221 upon arrival of an audio data packet verifies if an audio data packet is received by the next estimated time of arrival of an audio data packet. If an audio data packet is received in time, the processor 221 shall process the received audio data packet such that the audio samples contained therein can be output as part of the audio stream via audio connector 206. Each time no audio data packet is received by the estimated time of arrival of an audio data packet, the loss detection unit 224 triggers a packet concealment unit 225 that forms part of receiver 220 to generate a replacement packet and to supply the replacement packet to the processor 221 for insertion in the audio stream outputted via audio connector 206. The packet concealment unit 225 applies processing intensive concealment algorithms, typically using earlier received audio data packets to generate a replacement audio data packet that, when inserted in the received audio stream to replace a missing audio data packet, does not generate audible artefacts. Thanks to the active clock synchronization and arrival time estimation of audio data packets, lost packet concealment can be triggered early. This enables the conference unit 200 to meet latency requirements set for conference systems, typically ranging between 15 milliseconds and 25 milliseconds for the round-trip time of audio packets travelling back and forth between the conference unit 200 and access point.
[0058] FIG. 3 shows the functional building blocks of an access point AP or 300 relevant in view of the present invention. The access point 300 of FIG. 3 represents a possible implementation of the access point 101 drawn in FIG. 1. The access point 300 comprises a transceiver 301, an antenna 302 for wireless transmission, a clock 303, and an interface 304 for the chairperson or organization of a conference. The transmitter 301 comprises a transmitter 310 and receiver 320. The transmitter 310 comprises a processor 311 that is configured to generate audio data packets each containing 5 milliseconds of audio. The content of these audio data packets is controlled by the chairperson of the conference through interface 304. The content of these audio data packets may for example be audio received from a single conference unit, audio received from plural conference units, audio received from an interpreter or translator, etc. The processor 311 forwards the generated audio data packets to the antenna 302, i.e. the radio interface that transmits the audio data packets in downstream direction to the conference units of the conference system using the Wi-Fi protocol with soft TDMA and LBT mechanism as explained here above. In the downstream direction, the antenna 302 also regularly transmits Wi-Fi beacon messages containing for instance an access point identifier to enable conference units to associate with the access point 300. These beacon messages also comprise a timestamp or time value from clock 303 enabling conference units to actively synchronise their clock with the clock 303 in the access point 300. The clock signal, locally generated by clock 303 in access point 300, is used by an estimated time of arrival unit 323 that estimates the time of arrival of the next audio data packet coming from a conference unit. As clock 303 is actively synchronized with the clock used in the conference unit for transmission of audio data packets, the ETA unit 323 can accurately determine the time the next audio data packet is ready for transmission at the conference unit. The ETA unit 323 increases that time with a predetermined transmission delay, i.e. an acceptable time required for interrupt handling at the transmitter and receiver side, propagation through the air over the Wi-Fi channel, and possible jitter as a result for instance of the LBT mechanism implemented at the transmitter side. The transmission delay may further account for clock synchronisation inaccuracies between clock 303 and the clock in the conference unit actively synchronized therewith. In the example of FIG. 3, the transmission delay is predetermined to correspond to 1.5 milliseconds. The locally generated clock signal and predetermined transmission delay enable the ETA unit 323 to defensively estimate the arrival time of the next audio data packet for a conference unit. A loss detection unit 324 that receives an interrupt from processor 321 upon arrival of an audio data packet verifies if an audio data packet is received by the next estimated time of arrival of an audio data packet. If an audio data packet is received in time, the processor 321 shall process the received audio data packet such that the audio samples contained therein can be output and used for downstream transmission. Each time no audio data packet is received by the estimated time of arrival of an audio data packet, the loss detection unit 324 triggers a packet concealment unit 325 that forms part of receiver 320 to generate a replacement packet and to supply the replacement packet to the processor 321 for insertion in the audio stream outputted. The packet concealment unit 325 applies processing intensive concealment algorithms, typically using earlier received audio data packets to generate a replacement audio data packet that, when inserted in the received audio stream to replace a missing audio data packet, does not generate audible artefacts. Thanks to the active clock synchronization and arrival time estimation of audio data packets, lost packet concealment can be triggered early. This enables the access point 300 to meet latency requirements set for conference systems, typically ranging between 15 milliseconds and 25 milliseconds for the round-trip time of audio packets travelling back and forth between a conference unit and access point 300.
[0059] FIG. 4 represents a pseudo flow diagram illustrating an embodiment of the method for transfer of latency sensitive audio data packets according to the invention. The pseudo flow diagram of FIG. 4 illustrates the steps performed at the receiver side. In a first step 401, an expected arrival time is determined for an audio data packet from a time signal received from an actively synchronized clock 411 and an expected transmission delay 412, predetermined and recorded for instance in a computer memory or register. In a second step 402, it is verified whether an audio data packet has been received by the expected arrival time determined in the first step 401. In case an audio data packet has been received by the expected arrival time, the received audio data packet is processed normally in step 403. In case no audio data packet has been received by the expected arrival time, lost packet concealment is activated in step 404 in order to generate a replacement audio packet for the lost audio data packet. The lost packet concealment of step 404 is activated as soon as the expected arrival time is reached. At last, the processed audio from the received audio data packet or the generated audio through concealment is streamed in step 405 to form a continuous audio stream without audible artefacts.
[0060] FIG. 5 shows a suitable computing system 500 enabling to perform one or several steps in embodiments of the method for transfer of latency sensitive audio data packets according to the invention. Computing system 500 may in general be formed as a suitable general-purpose computer and comprise a bus 510, a processor 502, a local memory 504, one or more optional input interfaces 514, one or more optional output interfaces 516, a communication interface 512, a storage element interface 506, and one or more storage elements 508. Bus 510 may comprise one or more conductors that permit communication among the components of the computing system 500. Processor 502 may include any type of conventional processor or microprocessor that interprets and executes programming instructions. Local memory 504 may include a random-access memory (RAM) or another type of dynamic storage device that stores information and instructions for execution by processor 502 and/or a read only memory (ROM) or another type of static storage device that stores static information and instructions for use by processor 502. Input interface 514 may comprise one or more conventional mechanisms that permit an operator or user to input information to the computing device 500, such as a keyboard 520, a mouse 530, a pen, voice recognition and/or biometric mechanisms, a camera, etc. Output interface 516 may comprise one or more conventional mechanisms that output information to the operator or user, such as a display 540, etc. Communication interface 512 may comprise any transceiver-like mechanism such as for example one or more Ethernet interfaces that enables computing system 500 to communicate with other devices and/or systems, for example with other computing devices 581, 582, 583. The communication interface 512 of computing system 500 may be connected to such another computing system by means of a local area network (LAN) or a wide area network (WAN) such as for example the internet. Storage element interface 506 may comprise a storage interface such as for example a Serial Advanced Technology Attachment (SATA) interface or a Small Computer System Interface (SCSI) for connecting bus 510 to one or more storage elements 508, such as one or more local disks, for example SATA disk drives, and control the reading and writing of data to and/or from these storage elements 508. Although the storage element(s) 508 above is/are described as a local disk, in general any other suitable computer-readable media such as a removable magnetic disk, optical storage media such as a CD or DVD, -ROM disk, solid state drives, flash memory cards, . . . could be used. Computing system 500 could thus correspond to the processing circuitry 211, 221 in the embodiment illustrated by FIG. 2, or the processing circuitry 311, 321 in the embodiment illustrated by FIG. 3. The processing circuitry used respectively in the transmitter and in the receiver of these embodiments may evidently from part of a single processor or computer as is indicated by the dashed rectangles in these drawings.
[0061] Although the present invention has been illustrated by reference to specific embodiments, it will be apparent to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied with various changes and modifications without departing from the scope thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. In other words, it is contemplated to cover any and all modifications, variations or equivalents that fall within the scope of the basic underlying principles and whose essential attributes are claimed in this patent application. It will furthermore be understood by the reader of this patent application that the words “comprising” or “comprise” do not exclude other elements or steps, that the words “a” or “an” do not exclude a plurality, and that a single element, such as a computer system, a processor, or another integrated unit may fulfil the functions of several means recited in the claims. Any reference signs in the claims shall not be construed as limiting the respective claims concerned. The terms “first”, “second”, third“, “a”, “b”, “c”, and the like, when used in the description or in the claims are introduced to distinguish between similar elements or steps and are not necessarily describing a sequential or chronological order. Similarly, the terms “top”, “bottom”, “over”, “under”, and the like are introduced for descriptive purposes and not necessarily to denote relative positions. It is to be understood that the terms so used are interchangeable under appropriate circumstances and embodiments of the invention are capable of operating according to the present invention in other sequences, or in orientations different from the one(s) described or illustrated above.