Method and system for measuring audio transmission delay
09755933 ยท 2017-09-05
Assignee
Inventors
Cpc classification
H04M7/006
ELECTRICITY
H04M7/0081
ELECTRICITY
International classification
H04M7/00
ELECTRICITY
Abstract
A method and a system for measuring an audio transmission delay are provided. Synchronization operation is performed on transmission of an original audio codebook to be tested between a transmitter and a receiver. A transmitter starts sending the original audio codebook to be tested to a receiver in response to sending start instruction information, and stops sending the original audio codebook to the receiver in response to sending end instruction information. The receiver starts capturing the original audio codebook from the transmitter in response to receiving start instruction information and stops capturing the original audio codebook from the transmitter in response to receiving end instruction information. The audio transmission delay is obtained based on a test audio codebook captured by the receiver and the original audio codebook pre-stored in the receiver.
Claims
1. A method for measuring an audio transmission delay, comprising: performing synchronization operation on transmission of an original audio codebook to be tested between a transmitter and a receiver, to determine sending start instruction information, sending end instruction information, receiving start instruction information and receiving end instruction information of the original audio codebook; starting, by the transmitter, sending the original audio codebook to the receiver according to the sending start instruction information; stopping, by the transmitter, sending the original audio codebook to the receiver according to the sending end instruction information; starting, by the receiver, capturing the original audio codebook from the transmitter according to the receiving start instruction information to obtain a test audio codebook; stopping, by the receiver, capturing the original audio codebook from the transmitter according to the receiving end instruction information; and obtaining the audio transmission delay based on the test audio codebook captured by the receiver and the original audio codebook pre-stored in the receiver, wherein obtaining the audio transmission delay based on the test audio codebook captured by the receiver and the original audio codebook pre-stored in the receiver comprises:
2. The method according to claim 1, wherein: the sending start instruction information comprises a sending start time, the sending end instruction information comprises a sending end time, the receiving start instruction information comprises a receiving start time and the receiving end instruction information comprises a receiving end time; starting, by the transmitter, sending the original audio codebook to the receiver according to the sending start instruction information comprises starting, by the transmitter, sending the original audio codebook to the receiver at the sending start time; stopping, by the transmitter, sending the original audio codebook to the receiver according to the sending end instruction information comprises stopping, by the transmitter, sending the original audio codebook to the receiver at the sending end time; starting, by the receiver, capturing the original audio codebook from the transmitter according to the receiving start instruction information comprises starting, by the receiver, capturing the original audio codebook from the transmitter at the receiving start time; and stopping, by the receiver, capturing the original audio codebook from the transmitter according to the receiving end instruction information comprises stopping, by the receiver, capturing the original audio codebook from the transmitter at the receiving end time.
3. The method according to claim 2, wherein: the sending start time is the same as the receiving start time, and the sending end time is the same as the receiving end time; or the sending start time is the same as the receiving start time, and a difference between the sending end time and the receiving end time is smaller than a first predetermined threshold; or a difference between the sending start time and the receiving start time is smaller than a second predetermined threshold, and the sending end time is the same as the receiving end time; or the difference between the sending start time and the receiving start time is smaller than a third predetermined threshold, and the difference between the sending end time and the receiving end time is smaller than a fourth predetermined threshold.
4. The method according to claim 1, wherein performing synchronization operation on transmission of the original audio codebook to be tested between the transmitter and the receiver comprises: performing information interaction between the transmitter and the receiver, so that an order in which the transmitter sends a plurality of original audio codebooks is the same as an order in which the receiver receives the plurality of original audio codebooks.
5. The method according to claim 2, wherein performing synchronization operation on transmission of the original audio codebook to be tested between the transmitter and the receiver comprises: performing synchronization operation on transmission of the original audio codebook between the transmitter and the receiver by a first GPS synchronization control unit arranged in the transmitter and a second GPS synchronization control unit arranged in the receiver, wherein each of the first GPS synchronization control unit and the second GPS synchronization control unit comprises a GPS device which comprises a GPS antenna and a GPS receiving module, wherein the GPS antenna is configured to transmit at least one of the sending start time, the sending end time, the receiving start time, and the receiving end time, and wherein the GPS receiving module is configured to receive at least one of the sending start time, the sending end time, the receiving start time and the receiving end time.
6. The method according to claim 1, wherein: the sending start instruction information comprises first instruction information for instructing the receiver to be prepared for receiving, the sending end instruction information comprises second instruction information for instructing an end of playing of the original audio codebook, the receiving start instruction information comprises third instruction information for instructing the receiver to start receiving, and the receiving end instruction information comprises capturing duration carried in the second instruction information; starting, by the transmitter, sending the original audio codebook to the receiver according to the sending start instruction information comprises starting, by the transmitter, sending the original audio codebook to the receiver in response to receiving the first instruction information; stopping, by the transmitter, sending the original audio codebook to the receiver according to the sending end instruction information comprises stopping, by the transmitter, sending the original audio codebook to the receiver in response to receiving the second instruction information; starting, by the receiver, capturing the original audio codebook from the transmitter according to the receiving start instruction information comprises starting, by the receiver, capturing the original audio codebook from the transmitter in response to receiving the third instruction information; and stopping, by the receiver, capturing the original audio codebook from the transmitter according to the receiving end instruction information comprises determining, by the receiver, whether duration for capturing the original audio codebook from the transmitter exceeds the capturing duration, and stopping capturing the original audio codebook from the transmitter in a case that the duration for capturing the original audio codebook from the transmitter exceeds the capturing duration.
7. A system for measuring an audio transmission delay, comprising: a transmitter; a receiver; one or more processors; and one or more memories storing program instructions, that when executed by the one or more processors, configure the system to perform the following operations: performing synchronization operation on transmission of an original audio codebook to be tested between the transmitter and the receiver, and determining sending start instruction information, sending end instruction information, receiving start instruction information, and receiving end instruction information of the original audio codebook; starting, by the transmitter, sending the original audio codebook to the receiver according to the sending start instruction information; stopping, by the transmitter, sending the original audio codebook to the receiver according to the sending end instruction information; starting, by the receiver, capturing the original audio codebook from the transmitter according to the receiving start instruction information to obtain a test audio codebook; stopping, by the receiver, capturing the original audio codebook from the transmitter according to the receiving end instruction information; and obtaining the audio transmission delay based on the test audio codebook and the original audio codebook pre-stored in the receiver, wherein obtaining the audio transmission delay based on the test audio codebook captured by the receiver and the original audio codebook pre-stored in the receiver comprises:
8. The system according to claim 7, wherein: starting, by the transmitter, sending the original audio codebook to the receiver according to the sending start instruction information comprises starting, by the transmitter, sending the original audio codebook to the receiver at a sending start time, wherein the sending start instruction information comprises the sending start time; stopping, by the transmitter, sending the original audio codebook to the receiver according to the sending end instruction information comprises stopping, by the transmitter, sending the original audio codebook to the receiver at a sending end time, wherein the sending end instruction information comprises the sending end time; starting, by the receiver, capturing the original audio codebook from the transmitter according to the receiving start instruction information comprises starting, by the receiver, capturing the original audio codebook from the transmitter at a receiving start time, wherein the receiving start instruction information comprises the receiving start time; and stopping, by the receiver, capturing the original audio codebook from the transmitter according to the receiving end instruction information comprises stopping, by the receiver, capturing the original audio codebook from the transmitter at a receiving end time, wherein the receiving end instruction information comprises the receiving end time.
9. The system according to claim 8, wherein: the sending start time is the same as the receiving start time, and the sending end time is the same as the receiving end time; or the sending start time is the same as the receiving start time, and a difference between the sending end time and the receiving end time is smaller than a first predetermined threshold; or a difference between the sending start time and the receiving start time is smaller than a second predetermined threshold, and the sending end time is the same as the receiving end time; or the difference between the sending start time and the receiving start time is smaller than a third predetermined threshold, and the difference between the sending end time and the receiving end time is smaller than a fourth predetermined threshold.
10. The system according to claim 7, wherein performing synchronization operation on transmission of the original audio codebook to be tested between the transmitter and the receiver comprises: performing information interaction between the transmitter and the receiver, so that an order in which the transmitter sends a plurality of original audio codebooks is the same as an order in which the receiver receives the plurality of original audio codebooks.
11. The system according to claim 8, wherein performing synchronization operation on transmission of the original audio codebook to be tested between the transmitter and the receiver comprises: performing synchronization operation on transmission of the original audio codebook between the transmitter and the receiver by a first GPS synchronization control unit arranged in the transmitter and a second GPS synchronization control unit arranged in the receiver, wherein each of the first GPS synchronization control unit and the second GPS synchronization control unit comprises a GPS device which comprises a GPS antenna and a GPS receiving module, wherein the GPS antenna is configured to transmit at least one of the sending start time, the sending end time, the receiving start time, and the receiving end time, and wherein the GPS receiving module is configured to receive at least one of the sending start time, the sending end time, the receiving start time and the receiving end time.
12. The system according to claim 7, wherein: starting, by the transmitter, sending the original audio codebook to the receiver according to the sending start instruction information comprises starting, by the transmitter, sending the original audio codebook to the receiver in response to receiving first instruction information, wherein the first instruction information is used to instruct the receiver to be prepared for receiving; stopping, by the transmitter, sending the original audio codebook to the receiver according to the sending end instruction information comprises stopping, by the transmitter, sending the original audio codebook to the receiver in response to receiving second instruction information, wherein the second instruction information is used to instruct an end of playing of the original audio codebook; starting, by the receiver, capturing the original audio codebook from the transmitter according to the receiving start instruction information comprises starting, by the receiver, capturing the original audio codebook from the transmitter in response to receiving third instruction information, wherein the third instruction information is used to instruct the receiver to start receiving; and stopping, by the receiver, capturing the original audio codebook from the transmitter according to the receiving end instruction information comprises determining, by the receiver, whether duration for capturing the original audio codebook from the transmitter exceeds the capturing duration, and stopping capturing the original audio codebook from the transmitter in a case that the duration for capturing the original audio codebook from the transmitter exceeds the capturing duration.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) The drawings here are provided for further understanding the present disclosure, and are a part of the application. The illustrative embodiments according to the present disclosure and descriptions thereof are intended to explain the invention, rather than be an inappropriate limit to the invention. In the drawings,
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
DETAILED DESCRIPTION
(15) Expressions and terms in the description of the embodiments according to the present disclosure are subject to the following explanations.
(16) Technical solutions according to embodiments of the invention are described clearly and completely hereinafter in conjunction with the drawings, so that those in the art can better understand the solutions of the invention. Apparently, the described embodiments are only a few rather than all of the embodiments of the invention. Any other embodiments obtained by those skilled in the art based on the embodiments in the present disclosure without any creative work fall in the scope of the invention.
(17) It should be noted that terms such as first and second in the specification, claims and the drawings above of the present disclosure are used to distinguish between similar objects, rather represent a specific order or a priority sequence. It is understood terms under this circumstance can be interchanged in appropriate cases, so that the described embodiments according to the present disclosure can be implemented in an order other than the order illustrated or described herein. Besides, terms include, have and any variant thereof are intended to cover non-exclusive inclusion. For example, a process, method, system, product or device which includes a series of steps or units is not limited to steps or units explicitly listed, but may further include other steps or units which are not explicitly listed or inherent to the process, method, system, product or device.
(18) First Embodiment
(19) A method for measuring an audio transmission delay is provided according to an embodiment of the disclosure. As shown in
(20) At S202, synchronization operation is performed on transmission of an original audio codebook to be tested between a transmitter and a receiver, to obtain sending start instruction information, sending end instruction information, receiving start instruction information and receiving end instruction information of the original audio codebook.
(21) Optionally, instruction information for controlling the start and the end of sending and receiving the original audio codebook is obtained by performing synchronization operation on transmission of the original audio codebook to be tested.
(22) Optionally, an apparatus for performing synchronization operation includes but is not limited to a GPS synchronization control apparatus and a synchronization control apparatus of a signaling control server, according to an embodiment of the disclosure.
(23) It should be noted that the synchronization operation above is performed to coordinate starting and ending of audio play at the transmitter and starting and ending of audio capture at the receiver, i.e., control the transmitter to start or stop playing the codebook and notify the receiver to start or stop capturing the audio.
(24) For example, as shown in
(25) At 5204, the transmitter starts sending the original audio codebook to be tested to the receiver in response to the sending start instruction information, and stops sending the original audio codebook to the receiver in response to the sending end instruction information. The receiver starts capturing the original audio codebook from the transmitter in response to the receiving start instruction information and stops capturing the original audio codebook from the transmitter in response to the receiving end instruction information.
(26) For example, as shown in
(27) At S206, the audio transmission delay is obtained based on a test audio codebook captured by the receiver and the original audio codebook pre-stored in the receiver.
(28) For example, as shown in
(29) In the embodiment according to the present disclosure, the action of sending audio by the transmitter and the action of capturing audio by the receiver are exactly synchronous, so that the original audio codebook for calculating the delay and the captured test audio codebook undergoing the transmission delay are synchronous.
(30) Optionally, the sending start instruction information includes sending start time, the sending end instruction information includes sending end time, the receiving start instruction information includes receiving start time and the receiving end instruction information includes receiving end time.
(31) Optionally, the transmitter starting sending the original audio codebook to be tested to the receiver in response to the sending start instruction information includes: the transmitter starting sending the original audio codebook to the receiver at the sending start time. The sending start time may include but is not limited to the time when to start playing the audio.
(32) For example, as shown in
(33) Optionally, the transmitter stopping sending the original audio codebook to the receiver in response to the sending end instruction information includes: the transmitter stopping sending the original audio codebook to the receiver at the sending end time. The sending end time may include but is not limited to the time when to stop playing the audio.
(34) For example, as shown in
(35) Optionally, the receiver starts capturing the original audio codebook from the transmitter in response to the receiving start instruction information includes: the receiver starting capturing the original audio codebook from the transmitter at the receiving start time. The receiving start time may include but is not limited to the time when to start capturing the audio.
(36) For example, as shown in
(37) The receiver stopping capturing the original audio codebook from the transmitter in response to the receiving end instruction information includes: the receiver stopping capturing the original audio codebook from the transmitter at the receiving end time. The receiving end time may include but is not limited to the time when to stop capturing the audio.
(38) For example, as shown in
(39) In the embodiment according to the present disclosure, with the instructions of the start times and the end times of the transmitter and the receiver, the precise synchronization between the transmitter and the receiver is achieved and the accuracy of the delay calculation is improved.
(40) Optionally, there are four optional ways to determine the synchronization between the transmitter and the receiver in the embodiment.
(41) A first optional determination way is that the sending start time is the same as the receiving start time and the sending end time is the same as the receiving end time.
(42) Optionally, the start time and the end time of the transmitter are the same as those of the receiver, respectively, thereby achieving synchronous operation on the audio codebook. For example, the sending start time is T.sub.1, the receiving start time is also T.sub.1, the sending end time is T.sub.2, and the receiving end time is also T.sub.2.
(43) A second optional determination way is that the sending start time is the same as the receiving start time and a difference between the sending end time and the receiving end time is smaller than a first predetermined threshold.
(44) Optionally, the start times of the transmitter and the receiver are the same, and the difference between end times of the transmitter and the receiver is smaller than the first predetermined threshold, thereby achieving synchronous operation on the audio codebook. For example, the sending start time is T.sub.1, the receiving start time is also T.sub.1, the sending end time is T.sub.2, the receiving end time is T.sub.3, and T.sub.3T.sub.2<A.sub.1, where A.sub.1 is the first predetermined threshold. It can be determined the transmitter and the receiver are synchronized.
(45) A third optional determination way is that a difference between the sending start time and the receiving start time is smaller than a second predetermined threshold and the sending end time is the same as the receiving end time.
(46) Optionally, the difference between start times of the transmitter and the receiver is smaller than the second predetermined threshold and the end times of the transmitter and the receiver are the same, thereby achieving synchronous operation on the original audio codebook. For example, the sending start time is T.sub.1, the receiving start time is T.sub.4, the sending end time is T.sub.2, the receiving end time is also T.sub.2 and T.sub.4T.sub.1<A.sub.2, where A.sub.2 is the second predetermined threshold. It can be determined the transmitter and the receiver are synchronized.
(47) A fourth optional determination way is that the difference between the sending start time and the receiving start time is smaller than a third predetermined threshold and the difference between the sending end time and the receiving end time is smaller than a fourth predetermined threshold.
(48) Optionally, the difference between start times of the transmitter and the receiver is smaller than the third predetermined threshold, and the difference between end times of the transmitter and the receiver is smaller than the fourth predetermined threshold, thereby achieving synchronous operation on the audio codebook. For example, the sending start time is T.sub.1, the receiving start time is T.sub.5, the sending end time is T.sub.2 and the receiving end time is T.sub.6, T.sub.5T.sub.1<A.sub.3 and T.sub.6T.sub.2<A.sub.4. It can be determined the transmitter and the receiver are synchronized.
(49) In the embodiment according to the present disclosure, it can be determined that the transmitter and the receiver are synchronized in a case that times are the same. Meanwhile, it can also be determined that the transmitter and the receiver are synchronized in a case that the difference between respective two times is within an allowable range.
(50) Optionally, performing synchronization operation on transmission of the original audio codebook to be tested between the transmitter and the receiver further includes:
(51) at S402, performing information interaction between the transmitter and the receiver, so that an order in which the transmitter sends multiple original audio codebooks is the same as an order in which the receiver receives the multiple original audio codebooks.
(52) Optionally, the number of the original audio codebooks may be one or more. In a case that the number of the original audio codebooks is more than one, the order in which the transmitter sends the original audio codebooks is the same as the order in which the receiver receives the original audio codebooks.
(53) For example, as shown in
(54) Optionally, performing synchronization operation on transmission of the original audio codebook to be tested between the transmitter and the receiver includes: performing synchronization operation on transmission of the original audio codebook between the transmitter and the receiver by a first GPS synchronization control unit arranged in the transmitter and a second GPS synchronization control unit arranged in the receiver.
(55) Optionally, in the embodiment, each of the first GPS synchronization control unit and the second GPS synchronization control unit includes a GPS device which includes a GPS antenna and a GPS receiving module. The GPS antenna is configured to transmit at least one of the sending start time, the sending end time, the receiving start time and the receiving end time and the GPS receiving module is configured to receive at least one of the sending start time, the sending end time, the receiving start time and the receiving end time.
(56) For example, as shown in
(57) Further, the GPS device includes the antenna and the GPS receiving module, and the received signals are decoded and processed by its hardware circuit and processing software, to extract two kinds of signals from the received signals and output the two kinds of signals. One kind is pulse signals with an interval of 1 s, and the synchronization error between leading edges thereof and the international standard Greenwich mean time is no more than 1 s, i.e., 1 pps. The other includes international standard year-month-day-hour-minute-second information corresponding to pulse leading edges. The first kind of signals are called back by a GPS SDk development kit, to notify the synchronization control unit to read GPS time information, and the second kind of signals are called back by a GPS SDk development kit, to provide precise time for controlling whether to start playing and capturing corresponding audio.
(58)
(59) At S1, the local audio application terminal and the remote audio application terminal run a voice system under test, and initialize test information of each codebook, which includes a serial number of each codebook, duration of each codebook, an interval corresponding to each codebook and testing start time of each codebook.
(60) At S2, remote sending is performed. A test initiator sends a signal to a GPS synchronization control unit based on the serial number of the codebook and reads the time provided by GPS. In a case that the time provided by the GPS device reaches the testing start time corresponding to the audio codebook, the GPS synchronization control unit sends a command to a local testing App to start playing the audio codebook which will be sent out after being processed by the system under test.
(61) At S3, remote receiving is performed. After learning, by querying through a GPS SDK interface, that the time provided by the GPS device reaches the time for test, a GPS synchronization control unit sends a command to a testing App to turn on the remote terminal to capture the output of the audio system under test. The receiver captures the audio file at a sampling rate of an audio codebook file, which corresponds to the serial number of the audio codebook received from the transmitter and can be found in a local codebook index table, and records the audio file. The receiver continues capturing until a predetermined duration is reached, and then the receiver sends the captured test audio codebook and the original audio codebook to a delay measuring module.
(62) In the embodiment according to the present disclosure, synchronization of sending and receiving for long-distance or short-distance is achieved based on GPS, and the problem that the accuracy of the delay is affected by the asymmetry of the paths is avoided by the one-way capture, which improves the accuracy of delay measurement.
(63) Optionally, in the method for measuring the audio transmission delay, the sending start instruction information includes first instruction information for instructing the receiver to be prepared for receiving, the sending end instruction information includes second instruction information for instructing an end of playing of the original audio codebook, the receiving start instruction information includes third instruction information for instructing the receiver to start receiving, and the receiving end instruction information includes the capturing duration carried in the second instruction information.
(64) Optionally, instruction information may be called signaling information in the embodiment. And the instruction information described above is transmitted based on a signaling control server (SyncServer). Optionally, synchronization of sending and receiving for the short-distance can be achieved based on the signaling control server.
(65) Optionally, the transmitter starting sending the original audio codebook to be tested to the receiver in response to the sending start instruction information includes: the transmitter starting sending the original audio codebook to the receiver when receiving the first instruction information.
(66) For example, as shown in
(67) Optionally, the transmitter stopping sending the original audio codebook to the receiver in response to the sending end instruction information includes: the transmitter stopping sending the original audio codebook to the receiver when receiving the second instruction information.
(68) For example, as shown in
(69) Optionally, the receiver starting capturing the original audio codebook from the transmitter in response to the receiving start instruction information includes: the receiver starting capturing the original audio codebook from the transmitter when receiving the third instruction information.
(70) For example, as shown in
(71) Optionally, the receiver stopping capturing the original audio codebook from the transmitter in response to the receiving end instruction information includes: the receiver determining whether duration for capturing the original audio codebook from the transmitter exceeds the capturing duration, and stopping capturing the original audio codebook from the transmitter in a case that the duration for capturing the original audio codebook from the transmitter exceeds the capturing duration.
(72) For example, as shown in
(73) A specific flow of the instruction-controlled synchronization processing above of a synchronization control apparatus is further described in conjunction with
(74) At S1, the local audio application terminal and the remote audio application terminal run a voice system under test, start synchronization test control clients and successfully log in to the SyncServer. After they both successfully log in, the SyncServer creates a testing session. Two sides of the testing session are represented by side A and side B, respectively.
(75) At S2, any of the two sides (for example, side A) initiates an audio testing session request SyncRequest (the request carrying information of the serial number of the codebook), which is thereafter transferred to the other side (side B) of the testing session via a control end of the SyncServer.
(76) At S3, the other side (side B) initializes/turns on an audio resource capturing device after receiving the testing session request SyncRequest, creates header information such as a degraded codebook filename/an audio sampling rate and the number of sound tracks/the number of bits of a sample, according to the serial number of the codebook, to record an audio output signal of the system under test, and returns confirmation information Sync Ok of being prepared to the initiator (side A) of the testing session via the SyncServer.
(77) At S4, the initiator (side A) of the testing session sends a signaling (Ok Begin Play) for starting playing an audio codebook to the other side (side B) after receiving a signaling, transferred by the SyncServer, indicating that the opposite terminal is prepared, and immediately starts playing a reference codebook signal. The played reference codebook signal is input to and captured by the audio system under test, goes through all processes thereof (pre-processing, coding, packing, transmitting through a network, unpacking, decoding, post-processing and playing) and is captured by a test control client at the other side after being played and output by the other side.
(78) At S5, the other side (side B) immediately starts audio inner recording to capture the output of the audio system under test once receiving the signaling Ok Begin Play, and returns a signaling (Is Inner Recording) indicating that inner recording is being performed on the output of the audio system under test to the initiator (A).
(79) At S6, the initiator (side A) of the testing session sends a signaling Play Ended (carrying a duration of the test codebook) to the other side (side B) once finishing playing the reference audio codebook. After receiving the signaling, the other side determines whether the capturing duration is reached. In a case that the capturing duration is reached, the other side stops capturing the output signal of the audio system under test and outputs the recorded codebook signal.
(80) In the embodiment according to the present disclosure, synchronization operation of the transmitter and the receiver is achieved through instruction-based synchronization control, and a one-way capturing method is used, avoiding the problem that the delay accuracy is affected by echoes and path asymmetry and improving the accuracy of delay measurement.
(81) Optionally, obtaining the audio transmission delay based on the test audio codebook captured by the receiver and the original audio codebook pre-stored in the receiver includes:
R.sub.xy()=.sub.ts.sup.tex(t)y(t+)dt(1)
where R.sub.xy() is a value of a cross-correlation function between the original audio codebook and the test audio codebook corresponding thereto, t.sub.s is the time when the receiver starts capturing the original audio codebook from the transmitter, t.sub.e is the time when the receiver stops capturing the original audio codebook from the transmitter, t is time information corresponding to each sampling point, x(t) is an energy value corresponding to a sampling point at time t in the original audio codebook, is an offset of a sampling point in the test audio codebook used in convolution with x(t) and y(t+) is an energy value of a sampling point at time t+ in the test audio codebook. The audio transmission delay is represented by the value of corresponding to a maximum value of the cross-correlation function.
(82) The maximum cross-correlation function R.sub.xy() between the original audio codebook and the obtained test audio codebook and the value of the subscript corresponding thereto are solved, and an estimated delay value can be obtained by dividing the value of by the sampling rate information of a corresponding reference audio codebook.
(83) Optionally, the audio delay is obtained by solving the cross-correlation between audio signals, in the delay calculation of the embodiment, and the solved audio delay includes a crude audio overall delay Delay-crude and an internal audio delay Delay-internal. The crude overall delay Delay-crude is a delay value which is obtained based on a maximum overall cross-correlation between a reference codebook and an output audio codebook recorded by the synchronization control unit. The audio sub-segment delay Delay-internal is obtained as follows: after the crude overall delay is solved, audio sub-segment division and alignment is performed on the audio signal in the codebook, and then the delay between each audio sun-segment in the reference codebook and a corresponding audio sub-segment in the output audio codebook recorded by the synchronization control unit is solved. The delay value finally solved is the crude audio overall delay Delay-crude plus the internal audio delay Delay-internal.
(84) Optionally, a normalized maximum cross-correlation coefficient .sub.xy() and a corresponding subscript time can be calculated after normalizing the cross-correlation function value above according to the formula:
(85)
(86) As to delay estimation in a scenario of an audio playing codebook with a high sampling rate (44.1K, 48K, 96K and do on), data of one frame of codebook file may be easy to process. Thus, audio envelopes can be obtained from the codebook audio file at a small window of Tms, and a maximum cross-correlation value between the envelopes can be obtained, to obtain a corresponding delay value t, which includes the following steps.
(87) At S1, a window is applied to a voice/audio signal at Tms.
(88) Optionally, the applied window in the embodiment includes at least one of the following: a Hamming window, a Hann window, a hamming window, a triangle window, a Bartlett window and a Kaiser window
(89) For example, in a case that the window function is a rectangular window, which is defined by the formula:
(90)
a kth frame of the voice signal to which a window is applied is expressed by the formula: Xk(n)=w(n)*x(k*N+n). An average of energy of the kth frame of the signal, Xk(n), is expressed by E(k):
(91)
(92) At S2, an envelope information value is obtained for every Tms frame. The envelope information is obtained by calculating a logarithm of a value obtained by normalizing a square root of the voice energy signal and represents a short-term voice energy change. The envelope of the kth frame of voice signal is expressed by Env(k):
(93)
(94) At S3, a maximum value of a cross-correlation function between envelops of a played codebook signal and a recorded degraded signal of the system under test and a corresponding time . In measuring high-quality audio, x(t) or y(t) in the cross-correlation function or the cross-correlation coefficient can be replaced with the sequence value of the envelops obtained by applying the window to the reference codebook and the test codebook, respectively, to obtain the position of a corresponding delayed sample, which can be converted into time with the sampling rate to obtain a delay value.
(95) Optionally, obtaining the audio transmission delay based on the test audio codebook captured by the receiver and the original audio codebook pre-stored in the receiver further includes:
(96)
where TestValue(k) is a delay value corresponding to the maximum value of the cross-correlation function obtained by solving an ith original audio codebook and an ith test audio codebook corresponding thereto obtained through a kth measurement of the ith original audio codebook, the delay value is a time-domain value obtained by dividing a value of corresponding to the maximum value of the cross-correlation function obtained through the kth measurement by the sampling rate information adopted by the receiver in the kth measurement, the sampling rate information is a sampling rate in header information of the ith original audio codebook, Delay.sub.i, is an average audio transmission delay of the ith original audio codebook and m is an integer greater than or equal to 1.
(97) Optionally, obtaining the audio transmission delay based on the test audio codebook captured by the receiver and the original audio codebook pre-stored in the receiver further includes obtaining an average overall delay of the audio system:
(98)
where Avg_Delay is the average audio transmission delay of n original audio codebooks and n is an integer greater than or equal to 1.
(99) In the embodiment according to the present disclosure, energy values of sampling points are calculated based on the cross-correlation function, thereby achieving accurate calculation of the audio transmission delay.
(100) It should be noted that the foregoing method embodiments each are described as a combination of a series of actions for ease of description. Those in the art shall understand the invention is not limited by the described order of the actions, for some steps may be performed in other orders or simultaneously according to the present disclosure. Besides, those in the art shall also understand the embodiments described in the specification are preferred embodiments, and the actions and modules involved are not surely necessary for the invention.
(101) According to the descriptions of the embodiment above, those in the art can clearly understand that the methods according to the embodiments described above may be implemented through software in combination with a necessary universal hardware platform, or through hardware, and in many cases the former makes a better implementation. Based on such understanding, the essence or the part contributing to conventional technology of the technical solutions according to the present disclosure may be embodied in the form of a computer software product. The computer software product is stored in a storage medium (such as a ROM/RAM, a magnetic disk or an optical disc), and includes instructions to enable a terminal device (such as, a cellphone, a computer, a server or a network device) to perform the method according to each of the embodiments in the present disclosure.
(102) Second Embodiment
(103) A system for performing the method above for measuring an audio transmission delay is further provided according an embodiment of the disclosure. As shown in
(104) 1) A first synchronizing unit 902 arranged in the transmitter and a second synchronizing unit 903 arranged in the receiver are configured to perform synchronization operation on transmission of an original audio codebook to be tested between the transmitter and the receiver, to obtain sending start instruction information, sending end instruction information, receiving start instruction information and receiving end instruction information of the original audio codebook.
(105) Optionally, instruction information for controlling the start and the end of sending and receiving the original audio codebook is obtained by performing synchronization operation on transmission of the original audio codebook to be tested.
(106) Optionally, an apparatus for performing synchronization operation includes but is not limited to a GPS synchronization control apparatus and a synchronization control apparatus of a signaling control server, according to an embodiment of the disclosure.
(107) It should be noted that the synchronization operation above is performed to coordinate starting and ending of audio playing at the transmitter and starting and ending of audio capture at the receiver, i.e., control the transmitter to start or stop playing the codebook and notify the receiver to start or stop capturing the audio.
(108) For example, as shown in
(109) 2) A first responding unit 904 arranged in the transmitter is configured to start sending the original audio codebook to be tested to the receiver in response to the sending start instruction information.
(110) For example, as shown in
(111) 3) A second responding unit 906 arranged in the transmitter is configured to stop sending the original audio codebook to the receiver in response to the sending end instruction information.
(112) For example, as shown in
(113) 4) A third responding unit 908 arranged in the receiver is configured to start capturing the original audio codebook from the transmitter in response to the receiving start instruction information.
(114) For example, as shown in
(115) 5) A fourth responding unit 910 arranged in the receiver is configured to stop capturing the original audio codebook from the transmitter in response to the receiving end instruction information.
(116) For example, as shown in
(117) 6) A calculating unit 912 arranged in the receiver is configured to calculate an audio transmission delay based on a test audio codebook captured by the receiver and the original audio codebook pre-stored in the receiver.
(118) For example, as shown in
(119) In the embodiment according to the present disclosure, the action of sending audio by the transmitter and the action of capturing audio by the receiver are exactly synchronous, so that the original audio codebook for calculating the delay and the captured audio codebook undergoing the transmission delay are synchronous.
(120) Optionally, as shown in
(121) 1) The first responding unit 904 includes a first responding sub-module 1002, configured to start sending the original audio codebook to the receiver at sending start time. The sending start time is included in the sending start instruction information.
(122) Optionally, the sending start time may include but is not limited to the time when to start playing the audio.
(123) For example, as shown in
(124) 2) The second responding unit 906 includes a second responding sub-module 1004, configure to stop sending the original audio codebook to the receiver at sending end time. The sending end time is included in the sending end instruction information.
(125) Optionally, the sending end time may include but is not limited to the time when to stop playing the audio.
(126) For example, as shown in
(127) 3) The third responding unit 908 includes a third responding sub-module 1006, configured to start capturing the original audio codebook from the transmitter at a receiving start time. The receiving start time is included in the receiving start instruction information.
(128) Optionally, the receiving start time may include but is not limited to the time when to start capturing the audio.
(129) For example, as shown in
(130) 4) The fourth responding unit 910 includes a fourth responding sub-module 1008, configured to stop capturing the original audio codebook from the transmitter at receiving end time. The receiving end time is included in the receiving end instruction information.
(131) Optionally, the receiving end time may include but is not limited to the time when to stop capturing the audio.
(132) For example, as shown in
(133) In the embodiment according to the present disclosure, with the instructions of the start times and the end times of the transmitter and the receiver, the precise synchronization between the transmitter and the receiver is achieved and the accuracy of the delay calculation is improved.
(134) Optionally, as shown in
(135) Optionally, the start time and the end time of the transmitter are the same as those of the receiver, respectively, thereby achieving synchronous operation on the audio codebook. For example, the sending start time is T.sub.1, the receiving start time is also T.sub.1, the sending end time is T.sub.2, and the receiving end time is also T.sub.2.
(136) Optionally, the start times of the transmitter and the receiver are the same, and the difference between end times of the transmitter and the receiver is smaller than a first predetermined threshold, thereby achieving synchronous operation on the original audio codebook. For example, the sending start time is T.sub.1, the receiving start time is also T.sub.1, the sending end time is T.sub.2, the receiving end time is T.sub.3, and T.sub.3T.sub.2<A.sub.1, where A.sub.1 is the first predetermined threshold. It can be determined the transmitter and the receiver are synchronized.
(137) Optionally, the difference between start times of the transmitter and the receiver is smaller than a second predetermined threshold and the end times of the transmitter and the receiver are the same, thereby achieving synchronous operation on the original audio codebook. For example, the sending start time is T.sub.1, the receiving start time is T.sub.4, the sending end time is T.sub.2, the receiving end time is also T.sub.2 and T.sub.4T.sub.1<A.sub.2, where A.sub.2 is the second predetermined threshold. It can be determined the transmitter and the receiver are synchronized.
(138) Optionally, the difference between start times of the transmitter and the receiver is smaller than a third predetermined threshold, and the difference between end times of the transmitter and the receiver is smaller than a fourth predetermined threshold, thereby achieving synchronous operation on the audio codebook. For example, the sending start time is T.sub.1, the receiving start time is T.sub.5, the sending end time is T.sub.2 and the receiving end time is T.sub.6, T.sub.5T.sub.1<A.sub.3 and T.sub.6T.sub.2<A.sub.4. It can be determined the transmitter and the receiver are synchronized.
(139) In the embodiment according to the present disclosure, it can be determined the transmitter and the receiver are synchronized in a case that times are the same. Meanwhile, it can also be determined that the transmitter and the receiver are synchronized in a case that the difference between respective two times is within an allowable range.
(140) Optionally, as shown in
(141) Optionally, the number of the original audio codebooks may be one or more. In a case that the number of the original audio codebooks is more than one, the order in which the transmitter sends the original audio codebooks is the same as the order in which the receiver receives the original audio codebooks.
(142) For example, as shown in
(143) Optionally, as shown in
(144) Optionally, in the embodiment, each of the first GPS synchronization control unit and the second GPS synchronization control unit includes a GPS device which includes a GPS antenna and a GPS receiving module. The GPS antenna is configured to transmit at least one of the sending start time, the sending end time, the receiving start time and the receiving end time and the GPS receiving module is configured to receive at least one of the sending start time, the sending end time, the receiving start time and the receiving end time.
(145) For example, as shown in
(146) Further, the GPS device includes the antenna and the GPS receiving module, and the received signals are decoded and processed by its hardware circuit and processing software, to extract two kinds of signals from the received signals and output the two kinds of signals. One kind is pulse signals with an interval of 1 s, and the synchronization error between leading edges thereof and the international standard Greenwich mean time is no more than 1 s, i.e., 1 pps. The other includes international standard year-month-day-hour-minute-second information corresponding to pulse leading edges. The first kind of signals are called back by a GPS SDk development kit, to notify the synchronization control unit to read GPS time information, and the second kind of signals are called back by a GPS SDk development kit, to provide precise time for controlling whether to start playing and capturing corresponding audio.
(147)
(148) At S1, the local audio application terminal and the remote audio application terminal run a voice system under test, and initialize test information of each codebook, which includes a serial number of each codebook (a codebook here is a voice/audio file with audio header format information, where the header format information includes a sampling rate, the number of sound tracks and the number of bits of a sample, where the format of the voice/audio file may be an format with an audio header, such as way, mp3, wma and so on), duration of each codebook, an interval corresponding to each codebook and testing start time of each codebook.
(149) At S2, remote sending is performed. A test initiator sends a signal to a GPS synchronization control unit based on the serial number of the codebook and reads the time provided by GPS. In a case that the time provided by the GPS device reaches the testing start time corresponding to the audio codebook, the GPS synchronization control unit sends a command to a local testing App to start playing the audio codebook which will be sent out after being processed by the system under test.
(150) At S3, remote receiving is performed. After learning, by querying through a GPS SDK interface, that the time provided by the GPS device reaches the time for test, a GPS synchronization control unit sends a command to a testing App to turn on the remote terminal to capture the output of the audio system under test. The receiver continues capturing until a predetermined duration is reached, and then the receiver sends the captured test audio codebook and the original audio codebook to a delay measuring module.
(151) In the embodiment according to the present disclosure, synchronization of sending and receiving for long-distance or short-distance is achieved based on GPS, and the problem that the accuracy of the delay is affected by the asymmetry of the upload/download paths is avoided by the one-way capture. And the one-way capture can avoid disturbance and impact, caused by echoes, on the calculation of the delay, which improves the accuracy of delay measurement.
(152) Optionally, as shown in
(153) 1) The first responding unit 904 includes a sending sub-module 1302, configured to start sending the original audio codebook to the receiver when receiving first instruction information, where the first instruction information is used to instruct the receiver to be prepared for receiving.
(154) Optionally, instruction information may be called signaling information in the embodiment. And the instruction information described above is transmitted based on a signaling control server (SyncServer). Optionally, synchronization of sending and receiving for the short-distance can be achieved based on the signaling control server.
(155) For example, as shown in
(156) 2) The second responding unit 906 includes a terminating sub-module 1304, configured to stop sending the original audio codebook to the receiver when receiving second instruction information, where the second instruction information is used to instruct an end of playing of the original audio codebook.
(157) For example, as shown in
(158) 3) The third responding unit 908 includes a capturing sub-module 1306, configured to start capturing the original audio codebook from the transmitter when receiving third instruction information, where the third instruction information is used to instruct the receiver to start receiving.
(159) For example, as shown in
(160) 4) The fourth responding unit 910 includes a determining sub-module 1308, configured to determine whether duration for capturing the original audio codebook from the transmitter exceeds capturing duration and stop capturing the original audio codebook from the transmitter in a case that the duration for capturing the original audio codebook from the transmitter exceeds the capturing duration.
(161) For example, as shown in
(162) A specific flow of the instruction-controlled synchronization processing above of a synchronization control apparatus is further described in conjunction with
(163) At S1, the local audio application terminal and the remote audio application terminal run a voice system under test, start synchronization test control clients and successfully log in to the SyncServer. After they both successfully log in, the SyncServer creates a testing session. Two sides of the testing session are represented by side A and side B, respectively.
(164) At S2, any of the two sides (for example, side A) initiates an audio testing session request SyncRequest (the request carrying information of the serial number of the codebook), which is transferred to the other side (side B) of the testing session via a control end of the SyncServer.
(165) At S3, the other side (side B) initializes/turns on an audio resource capturing device after receiving the testing session request SyncRequest, creates header information such as a degraded codebook filename/an audio sampling rate and the number of sound tracks/the number of bits of a sample, according to the serial number of the codebook, to record an audio output signal of the system under test, and returns confirmation information Sync Ok of being prepared to the initiator (side A) of the testing session via the SyncServer.
(166) At S4, the initiator (side A) of the testing session sends a signaling (Ok Begin Play) for starting playing an audio codebook to the other side (side B) after receiving a signaling, transferred by the SyncServer, indicating that the opposite terminal is prepared, and immediately starts playing a reference codebook signal. The played reference codebook audio signal is input to and captured by the audio system under test, goes through all processes thereof (pre-processing, coding, packing, transmitting through a network, unpacking, decoding, post-processing and playing) and is captured by a test control client at the other side after being played and output by the other side.
(167) At S5, the other side (side B) immediately starts audio inner recording to capture the output of the audio system under test once receiving the signaling Ok Begin Play, and returns a signaling (Is Inner Recording) indicating that inner recording is being performed on the output of the audio system under test to the initiator (A).
(168) At S6, the initiator (side A) of the testing session sends a signaling Play Ended (carrying duration of the test codebook) to the other side (side B) once finishing playing the reference audio codebook. After receiving the signaling, the other side determines whether the capturing duration is reached. In a case that the capturing duration is reached, the other side stops capturing the output signal of the audio system under test and outputs the recorded codebook signal.
(169) In the embodiment according to the present disclosure, synchronization operation of the transmitter and the receiver is achieved through instruction-based synchronization control, and a one-way capturing method is used, avoiding the problem that the delay accuracy is affected by echoes and path asymmetry and improving the accuracy of delay measurement.
(170) Optionally, the calculating unit 912 includes a first calculating module, configure to calculate the audio transmission delay based on the following formula:
R.sub.xy()=.sub.ts.sup.tex(t)y(t+)dt(1)
where R.sub.xy () is a value of a cross-correlation function between the original audio codebook and the test audio codebook corresponding thereto, t.sub.s is the time when the receiver starts capturing the original audio codebook from the transmitter, t.sub.e is the time when the receiver stops capturing the original audio codebook from the transmitter, is time information corresponding to each sampling point, x(t) is an energy value corresponding to a sampling point at time t in the original audio codebook, is an offset of a sampling point in the test audio codebook used in convolution with x(t), and y(t+) is an energy value of a sampling point at time t+ in the test audio codebook. The audio transmission delay is represented by the value of corresponding to a maximum value of the cross-correlation function.
(171) The maximum cross-correlation function R.sub.xy() between the original audio codebook and the obtained test audio codebook and the value of the subscript corresponding thereto are solved, and an estimated delay value can be obtained by dividing the value of by the sampling rate information of a corresponding audio codebook.
(172) Optionally, the audio delay is obtained by solving the cross-correlation between audio signals, in the delay calculation of the embodiment, and the solved audio delay includes a crude audio overall delay Delay-crude and an internal audio delay Delay-internal. The crude overall delay Delay-crude is a delay value which is obtained based on a maximum overall cross-correlation between a reference codebook and an output audio codebook recorded by the synchronization control unit. The audio sub-segment delay Delay-internal is obtained as follows: after the crude overall delay is solved, audio sub-segment division and alignment is performed on the audio signal in the codebook, and then the delay between each audio sun-segment in the reference codebook and a corresponding audio sub-segment in the output audio codebook recorded by the synchronization control unit is solved. The delay value finally solved is the crude overall audio delay Delay-crude plus the internal audio delay Delay-internal.
(173) Optionally, a normalized maximum cross-correlation coefficient .sub.xy() and a corresponding subscript time can be calculated after normalizing the cross-correlation function value above according to the formula:
(174)
(175) As to delay estimation in a scenario of an audio playing codebook with a high sampling rate (44.1K, 48K, 96K and do on), data of one frame of codebook file may be easy to process. Thus, audio envelopes can be obtained from the codebook audio file at a small window of Tms, and a maximum cross-correlation value between the envelopes can be obtained, to obtain a corresponding delay value t, which includes the following steps.
(176) At S1, a window is applied to a voice/audio signal at Tms.
(177) Optionally, the applied window in the embodiment includes at least one of the following: a Hamming window, a Hann window, a hamming window, a triangle window, a Bartlett window and a Kaiser window.
(178) For example, in a case that the window function is a rectangular window, which is defined by the formula:
(179)
a kth frame of voice signal to which a window is applied is expressed by the formula: Xk(n)=w(n)*x(k*N+n). An average of energy of the kth frame of signal, Xk(n), is expressed by E(k):
(180)
(181) At S2, an envelope information value is obtained for every Tms frame. The envelope information is obtained by calculating a logarithm of a value obtained by normalizing a square root of the voice energy signal and represents a short-term voice energy change. The envelope of the kth frame of voice signal is expressed by Env(k):
(182)
(183) At S3, a maximum value of a cross-correlation function between envelops of a played codebook signal and a recorded degraded signal of the system under test and a corresponding time . In playing high-quality codebook signal, x(t) or y(t) in the cross-correlation function or the cross-correlation coefficient can be replaced with the envelop value obtained by applying the window to the reference codebook and the test codebook, respectively, to obtain the position of a corresponding delayed sample, which can be converted into time with the sampling rate to obtain a delay value.
(184) Optionally, the calculating unit 912 includes a second calculating module, configured to calculate the audio transmission delay according to the following formula:
(185)
where TestValue(k) is a delay value corresponding to the maximum value of the cross-correlation function obtained by solving an ith original audio codebook and an ith test audio codebook corresponding thereto obtained through a kth measurement of the ith original audio codebook, the delay value is a time-domain value obtained by dividing a value of corresponding to the maximum value of the cross-correlation function obtained through the kth measurement by the sampling rate information adopted by the receiver in the kth measurement, the sampling rate information is a sampling rate in header information of the ith original audio codebook, Delay.sub.i, is an average audio transmission delay of the ith original audio codebook and m is an integer greater than or equal to 1.
(186) Optionally, the calculating unit 912 includes a third calculating module configured to calculate the audio transmission delay according to the following formula:
(187)
where Avg_Delay is the average audio transmission delay of n original audio codebooks and n is an integer greater than or equal to 1.
(188) In the embodiment according to the present disclosure, energy values of sampling points are calculated based on the cross-correlation function, thereby achieving accurate calculation of the audio transmission delay.
(189) Optionally, in the embodiment above, the system for measuring an audio transmission delay can be applied in short-distance communications.
(190) The serial numbers of the embodiments according to the present disclosure are merely used for a purpose of description, and do not represent merits of the embodiments.
(191) Descriptions of the embodiments according to the present disclosure emphasize different aspects, and for a part, which is not described in detail, of an embodiment, reference can be made to related descriptions in other embodiments.
(192) It should be understood that the client disclosed in the embodiments according to the present disclosure may be implemented in other ways. For example, the apparatus embodiments described above are illustrative only. For example, the division of the units is merely a logical function division and there may be other divisions in practical implementations. For example, multiple units or components may be combined or integrated into another system, or some features may be ignored or not performed. In addition, the displayed or discussed mutual couplings or direct couplings or communication connections may be implemented through some interfaces. The indirect couplings or communication connections between modules or units may be implemented electrically or in other forms.
(193) The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, which may be located in one position or distributed in multiple network units. Some or all of the units may be selected as needed to achieve the objectives of the solutions according to the embodiments.
(194) In addition, functional units in the embodiments of the present disclosure may be integrated into one processing unit, or each of the units may exist alone physically, or two or more units are integrated into one unit. The integrated unit above may be implemented in the form of hardware, or in the form of software functional unit.
(195) When being implemented in the form of software functional unit sold or used as a independent product, the integrated unit may be stored in a computer readable storage medium. Based on such understanding, the essence or the part contributing to conventional technology of the technical solutions according to the present disclosure, or a part or all of the technical solutions may be implemented in the form of a computer software product. The computer software product is stored in a storage medium and includes instructions to enable a computer device (which may be, such as, a personal computer, a server, or a network device) to executive all or a part of steps of the method according to each of the embodiments in the present disclosure. The storage medium may be any medium that can store program codes, such as an U-disk, a read-only memory (ROM), a random access memory (RAM), a mobile hard disk drive, a magnetic disk, or an optical disk and so on.
(196) The embodiments above are only some preferred embodiments of invention. It should be noted that improvements and modifications made by those in the art without deviating from the principle of the invention shall fall with the scope of the present disclosure.