Calibration method and system
09949050 ยท 2018-04-17
Assignee
Inventors
Cpc classification
G01B5/00
PHYSICS
G01S15/876
PHYSICS
G01S7/539
PHYSICS
H04S7/305
ELECTRICITY
G01S2015/465
PHYSICS
G01S15/42
PHYSICS
G01C15/00
PHYSICS
International classification
G01B5/00
PHYSICS
G01C15/00
PHYSICS
G01S7/539
PHYSICS
H04S7/00
ELECTRICITY
Abstract
The present invention concerns a method for calibrating an array of receivers (r.sub.j), each receiver being configured for receiving a signal transmitted by at least one transmitter (s.sub.i), and echoes of the transmitted signal as reflected by one or more reflective surfaces (w), said method comprising the following steps: sorting said echoes, by assigning each echo to a reflective surface or to a combination of reflective surfaces (w) calibrating said array of receivers (r.sub.j) based on said sorting.
Claims
1. A method for determining the position of at least one microphone of an array of microphones, each microphone being configured to receive a signal transmitted by at least one loudspeaker, and echoes of the transmitted signal as reflected by one or more reflective surfaces, said method comprising the following steps: Receiving, in a processor, an impulse response measured at the array of microphones, wherein the impulse response contains the echoes of the transmitted signal, wherein each of the echoes corresponds substantially to a Dirac in the impulse response, Sorting, in the processor, said echoes of the impulse response, by assigning each echo to one reflective surface or to a combination of reflective surfaces; wherein the step of sorting said echoes by assigning each echo to one reflective surface or to a combination of reflective surfaces, comprises selecting a subset of echo assignments from a set of all possible echo assignments, determining it that selected subset of echo assignment is valid by checking if the times of arrival of all of the echoes in the selected subset satisfy a predefined system of equations; Determining, in the processor, the position of at least one of the microphones in the array based on the assignments of the echoes to the corresponding reflective surfaces.
2. The method according to claim 1, wherein said sorting comprises: selecting a set of echoes from the echoes as received by the microphone; assigning each echo of the set to one reflective surface or to a combination of reflective surfaces.
3. The method according to claim 2, wherein said determining the on of at least one of the microphones comprises: selecting a calibration module among a number of calibration modules; running said calibration module for all the echoes' assignments: calculating for each running a fit parameter; selecting an echoes' assignment on the basis of said fit parameter.
4. The method according to claim 3, further comprising: selecting the echoes' assignment which minimizes said fit parameter.
5. The method of claim 3 comprising: determining the location of the transmitter(s) based on the selected echoes' assignment.
6. The method of claim 3 comprising: determining the location and/or orientation of the reflective surface(s) based on the selected echoes' assignment.
7. The method of claim 6, the array of receivers comprising at least M.sub.min+1 receivers, wherein M.sub.min is the minimum number of microphones required by the calibration module, the method comprising: selecting a sub-array of microphones from said array of receivers, said sub-array comprising a number of microphones less that the number of receivers of said array of receivers; determining the position of said array of microphones based on the signals and echoes received only by said sub-array of receivers.
8. The method according to claim 7, wherein said selecting is based on the knowledge of which microphones of the array of microphones are close in space.
9. The method of the claim 1, wherein the location and/or the orientation of the reflective surfaces is known.
10. The method of claim 1, said signal being an acoustic signal.
11. The method of claim 10, said reflective surface being a wall of a room.
12. A method according to claim 1, wherein the position of at least one of the receivers in the array is determined based on the time of arrivals or time difference of arrival of the echoes and based on the assignments of the echoes to the corresponding reflective surfaces.
13. A system for determining the position of at least one receiver of an array of receivers, to find the absolute or the relative location of the receivers in an array of receivers, comprising: at least one transmitter for sending a signal; said array of receivers for receiving an impulse response containing the transmitted signal and the echoes of the transmitted signals as reflected by one or more reflective surfaces, wherein each of the echoes corresponds substantially to a Dirac in the impulse response; a processor for sorting said echoes of the impulse response by assigning each of said echoes to one reflective surface or to a combination of reflective surfaces, wherein the first computing module is configured to sort said echoes by assigning each echo to one reflective surface or to a combination of reflective surfaces, by selecting a subset of echo assignments from a set of all possible echo assignments, and determining if that selected echo assignment is valid by checking if the times of arrival of all of the echoes in the selected subset satisfy a predefined system of equations; the processor further configured for determining the position of at least one of the receivers in the array based on the assignments of the echoes to the corresponding reflective surfaces, wherein in a first embodiment, the receivers are microphones, the transmitter is a loudspeaker and signal is an acoustic signal or in a second embodiment, the receivers and the transmitter are mobile devices and the signal is an RF signal or in a third embodiment, said transmitter is a light source, said array of receivers is an array of light sensitive devices as a photo diodes or cameras and said signal is a light signal.
14. The system of claim 13, said reflective surface being a wall of a room.
15. The system of claim 13, said reflective surface being a mirror.
16. A computer program product, comprising: a tangible non-transitory computer usable medium including computer usable program code for determining the position of at least one receiver of an array of receivers, said receivers of the array being configured for receiving a signal transmitted by a transmitter, and echoes of the transmitted signal as reflected by one or more reflective surfaces, the computer usable program code performs the following steps, when executed on a processor: Receiving, in the processor, an impulse response measured at the array of receivers, wherein the impulse response contains the echoes of the transmitted signal, wherein each of the echoes corresponds substantially to a Dirac in the impulse response; Sorting, in the processor, said echoes of the impulse response, by assigning each of said echoes to one reflective surface or to a combination of reflective surfaces, wherein the sorting said echoes by assigning each echo to one reflective surface or to a combination of reflective surfaces, comprises selecting a subset of echo assignments from a set of all possible echo assignments, determining that if selected echo assignment is valid by checking if the times of arrival of all of the echoes in the selected subset satisfy a predefined system of equations; Determining, in the processor, the position of at least one of the receivers in the array based on the assignments of the echoes to the corresponding reflective surfaces wherein in a First embodiment, the receivers are microphones, the transmitter is a loudspeaker and said signal is an acoustic signal or in a second embodiment, the receivers and the transmitter are mobile devices and the signal is an RF signal or in a third embodiment, said transmitter is a light source, said array of receivers is an array of light sensitive devices as a photo diodes or cameras and said signal is a light signal.
17. A method for determining the position of at least one receiver of an array of receivers, each receiver being configured for receiving a signal transmitted by at least one transmitter, and echoes of the transmitted signal as reflected by one or more reflective surfaces, said method comprising the following steps: Receiving, in a processor, an impulse response measured at the array of microphones, wherein the impulse response contains the echoes of the transmitted signal, wherein each of the echoes corresponds substantially to a Dirac in the impulse response; Sorting, in the processor, said echoes received by the array of receivers, by assigning each echo to one reflective surface or to a combination of reflective surfaces; wherein said sorting comprises: selecting a set of echoes from the echoes as received by the receiver; assigning each echo of the set to one reflective surface or to a combination of reflective surfaces; wherein the method comprises further the step of determining, in the computing module, the position of at least one of the receivers in the array based on the assignments of the echoes to the corresponding reflective surfaces, wherein the step of determining the position comprises: selecting a calibration module among a number of calibration modules; running said calibration module for all the echoes' assignments; calculating a fit parameter for each running of said calibration module; selecting an echoes' assignment on the basis of said fit parameter wherein in a first embodiment, the receivers are microphones, the transmitter is a loudspeaker and said signal is an acoustic signal or in a second embodiment, the receivers and the transmitter are mobile devices and the signal is an RF signal or in a third embodiment, said transmitter is a light source, said array of receivers is an array of light sensitive devices as a photo diodes or cameras and said signal is a light signal.
18. A method according to claim 17 wherein the fit parameter is calculated as the discrepancy between the measured data, and the data that would have been generated by receivers at estimated positions.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) The invention will be better understood with the aid of the description of an embodiment given by way of example and illustrated by the figures, in which:
(2)
(3)
(4)
(5)
DETAILED DESCRIPTION OF POSSIBLE EMBODIMENTS OF THE INVENTION
(6) The present invention will be now described in more detail in connection with its embodiment for calibrating an array of microphones (or, in general, of receivers).
(7) The present invention will be now described in more detail in connection with an audio signal. However the present invention finds applicability of connection with other kinds of signals, e.g. and in a non-limiting way a light signal, a RF signal, an UWB signal, an ultrasound signal, etc.
(8) The present invention will be now described in more detail in connection with a room. However the present invention does not necessarily need to be applied in a closed room. The presence of one or more reflective surfaces is sufficient.
(9) The first and second order echoes concept is described in
(10)
(11) A first audio signal transmitted by the source s is reflected by the wall w2. The reflected signal or echo e1 is then received by the receiver r. Since there is a single reflection of the transmitted signal before its reception by the receiver r, the echo e1 is a first-order echo. A second audio signal transmitted by the source is reflected first by the wall w2 and after by the wall w1; the reflected signal or echo e2 is then received by the receiver r. Since there are two reflections of the transmitted signal before its reception by the receiver r, the echo e2 is a second-order echo.
(12) The time of arrival (TOA) is defined as the travel time from a source s to a receiver r. The audio signals e1 and e2 can have different times of arrival (TOAs).
(13) In the context of the present invention the expression time of arrival or TOA indicates the absolute propagation time of an echo between the transmitter and the receiver, or the difference of the time of arrival of an echo from the time of arrival of another echo (reference). In the first case the transmitter and the receiver are synchronised, in the second case they are not synchronized.
(14)
(15) {tilde over (s)}.sub.ij is the image of {tilde over (s)}.sub.i with respect of the wall (i+1). It is then a second generation image source, generating a second-order echo.
(16) The virtual sources {tilde over (s)}.sub.i or {tilde over (s)}.sub.ij are not real, tangible and concrete sources as the real source s. In other words they are abstract objects used for studying the signal reflections, according to the well known image-source theory, used e.g. in optics.
(17) The method according to the invention uses the image source model. The idea in the image source model is that if there is a sound source on one side of the wall, then the sound field on the same side can be represented as a superposition of the original sound field and the one generated by a mirror image of the source with respect to the wall.
(18) In the case of
{tilde over (s)}.sub.i=s+2(p.sub.is.sub.in.sub.i)n.sub.i
(19) By observing the impulse response and doing appropriate computations it is possible to access the first-order echoes, but also higher-order echoes.
(20) In other words the sound propagation inside a space defined by one or more reflective surfaces, e.g. a room, can be modelled by the room impulse response (RIR). An RIR describes the acoustic channel between the source s and the receiver r inside the room. It depends on the shape of the room and locations of the loudspeaker s and the microphone r. Ideally, it is a train of Diracs, each corresponding to an echo:
(21)
(22) where c.sub.i and t.sub.i are the amplitude and time of arrival of the ith echo.
(23) As described, the loudspeaker s does not need to be synchronized with the microphone r, as it is possible to only measure differences of times of arrival of the echoes to the microphone r due to the lack of synchronization.
(24) If the loudspeaker s and the microphone r are synchronised, the time of arrival corresponds to the absolute propagation times of the signal between the loudspeaker s and the microphone r.
(25) The microphone r hears the convolution of the signal transmitted by the loudspeaker s with the RIR. By measuring the RIR it is possible to access the echo times t.sub.i. These echo times can be linked to the room geometry and the microphone location with the image source model. According to this model, it is possible to replace an echo from a wall by a virtual source behind the wall in a mirrored location of the original source.
(26) The described image source model is used for calibrating an array of receivers, e.g. an array of microphones.
(27) The inventive method exploits the use of a known calibration module or method, e.g. the calibration method described in M. Pollefeys and D. Nister, Direct computation of sound and microphone locations from time-difference-of-arrival data, in IEEE int. Conf. on Acoustic Speech, and Signal Processing, 2008, pp. 2445-2448 as a black-box building block.
(28) Any other known calibration module or method can be used as a building block of the present method.
(29) As discussed, known calibration modules or methods do not exploit at all the reverberation.
(30)
(31) In the illustrated example, the sources s.sub.i produce some impulsive sound whose arrival times .sub.i can be precisely measured by the receivers r.sub.j. In general, the sources s.sub.i can produce any kind of signal allowing to estimate the impulse response between the sources and the microphones.
(32) In one preferred embodiment the receivers r.sub.j are synchronized. However, as discussed, the synchronisation is not necessary for the working of the system.
(33) The location of the array of receivers r.sub.j is unknown, the location of the array of sources s.sub.i is unknown, and the location of one or more reflective surfaces (not illustrated) is unknown as well.
(34) The array of receivers r.sub.j measures then
.sub.ij=c.sub.i+s.sub.ir.sub.j.sub.2.
(35) where
(36) {s.sub.k}.sub.k=1.sup.K denotes the source positions
(37) {r.sub.m}.sub.m=1.sup.M denotes the the microphone positions
(38) .sub.i the offset time associated with the ith source.
(39) Thus it is possible to measure the matrix =(.sub.ij).
(40) A calibration module (black box), named Calibrate, that computes the unknown locations {r.sub.m}, {s.sub.k}, and offset {.sub.k} from {.sub.ij} is then selected.
(41) It is possible then to write
({circumflex over (R)},,{circumflex over ()},)=Calibrate()
(42) where 0 is a fit parameter, denoting some measure of fit. If {circumflex over (R)}, and {circumflex over ()} perfectly generate , then =0. The symbol ^ over R, S and indicates an estimation of R, S respectively .
(43) The measure of fit is computed as the discrepancy between the measured data, and the data that would have been generated by receivers at estimated positions:
(44)
(45) Associated with the component Calibrate is the minimal number of microphones M.sub.min and sources K.sub.min required for estimation. In other words, the choice of the module Calibrate determines the minimal number of microphones M.sub.min and sources K.sub.min necessary for performing the calibration.
(46) The minimal dimensions of are then K.sub.minM.sub.min.
(47) The present invention is based on the observation that in a room (or more generally, in the presence of one or more reflective surfaces), the acoustic sources {s.sub.k} generate reflections equivalent to virtual sources.
(48) The present invention is based on exploitation of these additional sources, normally considered an annoyance.
(49)
(50) A challenge that appears in this setting is that it is not possible to address each virtual source {tilde over (s)}.sub.i individually, as they are not labelled. Moreover in the presence of multiple reflective surfaces, they are heard by the array of receivers in different orders.
(51) It is then necessary to perform an echo sorting. However in this case, the (relative) geometry of the array of the receivers is not known.
(52) The goal is to find the best fit among all possible echo assignments. This can be achieved by running the module Calibrate for all echo assignments, and taking as the correct assignment the one with the smallest .
(53) In one embodiment, only the echoes received by the array of receivers in a first temporal window are considered, regardless their order.
(54) In another embodiment, a second temporal window will be used so as to reduce the number of combinations for echo sorting. That second window can depend on the largest dimension of the array of receivers.
(55) Performing the combinatorial search may be feasible for arrays with small number of receivers, e.g. comprising less than 10 receivers. For large arrays of receivers, e.g. comprising more than 10 receivers, e.g. 20, 100 or more, however, the number of combinations could become too big for the processing power of the computing module. If e.g. the computing module belongs to a mobile device, probably a fixed computer will be necessary.
(56) In this case, it is also possible to bootstrap the method by first running it for one or more sub-arrays of the array of receivers. In one preferred embodiment it is possible to have an idea about groups of receivers that are spatially close (e.g. this will be the case for large fixed arrays). Knowing which microphones are close in space is relevant, as proximity makes it more likely that the microphones will have picked up the same echoes, and in the same order. In spatially large arrays, it is not guaranteed that all the microphones will hear all the echoes.
(57) If any information about the array at all is available, different strategies and heuristics can be used to sample several sub-arrays from the whole array of receivers. It is possible to do it randomly, or by exploiting some conditions. For example, a necessary (but not sufficient) condition that the microphones be close is that they receive the pukes at similar time instants.
(58) In one preferred embodiment the calibration can be performed one acoustic event at a time (e.g. a finger snap). This means that the echoes corresponding to different acoustic events will not overlap, as they will overlap only within one event. This is useful, as it is possible to know that the time offset will be the same for all virtual events corresponding to a single real event (and it will be equal to the offset of that event). Structured information like this can be exploited for modifying the Calibrate module.
(59) Here below there is an example of an embodiment of the method according to the invention:
(60) TABLE-US-00001 Algorithm 1 Basic indoor calibration Input: Signals received by the microphones {y.sub.m(t)}.sub.m=1.sup.M Output:
Microphone positions r.sub.m
Source positions s.sub.k
Unknown offsets Peak picking:
For every y.sub.m(t) find the set of peaks (echoes), T.sub.m Initialization:
.sub.best For every echo assignment across {T.sub.m}:
Create the corresponding matrix
(R, S, , ) = Calibrate(.sub.i)
If ( < .sub.best), then (R.sub.best, S.sub.best, .sub.best, .sub.best) (R, S, , ) End For
Once the correct echo assignment is learned, apply it to the whole array (in case subarrays were used).
(61) As discussed, the method according to the invention allows to reduce the number of sources required for calibration. For example the calibration method described in M. Pollefeys and D. Nister, Direct computation of sound and microphone locations from time-difference-of-arrival data, in IEEE Int. Conf. on Acoustic, Speech, and Signal Processing, 2008, pp. 2445-2448 suggests to use at least four sources for calibrating an array of 10 microphones. The method according to the invention allows to use for the same calibration algorithm 1 source only, placed in a room comprising reflective walls, and to exploit the at least 3 virtual sources representing the echoes received by the array of microphones. In this case only a single emission time needs to be estimated, as it will be the same for all the image sources.
(62)
(63) Processor unit 304 serves to execute instructions for software that may be loaded into memory 306. Processor unit 304 may be a set of one or more processors or may be a multi-processor core, depending on the particular implementation. Further, processor unit 304 may be implemented using one or more heterogeneous processor systems in which a main processor is present with secondary processors on a single chip. As another illustrative example, the processor unit 304 may be a symmetric multi-processor system containing multiple processors of the same type.
(64) In some embodiments, the memory 306 shown in
(65) The communications unit 310 shown in
(66) The input/output unit 312 shown in
(67) Instructions for the operating system and applications or programs are located on the persistent storage 308. These instructions may be loaded into the memory 306 for execution by processor unit 304. The processes of the different embodiments may be performed by processor unit 304 using computer implemented instructions, which may be located in a memory, such as memory 306. These instructions are referred to as program code, computer usable program code, or computer readable program code that may be read and executed by a processor in processor unit 304. The program code in the different embodiments may be embodied on different physical or tangible computer readable media, such as memory 306 or persistent storage 308.
(68) Program code 316 is located in a functional form on the computer readable media 318 that is selectively removable and may be loaded onto or transferred to data processing system 300 for execution by processor unit 304. Program code 316 and computer readable media 318 form a computer program product 320 in these examples. In one example, the computer readable media 318 may be in a tangible form, such as, for example, an optical or magnetic disc that is inserted or placed into a drive or other device that is part of persistent storage 308 for transfer onto a storage device, such as a hard drive that is part of persistent storage 308. In a tangible form, the computer readable media 318 also may take the form of a persistent storage, such as a hard drive, a thumb drive, or a flash memory that is connected to data processing system 300. The tangible form of computer readable media 318 is also referred to as computer recordable storage media. In some instances, computer readable media 318 may not be removable.
(69) Alternatively, the program code 316 may be transferred to data processing system 300 from computer readable media 318 through a communications link to communications unit 310 and/or through a connection to input/output unit 312. The communications link and/or the connection may be physical or wireless in the illustrative examples. The computer readable media also may take the form of non-tangible media, such as communications links or wireless transmissions containing the program code.
(70) The different components illustrated for data processing system 300 are not meant to provide architectural limitations to the manner in which different embodiments may be implemented. The different illustrative embodiments may be implemented in a data processing system including components in addition to or in place of those illustrated for data processing system 300. Other components shown in
(71) Therefore, as explained at least in connection with
(72) In accordance with a further embodiment of the present invention is provided for a computer data carrier storing presentation content created while employing the methods of the present invention.