SIGNAL PROCESSING APPARATUS, METHOD, AND SYSTEM
20220335923 · 2022-10-20
Assignee
Inventors
- Tingqiu Yuan (Shenzhen, CN)
- Libin Zhang (Beijing, CN)
- Huimin Zhang (Shenzhen, CN)
- Chang Liu (Shenzhen, CN)
Cpc classification
G10K11/17881
PHYSICS
G10K2210/30231
PHYSICS
G10K2210/1081
PHYSICS
G10K11/17817
PHYSICS
International classification
Abstract
A signal processing apparatus is configured to preprocess a sound wave signal and output a processed audio signal through an electromagnetic wave. The signal processing apparatus includes: a receiving unit, configured to receive at least one sound wave signal; a conversion unit, configured to convert the at least one sound wave signal to at least one audio signal; a positioning unit, configured to determine position information related to the at least one sound wave signal; a processing unit, configured to determine a sending time point of at least one audio signal based on the position information and a first time point, where the first time point is a time point at which the receiving unit receives the at least one sound wave signal; and a sending unit, configured to send the at least one audio signal through the electromagnetic wave.
Claims
1. A signal processing method, wherein the signal processing method is applied to a signal processing apparatus, the signal processing apparatus preprocesses a sound wave signal and outputs a processed audio signal through an electromagnetic wave, and the signal processing method comprises: receiving at least one sound wave signal; converting the at least one sound wave signal to at least one audio signal; determining position information related to the at least one sound wave signal; determining a sending time point of the at least one audio signal based on the position information and a first time point, wherein the first time point is a time point at which the signal processing apparatus receives the at least one sound wave signal; and sending the at least one audio signal through the electromagnetic wave at the sending time point.
2. The method according to claim 1, further comprising: performing phase inversion processing on the at least one audio signal; and the sending the at least one audio signal through the electromagnetic wave comprises: sending, through the electromagnetic wave, the at least one audio signal on which phase inversion processing is performed.
3. The method according to claim 1, further comprising: determining a first distance and a second distance based on the position information, wherein the first distance is a distance between a sound source of the at least one sound wave signal and an electronic device, and the second distance is a distance between the sound source of the at least one sound wave signal and the signal processing apparatus; and performing transfer adjustment on the at least one sound wave signal based on a difference between the first distance and the second distance, to determine a signal feature of the at least one audio signal, wherein the signal feature comprises an amplitude feature; and the sending the at least one audio signal through the electromagnetic wave comprises: sending the at least one audio signal to the electronic device at the sending time point through the electromagnetic wave.
4. The method according to claim 3, wherein the determining the sending time point of the at least one audio signal based on the position information and a first time point comprises: determining, based on a difference between a first duration and a second duration, a time point for sending the at least one audio signal, so that the at least one audio signal and the at least one sound wave signal arrive at the electronic device synchronously, wherein the first duration is a ratio of the difference between the first distance and the second distance to a speed of sound, the second duration is a difference between a second time point and the first time point, and the second time point is a time point that is determined by the signal processing apparatus and at which the electronic device receives the audio signal.
5. The method according to claim 4, wherein the determining, based on a difference between first duration and second duration, a time point for sending the at least one audio signal comprises: when the first duration is greater than the second duration, determining, based on the difference between the first duration and the second duration, the time point for sending the at least one audio signal.
6. A signal processing method, wherein the signal processing method is applied to an electronic device, and comprises: receiving at least one sound wave signal; receiving at least one audio signal, a first time point, and first information through an electromagnetic wave from a signal processing apparatus, wherein the first time point is a time point at which the signal processing apparatus receives the at least one sound wave signal, and the first information is position information related to the at least one sound wave signal; and determining a playing time point of the at least one audio signal based on the first time point and the first information, wherein the audio signal is for performing noise reduction processing on the at least one sound wave signal.
7. The method according to claim 6, further comprising: performing phase inversion processing on the at least one audio signal.
8. The method according to claim 6, wherein the determining a playing time point of the at least one audio signal based on the first time point and the first information comprises: determining a first distance and a second distance based on the first information, wherein the first distance is a distance between a sound source of the at least one sound wave signal and the electronic device, and the second distance is a distance between the sound source of the at least one sound wave signal and the signal processing apparatus; and determining the playing time point of the at least one audio signal based on a difference between first duration and second duration, so that the electronic device plays the audio signal when receiving the at least one sound wave signal, wherein the first duration is a ratio of a difference between the first distance and the second distance to a speed of sound, the second duration is a difference between a second time point and the first time point, and the second time point is a time point at which the at least one audio signal is received.
9. The method according to claim 8, further comprising: performing transfer adjustment on the at least one audio signal based on the difference between the first distance and the second distance, to determine a signal feature of the at least one audio signal, wherein the signal feature comprises an amplitude feature.
10. The method according to claim 6, wherein the at least one audio signal comprises N audio signals, wherein N is a positive integer greater than 1; and the method further comprises: calculating an arithmetic average value of M signals for a same sound source, wherein M is a positive integer not greater than N.
11. A signal processing apparatus, comprising a microphone, a memory, a processor and a communication interface, wherein the microphone is configured to receive at least one sound wave signal; the memory is configured to store program instructions; the processor is coupled to the microphone and is configured to invoke the program instructions to: convert the at least one sound wave signal to at least one audio signal; determine position information related to the at least one sound wave signal; determine a sending time point of the at least one audio signal based on the position information and a first time point, wherein the first time point is a time point at which the receiving unit receives the at least one sound wave signal; and the communication interface is coupled to the processor and configured to send the at least one audio signal through an electromagnetic wave at the sending time point.
12. The signal processing apparatus according to claim 11, wherein the processor is further configured to invoke the program instructions to perform phase inversion processing on the at least one audio signal; and the communication interface is configured to send, through the electromagnetic wave, the at least one audio signal on which phase inversion processing is performed.
13. The signal processing apparatus according to claim 11, wherein the processor is further configured to invoke the program instructions to: determine a first distance and a second distance based on the position information, wherein the first distance is a distance between a sound source of the at least one sound wave signal and an electronic device, and the second distance is a distance between the sound source of the at least one sound wave signal and the signal processing apparatus; and perform transfer adjustment on the at least one sound wave signal based on a difference between the first distance and the second distance, to determine a signal feature of the at least one audio signal, wherein the signal feature comprises an amplitude feature; and the sending unit is specifically configured to send the at least one audio signal to the electronic device at the sending time point through the electromagnetic wave.
14. The signal processing apparatus according to claim 13, wherein the processor is further configured to invoke the program instructions to: determine, based on a difference between first duration and second duration, a time point for sending the at least one audio signal, so that the at least one audio signal and the at least one sound wave signal arrive at the electronic device synchronously, wherein the first duration is a ratio of the difference between the first distance and the second distance to a speed of sound, the second duration is a difference between the first time point and a second time point, and the second time point is a time point that is determined by the signal processing apparatus and at which the electronic device receives the audio signal.
15. The signal processing apparatus according to claim 14, wherein the processor is further configured to invoke the program instructions to: when the first duration is greater than the second duration, determine, based on the difference between the first duration and the second duration, the time point for sending the at least one audio signal.
16. A electronic device, comprising a microphone, a memory, a processor and a communication interface, wherein the microphone is configured to receive at least one sound wave signal; the communication interface is coupled to the processor and configured to receive at least one audio signal, a first time point, and first information through an electromagnetic wave from a signal processing apparatus, wherein the first time point is a time point at which the signal processing apparatus receives the at least one sound wave signal, and the first information is position information related to the at least one sound wave signal; the memory is configured to store program instructions; and the processor is coupled to the microphone and is configured to invoke the program instructions to determine a playing time point of the at least one audio signal based on the first time point and the first information, wherein the audio signal is for performing noise reduction processing on the at least one sound wave signal.
17. The signal processing apparatus according to claim 16, wherein the processor is further configured to invoke the program instructions to perform phase inversion processing on the at least one audio signal.
18. The signal processing apparatus according to claim 16, wherein the processor is further configured to invoke the program instructions to: determine a first distance and a second distance based on the first information, wherein the first distance is a distance between a sound source of the at least one sound wave signal and the electronic device, and the second distance is a distance between the sound source of the at least one sound wave signal and the signal processing apparatus; and determine the playing time point of the at least one audio signal based on a difference between first duration and second duration, so that the electronic device plays the audio signal when receiving the at least one sound wave signal, wherein the first duration is a ratio of a difference between the first distance and the second distance to a speed of sound, the second duration is a difference between the first time point and a second time point, and the second time point is a time point at which the at least one audio signal is received.
19. The signal processing apparatus according to claim 18, wherein the processor is further configured to invoke the program instructions to: perform transfer adjustment on the at least one audio signal based on the difference between the first distance and the second distance, to determine a signal feature of the at least one audio signal, wherein the signal feature comprises an amplitude feature.
20. The signal processing apparatus according to claim 16, wherein the at least one audio signal comprises N audio signals, wherein N is a positive integer greater than 1; and the processor is further configured to invoke the program instructions to: calculate an arithmetic average value of M signals for a same sound source, wherein M is a positive integer not greater than N.
Description
BRIEF DESCRIPTION OF DRAWINGS
[0141]
[0142]
[0143]
[0144]
[0145]
[0146]
[0147]
[0148]
[0149]
[0150]
[0151]
[0152]
[0153]
[0154]
[0155]
DESCRIPTION OF EMBODIMENTS
[0156] The following describes embodiments of this disclosure with reference to the accompanying drawings. It is clear that the described embodiments are merely a part rather than all of embodiments of this disclosure. A person of ordinary skill in the art may learn that, with technology development and emergence of a new scenario, the technical solutions provided in embodiments of this disclosure are also applicable to a similar technical problem.
[0157] In the specification, claims, and accompanying drawings of this disclosure, terms “first”, “second”, and the like are intended to distinguish between similar objects but do not necessarily indicate a specific order or sequence. It should be understood that the data used in such a way are interchangeable in appropriate circumstances, so that embodiments described herein can be implemented in an order other than the content illustrated or described herein. Moreover, terms “include”, “comprise”, and any other variant thereof are intended to cover non-exclusive inclusion. For example, a process, a method, a system, a product, or a device that includes a series of steps or modules is not necessarily limited to those expressly listed steps or modules, but may include other steps or modules not expressly listed or inherent to the process, the method, the product, or the device. Names or numbers of steps in this disclosure do not mean that the steps in the method procedure need to be performed in a time/logical sequence indicated by the names or numbers. An execution sequence of the steps in the procedure that have been named or numbered can be changed based on a technical objective to be achieved, provided that same or similar technical effects can be achieved. Division into the modules in this disclosure is logical division. In actual application, there may be another division manner. For example, a plurality of modules may be combined or integrated into another system, or some features may be ignored or not performed. In addition, the displayed or discussed mutual coupling or direct coupling or communication connection may be implemented through some ports, and the indirect coupling or communication connection between modules may be in an electrical form or another similar form. This is not limited in this disclosure. In addition, modules or submodules described as separate components may be or may not be physically separated, or may be or may not be physical modules, or may be distributed on a plurality of circuit modules. Objectives of the solutions of this disclosure may be achieved by selecting some or all of the modules based on actual demands.
[0158] Current noise reduction technologies include active noise reduction and passive noise reduction. An active noise reduction function is to superimpose a cancellation sound wave and an environmental noise, to cancel out impact of the environmental noise, where the cancellation sound wave has a same frequency and amplitude as the environmental noise, but a phase of the cancellation sound wave differs from a phase of the environmental noise by 180°. In this way, a noise reduction effect is achieved. A passive noise reduction headset mainly surrounds ears to form a closed space, or uses a sound insulation material such as a silicone earplug to block an external noise. Because the passive noise reduction headset usually needs to block an ear canal or wear a thick earmuff to achieve a noise reduction effect, wearing experience and a noise reduction effect of users are poor. However, an active noise reduction headset can overcome a disadvantage that the noise reduction effect of the passive noise reduction headset is not ideal. Therefore, the active noise reduction headset may become a standard configuration of a smartphone in the future, and is to play an important role in fields such as wireless connection, intelligent noise reduction, voice interaction, and biological monitoring.
[0159] Generally, active noise reduction (active noise cancellation, ANC) includes three types: feedforward noise reduction, feedback noise reduction, and integrated noise reduction. To better understand this solution, the following describes principles of the three types of active noise reduction. It should be noted that, in the conventional technology, there are mature technologies about how to implement a feedforward active noise reduction headset, a feedback active noise reduction headset, and an integrated active noise reduction headset. How to implement the three types of active noise reduction is not an inventive point of this disclosure.
[0160]
[0161]
[0162] In addition to the foregoing feedforward active noise reduction system and feedback noise reduction system, there is an integrated noise reduction system. As shown in
y(n)=w.sup.T(n)×(n)
e(n)=d(n)−y(n)
w(n+1)=w(n)+ue(n)×(n).
[0163] In the foregoing expression, w(n) represents a weight coefficient of an adaptive filter, the third formula is a filter coefficient update formula, and u represents a convergence factor (a value may be random). That is, a weight coefficient at a next time point may be obtained by adding a weight coefficient at a current time point and an input proportional to an error function. A purpose of the system is to obtain y(n) through continuous prediction based on e(n) and x(n), so as to minimize e(n).
[0164] In the foregoing three active noise reduction systems, because a sensor is disposed on a headset, a distance between the sensor and ears is excessively small, and the headset needs to collect a noise within an extremely short period of time and process the noise, for example, a headset in the market has only dozens of microseconds to sample, process, and play signals. Such a short period of time greatly limits performance of an active noise reduction headset, and reduces an upper limit of an active noise reduction frequency of the headset. To resolve this problem, one manner is to remove a sensor (for example, the sensor may be a microphone) that is usually embedded in a headset, and the sensor becomes an external sensor. For example, a headset user sits in an office and wears a noise reduction headset. A microphone is installed at a door of the office to sense a noise in the corridor and transmit the noise to the headset at a speed of light. Because a speed of a wireless signal is greatly faster than a speed of sound, the headset has more time to process the signal and calculate a noise cancellation signal. This time advantage enables the headset to obtain information about a noise several milliseconds in advance, which is hundreds of times faster than dozens of microseconds of a conventional headset. In this way, noise reduction calculation is better performed. However, this solution also has a disadvantage. In this solution, a microphone can be used to sense and eliminate only a single noise source. Consequently, this solution can be used only in an indoor environment dominated by a single noise source, and is not applicable to a scenario with a plurality of noise sources. For example, when there are a plurality of noise sources, the headset may receive, at different time points, noise signals collected by different microphones. This solution does not provide, in a case of the plurality of noise sources, a solution about how the headset performs processing based on noise reduction signals sent by a plurality of microphones, to achieve a noise reduction effect. In addition, in this solution, it cannot be ensured that a noise reduction signal obtained by processing, by the headset, the noise collected by the microphone can exactly cancel out a noise received by the headset. In other words, this solution provides only an idea of separating noise collection from noise reduction signal playing, but does not describe how to specifically achieve a noise reduction effect after a sensor for collecting an external noise is externally disposed. As a result, this solution cannot be actually applied. To resolve this problem, this disclosure provides an audio signal processing method. The method is described in detail below.
[0165] Descriptions are first provided for a system architecture of this disclosure and a scenario to which this disclosure is applicable.
[0166] In this disclosure, any sound causing interference to a sound that a user wants to listen to is referred to as a noise. For example, when the user is using a headset, any sound causing interference to audio transmitted in the headset is a noise. For example, the noise may be a sound from an ambient environment. The technical solutions provided in this disclosure are applicable to a scenario in which there is one or more noise sources, and in particular, to a scenario in which there are a plurality of noise sources. In this disclosure, a noise source is sometimes referred to as a sound source or an acoustic source. When a difference between the noise source, the sound source, and the acoustic source is not emphasized, the noise source, the sound source, and the acoustic source have a same meaning. As shown in
[0167] The foregoing describes the system architecture and example applicable scenarios provided in this disclosure. The following describes a principle of how the signal processing apparatus and the electronic device cooperate with each other to implement noise reduction. In the technical solutions provided in this disclosure, transfer adjustment and time adjustment need to be performed on the audio signal collected by the signal processing apparatus. Transfer adjustment is performed, so that the noise reduction signal played by the electronic device can have a same or similar signal feature as the audio signal of the noise collected by the electronic device. Time adjustment is performed, so that the noise reduction signal played by the electronic device and the noise audio signal of the noise collected by the electronic device can cancel out each other. This enhances a noise reduction effect. Transfer adjustment may be performed by the signal processing apparatus or the electronic device. Time adjustment may be performed by the signal processing apparatus or the electronic device. The following separately describes whether transfer adjustment is performed by the signal processing apparatus or the electronic device, and whether time adjustment is performed by the signal processing apparatus or the electronic device. In addition, transfer adjustment and time adjustment may be performed based on an actual path or an estimated path. In addition, when receiving signals sent by a plurality of signal processing apparatuses, the electronic device needs to process the received signals sent by the plurality of devices. In addition, the signal processing apparatus may recognize acoustic sources, to separate signals into a plurality of channels of audio signals based on the acoustic sources. This can perform noise reduction processing more accurately. In addition, considering that noise perception is different due to different positions of the two ears, noise reduction processing may be further separately performed for the two ears. This disclosure further describes these specific cases.
[0168]
[0169] As shown in
[0170] 501: A signal processing apparatus receives at least one first sound wave signal, and converts the at least one sound wave signal to at least one audio signal.
[0171] The signal processing apparatus may receive the first sound wave signal by using a microphone device, or the signal processing apparatus may receive the first sound wave signal by using a microphone array. The microphone array is a system that includes a specific quantity of acoustic sensors (which are usually microphones) and that is configured to sample and process a spatial feature of a sound field. In other words, the microphone array includes a plurality of sensors distributed in space according to a particular topology structure. The microphone may convert a sound wave signal to an audio signal. In this disclosure, the signal processing apparatus converts the received first sound wave signal to the audio signal by using the microphone or the microphone array.
[0172] 502: The signal processing apparatus performs transfer adjustment on the at least one first sound wave signal based on first information.
[0173] The first information includes position information of an electronic device relative to the signal processing apparatus. The signal processing apparatus performs transfer adjustment on the first sound wave signal based on the first information.
[0174] In this embodiment of this disclosure, the position information of the electronic device relative to the signal processing apparatus may be obtained in a plurality of manners. All methods for obtaining distances between several devices in a conventional technology can be used in this embodiment of this disclosure. For example, a distance between the electronic device and the signal processing apparatus is pre-specified. In an actual application process, a distance of the electronic device relative to the signal processing apparatus is adjusted based on the pre-specified distance, the distance between the electronic device and the signal processing apparatus may be measured in advance, or a topology relationship between the electronic device and the signal processing apparatus may be obtained according to a positioning method, to obtain the position information of the electronic device relative to the signal processing apparatus. This disclosure protects how to use the position information of the electronic device relative to the signal processing apparatus. How to obtain the position information of the electronic device relative to the signal processing apparatus is not specifically limited in embodiments of this disclosure. The following uses a delay estimation positioning method as an example to describe how the signal processing apparatus performs transfer adjustment on the first sound wave signal based on the first information so that a signal feature of a first audio signal is the same as or close to a signal feature of a second sound wave signal.
[0175] Time delay estimation positioning method is a sound source positioning method widely used in the industry. When the signal processing apparatus receives the first sound wave signal by using a vector microphone, or the signal processing apparatus receives the first sound wave signal by using a microphone array, the signal processing apparatus may position a sound source that sends the sound wave signal, or may position the electronic device. The following provides descriptions by using an example in which the signal processing apparatus receives the first sound wave signal by using a microphone array.
[0176] The electronic device may send a sound wave signal with a fixed frequency or fixed content at intervals, and the signal processing apparatus receives the sound wave signal by using a microphone matrix. As shown in
[0177] In this case, a distance from the electronic device to each microphone satisfies d.sub.i=√{square root over ((x−x.sub.i).sup.2+(y−y.sub.i).sup.2+(z−z.sub.i).sup.2)}.
[0178] Differences between distances from the electronic device to the microphones satisfy d.sub.ij=d.sub.i−d.sub.j, where i=0, 1, . . . , N, and j=0, 1, . . . , N.
[0179] A relative delay between different microphones satisfies {circumflex over (d)}.sub.ij=cτ.sub.ij, where i=0, 1, . . . , N, and j=0, 1, . . . , N; c represents a speed of sound, and the speed of sound in this disclosure is a propagation speed of sound in the air; and τ.sub.ij represents a delay between the i.sup.th microphone and a j.sup.th microphone.
[0180] In the foregoing expressions, a distance between microphones is known, and the speed of sound is also known. An approximate spatial position of the electronic device may be obtained by comprehensively solving the foregoing expressions. It is assumed that spatial coordinates of the electronic device are determined through calculation as p2=(x.sub.d, y.sub.d, z.sub.d). The coordinates of the signal processing apparatus satisfy p1=(0, 0, 0), that is, the signal processing apparatus is set to the origin of the spatial coordinates. Because the coordinates of the electronic device are known, a distance d1 between the signal processing apparatus and the electronic device may be calculated as follows: d1=√{square root over ((0−x.sub.d).sup.2+(0−y.sub.d).sup.2+(0−z.sub.d).sup.2)}.
[0181] In a specific implementation, the first information may be the distance d1 between the signal processing apparatus and the electronic device.
[0182] In a specific implementation, the first information may be spatial coordinates of the electronic device and the signal processing apparatus in a same spatial coordinate system.
[0183] When a sound wave signal is propagated in the air, amplitude attenuation and phase shift occur. Amplitude attenuation and phase shift are related to a transfer distance of a sound wave. A relationship between the transfer distance of the sound wave, and amplitude attenuation and phase shift belong to the conventional technology. For example, this disclosure provides a method for performing transfer adjustment based on a distance. Transfer adjustment in this disclosure includes amplitude adjustment or phase shift adjustment. In an ideal propagation condition, a relationship between a signal received by a receive end and a signal sent by a transmit end is as follows:
x(t)=s(t)*h(t)=as(t−τ).
[0184] h(t) represents an impulse response of a linear time-invariant system, a represents amplitude attenuation, and r represents a transmission delay.
[0185] A frequency domain expression is as follows:
X(ω)=S(ω)H(ω)=S(ω)G(r,r.sub.0,w).
[0186] r.sub.0 represents a spatial coordinate point of the transmit end, G(r, r.sub.0, w) represents a Green's function, and an expression is as follows:
[0187] In a specific implementation, r−r.sub.0 represents the distance d1 between the signal processing apparatus and the electronic device. A signal X(ω) obtained after transmission is performed by d1 may be obtained by using a frequency domain function, and then a time-domain signal x(n) may be obtained by transforming the signal X(ω) to a time domain. This process is a process in which the signal processing apparatus performs transfer adjustment on the first sound wave signal based on the first information. To be specific, the signal processing apparatus may learn, based on a value of d1, a signal received by the electronic device after the signal is transmitted by the distance d1, so that the signal processing apparatus may perform transfer adjustment on the first audio signal. It may be understood that the signal processing apparatus predicts, in advance, a signal feature of an audio signal corresponding to a signal that is sent by a sound source and that is received by the electronic device. The prediction may specifically include amplitude prediction and phase prediction. It should be noted that, in this embodiment of this disclosure, transfer adjustment is performed only based on an estimated path. A scenario to which this embodiment of this disclosure is applicable includes but is not limited to a scenario in which a topology node cannot obtain position information of the sound source, or a distance between the sound source and the signal processing apparatus is very short. For example, the signal processing apparatus is deployed at a position of the sound source. In these scenarios, it may be considered that the distance d1 between the signal processing apparatus and the electronic device is a transmission path of the first audio signal, and a signal obtained after the audio signal corresponding to the first sound wave signal is transferred by the distance d1 is used by the electronic device to determine a noise reduction signal.
[0188] It should be noted that, in some embodiments, before the signal processing apparatus performs transfer adjustment on the first sound wave signal, the signal processing apparatus may further perform phase inversion processing on the audio signal corresponding to the first sound wave signal, that is, the signal processing apparatus may perform phase inversion processing on the collected audio signal, so that a phase of the first audio signal is opposite to a phase of the collected audio signal. Phase inversion processing may be performed on the collected audio signal in different manners. For example, it is assumed that the audio signal collected by the signal processing apparatus is p1(n). In this case, the signal processing apparatus may directly perform phase inversion on a sampled and quantized audio signal p1(n), that is, invert a phase of a symbol at each sampling point to obtain a phase-inverted signal of p1(n). A complete active noise reduction system may be further deployed on the signal processing apparatus to obtain a phase-inverted signal y(n). The active noise reduction system may be the foregoing feedforward active noise reduction system, feedback active noise reduction system, or integrated active noise reduction system. How to obtain the phase-inverted signal based on the active noise reduction system belongs to the conventional technology, and has been described above. Details are not described herein again.
[0189] In a specific implementation, the method may further include 503: The signal processing apparatus determines at least one first time point.
[0190] The first time point is a time point at which the signal processing apparatus receives the at least one first sound wave signal.
[0191] Corresponding to step 503, the method may further include step 504: The signal processing apparatus determines a sending time point of at least one first audio signal based on a difference between first duration and second duration.
[0192] The first duration is determined by the signal processing apparatus based on the first information and the speed of sound, the second duration is a difference between a second time point and the first time point, and the second time point is a time point that is determined by the signal processing apparatus and at which the electronic device receives the first audio signal. For example, it is assumed that the distance between the signal processing apparatus and the electronic device is d1, the time point at which the signal processing apparatus receives the first sound wave signal is T1, and a time point at which the signal processing apparatus determines that the electronic device receives the first audio signal is T2. In this case, the signal processing apparatus performs delay processing on the audio signal corresponding to the first sound wave signal. Delay duration may be Δt=d1/c−(T2−T1), where c represents the speed of sound.
[0193] In order that when the first audio signal arrives at the electronic device, the electronic device can obtain the noise reduction signal by performing only a small amount of processing, the sending time point of the first audio signal may be adjusted. For example, after the electronic device receives the first audio signal, the electronic device can play the first audio signal after performing only phase inversion processing on the first audio signal. If the signal processing apparatus has performed phase inversion processing on the obtained first sound wave signal so that the phase of the first audio signal is opposite to a phase of the first sound wave signal, the electronic device may directly play the first audio signal after receiving the first audio signal, that is, the first audio signal may be superimposed with and cancel out a noise signal received by the electronic device.
[0194] It should be noted that, in some specific application scenarios, if the signal processing apparatus does not perform step 504, the signal processing apparatus may send the first time point to the electronic device, or send the first time point and the first information to the electronic device. The electronic device performs, based on the first time point and the first information, time adjustment on the audio signal received by the electronic device. How the electronic device adjusts the received audio signal based on the first time point and the first information is described in an embodiment corresponding to
[0195] In a specific implementation, if Δt is less than 0, it indicates that the electronic device first receives a sound wave signal sent by a sound source, and then receives the at least one first audio signal sent by the electronic device through an electromagnetic wave. In this case, the electronic device does not learn of a signal feature of a noise in advance. If the noise reduction signal determined by the electronic device based on the received first audio signal cannot achieve a good noise reduction effect, the signal processing apparatus directly discards the first audio signal, without performing delay processing on the first audio signal.
[0196] 505: The signal processing apparatus sends the at least one first audio signal to the electronic device through an electromagnetic wave.
[0197] The first audio signal is used by the electronic device to determine a noise reduction signal, the noise reduction signal is for performing noise reduction processing on a second sound wave signal received by the electronic device, and the second sound wave signal and the first sound wave signal are signals sent by a same sound source. In a specific implementation, the signal processing apparatus may compress the phase-inverted signal in a G.711 manner. A delay needs to be less than or equal to 1 ms, or only 0.125 ms.
[0198] In a specific implementation, the signal processing apparatus sends the first audio signal in a wireless manner such as Wi-Fi or Bluetooth, to ensure that a signal carrying a noise feature arrive at the electronic device earlier than a direct signal. The signal carrying the noise feature is the first audio signal, and the direct signal is the second sound wave signal sent by the sound source.
[0199] In a specific implementation, if there are a plurality of signal processing apparatuses, each of the plurality of signal processing apparatuses sends a first audio signal to the electronic device. It is assumed that there are N signal processing apparatuses, the electronic device receives N first audio signals. In this scenario, the method may further include 506: The electronic device determines a noise reduction signal based on an arithmetic average value of the N first audio signals. When the N signal processing apparatuses each send a first audio signal to the electronic device, where N is a positive integer, the electronic device receives N first audio signals. The N first audio signals may be obtained by processing, by different signal processing apparatuses, sound wave signals sent by one sound source, or the N first audio signals may be obtained by processing, by different signal processing apparatuses, sound wave signals sent by sound sources in different positions. The electronic device may determine, based on information (for example, second information in an implementation of this application) that is related to a sound source position and that is sent by the different signal processing apparatuses, whether the first audio signals are for the same sound source. It is assumed that the first audio signals are for the same sound source. For example, M first audio signals are for a first sound source. That is, M signal processing apparatuses each process a received sound wave signal sent by the first sound source, to obtain a first audio signal; and send the first audio signal to the electronic device. In this case, the electronic device determines an arithmetic average value of the first audio signals sent by the M signal processing apparatuses. It should be noted that, if the M signal processing apparatuses can separate sound sources (an acoustic source separation technology is described below), the M signal processing apparatuses may send a plurality of first audio signals to the electronic device. Each of the plurality of first audio signals may be obtained by processing, by the signal processing apparatus, received sound wave signals sent by different sound sources. When receiving a plurality of first audio signals, the electronic device may calculate an arithmetic average value of first audio signals obtained through processing for a same sound source, and may finally obtain a plurality of arithmetic average values. Each of the plurality of arithmetic average values may be considered as a noise reduction signal, and the electronic device may directly play the noise reduction signal or play a noise reduction signal determined based on each of the plurality of arithmetic average values. If P first audio signals in the received N first audio signals are for different sound sources, and no other first audio signals in the N first audio signals are for a same sound source as the P first audio signals, the electronic device may directly play the first audio signal or play a noise reduction signal determined based on any one of the P first audio signals, where P is an integer. In other words, in a scenario with a plurality of sound sources, when the electronic device processes first audio signals, if the electronic device determines that the received first audio signals are signals obtained by processing signals for different sound sources by the signal processing apparatus, the electronic device determines only an arithmetic average value of first audio signals for a same sound source, without calculating an arithmetic average value of the plurality of received first audio signals.
[0200] In a specific implementation, the method may further include 507: The electronic device performs cross-correlation processing on the first audio signal and the second sound wave signal to determine a noise reduction signal.
[0201] A cross-correlation function represents a degree of correlation between two time sequences, that is, describes a degree of correlation between values of two signals at any two different time points. The two signals may be aligned in time by performing cross-correlation processing on the two signals. For example, a noise reduction effect may be further optimized by performing cross-correlation processing on the sound wave signal received by the electronic device and an electromagnetic wave signal received by the electronic device. It is assumed that the sound wave signal received by the electronic device is p2(n). In this case, the electronic device performs cross-correlation processing on p2(n) and the first audio signal received by the electronic device:
R(t)=p2(n)*p1.sub.c.sup.t(n−t).
[0202] p1.sub.c.sup.t(n) represents the first audio signal, and records Δt corresponding to a minimum value of R(t) in the foregoing formula, Δt represents a delay value. p1.sub.c.sup.t(n) is delayed by duration of Δt to obtain a signal p1.sub.cΔt.sup.t(n), where the signal is a phase-inverted signal aligned with p2(n) in time.
[0203] In a specific implementation, the method may further include 508: The electronic device adjusts the first audio signal.
[0204] An error sensor may be deployed on the electronic device to collect an error signal e(n). A principle of calculating the required phase-inverted signal y(n) according to the FxLMS criterion has been described above. In the foregoing expression for calculating the required phase-inverted signal y(n) according to the FxLMS criterion, the reference signal x(n) is a collected external noise. In this embodiment, the first audio signal is used as the reference signal x(n). x(n) is an initial phase-inverted signal.
y(n)=w.sup.T(n)×(n)
e(n)=d(n)+y(n)
w(n+1)=w(n)+ue(n)×(n).
[0205] Prediction is performed continuously based on e(n) and x(n) to obtain y(n), so as to minimize e(n).
[0206] In a specific implementation, alternatively, the first audio signal and a sound wave signal that is sent by a sound source and received by the electronic device and may be superimposed, and a superimposed signal is used as the reference signal x(n).
[0207] 509: The electronic device plays the noise reduction signal.
[0208] After obtaining the final noise reduction signal, the electronic device may play the noise reduction signal by using a speaker of the electronic device, to implement an active noise reduction function. The noise reduction signal is for cancelling out the sound wave signal sent by the sound source and received by the electronic device.
[0209] It can be learned from the embodiment corresponding to
[0210] In the embodiment corresponding to
d2=√{square root over ((0−x.sub.s).sup.2+(0−y.sub.s).sup.2+(0−z.sub.s).sup.2)}.
[0211] A distance d3 between the sound source and the electronic device may be further determined:
d3=√{square root over ((x.sub.d−x.sub.s).sup.2+(y.sub.d−y.sub.x).sup.2+(z.sub.d−z.sub.x).sup.2)}.
[0212] According to the foregoing analysis, in some embodiments, to achieve a better noise reduction effect, the first sound wave signal may be processed based on the first information and the second information to obtain the first audio signal. The second information is position information of the sound source relative to the signal processing apparatus. The following describes this implementation.
[0213]
[0214] As shown in
[0215] 801: A signal processing apparatus receives at least one first sound wave signal, and converts the at least one sound wave signal to at least one audio signal.
[0216] Step 801 may be understood with reference to step 501 in the embodiment corresponding to
[0217] 802: The signal processing apparatus performs transfer adjustment on the at least one first sound wave signal based on first information and second information.
[0218] The first information may be understood with reference to descriptions about the first information in the embodiment corresponding to
[0219] In a specific implementation, the second information may be a distance d2 between the sound source and the signal processing apparatus.
[0220] In a specific implementation, the second information may be spatial coordinates of the sound source and the signal processing apparatus in a same spatial coordinate system.
[0221] As described in the embodiment corresponding to
x(t)=s(t)*h(t)=as(t−τ).
[0222] h(t) represents an impulse response of a linear time-invariant system, a represents amplitude attenuation, and τ represents a transmission delay.
[0223] A frequency domain expression is as follows:
X(ω)=S(ω)H(ω)=S(ω)G(r,r.sub.0,w).
[0224] r0 represents a spatial coordinate point of the transmit end, G(r, r.sub.0, w) represents a Green's function, and an expression is as follows:
[0225] In this embodiment of this disclosure, r−r0 is Δd, and Δd=d3−d2. How to determine values of d3 and d2 has been described above. Details are not described herein again. A signal X(ω) obtained after transmission is performed by Δd may be obtained by using a frequency domain function, and then a time-domain signal x(n) may be obtained by transforming the signal X(ω) to a time domain. This process is a process in which the signal processing apparatus performs transfer adjustment on the first sound wave signal based on the first information and the second information.
[0226] In a specific implementation, the method may further include 803: The signal processing apparatus determines a first time point.
[0227] The first time point is a time point at which the signal processing apparatus receives the first sound wave signal.
[0228] Corresponding to step 803, the method may further include step 804: The signal processing apparatus determines a sending time point of at least one first audio signal based on a difference between third duration and second duration.
[0229] The third duration is a ratio of a difference between a first distance and a second distance to a speed of sound, the second duration is a difference between a second time point and the first time point, and the second time point is a time point that is determined by the signal processing apparatus and at which the electronic device receives the first audio signal. For example, it is assumed that the distance between the signal processing apparatus and the sound source is d2, a distance between the sound source and the electronic device is d3, a time point at which the signal processing apparatus receives the first sound wave signal is T1, and the signal processing apparatus determines that a time point at which the electronic device receives the first audio signal is T2. In this case, the signal processing apparatus performs delay processing on the audio signal corresponding to the first sound wave signal. Delay duration may be Δt=(d3−d2)/c−(T2−T1), where c represents the speed of sound.
[0230] In order that when the first audio signal arrives at the electronic device, the electronic device can obtain the noise reduction signal by performing only a small amount of processing, the sending time point of the first audio signal may be adjusted. For example, after the electronic device receives the first audio signal, the electronic device can play the first audio signal after performing only phase inversion processing on the first audio signal. If the signal processing apparatus has performed phase inversion processing on the obtained first sound wave signal so that a phase of the first audio signal is opposite to a phase of the first sound wave signal, the electronic device may directly play the first audio signal after receiving the first audio signal, that is, the first audio signal may be superimposed with and cancel out a noise signal received by the electronic device.
[0231] It should be noted that, in some specific application scenarios, if the signal processing apparatus does not perform step 804, the signal processing apparatus may send the first time point to the electronic device, or send the first time point, the first information, and the second information to the electronic device. The electronic device performs, based on the first time point, the first information, and the second information, time adjustment on the audio signal received by the electronic device. How the electronic device adjusts the received audio signal based on the first time point, the first information, and the second information is described in the embodiment corresponding to
[0232] It should be noted that, if the signal processing apparatus performs phase inversion processing on the audio signal corresponding to the first sound wave signal, that is, if the signal processing apparatus performs phase inversion processing on a collected audio signal so that the phase of the first audio signal is opposite to a phase of the collected audio signal, there may be different manners. For example, it is assumed that the audio signal collected by the signal processing apparatus is p1(n). In this case, the signal processing apparatus may directly perform phase inversion on a sampled and quantized audio signal p1(n), that is, invert a phase of a symbol at each sampling point, to obtain a phase-inverted signal of p1(n). A complete active noise reduction system may be further deployed on the signal processing apparatus to obtain a phase-inverted signal y(n). The active noise reduction system may be the foregoing feedforward active noise reduction system, feedback active noise reduction system, or integrated active noise reduction system. How to obtain the phase-inverted signal based on the active noise reduction system belongs to the conventional technology, and has been described above. Details are not described herein again.
[0233] In a specific implementation, if Δt is less than 0, it indicates that the electronic device first receives a sound wave signal sent by a sound source, and then receives the at least one first audio signal sent by the electronic device through an electromagnetic wave. In this case, the electronic device does not learn of a signal feature of a noise in advance. If the noise reduction signal determined by the electronic device based on the received first audio signal cannot achieve a good noise reduction effect, the signal processing apparatus directly discards the first audio signal, without performing delay processing on the first audio signal.
[0234] 805: The signal processing apparatus sends the at least one first audio signal to the electronic device through an electromagnetic wave.
[0235] Step 805 may be understood with reference to step 505 in the embodiment corresponding to
[0236] In a specific implementation, the method may further include 806: The electronic device determines a noise reduction signal based on an arithmetic average value of N first audio signals. Step 806 may be understood with reference to step 506 in the embodiment corresponding to
[0237] In a specific implementation, the method may further include 807: The electronic device performs cross-correlation processing on the first audio signal and the second sound wave signal to determine a noise reduction signal.
[0238] Step 807 may be understood with reference to step 507 in the embodiment corresponding to
[0239] In a specific implementation, the method may further include 808: The signal processing apparatus adjusts the first audio signal.
[0240] Step 808 may be understood with reference to step 508 in the embodiment corresponding to
[0241] 809: The electronic device plays the noise reduction signal.
[0242] Step 809 may be understood with reference to step 509 in the embodiment corresponding to
[0243] It can be learned from the embodiment corresponding to
[0244] In the embodiments corresponding to
[0245]
[0246] As shown in
[0247] 901: A signal processing apparatus receives at least one first sound wave signal, and converts the at least one sound wave signal to at least one audio signal.
[0248] Step 901 may be understood with reference to step 501 in the embodiment corresponding to
[0249] 902: An electronic device receives at least one second sound wave signal.
[0250] The electronic device may receive the second sound wave signal by using a microphone device, or the electronic device may receive the second sound wave signal by using a microphone array. In this application, the electronic device converts the received second sound wave signal to an audio signal by using the microphone or the microphone array.
[0251] 903: The signal processing apparatus performs digital processing on the at least one first sound wave signal to obtain at least one first audio signal.
[0252] 904: The signal processing apparatus determines a first time point, where the first time point is a time point at which the signal processing apparatus receives the at least one first sound wave signal.
[0253] 905: The electronic device receives, through an electromagnetic wave, the at least one first audio signal and the first time point that are sent by the signal processing apparatus.
[0254] The first audio signal is a signal obtained by performing digital processing on the received first sound wave signal by the signal processing apparatus, the first sound wave signal and the second sound wave signal are signals sent by a same sound source, and the first time point is a time point at which the signal processing apparatus receives the first sound wave signal.
[0255] 906: The electronic device processes the first audio signal based on first information and the first time point to obtain a noise reduction signal.
[0256] The first information includes position information of the electronic device relative to the signal processing apparatus. The noise reduction signal is for performing noise reduction processing on the second sound wave signal received by the electronic device.
[0257] In a specific implementation, the electronic device processes the first audio signal based on a difference between first duration and second duration, to determine a time point for playing the noise reduction signal. The first duration is determined by the electronic device based on the first information and a speed of sound, the second duration is a difference between a second time point and the first time point, and the second time point is a time point at which the electronic device receives the first audio signal. For example, it is assumed that a distance between the signal processing apparatus and the electronic device is d1, a time point at which the signal processing apparatus receives the first sound wave signal is T1, and a time point at which the electronic device receives the first audio signal is T2. In this case, the signal processing apparatus performs delay processing on the first audio signal. Delay duration may be Δt=d1/c−(T2−T1), where c represents the speed of sound. In this implementation and the following implementations, the first information may be information prestored in the electronic device, or the first information may be sent by the signal processing apparatus to the electronic device. Specifically, the signal processing apparatus may send d1 to the electronic device; or the signal processing apparatus may send spatial coordinates, determined by the signal processing apparatus, of the signal processing apparatus and the electronic device in a same coordinate system. Alternatively, the first information may be obtained by the electronic device through measurement. For example, a vector audio collection manner may be configured on the electronic device to position the signal processing apparatus. The vector collection manner includes two methods: According to one method, a microphone array is deployed on the electronic device to perform vector collection. According to the other method, after another electronic device transmits scalar audio signals to the electronic device, the electronic device combines these audio signals and scalar audio signals collected by the electronic device into a virtual microphone array to perform vector collection. Obtaining distances between several devices according to a positioning method has been described in the embodiment corresponding to
[0258] In a specific implementation, the electronic device processes the first audio signal based on a difference between first duration and second duration, to determine a time point for playing the noise reduction signal. The first duration is determined by the electronic device based on the first information and a speed of sound, the second duration is a difference between a second time point and the first time point, and the second time point is a time point at which the electronic device receives the first audio signal.
[0259] The electronic device performs transfer adjustment on the first audio signal based on the first information. As described above, when a sound wave signal is propagated in the air, amplitude attenuation and phase shift occur. Amplitude attenuation and phase shift are related to a transfer distance of a sound wave. The electronic device performs transfer adjustment on the first audio signal based on the first information.
[0260] In a specific implementation, when the first duration is greater than the second duration, the electronic device processes the first audio signal based on the difference between the first duration and the second duration, to determine the time point for playing the noise reduction signal. In other words, when the first duration is less than the second duration, it indicates that the electronic device first receives a sound wave signal sent by a sound source, and then receives the at least one first audio signal sent by the electronic device through the electromagnetic wave. In this case, the electronic device does not learn of a signal feature of a noise in advance. If the noise reduction signal determined by the electronic device based on the received first audio signal cannot achieve a good noise reduction effect, the electronic device directly discards the first audio signal, without performing delay processing on the first audio signal.
[0261] In a specific implementation, the electronic device determines a first distance and a second distance based on the first information and second information. The second information is position information of the sound source relative to the signal processing apparatus, the first distance is a distance between the sound source and the electronic device, and the second distance is a distance between the sound source and the signal processing apparatus.
[0262] The electronic device processes the first audio signal based on a difference between third duration and second duration, to determine a time point for playing the noise reduction signal. The third duration is a ratio of a difference between the first distance and the second distance to a speed of sound, the second duration is a difference between a second time point and the first time point, and the second time point is a time point at which the electronic device receives the first audio signal. For example, it is assumed that the distance between the signal processing apparatus and the sound source is d2, the distance between the sound source and the electronic device is d3, a time point at which the signal processing apparatus receives the first sound wave signal is T1, and a time point at which the electronic device receives the first audio signal is T2. In this case, the electronic device performs delay processing on the first audio signal. Delay duration may be Δt=(d3−d2)/c−(T2−T1), where c represents the speed of sound. In this embodiment of this disclosure, the second information may be prestored in the electronic device, or the second information may be sent by the signal processing apparatus to the electronic device. Specifically, the second information may be the distance between the sound source and the signal processing apparatus; or the second information may be spatial coordinates, determined by the signal processing apparatus, of the sound source and the signal processing apparatus in a same spatial coordinate system. Alternatively, the second information may be obtained by the electronic device through measurement. For example, a vector audio collection manner may be configured on the electronic device to position the sound source. The vector collection manner includes two methods: According to one method, a microphone array is deployed on the electronic device to perform vector collection. According to the other method, after another electronic device transmits scalar audio signals to the electronic device, the electronic device combines these audio signals and scalar audio signals collected by the electronic device into a virtual microphone array to perform vector collection. Obtaining distances between several devices according to a positioning method has been described in the embodiment corresponding to
[0263] In a specific implementation, the electronic device processes the first audio signal based on a difference between third duration and second duration, to determine a time point for playing the noise reduction signal. The third duration is a ratio of a difference between a first distance and a second distance to a speed of sound, the second duration is a difference between a second time point and the first time point, the second time point is a time point at which the electronic device receives the first audio signal, the first distance is a distance between a sound source and the electronic device, and the second distance is a distance between the sound source and the signal processing apparatus.
[0264] The electronic device determines the first distance and the second distance based on the first information and second information. The first distance is the distance between the sound source and the electronic device, the second distance is the distance between the sound source and the signal processing apparatus, and the second information is position information of the sound source relative to the signal processing apparatus.
[0265] The electronic device performs transfer adjustment on the first audio signal based on the difference between the first distance and the second distance. As described above, when a sound wave signal is propagated in the air, amplitude attenuation and phase shift occur. Amplitude attenuation and phase shift are related to a transfer distance of a sound wave. In an ideal propagation condition, a relationship between a signal received by a receive end and a signal sent by a transmit end is as follows:
x(t)=s(t)*h(t)=as(t−τ).
[0266] h(t) represents an impulse response of a linear time-invariant system, a represents amplitude attenuation, and τ represents a transmission delay.
[0267] A frequency domain expression is as follows:
X(ω)=S(ω)H(ω)=S(ω)G(r,r.sub.0,w).
[0268] r0 represents a spatial coordinate point of the transmit end, G(r, r0, w) represents a Green's function, and an expression is as follows:
[0269] In this embodiment of this disclosure, r−r0 is Δd, and Δd=d3−d2. How to determine values of d3 and d2 has been described above. Details are not described herein again. A signal X(ω) obtained after transmission is performed by Δd may be obtained by using a frequency domain function, and then a time-domain signal x(n) may be obtained by transforming the signal X(ω) to a time domain. This process is a process in which the electronic device performs transfer adjustment on the first audio signal based on the difference between the first distance and the second distance.
[0270] In a specific implementation, when the third duration is greater than the second duration, the electronic device processes the first audio signal based on the difference between the third duration and the second duration, to determine the time point for playing the noise reduction signal. In other words, when the third duration is less than the second duration, it indicates that the electronic device first receives a sound wave signal sent by the sound source, and then receives the at least one first audio signal sent by the electronic device through the electromagnetic wave. In this case, the electronic device does not learn of a signal feature of a noise in advance. If the noise reduction signal determined by the electronic device based on the received first audio signal cannot achieve a good noise reduction effect, the electronic device directly discards the first audio signal, without performing delay processing on the first audio signal.
[0271] In a specific implementation, the electronic device may further perform cross-correlation processing on the second sound wave signal and the first audio signal that is processed based on the first information and the first time point, to determine the noise reduction signal.
[0272] In a specific implementation, the electronic device determines the noise reduction signal based on an arithmetic average value of first audio signals. When N signal processing apparatuses each send a first audio signal to the electronic device, where N is a positive integer, the electronic device receives N first audio signals. The N first audio signals may be obtained by processing, by different signal processing apparatuses, sound wave signals sent by one sound source, or the N first audio signals may be obtained by processing, by different signal processing apparatuses, sound wave signals sent by sound sources in different positions. The electronic device may determine, based on information (for example, second information in an implementation of this disclosure) that is related to a sound source position and that is sent by the different signal processing apparatuses, whether the first audio signals are for the same sound source. It is assumed that the first audio signals are for the same sound source. For example, M first audio signals are for a first sound source. That is, M signal processing apparatuses each process a received sound wave signal sent by the first sound source, to obtain a first audio signal; and send the first audio signal to the electronic device. In this case, the electronic device determines an arithmetic average value of the first audio signals sent by the M signal processing apparatuses. It should be noted that, if the M signal processing apparatuses can separate sound sources (an acoustic source separation technology is described below), the M signal processing apparatuses may send a plurality of first audio signals to the electronic device. Each of the plurality of first audio signals may be obtained by processing, by the signal processing apparatus, received sound wave signals sent by different sound sources. When receiving a plurality of first audio signals, the electronic device may calculate an arithmetic average value of first audio signals obtained through processing for a same sound source, and may finally obtain a plurality of arithmetic average values. Each of the plurality of arithmetic average values may be considered as a noise reduction signal, and the electronic device may directly play the noise reduction signal.
[0273] In a specific implementation, if the signal processing apparatus does not perform phase inversion processing on the collected audio signal, the electronic device further needs to perform phase inversion processing on the first audio signal after receiving the first audio signal sent by the signal processing apparatus. For example, the electronic device may directly perform phase inversion on the first audio signal, that is, invert a phase of a symbol at each sampling point to obtain a phase-inverted signal of the first audio signal. Alternatively, a complete active noise reduction system may be deployed on the electronic device to obtain the phase-inverted signal. The active noise reduction system may be the foregoing feedforward active noise reduction system, feedback active noise reduction system, or integrated active noise reduction system. How to obtain the phase-inverted signal based on the active noise reduction system belongs to the conventional technology, and has been described above. Details are not described herein again.
[0274] In a specific implementation, the electronic device may further adjust the first audio signal. This can be understood with reference to step 508 in the embodiment corresponding to
[0275] 907: The electronic device plays the noise reduction signal.
[0276] Step 906 may be understood with reference to step 509 in the embodiment corresponding to
[0277] It can be learned from the embodiment corresponding to
[0278] In the embodiments corresponding to
[0279] There may be more than one sound source in the embodiments corresponding to
[0280] All sound source extraction and separation technologies in the conventional technology can be used in embodiments of this disclosure. The following provides a sound source extraction and separation method.
[0281] N independent sound sources and M microphones are disposed. The M microphones may be deployed on a signal processing apparatus, or may be deployed on an electronic device. It is assumed that a sound source vector is as follows:
s(n)=[s.sub.1(n), . . . ,s.sub.N(n)].sup.T,
[0282] an observation vector x(n) satisfies x(n)=[x.sub.1(n), . . . , x.sub.N(n)].sup.T, and a length of a hybrid filter is P. In this case, a convolutional mixing process may be expressed as follows:
[0283] A hybrid network H(n) is an M×N matrix sequence and includes an impulse response of the hybrid filter. It is assumed that a length of a separation filter is L, and an estimated sound source vector y(n) satisfies y(n)=[y.sub.1(n), . . . , y.sub.N(n)].sup.T. In this case, an expression of y(n) is as follows:
[0284] A separation network W(n) is an N×M matrix sequence and includes an impulse response of the separation filter, and “*” represents a matrix convolution operation.
[0285] The separation network W(n) may be obtained according to a frequency-domain blind source separation algorithm. Time-domain convolution is transformed to a frequency-domain product through an L-point short-time Fourier transform (short-time Fourier transform, STFT), that is,
X(m,f)=H(f)S(m,f)
Y(m,f)=W(f)X(m,f).
[0286] m is obtained by performing L-point down-sampling on a time index value n, X(m,f) and Y(m,f) are obtained by performing an STFT on x(n) and y(n) respectively, and H(f) and W(f) are Fourier transform forms of H(n) and W(n), where f∈[f.sub.0, . . . , f.sub.L/2], and f represents a frequency.
[0287] Y(m,f) obtained through blind source separation is inversely transformed back to a time domain, to obtain estimated sound source signals y.sub.1(n), . . . , and y.sub.N(n).
[0288] In scenarios with a plurality of noise sources, the signal processing apparatus may separate collected audio signals into a plurality of channels of audio signals based on an acoustic source separation technology, and then process each channel of audio signal according to the embodiments corresponding to
[0289] In particular, in addition to a scenario in which there are really a plurality of sound sources, the scenarios with the plurality of sound sources mentioned in this embodiment also include a scenario with a plurality of transmission paths in each sound source (for example, reflection of a sound wave by a wall in a room). In this scenario, these reflection paths may be considered as virtual sound sources, the virtual sound sources have directions different from a direction of an initial sound source, and are positions of specific reflection points. However, the reflection points may be considered as positions of the virtual sound sources, and are processed as independent sound sources. This is also a scenario with a plurality of sound sources. A recognition method for the sound source may be the same as the algorithm in this embodiment.
[0290] In a specific implementation, when the electronic device is for dual-ear playback, for example, the electronic device is a noise reduction headset, considering that noise perception is different due to different positions of two ears, noise reduction processing may be further separately performed for the two ears in this application. The following describes this case.
[0291] Perception of a person for a spatial orientation of a sound: A spatial sound source is transferred to two ears of the person over the air. Phases at sound wave frequencies and sound pressure of sound waves heard by the left ear and the right ear of the person both are different because distances and orientations at which the sound wave arrives at the two ears both are different. Perception of the person for a spatial direction and a distance of audio is formed based on the information.
[0292] A head-related transfer function (head-related transfer function, HRTF) describes a scattering effect of the head, pinnae, and the like on a sound wave, and an interaural time difference (interaural time difference, ITD) and an interaural level difference (interaural level difference, ILD) that result from the scattering effect, and reflects a process of transmitting the sound wave from a sound source to the two ears. A human auditory system compares the ITD with past auditory experience to precisely position the sound source. A signal processing method is used for virtual sound based on the HRTF to simulate and retransmit sound space information. In this way, a subjective space sense of a sound is reproduced for a listener.
[0293] That is, a binaural HRTF function essentially includes spatial orientation information, and HRTF functions for different spatial orientations are totally different. Common audio information of any single audio channel is convolved by using binaural HRTF functions of corresponding spatial positions separately to obtain audio information corresponding to the two ears. 3D audio can be experienced by playing the audio information by using a headset. Therefore, the HRTF function actually includes spatial information, and represents a function of transferring the sound wave from spatially different sound sources to the two ears.
[0294] The HRTF function is a frequency domain function. An expression of the HRTF function in time domain is referred to as a head-related impulse response (head-related impulse response, HRIR) or a binaural impulse response. The HRIR and the head-related transfer function HRTF function are a Fourier transform pair.
[0295]
[0296] As shown in
[0297] 1001: A first electronic device determines a noise reduction signal.
[0298] In a specific implementation, the first electronic device may determine the noise reduction signal with reference to the manner in which the electronic device determines the noise reduction signal in the embodiment corresponding to
[0299] In a specific implementation, the first electronic device may determine the noise reduction signal with reference to the manner in which the electronic device determines the noise reduction signal in the embodiment corresponding to
[0300] In a specific implementation, the first electronic device may determine the noise reduction signal with reference to the manner in which the electronic device determines the noise reduction signal in the embodiment corresponding to
[0301] 1002: The first electronic device determines spatial coordinates, corresponding to a case in which the first electronic device is the origin of coordinates, of a sound source relative to the first electronic device.
[0302] Based on coordinates S=(x.sub.s, y.sub.s, z.sub.s) of the sound source sent by a signal processing apparatus and coordinates (x.sub.d, y.sub.d, z.sub.d) of the first electronic device, the first electronic device calculates coordinates s′=(x′.sub.s, y′.sub.s, z′.sub.s) of the sound source relative to the first electronic device, that is, the coordinates, corresponding to a case in which the first electronic device is used as the origin of the coordinates (0, 0, 0), of the sound source relative to the first electronic device. A method is as follows:
x.sub.s′=x.sub.s−x.sub.d
y.sub.s′=y.sub.s−y.sub.d
y.sub.s′=y.sub.s−y.sub.d.
[0303] In a specific implementation, when the first electronic device receives spatial coordinates of a plurality of sound sources sent by a plurality of topology nodes, the first electronic device may calculate an arithmetic average value of the received spatial coordinates of the plurality of sound sources, to obtain the coordinates
[0304] 1003: The first electronic device determines a first head-related transfer function (head-related transfer function, HRTF) based on the spatial coordinates of the sound source.
[0305] The first electronic device prestores a correspondence between an HRTF and the spatial coordinates of the sound source.
[0306] 1004: The first electronic device deconvolves the noise reduction signal based on the first HRTF, to obtain a phase-inverted signal of the noise reduction signal.
[0307] The first electronic device deconvolves the obtained noise reduction signal based on an HRTF corresponding to the first electronic device, to obtain a phase-inverted signal of the noise source. Because the HRTF function is a frequency domain function, actual convolution and deconvolution processing are both based on a time-domain corresponding head-related impulse response (head related impulse response, HRIR). A method is as follows:
[0308] Based on the coordinates
[0309] A phase-inverted signal p3.sub.r.sup.A(n) of the first electronic device is deconvolved based on ha(n), to obtain a phase-inverted signal s_p3(n) of a noise signal.
[0310] 1005: The first electronic device sends the phase-inverted signal of the noise reduction signal and the spatial coordinates of the sound source to a second electronic device.
[0311] 1006: The second electronic device convolves the phase-inverted signal of the noise reduction signal with a second HRTF to determine a noise reduction signal of the second electronic device.
[0312] Based on the coordinates
[0313] The signal s_p3(n) is convolved with hb(n) to obtain a signal p3.sub.r.sup.B(n).
[0314] p3.sub.r.sup.B(n) is a phase-inverted signal on a side of the second electronic device, and the phase-inverted signal herein is the noise reduction signal on the side of the second electronic device.
[0315] If signals are from a plurality of topology nodes, processing is performed by each of the topology nodes, and then arithmetic average values are obtained and added.
[0316] The second HRTF is determined by the second electronic device based on the spatial coordinates of the sound source, and the second electronic device prestores a correspondence between the HRTF and the spatial coordinates of the sound source.
[0317] In this disclosure, the first electronic device and the second electronic device may respectively represent left and right earphones of a headset.
[0318] In the embodiment corresponding to
[0319] In a specific implementation, the signal p3.sub.r.sup.A(n) of the first electronic device is transformed to a frequency domain, to obtain p3.sub.r.sup.A(ω).
[0320] Based on the coordinates
[0321] The phase-inverted signal p3.sub.r.sup.A(n) of the first electronic device is divided by H.sub.A(ω) to obtain a frequency domain form S_P3(ω) of the phase-inverted signal of the noise signal.
[0322] S_P3(ω) is multiplied by an HRTF function H.sub.B(M) of the second electronic device, and an obtained signal is transformed to a time domain, to obtain a noise reduction signal on the side of the second electronic device.
[0323] Based on the coordinates
[0324] The signal S_P3(ω) is multiplied by H.sub.B(ω), to obtain a signal P3.sub.r.sup.B(ω).
[0325] The signal P3.sub.r.sup.B(ω) is inversely transformed to a time domain to obtain the signal p3.sub.r.sup.B(n).
[0326] p3.sub.r.sup.B(n) is the phase-inverted signal of the second electronic device, and the phase-inverted signal is the noise reduction signal on the side of the second electronic device.
[0327] In a specific implementation, if the first electronic device and the second electronic device respectively correspond to left and right earphones of a headset, because a calculation amount required by the first electronic device is large, a side with a higher battery level on the left and right earphones of the headset may be used as the first electronic device.
[0328] In a specific implementation, the second electronic device may adjust the signal p3.sub.r.sup.B(n). An adjustment method may be understood with reference to step 508 in the embodiment corresponding to
[0329] An embodiment of this disclosure further provides a voice enhancement method. The method may be used in combination with the foregoing embodiments corresponding to
[0330]
[0331] As shown in
[0332] 1101: A signal processing apparatus collects an audio signal.
[0333] The signal processing apparatus receives a third sound wave signal by using a microphone or a microphone array. The microphone or the microphone array may convert the received sound wave signal to an audio signal.
[0334] 1102: The signal processing apparatus extracts a signal of a non-voice part of the audio signal, and determines a noise spectrum.
[0335] Voice activity detection (voice activity detection, VAD) is performed on the audio signal to extract the signal of the non-voice part of the audio signal. It is assumed that the extracted signal of the non-voice part is x1_n(n). In this case, the signal processing apparatus performs a fast Fourier transform (fast Fourier transform, FFT) on x1_n(n) to obtain X1_N(ω), that is, the noise spectrum.
[0336] 1103: The signal processing apparatus sends the noise spectrum to an electronic device through an electromagnetic wave.
[0337] 1104: The electronic device receives a fourth sound wave signal.
[0338] 1105: The electronic device determines a voice enhancement signal of the fourth sound wave signal based on a difference between the fourth sound wave signal on which the FFT is performed and the noise spectrum.
[0339] In a specific implementation, if the electronic device receives a plurality of noise spectrums sent by a plurality of signal processing apparatuses, the electronic device determines an arithmetic average value of the received plurality of noise spectrums to obtain a noise spectrum X3_N(ω).
[0340] In a specific implementation, the electronic device may determine the noise spectrum based on all obtained noise spectrums, including those calculated by the electronic device (that is, an electronic device 3 also calculates a local noise spectrum X3_N(ω) of the electronic device 3 in the manner in step 1102). In a specific implementation, different weights may be further set for noise spectrums determined by different devices (the signal processing apparatus and the electronic device), to obtain the noise spectrum X3_N(ω). For example, a weight for a noise spectrum calculated by the electronic device is greater, for example, 0.5, and a weight for a noise spectrum calculated by another device is 0.25. An expression may be as follows:
X3_N(ω)=0.5.Math.X3_N(ω)+0.25.Math.X1_N(ω)+0.25.Math.X2_N(ω).
[0341] An FFT is performed on an audio signal collected by a topology node 3 to obtain X3(ω), and then the noise spectrum X3_N(ω) is subtracted from X3(ω) to obtain a signal spectrum Y3(ω) of a pure voice:
Y3(ω)=X3(ω)−X3_N(ω).
[0342] Then, an inverse fast Fourier transform (inverse fast Fourier transform, IFFT) is performed on Y3(ω) to obtain y3(n), that is, a voice enhanced signal.
[0343] It can be learned from the embodiment corresponding to
[0344] The foregoing mainly describes the solutions provided in embodiments of this disclosure from a perspective of interaction between the electronic device and the signal processing apparatus. It may be understood that, to implement the foregoing functions, the electronic device and the signal processing apparatus include corresponding hardware structures and/or software modules for performing the functions. A person skilled in the art should be easily aware that, with reference to modules and algorithm steps in examples described in embodiments disclosed in this specification, this application can be implemented by hardware or a combination of hardware and computer software. Whether a function is performed by hardware or hardware driven by computer software depends on particular applications and design constraints of the technical solutions. A person skilled in the art may use different methods to implement the described functions for each particular application, but it should not be considered that the implementation goes beyond the scope of this disclosure.
[0345] From a perspective of a hardware structure, the signal processing apparatus and the electronic device in
[0346] For example, the signal processing apparatus may be implemented by a device in
[0347] The communication interface 1201 may be any apparatus such as a transceiver, and is configured to communicate with another device or a communication network, for example, the Ethernet, a radio access network (radio access network, RAN), or a wireless local area network (wireless local area network, WLAN).
[0348] The processor 1202 includes but is not limited to one or more of a central processing unit (central processing unit, CPU), a network processor (network processor, NP), an application-specific integrated circuit (application-specific integrated circuit, ASIC), or a programmable logic device (programmable logic device, PLD). The PLD may be a complex programmable logic device (complex programmable logic device, CPLD), a field-programmable logic gate array (field-programmable gate array, FPGA), generic array logic (generic array logic, GAL), or any combination thereof. The processor 1202 is responsible for a communication line 1205 and general processing; and may further provide various functions, including timing, peripheral interfacing, voltage regulation, power management, and another control function. The memory 1203 may be configured to store data used by the processor 1202 when the processor 1202 performs an operation.
[0349] The memory 1203 may be a read-only memory (read-only memory, ROM) or another type of static storage device that can store static information and instructions, a random access memory (random access memory, RAM) or another type of dynamic storage device that can store information and instructions; or may be an electrically erasable programmable read-only memory (electrically erasable programmable read-only memory, EEPROM), a compact disc read-only memory (compact disc read-only memory, CD-ROM) or another compact disc storage, an optical disc storage (including a compact disc, a laser disc, an optical disc, a digital versatile disc, a Blu-ray disc, or the like), a magnetic disk storage medium or another magnetic storage device, or any other medium that can be configured to carry or store expected program code in a form of an instruction or a data structure and that can be accessed by a computer. However, this is not limited thereto. The memory may exist independently, and is connected to the processor 1202 through the communication line 1205. Alternatively, the memory 1203 may be integrated with the processor 1202. If the memory 1203 and the processor 1202 are mutually independent components, the memory 1203 is connected to the processor 1202. For example, the memory 1203 and the processor 1202 may communicate with each other through the communication line. The communication interface 1201 and the processor 1202 may communicate with each other through a communication line, and the communication interface 1201 may alternatively be connected to the processor 1202 directly.
[0350] The microphone 1204 should be understood in a broad sense, and the microphone 1204 should also be understood as including a microphone array. The microphone may alternatively be a mic or a micro-speaker. The microphone is an energy conversion device that converts a sound signal to an electrical signal. Types of microphones include but are not limited to capacitive microphones, crystal microphones, carbon microphones, and dynamic microphones.
[0351] The communication line 1205 may include any quantity of interconnected buses and bridges, and the communication line 1205 links together various circuits including one or more processors 1202 represented by the processor 1202 and a memory represented by the memory 1203. The communication line 1205 may further link various other circuits such as a peripheral device, a voltage stabilizer, and a power management circuit. These are well known in the art, and therefore are not further described in this disclosure.
[0352] In a specific implementation, the signal processing apparatus may include:
[0353] a microphone, configured to receive a first sound wave signal.
[0354] The signal processing apparatus may further include a memory, configured to store computer-readable instructions. The signal processing apparatus may further include a processor that is coupled to the memory and that is configured to execute the computer-readable instructions in the memory to perform the following operation:
[0355] processing the first sound wave signal based on first information to obtain a first audio signal, where the first information includes position information of an electronic device relative to the signal processing apparatus.
[0356] The signal processing apparatus may further include a communication interface that is coupled to the processor and that is configured to send the first audio signal to the electronic device through an electromagnetic wave. The first audio signal is used by the electronic device to determine a noise reduction signal, the noise reduction signal is for performing noise reduction processing on a second sound wave signal received by the electronic device, and the second sound wave signal and the first sound wave signal are in a same sound field.
[0357] In a specific implementation, the processor is specifically configured to perform transfer adjustment on the first sound wave signal based on the first information.
[0358] In a specific implementation, the processor is further configured to determine a first time point. The first time point is a time point at which the signal processing apparatus receives the first sound wave signal. The communication interface is further configured to send the first time point and the first information to the electronic device. The first time point and the first information are used by the electronic device to determine, based on a speed of sound, to play the noise reduction signal.
[0359] In a specific implementation, the processor is further configured to: determine a first time point, where the first time point is a time point at which the signal processing apparatus receives the first sound wave signal; perform transfer adjustment on the first sound wave signal based on the first information; and determine, based on a difference between first duration and second duration, a time point for sending the first audio signal, where the first duration is determined by the signal processing apparatus based on the first information and the speed of sound, the second duration is a difference between a second time point and the first time point, and the second time point is a time point that is determined by the signal processing apparatus and at which the electronic device receives the first audio signal.
[0360] In a specific implementation, the processor is specifically configured to: when the first duration is greater than the second duration, send the first audio signal to the electronic device.
[0361] In a specific implementation, the processor is specifically configured to perform transfer processing on the first sound wave signal based on the first information and second information. The second information is position information of a sound source relative to the signal processing apparatus.
[0362] In a specific implementation, the processor is further configured to: determine a first time point. The first time point is a time point at which the signal processing apparatus receives the first sound wave signal. The communication interface is further configured to send the first time point, the first information, and the second information to the electronic device. The first time point, the first information, and the second information are used by the electronic device to determine, based on the speed of sound, to play the noise reduction signal.
[0363] In a specific implementation, the processor is further configured to: determine a first time point, where the first time point is a time point at which the signal processing apparatus receives the first sound wave signal; determine a first distance and a second distance based on the first information and second information, where the first distance is a distance between a sound source and the electronic device, the second distance is a distance between the sound source and the signal processing apparatus, and the second information is position information of the sound source relative to the signal processing apparatus; perform transfer adjustment on the first sound wave signal based on a difference between the first distance and the second distance; and process the first audio signal based on a difference between third duration and second duration, to determine a time point for sending the first audio signal, where the third duration is a ratio of the difference between the first distance and the second distance to the speed of sound, the second duration is a difference between a second time point and the first time point, and the second time point is a time point that is determined by the signal processing apparatus and at which the electronic device receives the first audio signal.
[0364] In a specific implementation, the communication interface is specifically configured to: when the third duration is greater than the second duration, send, by the signal processing apparatus, the first audio signal to the electronic device through an electromagnetic wave.
[0365] In a specific implementation, the processor is further configured to determine a first time point. The first time point is a time point at which the signal processing apparatus receives the first sound wave signal. The communication interface is further configured to send the first time point to the electronic device. The first time point is used by the electronic device to determine to play the noise reduction signal.
[0366] In a specific implementation, the processor is further configured to: obtain a first topological relationship between the signal processing apparatus and the electronic device, and determine the first information based on the first topological relationship, where the first information is a distance between the electronic device and the signal processing apparatus, or the first information is coordinates of the electronic device and the signal processing apparatus in a same coordinate system.
[0367] In a specific implementation, the memory prestores the first information, and the first information is a distance between the electronic device and the signal processing apparatus.
[0368] In a specific implementation, the processor is further configured to: obtain a second topological relationship among the signal processing apparatus, a sound source, and the electronic device; and determine the second information based on the second topological relationship.
[0369] In a specific implementation, the memory prestores the second information.
[0370] In a specific implementation, the processor is further configured to determine a phase-inverted signal of the first sound wave signal.
[0371] The processor is specifically configured to process the phase-inverted signal of the first sound wave signal based on the first information.
[0372] In a specific implementation, the processor is further configured to: recognize the first sound wave signal, and determine that the first sound wave signal comes from N sound sources, where N is a positive integer greater than 1; divide the first sound wave signal into N signals based on the N sound sources; and process the first sound wave signal based on the first information to obtain N first audio signals.
[0373] In a specific implementation, the microphone is further configured to receive a third sound wave signal. The processor is further configured to: extract a signal of a non-voice part from the third sound wave signal; and determine a noise spectrum of the third sound wave signal based on the signal of the non-voice part. The communication interface is further configured to send the noise spectrum to the electronic device through an electromagnetic wave, so that the electronic device determines a voice enhancement signal of a fourth sound wave signal based on the noise spectrum and the fourth sound wave signal. The fourth sound wave signal and the third sound wave signal are in a same sound field.
[0374] In a specific implementation, the signal processing apparatus may include:
[0375] a microphone, configured to: receive at least one sound wave signal, and convert the at least one sound wave signal to at least one audio signal.
[0376] The signal processing apparatus may further include a memory, configured to store computer-readable instructions. The signal processing apparatus may further include a processor that is coupled to the memory and that is configured to execute the computer-readable instructions in the memory to perform the following operations:
[0377] determining position information related to the at least one sound wave signal; and
[0378] determining a sending time point of the at least one audio signal based on the position information and the first time point. The first time point is a time point at which the receiving unit receives at least one sound wave signal.
[0379] The signal processing apparatus may further include a communication interface that is coupled to the processor and configured to send the at least one audio signal through the electromagnetic wave.
[0380] In a specific implementation, the processor is further configured to perform phase inversion processing on the at least one audio signal. The communication interface is specifically configured to send, through an electromagnetic wave, the at least one audio signal on which phase inversion processing is performed.
[0381] In a specific implementation, the processor is further configured to: determine a first distance and a second distance based on position information, where the first distance is a distance between a sound source of the at least one sound wave signal and the electronic device, and the second distance is a distance between the sound source of the at least one sound wave signal and the signal processing apparatus; and perform transfer adjustment on the at least one sound wave signal based on a difference between the first distance and the second distance, to determine a signal feature of the at least one audio signal, where the signal feature includes an amplitude feature. The communication interface is specifically configured to send the at least one audio signal to the electronic device at the sending time point through the electromagnetic wave.
[0382] In a specific implementation, the processor is specifically configured to determine, based on a difference between first duration and second duration, a time point for sending the at least one audio signal, so that the at least one audio signal and the at least one sound wave signal arrive at the electronic device synchronously. The first duration is a ratio of the difference between the first distance and the second distance to the speed of sound, the second duration is a difference between a second time point and the first time point, and the second time point is a time point that is determined by the signal processing apparatus and at which the electronic device receives the audio signal.
[0383] In a specific implementation, the processor is specifically configured to: when the first duration is greater than the second duration, determine, based on the difference between the first duration and the second duration, the time point for sending the at least one audio signal.
[0384] In this embodiment of this disclosure, the communication interface may be considered as a signal receiving module, a signal sending module, or a wireless communication module of the signal processing apparatus, the processor having a processing function may be considered as an audio signal processing module/unit and a positioning module/unit of the signal processing apparatus, the memory may be considered as a storage module/unit of the signal processing apparatus, and the microphone may be considered as a sound collection module of the signal processing apparatus or another signal receiving module/unit. For example, as shown in
[0385] In a specific implementation, the sound collection module 1310 is configured to perform the sound wave signal receiving operation on the side of the signal processing apparatus in step 501 in
[0386] In a specific implementation, the sound collection module 1310 is configured to perform the sound wave signal receiving operation on the side of the signal processing apparatus in step 801 in
[0387] In a specific implementation, the sound collection module 1310 is configured to perform the sound wave signal receiving operation on the side of the signal processing apparatus in step 901 in
[0388] In a specific implementation, the sound collection module 1310 is configured to perform the sound wave signal receiving operation on the side of the signal processing apparatus in step 1101 in
[0389] In addition, the electronic device may be implemented by a device in
[0390] The communication interface 1401 may be any apparatus such as a transceiver, and is configured to communicate with another device or a communication network, for example, the Ethernet, a radio access network (radio access network, RAN), or a wireless local area network (wireless local area network, WLAN).
[0391] The processor 1402 includes but is not limited to one or more of a central processing unit (central processing unit, CPU), a network processor (network processor, NP), an application-specific integrated circuit (application-specific integrated circuit, ASIC), or a programmable logic device (programmable logic device, PLD). The PLD may be a complex programmable logic device (complex programmable logic device, CPLD), a field-programmable logic gate array (field-programmable gate array, FPGA), generic array logic (generic array logic, GAL), or any combination thereof. The processor 1402 is responsible for a communication line 1407 and general processing; and may further provide various functions, including timing, peripheral interfacing, voltage regulation, power management, and another control function. The memory 1403 may be configured to store data used by the processor 1402 when the processor 1402 performs an operation.
[0392] The memory 1403 may be a read-only memory (read-only memory, ROM) or another type of static storage device that can store static information and instructions, a random access memory (random access memory, RAM) or another type of dynamic storage device that can store information and instructions; or may be an electrically erasable programmable read-only memory (electrically erasable programmable read-only memory, EEPROM), a compact disc read-only memory (compact disc read-only memory, CD-ROM) or another compact disc storage, an optical disc storage (including a compact disc, a laser disc, an optical disc, a digital versatile disc, a Blu-ray disc, or the like), a magnetic disk storage medium or another magnetic storage device, or any other medium that can be configured to carry or store expected program code in a form of an instruction or a data structure and that can be accessed by a computer. However, this is not limited thereto. The memory may exist independently, and is connected to the processor 1402 through the communication line 1407. Alternatively, the memory 1403 may be integrated with the processor 1402. If the memory 1403 and the processor 1402 are mutually independent components, the memory 1403 is connected to the processor 1402. For example, the memory 1403 and the processor 1402 may communicate with each other through the communication line. The communication interface 1401 and the processor 1402 may communicate with each other through a communication line, and the communication interface 1401 may alternatively be connected to the processor 1402 directly.
[0393] The microphone 1406 should be understood in a broad sense, and the microphone 1406 should also be understood as including a microphone array. The microphone may alternatively be a mic or a micro-speaker. The microphone is an energy conversion device that converts a sound signal to an electrical signal. Types of microphones include but are not limited to capacitive microphones, crystal microphones, carbon microphones, and dynamic microphones.
[0394] The communication line 1407 may include any quantity of interconnected buses and bridges, and the communication line 1407 links together various circuits including one or more processors 1402 represented by the processor 1402 and a memory represented by the memory 1403. The communication line 1407 may further link various other circuits such as a peripheral device, a voltage stabilizer, and a power management circuit. These are well known in the art, and therefore are not further described in this disclosure.
[0395] In a specific implementation, the signal processing apparatus may include: a microphone, configured to receive at least one sound wave signal; a communication interface, configured to receive at least one audio signal, a first time point, and first information through an electromagnetic wave, where the at least one audio signal is at least one audio signal obtained by performing digital processing on a received sound wave signal by the signal processing apparatus, the first time point is a time point at which the signal processing apparatus receives the at least one sound wave signal, and the first information is position information related to at least one sound wave signal; and a processor, configured to determine a playing time point of the at least one audio signal based on the first time point and the first information, where the audio signal is for performing noise reduction processing on the at least one sound wave signal.
[0396] In a specific implementation, the processor is further configured to perform phase inversion processing on the at least one audio signal.
[0397] In a specific implementation, the processor is specifically configured to: determine a first distance and a second distance based on the first information, where the first distance is a distance between a sound source of the at least one sound wave signal and the electronic device, and the second distance is a distance between the sound source of the at least one sound wave signal and the signal processing apparatus; and determine the playing time point of the at least one audio signal based on a difference between first duration and second duration, so that the electronic device plays the audio signal when receiving the at least one sound wave signal. The first duration is a ratio of a difference between the first distance and the second distance to the speed of sound, the second duration is a difference between a second time point and the first time point, and the second time point is a time point at which the at least one audio signal is received.
[0398] In a specific implementation, the at least one audio signal includes N audio signals, where N is a positive integer greater than 1. The processor is further configured to calculate an arithmetic average value of M signals for a same sound source, where M is a positive integer not greater than N.
[0399] In a specific implementation, the communication interface is further configured to receive a first time point. The first time point is a time point at which the signal processing apparatus receives the first sound wave signal. The processor is specifically configured to process the first audio signal based on the first time point, to determine to play the noise reduction signal by using a speaker.
[0400] In a specific implementation, the processor is specifically configured to process the first audio signal based on a difference between first duration and second duration, to determine to play the noise reduction signal. The first duration is determined by a first electronic device based on a ratio of a third distance to the speed of sound, the second duration is a difference between a second time point and the first time point, the second time point is a time point at which the first electronic device receives the first audio signal, and the third distance is a distance between the first electronic device and the signal processing apparatus.
[0401] In a specific implementation, the processor is specifically configured to: when the first duration is greater than the second duration, process, by the first electronic device, the first audio signal based on the difference between the first duration and the second duration, to determine to play the noise reduction signal by using a speaker.
[0402] In a specific implementation, the processor is specifically configured to process the first audio signal based on a difference between third duration and second duration, to determine to play the noise reduction signal. The third duration is a ratio of a difference between a first distance and a second distance to the speed of sound, the second duration is a difference between a second time point and the first time point, the second time point is a time point at which the first electronic device receives the first audio signal, the first distance is a distance between a sound source and the first electronic device, and the second distance is a distance between the sound source and the signal processing apparatus.
[0403] In a specific implementation, the processor is specifically configured to: when the third duration is greater than the second duration, process the first audio signal based on the difference between the third duration and the second duration, to determine to play the noise reduction signal by using a speaker.
[0404] In a specific implementation, the communication interface is further configured to receive first information sent by the signal processing apparatus. The processor is further configured to determine the third distance based on the first information.
[0405] In a specific implementation, the communication interface is further configured to receive first information and second information that are sent by the signal processing apparatus. The second information includes position information of a sound source relative to the signal processing apparatus. The processor is further configured to determine the first distance and the second distance based on the first information and the second information.
[0406] In a specific implementation, there are N first audio signals, where N is a positive integer greater than 1. The processor is specifically configured to: determine, based on the second information, that M first audio signals are signals obtained by processing sound wave signals for a same sound source by the signal processing apparatus, the M first audio signals are any M signals in the N first audio signals, and M is a positive integer; and determine the noise reduction signal based on P first audio signals and an arithmetic average value of the M first audio signals, where P is a positive integer, and the P first audio signals are signals in the M first audio signals other than the M first audio signals.
[0407] In a specific implementation, the processor is specifically configured to perform cross-correlation processing on the first audio signal and the second sound wave signal, to determine the noise reduction signal.
[0408] In a specific implementation, the processor is specifically configured to determine the noise reduction signal based on the first audio signal, the noise reduction signal, and the second sound wave signal according to a least mean square error algorithm.
[0409] In a specific implementation, the processor is further configured to: determine spatial coordinates, corresponding to a case in which the first electronic device is the origin of the coordinates, of a sound source relative to the first electronic device; determine a first head-related transfer function HRTF based on the spatial coordinates of the sound source, where the memory prestores a correspondence between the HRTF and the spatial coordinates of the sound source; and deconvolve the noise reduction signal based on the first HRTF, to obtain a phase-inverted signal of the noise reduction signal.
[0410] The communication interface is further configured to send the phase-inverted signal of the noise reduction signal and the spatial coordinates of the sound source to a second electronic device, so that the second electronic device convolves the phase-inverted signal of the noise reduction signal with a second HRTF, to determine a noise reduction signal of the second electronic device. The second HRTF is determined by the second electronic device based on the spatial coordinates of the sound source, and the second electronic device prestores a correspondence between the HRTF and the spatial coordinates of the sound source.
[0411] In a specific implementation, the first electronic device and the second electronic device are earphones. The earphones include a left earphone and a right earphone, and an earphone with a higher battery level in the left earphone and the right earphone is the first electronic device.
[0412] In a specific implementation, the electronic device includes: a microphone, configured to receive a second sound wave signal; and a communication interface, configured to receive a first audio signal sent by a signal processing apparatus, where the first audio signal is a signal obtained by performing digital processing on the received first sound wave signal by the signal processing apparatus, and the first sound wave signal and the second sound wave signal are in a same sound field. The electronic device may further include a memory, configured to store computer-readable instructions. The electronic device may further include a processor coupled to the memory and configured to execute the computer-readable instruction in the memory to perform the following operations: processing the first audio signal based on first information to obtain a noise reduction signal. The noise reduction signal is for performing noise reduction processing on the second sound wave signal received by the electronic device, and the first information includes position information of the first electronic device relative to the signal processing apparatus.
[0413] In a specific implementation, the communication interface is further configured to receive a first time point. The first time point is a time point at which the signal processing apparatus receives the first sound wave signal. The processor is specifically configured to process the first audio signal based on a difference between first duration and second duration, to determine to play the noise reduction signal. The first duration is determined by the first electronic device based on the first information and the speed of sound, the second duration is a difference between a second time point and the first time point, and the second time point is a time point at which the first electronic device receives the first audio signal.
[0414] In a specific implementation, the communication interface is further configured to receive a first time point. The first time point is a time point at which the signal processing apparatus receives the first sound wave signal. The processor is specifically configured to: process the first audio signal based on a difference between first duration and second duration, to determine to play the noise reduction signal, where the first duration is determined by the first electronic device based on the first information and the speed of sound, the second duration is a difference between a second time point and the first time point, and the second time point is a time point at which the first electronic device receives the first audio signal; and adjust the first audio signal based on the first information.
[0415] In a specific implementation, the processor is specifically configured to: when the first duration is greater than the second duration, process, by the first electronic device, the first audio signal based on the difference between the first duration and the second duration, to determine to play the noise reduction signal by using a speaker.
[0416] In a specific implementation, the communication interface is further configured to receive a first time point. The first time point is a time point at which the signal processing apparatus receives the first sound wave signal. The processor is specifically configured to: determine a first distance and a second distance based on the first information and second information, where the second information is position information of a sound source relative to the signal processing apparatus, the first distance is a distance between the sound source and the first electronic device, and the second distance is a distance between the sound source and the signal processing apparatus; and process the first audio signal based on a difference between third duration and second duration, to determine to play the noise reduction signal by using a speaker, where the third duration is a ratio of a difference between the first distance and the second distance to the speed of sound, the second duration is a difference between a second time point and the first time point, and the second time point is a time point at which the first electronic device receives the first audio signal.
[0417] In a specific implementation, the communication interface is further configured to receive a first time point. The first time point is a time point at which the signal processing apparatus receives the first sound wave signal. The processor is specifically configured to: process the first audio signal based on a difference between third duration and second duration, to determine to play the noise reduction signal, where the third duration is a ratio of a difference between a first distance and a second distance to the speed of sound, the second duration is a difference between a second time point and the first time point, the second time point is a time point at which the first electronic device receives the first audio signal, the first distance is a distance between a sound source and the first electronic device, and the second distance is a distance between the sound source and the signal processing apparatus; determine the first distance and the second distance based on the first information and second information, where the first distance is the distance between the sound source and the electronic device, the second distance is the distance between the sound source and the signal processing apparatus, and the second information is position information of the sound source relative to the signal processing apparatus; and perform transfer adjustment on the first audio signal based on the difference between the first distance and the second distance.
[0418] In a specific implementation, the processor is specifically configured to: when the third duration is greater than the second duration, process the first audio signal based on the difference between the third duration and the second duration, to determine to play the noise reduction signal by using a speaker.
[0419] In a specific implementation, the communication interface is further configured to receive the first information sent by the signal processing apparatus.
[0420] In a specific implementation, the communication interface is further configured to receive the second information sent by the signal processing apparatus.
[0421] In a specific implementation, there are N first audio signals, where N is a positive integer greater than 1. The processor is specifically configured to: determine, based on the second information, that M first audio signals are signals obtained by processing sound wave signals for a same sound source by the signal processing apparatus, the M first audio signals are any M signals in the N first audio signals, and M is a positive integer; and determine the noise reduction signal based on P first audio signals and an arithmetic average value of the M first audio signals, where P is a positive integer, and the P first audio signals are signals in the M first audio signals other than the M first audio signals.
[0422] In a specific implementation, the processor is further configured to determine spatial coordinates, corresponding to a case in which the first electronic device is the origin of the coordinates, of a sound source relative to the first electronic device; determine a first head-related transfer function HRTF based on the spatial coordinates of the sound source, where the first electronic device prestores a correspondence between the HRTF and the spatial coordinates of the sound source; and deconvolve the noise reduction signal based on the first HRTF, to obtain a phase-inverted signal of the noise reduction signal. The communication interface is further configured to send the phase-inverted signal of the noise reduction signal and the spatial coordinates of the sound source to a second electronic device, so that the second electronic device convolves the phase-inverted signal of the noise reduction signal with a second HRTF, to determine a noise reduction signal of the second electronic device. The second HRTF is determined by the second electronic device based on the spatial coordinates of the sound source, and the second electronic device prestores a correspondence between the HRTF and the spatial coordinates of the sound source.
[0423] In a specific implementation, the first electronic device and the second electronic device are earphones. The earphones include a left earphone and a right earphone, and an earphone with a higher battery level in the left earphone and the right earphone is the first electronic device.
[0424] In a specific implementation, the communication interface is further configured to receive a noise spectrum of a third sound wave signal sent by a signal processing apparatus. The noise spectrum of the third sound wave signal is determined by the signal processing apparatus based on a signal of a non-voice part of the received third sound wave signal. The microphone is further configured to receive a fourth sound wave signal, where the fourth sound wave signal and the third sound wave signal are in a same sound field. The processor is further configured to determine a voice enhancement signal of the fourth sound wave signal based on a difference between the fourth sound wave signal on which a fast Fourier transform FFT is performed and the noise spectrum.
[0425] In a specific implementation, there are M noise spectrums of the third sound wave signal, where M is a positive integer greater than 1. The processor is further configured to: determine that any N noise spectrums in the M noise spectrums are noise spectrums determined by the signal processing apparatus for sound wave signals for a same sound source, where N is a positive integer; and determine an arithmetic average value of the N noise spectrums.
[0426] In this embodiment of this disclosure, the communication interface may be considered as a wireless communication module of the signal processing apparatus or a signal receiving module or a signal sending module of the signal processing apparatus, the processor having a processing function may be considered as a control module or a processing module of the signal processing apparatus, the memory may be considered as a storage module of the signal processing apparatus, the microphone may be considered as a sound collection module of the signal processing apparatus or another signal receiving module of the signal processing apparatus. The speaker may be considered as a playing module of the signal processing apparatus. As shown in
[0427] In a specific implementation, the sound collection module 1510 is configured to perform the audio signal collection step on the side of electronic device in the embodiment corresponding to
[0428] In a specific implementation, the sound collection module 1510 is configured to perform the audio signal collection step on the side of electronic device in the embodiment corresponding to
[0429] In a specific implementation, the sound collection module 1510 is configured to perform the sound wave signal receiving operation on the side of the electronic device in step 902 in
[0430] In a specific implementation, when the electronic device is a first electronic device, the sound collection module 1510 is configured to perform the audio signal collection step on the side of the electronic device in the embodiment corresponding to
[0431] In a specific implementation, the sound collection module 1510 is configured to perform the sound wave signal receiving operation on the side of the electronic device in step 1104 in
[0432] All or a part of the foregoing embodiments may be implemented by using software, hardware, firmware, or any combination thereof. When software is used to implement embodiments, all or a part of embodiments may be implemented in a form of a computer program product.
[0433] The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, all or a part of the procedures or the functions according to embodiments of this disclosure are generated. The computer may be a general-purpose computer, a dedicated computer, a computer network, or another programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or may be transmitted from a computer-readable storage medium to another computer-readable storage medium. For example, the computer instructions may be transmitted from a website, computer, server, or data center to another website, computer, server, or data center in a wired (for example, a coaxial cable, an optical fiber, or a digital subscriber line (DSL)) or wireless (for example, infrared, radio, or microwave) manner. The computer-readable storage medium may be any usable medium accessible by the computer, or a data storage device, such as a server or a data center, integrating one or more usable media. The usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, or a magnetic tape), an optical medium (for example, a DVD), a semiconductor medium (for example, a solid state disk (solid state disk, SSD)), or the like.
[0434] A person of ordinary skill in the art may understand that all or a part of the steps of the methods in the foregoing embodiments may be implemented by a program instructing related hardware. The program may be stored in a computer-readable storage medium. The storage medium may include a ROM, a RAM, a magnetic disk, an optical disc, or the like.
[0435] The audio signal processing method, the signal processing apparatus, the electronic device, the noise reduction system, and the storage medium provided in embodiments of this disclosure are described in detail above. This specification describes principles and implementations of this disclosure by using specific examples. The foregoing embodiments are merely used to help understand the methods and core ideas of this disclosure. In addition, a person of ordinary skill in the art can make variations and modifications to the specific implementations and application scopes according to the ideas of this disclosure. Therefore, this specification shall not be construed as a limitation to this disclosure.