Splitting a Voice Signal into Multiple Point Sources
20230143473 · 2023-05-11
Inventors
Cpc classification
H04S1/002
ELECTRICITY
H04S2400/11
ELECTRICITY
H04S7/30
ELECTRICITY
International classification
Abstract
In a method for reproducing sound of a data object, a voice signal of a data object is split into a first sub-band signal and a second sub-band signal, and speaker driver signals are generated to produce sound of the object by a two-way speaker system in which the first sub-band signal drives a tweeter or high frequency driver and the second sub-band signal drives a woofer or low frequency driver. In another aspect, the first and second sub-band signals are spatialized as virtual sources that are in different locations. Other aspects are also described and claimed.
Claims
1. An audio system comprising a data processor configured to spatialize sound that is associated with a visual element that is being display on a display, the processor to: split an audio signal into a plurality of sub-band audio signals that include a first sub-band signal in a first sub-band, and a second sub-band signal in a second sub-band; and generate a plurality of speaker driver signals by processing the first and second sub-band audio signals so that the first sub-band signal is spatialized to emanate from a first location of the visual element, and the second sub-band signal is spatialized to emanate from a second location of the visual element that is different than the first location.
2. The system of claim 1 wherein to generate the speaker driver signals, the processor spatializes the first sub-band signal as a first virtual sound source that is at a first virtual location, and the second sub-band signal as a second virtual sound source that is at a second virtual location different than the first virtual location.
3. The system of claim 2 wherein the audio signal is a voice signal, and the visual element is an avatar.
4. The system of claim 3 wherein the first location in the avatar is in a head or a mouth, and the second location in the avatar is in a torso.
5. The system of claim 4 wherein the first sub-band is a high frequency band and the second sub-band is a low frequency band, wherein the high frequency band is above the low frequency band.
6. The system of claim 1 wherein the audio signal is a voice signal, and the visual element is an avatar associated with a data object in a simulated reality application.
7. The system of claim 6 wherein the first location in the avatar is in a head or a mouth, and the second location in the avatar is in a torso.
8. The system of claim 7 wherein the first sub-band is a high frequency band and the second sub-band is a low frequency band, wherein the high frequency band is above the low frequency band.
9. The system of claim 8 wherein the processor is configured to perform frequency-dependent directivity processing upon the first sub-band.
10. The system of claim 8 wherein the processor is configured to perform gain-dependent directivity processing upon the first sub-band.
11. The system of claim 1 wherein the processor is to: receive an acoustic characteristic of a virtual room in which the visual element is presented on a display, or of a real room in which a listener of the spatialized sound is located; and set one or more cut off frequencies of the plurality of sub-band audio signals based on the acoustic characteristic.
12. The system of claim 11 wherein the acoustic characteristic comprises a room size or room volume.
13. A method for reproducing sound of a data object, the method comprising: splitting a voice signal of a data object into a first sub-band signal in a first sub-band, and a second sub-band signal in a second sub-band; and generating a plurality of speaker driver signals to produce sound of the object by a two-way speaker system, by processing the first sub-band signal into a tweeter or high frequency driver signal for the two-way speaker system, and the second sub-band signal into a woofer or low frequency driver signal for the two-way speaker system.
14. The method of claim 13 wherein the data object is associated with a visual element in a simulated reality application program, the visual element being an avatar.
15. The method of claim 13 wherein the first sub-band is a high frequency band and the second sub-band is a low frequency band, wherein the high frequency band is above the low frequency band.
16. An article of manufacture comprising a machine-readable storage medium having stored therein instructions that configure a processor to: split a voice signal into a first sub-band signal in a first sub-band, and a second sub-band signal in a second sub-band; and generate a plurality of speaker driver signals to reproduce sound of the voice signal, in which sound of the first sub-band signal is produced by a first speaker driver and sound of the second sub-band signal is produced by a second speaker driver.
17. The article of manufacture of claim 16 wherein the first speaker driver is at a tweeter and the second speaker driver is a woofer.
18. The article of manufacture of claim 17 wherein the voice signal is that of an avatar that is being displayed on a display.
19. The article of manufacture of claim 18 wherein the first sub-band is a high frequency band and the second sub-band is a low frequency band, wherein the high frequency band is above the low frequency band.
20. The article of manufacture of claim 18 further comprising instructions that configure the processor to: receive an acoustic characteristic of a virtual room in which the avatar is presented on the display, or a real room in which a listener of the reproduced sound is located; and setting one or more cut off frequencies of the first sub-band and the second sub-band based on the acoustic characteristic.
21. The article of manufacture of claim 20 wherein the acoustic characteristic comprises room size or room volume.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0009] Several aspects of the disclosure here are illustrated by way of example and not by way of limitation in the figures of the accompanying drawings in which like references indicate similar elements. It should be noted that references to “an” or “one” aspect in this disclosure are not necessarily to the same aspect, and they mean at least one. Also, in the interest of conciseness and reducing the total number of figures, a given figure may be used to illustrate the features of more than one aspect of the disclosure, and not all elements in the figure may be required for a given aspect.
[0010]
[0011]
[0012]
DETAILED DESCRIPTION
[0013] Several aspects of the disclosure with reference to the appended drawings are now explained. Whenever the shapes, relative positions and other aspects of the parts described are not explicitly defined, the scope of the invention is not limited only to the parts shown, which are meant merely for the purpose of illustration. Also, while numerous details are set forth, it is understood that some aspects of the disclosure may be practiced without these details. In other instances, well-known circuits, structures, and techniques have not been shown in detail so as not to obscure the understanding of this description.
[0014] One aspect of the disclosure is
[0015] The input audio signal (e.g., a monaural signal) is associated with or represents the sound of a data object which is represented by a visual element 2, such as in a simulated reality application program. The visual element 2 of the data object appears on a display 3 after having been rendering by a video engine (not shown.) The visual element 2 may be a graphical object area (e.g., drawn on a 2D display) or it may be a graphical object volume (e.g., drawn on a 3D display) of the data object. The data object may be for example a person and the visual element 2 is an avatar of the person, depicted in
[0016] The audio system renders a single input audio signal as two or more virtual sound sources or point sources, as follows. A splitter 4 splits the audio signal into two or more sub-band audio signals (components of the input audio signal), including a first sub-band (sub-band A) and a second sub-band (sub-band B.) The splitter may be implemented for example as a filter bank. The sub-band A may be in a higher frequency range of the human audible range than the sub-band B. As an example, the low frequency band (sub-band B) may lie within 50 Hz-200 Hz. In another example, the low frequency band lies within 100 Hz-300 Hz. The high frequency band may lie above those ranges.
[0017] The sub-band A is assigned to a first location in the visual element, which is within the area or volume of the visual element, while the second sub-band is assigned to a second location in the visual element that is spaced apart from the first location (but that is also within the area or volume of the visual element.) As seen in the figure, sub-band A is spatialized as a virtual sound source A or a point source that is located at the person's or avatar's head or mouth, while sub-band B is spatialized as a virtual sound source B located at the person's or avatar's torso. The system generates a set of multi-channel speaker driver signals (two or more speaker driver signals) that drive a listening device to produce the sound of the data object, by processing the two sub-band audio signals and their associated metadata that includes their respective virtual source locations, so that sound of the sub-band A emanates from a different location than sound of the sub-band B. Note here that the location of a virtual sound source may be equivalent to an azimuthal direction or angle, and an elevation direction or angle, for example as viewed from the virtual listening position.
[0018] In the example of
[0019]
[0020] Turning now to
[0021]
[0022] In another instance, rather than spatializing the sound of the data object, sound of the first sub-band signal is produced by a high frequency speaker driver, e.g., a tweeter, while sound of the second sub-band signal is produced by a low frequency speaker driver, e.g., a woofer, of a 2-way or multi-way speaker system. Those speaker drivers may be integrated into the same housing of a listening device such as a laptop computer, a tablet computer, or a head mounted device. In those instances, the listening device also has therein (either integrated or mounted) the display 3.
[0023] Another aspect of the disclosure here is to add an audio processing effect into the chain of signal processing being performed upon the sub-band A audio signal (e.g., a high-frequency band being rendered as emanating from the source which in this case is the avatar's mouth) being a frequency-dependent directivity, or a frequency-and-gain dependent directivity. In
[0024] While certain aspects have been described and shown in the accompanying drawings, it is to be understood that such are merely illustrative of and not restrictive on the broad invention, and that the invention is not limited to the specific constructions and arrangements shown and described, since various other modifications may occur to those of ordinary skill in the art. The description is thus to be regarded as illustrative instead of limiting.