Software and Microphone Device
20230292072 · 2023-09-14
Inventors
Cpc classification
H04S5/02
ELECTRICITY
H04R5/04
ELECTRICITY
H04S2420/11
ELECTRICITY
H04R5/027
ELECTRICITY
International classification
H04S5/02
ELECTRICITY
H04R5/04
ELECTRICITY
Abstract
A software of the present invention causes a processor to execute a process including converting an A format signal applicable to ambisonics to a B format signal; distinguishing a specific direction from a plurality of directions based on the B format signal; and generating and outputting an audio signal corresponding to the specific direction. Also disclosed is a microphone including the software.
Claims
1. A software causing a processor to execute a process comprising: converting an A format signal applicable to ambisonics to a B format signal; distinguishing a specific direction from a plurality of directions based on the B format signal; and generating and outputting an audio signal corresponding to the specific direction.
2. The software according to claim 1, wherein the software causes the processor to execute: a first process of converting the A format signal to the B format signal, the A format signal being converted to a digital signal in advance; a second process of generating a plurality of signals corresponding to the plurality of directions based on the B format signal; a third process of distinguishing the specific direction corresponding to a largest signal of the plurality of signals; and a fourth process of generating and outputting the audio signal corresponding to the specific direction based on the B format signal.
3. The software according to claim 2, wherein the software causes the processor to execute: in the second process, a process of calculating an envelope of each of the plurality of signals corresponding to the plurality of directions; and in the third process, a process of distinguishing the specific direction corresponding to a largest signal based on the envelope.
4. The software according to claim 2, wherein the software causes the processor to execute: in the first process, a process of memorizing the B format signal converted from the A format signal; and in the fourth process, a process of generating the audio signal corresponding to the specific direction based on the memorized B format signal.
5. The software according to claim 3, wherein the software causes the processor to execute: in the first process, a process of memorizing the B format signal converted from the A format signal; and in the fourth process, a process of generating the audio signal corresponding to the specific direction based on the memorized B format signal.
6. A microphone device with the software according to any one of claims 1 through 4 installed therein, the device comprising: a body of the microphone; four or more microphone elements provided facing sound pickup directions different from each other in the body and configured to output audio signals to be components of the A format signal; an amplifier configured to amplify the audio signals outputted from the four or more microphone elements; an A/D converter configured to convert each audio signal amplified by the amplifier to a digital signal; and the processor configured to process the audio signal converted to the digital signal by the A/D converter in accordance with the software.
7. A microphone device with the software according to claim 5 installed therein, the device comprising: a body of the microphone; four or more microphone elements provided facing sound pickup directions different from each other in the body and configured to output audio signals to be components of the A format signal; an amplifier configured to amplify the audio signals outputted from the four or more microphone elements; an A/D converter configured to convert each audio signal amplified by the amplifier to a digital signal; and the processor configured to process the audio signal converted to the digital signal by the A/D converter in accordance with the software.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0018]
[0019]
[0020]
[0021]
[0022]
[0023]
[0024]
[0025]
[0026]
DESCRIPTION OF THE INVENTION
[0027] A description is given below to an embodiment of the software and the microphone device of the present invention with reference to the drawings.
[0028] 1. Ambisonics
[0029] The software and the microphone device of the present invention use the technique of ambisonics. At first, with reference to
[0030] Ambisonics is a technique to record the entire sound throughout peripheral 360° in a space and reproduce the same. Such ambisonics is capable of providing spatial audio containing sound in forward and backward directions, left and right directions, and upward and downward directions. With the proliferation of virtual reality (VR) technique in recent years, ambisonics is used for audio for 360° video.
[0031]
[0032] The first through fourth microphone elements 11 to 14 pick up sound in the four directions of FLU, FRD, BLD, and BRU. Signals of the sound in the four directions of FLU, FRD, BLD, and BRU are called as “A format signals.” Such an A format signal is not directly usable and is converted to a “B format signal” with a directivity as illustrated in
[0033] The A format signals are converted to the B format signals W, X, Y, and Z by formulae (1) through (4) below.
W=FLU+FRD+BLD+BRU (1)
X=FLU+FRD−BLD−BRU (2)
Y=FLU−FRD+BLD−BRU (3)
Z=FLU−FRD−BLD+BRU (4)
[0034] In the above formulae, W denotes a signal of sound in all directions, X denotes a signal of sound in the forward and backward directions, Y denotes a signal of sound in the left and right directions, Z denotes a signal of sound in the upward and downward directions, FLU denotes a signal of upper left front sound, FRD denotes a signal of lower right front sound, BLD denotes a signal of lower left back sound, and BRU denotes a signal of upper right back sound.
[0035] Synthesis of the B format signals W, X, Y, and Z produces a signal of omnidirectional sound including the forward and backward, left and right, and upward and downward directions. For example,
[0036] 2. Microphone Device
[0037] The microphone device with the software of the present embodiment installed therein is then described with reference to
[0038] A microphone device 1 of the present embodiment has an appearance illustrated in the six drawings of
[0039] The microphone device 1 includes the microphone 10 and a body 20. The microphone 10 is identical to that in
[0040] As illustrated in
[0041] The REMOTE terminal 215 is electrically connected to a wireless adapter, not shown, a Bluetooth® adapter, for example. The microphone device 1 is allowed to wirelessly communicate via the wireless adapter with a smartphone, a tablet PC, a laptop PC, a desktop PC, and the like, not shown. Users can remotely operate the microphone device 1 using such a smartphone and the like. The microphone device 1 is capable of outputting an audio signal to, for example, a headphone, not shown, via the wireless adapter.
[0042] As illustrated in
[0043] The REC LED 201B has functions identical to the REC LED 201A illustrated in
[0044] The display 202 displays various types of information on the microphone device 1. For example, while the microphone device 1 is recording, the display 202 displays information on the recording time, the signal level of the A or B format signal, and the degree of horizontality and the degree of verticality of the body 20. As another example, while the microphone device 1 is playing back, the display 202 displays information on the playback time, the degree of horizontality, the degree of verticality, and the rotation of the body 20.
[0045] The REC key 203 is operated to start recording. The STOP/HOME key 204 is operated to stop recording or playing back and cause the display 202 to display a home screen. The REW/Select key 205 is operated to rewind the playback position of a file and select an item to be displayed on the display 202.
[0046] The PLAY/PAUSE/ENTER key 206 is operated to start playing back, pause the recording or playing back, and determine the selected item. The FF/Select key 207 is operated to fast forward the playback position of a file and select an item to be displayed on the display 202. The MENU key 208 is operated to cause the display 202 to display a MENU screen. The Power/HOLD switch 209 is operated to turn on/off the power supply of the microphone device 1 and deactivate key operations.
[0047] As illustrated in
[0048] The USB terminal 212 is used to electrically connect the microphone device 1 to another device. For example, the microphone device 1 is electrically connected to a personal computer, not shown, via the USB terminal 212 to be used as, for example, a microphone for a conference system. The USB terminal 212 is connected to an AC adapter, not shown, to supply the AC power to the microphone device 1. The LINE OUT terminal 213 is used to output an audio signal to another device.
[0049] As illustrated in
[0050] As illustrated in
[0051]
[0052] The respective first through fourth microphone elements 11 to 14 pick up sound from four different directions and output first signals. The four signals outputted from the first through fourth microphone elements 11 to 14 are collectively called as a four-channel A format signal. The four-channel A format signal outputted from the first through fourth microphone elements 11 to 14 are indicated by FLU, FRD, BLD, and BRU in
[0053] The four-channel A format signal outputted from the first through fourth microphone elements 11 to 14 is inputted to the microphone gain 21. The microphone gain 21 amplifies the four-channel A format signal at a degree of amplification set by the MIC GAIN dial 211 illustrated in
[0054] The four-channel A format signal amplified by the microphone gain 21 is inputted to the A/D converter 22. The A/D converter 22 converts the A format signal as an analog signal to a digital signal. The four-channel A format signal converted to the digital signal is inputted to the processor 24.
[0055] 3. Process of Processor by Software
[0056] The processor 24 executes a process in accordance with the software of the present embodiment. The process of the processor 24 by the software of the present embodiment is summarized as follows: At first, the processor 24 converts an A format signal to a B format signal. Then, the processor 24 distinguishes a specific direction from a plurality of directions based on the B format signal. The processor 24 then generates and outputs an audio signal corresponding to the specific direction.
[0057] In the present embodiment, an example of using the microphone device 1 as a microphone for a conference system is described. In this case, the processor 24 distinguishes the direction of a speaker among the plurality of participants around the microphone device 1 and generates and outputs an audio signal corresponding to the direction of the speaker. In addition, every time the speaker changes, the processor 24 distinguishes the direction of a new speaker and generates and outputs an audio signal corresponding to the direction of the new speaker. Below is a description of the process of the processor 24 illustrated in
[0058] 3.1 Low-Cut Process
[0059] The processor 24 executes a low-cut process 240. That is, the processor 24 removes components at a preset frequency or less from the A format signal converted to the digital signal. Users can set the frequency (cut-off frequency) subjected to the low-cut process 240 by pressing the MENU key 208 illustrated in
[0060] 3.2 A/B Format Conversion Process
[0061] The processor 24 executes an A/B format conversion process 241. That is, based on the formulae (1) through (4) above, the processor 24 converts the A format signal converted to the digital signal to a four-channel B format signal. The four-channel B format signal is indicated by W, X, Y, and Z in
[0062] As illustrated in
[0063] 3.3 Memorization/Reading Process
[0064] The processor 24 executes a memorization/reading process 242 of the B format signal. That is, the processor 24 memorizes the four-channel B format signal W, X, Y, and Z generated by the A/B format conversion process 241 in a storage medium, not shown, exemplified by a RAM. The processor 24 also reads the signals W, X, and Y of the B format signal memorized in the RAM to generate an audio signal corresponding to the specific direction in 360° horizontally.
[0065] 3.4 0-315 Sampling Process
[0066] The processor 24 executes a 0-315 sampling process 243. The “0-315” means 0°, 45°, 90°, 135°, 180°, 225°, 270°, and 315°. As illustrated in
[0067] The 0-315 sampling process 243 illustrated in
[0068] In the 0-315 signal generation process 243A, the processor 24 generates a plurality of signals respectively corresponding to 0°, 45°, 90°, 135°, 180°, 225°, 270°, and 315° by synthesizing the signals W, X, and Y of the B format signal.
[0069] Then, in the 0-315 envelope calculation process 243B, the processor 24 calculates Env 0, Env 45, Env 90, Env 135, Env 180, Env 225, Env 270, and Env 315, which are the envelopes of the respective plurality of signals.
[0070] 3.5 0-315 Sum/Average Calculation Process As illustrated in
[0071] 3.6 Angle Distinguishing Process
[0072] The processor 24 executes an angle distinguishing process 245. That is, the processor 24 compares the average (Ave) of each of Env 0, Env 45, Env 90, Env 135, Env 180, Env 225, Env 270, and Env 315. Based on the results of the comparison, the processor 24 then distinguishes a specific angle of any one of 0°, 45°, 90°, 135°, 180°, 225°, 270°, or 315° corresponding to the signal with the largest envelope average (Ave).
[0073] The distinguishment of the specific angle by the processor 24 is executed at predetermined time intervals. For example, the processor 24 repeatedly executes the process of distinguishing the specific angle at 33-ms intervals equivalent to one frame of a frame rate of 30 FPS. In this example, the processor 24 distinguishes the specific angle based on the envelope average (Ave) in 33 ms.
[0074] 3.7 Audio Signal Generation Process
[0075] The processor 24 executes an audio signal generation process 246. That is, the processor 24 generates an audio signal corresponding to the specific angle distinguished by the angle distinguishing process 245 described above. The audio signal corresponding to the specific angle is generated by synthesizing the signals W, X, and Y of the B format signal memorized in the RAM.
[0076] As illustrated in
[0077] For example, the processor 24 in the angle distinguishing process 245 distinguishes the specific angle at 33-ms intervals. In this case, based on the B format signal W, X, and Y delayed 33 ms, the processor 24 in the audio signal generation process 246 generates an audio signal corresponding to the specific angle. That is, the audio signal corresponding to the specific angle is generated based on the B format signal W, X, and Y memorized in the RAM 33 ms earlier. This allows sending of the talk by the new speaker to the conference system at the other party of the conference without missing from the beginning. It should be noted that the 33-ms delayed audio signal is outputted from the microphone device 1. However, the 33 ms delay does not cause the other party of the conference to feel an incompatibility.
[0078] 3.8 Cross Fade Process
[0079] The processor 24 executes a cross fade process 247. The cross fade process 247 is executed when a first speaker changes to a second speaker.
[0080] For example, it is assumed that the first speaker speaks from a specific angle a (e.g., a=00). The processor 24 distinguishes the specific angle a corresponding to the signal with the largest envelope average (Ave). The processor 24 then generates an audio signal corresponding to the specific angle a and outputs the signal from the microphone device 1.
[0081] Later, when the second speaker speaks from a specific angle b (e.g., b=90°), the processor 24 distinguishes the specific angle b corresponding to the signal with the largest envelope average (Ave). The processor 24 then generates an audio signal corresponding to the specific angle b and outputs the signal from the microphone device 1. At this point, the processor 24 executes the cross fade process 247.
[0082] In the cross fade process 247, the processor 24 gradually reduces the output level of the audio signal corresponding to the specific angle a. This causes the output of the audio signal corresponding to the specific angle a to be faded out. At the same time, the processor 24 gradually increases the output level of the audio signal corresponding to the specific angle b. This causes the output of the audio signal corresponding to the specific angle b to be faded in.
[0083] Such a cross fade process 247 can reduce the sound of noise produced when the output of the two audio signals is switched. That is, disconnection of the continuity of the signal waveform when output of the two audio signals is switched produces noise. The noise produces sound every time the speaker changes and gives the other party of the conference uncomfortable feelings. The cross fade process 247 allows reduction of the sound of noise produced when the speaker changes and allows switch of the sound of the first speaker to the sound of the second speaker without the feelings of incompatibility.
[0084] 3.9 Process Flow of Processor With reference to
[0085] At step S1, the processor 24 clears the sum (Sum) and average (Ave) of each of Env 0, Env 45, Env 90, Env 135, Env 180, Env 225, Env 270, and Env 315 memorized in the process of
[0086] It should be noted that Env 0 is the envelope of a signal sampled at 0° horizontally. Env 45 is the envelope of a signal sampled at 450 horizontally. Env 90 is the envelope of a signal sampled at 900 horizontally. Env 135 is the envelope of a signal sampled at 1350 horizontally. Env 180 is the envelope of a signal sampled at 1800 horizontally. Env 225 is the envelope of a signal sampled at 225° horizontally. Env 270 is the envelope of a signal sampled at 2700 horizontally. Env 315 is the envelope of a signal sampled at 3150 horizontally.
[0087] Going on to step S2, the processor 24 firstly calculates the sum (Sum) and each average (Ave) of Env 0. For example, the processor 24 calculates the sum (Sum) and each average (Ave) of Env 0 in 33 ms.
[0088] Going on to step S3, the processor 24 determines whether the average (Ave) of Env 0 is a predefined threshold or more. If the average (Ave) of Env 0 is less than the threshold (No), the processor 24 goes on to step S5. From this point forward, the process for a signal at 0° horizontally corresponding to Env 0 is not executed. In other words, if the envelope average (Ave) is less than the threshold, no audio signal is generated for the angle corresponding to this envelope.
[0089] Meanwhile, if the average (Ave) of Env 0 is the threshold or more at step S3 (YES), the processor 24 goes on to step S4 and distinguishes the angle “0°” corresponding to Env 0. The processor 24 then goes on to step S5.
[0090] At step S5, the processor 24 determines whether the process of steps S2 through S4 is completed for all angles of Env 0, Env 45, Env 90, Env 135, Env 180, Env 225, Env 270, and Env 315. If the process of steps S2 through S4 is not completed for all angles (NO), the processor 24 repeats the process of steps S2 through S4 for all angles.
[0091] Meanwhile, if the process of steps S2 through S4 is completed for all angles at step S5 (YES), the processor 24 goes on to step S6. At step S6, the processor 24 distinguishes the largest envelope average (Ave) among the envelope averages (Ave) of the threshold or more distinguished at step S3.
[0092] Going on to step S7, the processor 24 distinguishes the specific angle b (e.g., b=90°) corresponding to the largest envelope average (Ave). Going on to step S8, the processor 24 generates an audio signal corresponding to the specific angle b. The audio signal corresponding to the specific angle b is generated by synthesizing the signals W, X, and Y of the B format signal memorized in the RAM.
[0093] Going on to step S9, the processor 24 determines whether an audio signal corresponding to the specific angle “b” is currently outputted. The currently outputted audio signal has generated by the process of
[0094] Meanwhile, if determining that the audio signal corresponding to the specific angle “b” is not currently outputted (NO) at step S9, the processor 24 goes on to step S10 and executes the cross fade process.
[0095] For example, it is assumed that an audio signal corresponding to the specific angle a (e.g., a=0°) is currently outputted by the process of
[0096] The processor 24 then executes the process of step S11 and finishes the process illustrated in
[0097] 4. Action and Effects
[0098] The microphone device 1 with the software of the present embodiment described above installed therein allows selective output of the sound produced from the specific direction in the space where the first through fourth microphone elements 11 to 14 are installed. That is, the processor 24 executing the process in accordance with the software of the present embodiment distinguishes the specific direction from which the loudest sound is produced in the space where the first through fourth microphone elements 11 to 14 are installed and generates and outputs an audio signal corresponding to the specific direction. Audio signals corresponding to directions other than the specific direction are not outputted. Such a process of the software in the present embodiment may be considered to reproduce the human behavior of directing a microphone to the direction from which the loudest sound is produced by the digital signal process.
[0099] In addition, the processor 24 generates and outputs only an audio signal produced from the specific direction and thus the echoing sound picked up by the microphone 10 omnidirectionally in the room and various types of noise produced inside and outside the room are greatly reduced.
[0100] Still in addition, the processor 24 removes the components at the cut-off frequency or less from the A format signal by the low-cut process 240. This causes the audio signal generated by the processor 24 to have reduced noise in the low frequency band, such as wind noise of an air conditioner and pop noise of a speaker.
[0101] 5. Others The software and the microphone device of the present invention are not limited to the embodiment described above. For example, the first order ambisonics to generate a four-channel B format signal is employed in the embodiment described above while the order of ambisonics is not limited to this. To the software and the microphone device of the present invention, higher order ambisonics of the second order or higher is applicable.
[0102] In addition, the use of the software and the microphone device is exemplified by the microphone for a conference system in the embodiment described above while the use is not limited to this. For example, the use of the software and the microphone device of the present invention may be a microphone simultaneously used with a monitoring camera. In this case, it is possible to direct the monitoring camera in a specific direction distinguished by the microphone device.
[0103] Still in addition, the software and the microphone device of the present invention is not limited to the configuration for signal processing of sound horizontally through 360° based on the signals W, X, and Y of the B format signal. The software and the microphone device of the present invention are capable of signal processing for omnidirectional sound including the forward and backward, left and right, and upward and downward directions based on all B format signal W, X, Y, and Z.
[0104] In addition, the sound horizontally through 360° is subjected to signal processing at intervals of 45° in the embodiment described above while the interval is not limited to this. The software and the microphone device of the present invention is capable of signal processing for sound horizontally through 360° at intervals other than 45°.
DESCRIPTION OF REFERENCE NUMERALS
[0105] 1 Microphone Device [0106] 10 Microphone [0107] 11 First Microphone Element [0108] 12 Second Microphone Element [0109] 13 Third Microphone Element [0110] 14 Fourth Microphone Element [0111] 15 Protector [0112] 20 Body [0113] 201A, 201B REC LED [0114] 202 Display (Visual Display Device) [0115] 203 REC Key [0116] 204 STOP/HOME Key [0117] 205 REW/Select Key [0118] 206 PLAY/PAUSE/ENTER Key [0119] 207 FF/Select Key [0120] 208 MENU Key [0121] 209 Power/HOLD Switch [0122] 210 VOLUME Key [0123] 211 MIC GAIN Dial [0124] 212 USB Terminal [0125] 213 LINE OUT Terminal [0126] 214 Threaded Hole [0127] 215 REMOTE Terminal [0128] 216 PHONE OUT Terminal [0129] 217 Bottom Cover [0130] 21 Microphone Gain [0131] 22 A/D Converter [0132] 24 Processor [0133] 240 Low-Cut Process [0134] 241 A/B Format Conversion Process [0135] 242 Memorization/Reading Process [0136] 243 0-315 Sampling Process [0137] 243A 0-315 Signal Generation Process [0138] 243B 0-315 Envelope Calculation Process [0139] 244 0-315 Sum/Average Calculation Process [0140] 245 Angle Distinguishing Process [0141] 246 Audio Signal Generation Process [0142] 247 Cross Fade Process