RECORDING MEETING AUDIO VIA MULTIPLE INDIVIDUAL SMARTPHONES
20220377458 · 2022-11-24
Inventors
Cpc classification
H04R2499/11
ELECTRICITY
H04R2430/01
ELECTRICITY
International classification
Abstract
A method of providing audio information from a meeting includes receiving a first audio stream from a first input audio device and a second audio stream from a second input audio device during the meeting, identifying a first audio fragment from the first audio stream, and identifying a second audio fragment from the second audio stream. The method also includes compiling the audio fragments from the first and second audio streams into an audio file that includes at least the first audio fragment and the second audio fragment. The method further includes providing the audio file to one or more recipients. The audio file identifies the first audio fragment as corresponding to a first participant of the meeting and the second audio fragment as corresponding to a second participant of the meeting.
Claims
1. A method of recording audio information from a meeting, the method comprising: executing a meeting management application, including establishing a plurality of connections with a plurality of audio input devices configured to record audio data; receiving a plurality of audio streams via the plurality of connections during a meeting, the plurality of audio streams including a first audio stream; associating the first audio stream with a first participant; identifying a first audio fragment from the first audio stream; transcribing the first audio fragment to first textual content; and compiling the plurality of audio streams into a storyboard of the meeting, the storyboard including at least the first textual content of the first audio fragment to be displayed in association with the first participant.
2. The method of claim 1, further comprising: extracting a voice profile from a first audio input device associated with the first participant, wherein the first audio fragment is transcribed based on the voice profile.
3. The method of claim 1, further comprising: distributing the storyboard of the meeting to a subset of the plurality of audio input devices.
4. The method of claim 1, wherein the storyboard of the meeting is organized in a chronological order of audio fragments.
5. The method of claim 1, wherein the storyboard of the meeting further includes a plurality of audio fragments.
6. The method of claim 1, wherein the storyboard of the meeting further includes at least one of voice annotations and pre-recorded introductions of each active participant.
7. The method of claim 1, further comprising: receiving one or more user inputs to edit the storyboard; and in response to the one or more user inputs to edit the storyboard, performing an action including one or more of: emphasizing the first audio fragment associated with the first participant; deemphasizing the first audio fragment associated with the first participant; grouping a plurality of audio fragments in the storyboard by topic; grouping audio fragments associated with the first participant in the storyboard; and adding new audio fragments to the storyboard.
8. The method of claim 1, further comprising: receiving an additional user voice input annotating the first audio fragment, wherein the first audio fragment is automatically identified from the first audio stream in response to the additional user voice input.
9. The method of claim 1, further comprising: receiving an additional user input annotating the first audio fragment, wherein the first audio fragment is automatically identified from the first audio stream in response to the additional user input.
10. An electronic device, comprising: one or more processors; and memory storing one or more programs for execution by the one or more processors, the one or more programs including instructions for: executing a meeting management application, including establishing a plurality of connections with a plurality of audio input devices configured to record audio data; receiving a plurality of audio streams via the plurality of connections during a meeting, the plurality of audio streams including a first audio stream; associating the first audio stream with a first participant; identifying a first audio fragment from the first audio stream; transcribing the first audio fragment to first textual content; and compiling the plurality of audio streams into a storyboard of the meeting, the storyboard including at least the first textual content of the first audio fragment to be displayed in association with the first participant.
11. The electronic device of claim 10, wherein the storyboard of the meeting is organized in a chronological order.
12. The electronic device of claim 10, wherein the storyboard of the meeting further includes at least one of voice annotations and pre-recorded introductions of each active participant.
13. The electronic device of claim 10, wherein the first audio stream is recorded by a first audio input device, and a visual signal is provided on the first audio input device to request feedback from the first participant associated with the first audio input device.
14. The electronic device of claim 13, wherein the first participant responds to the visual signal provided on the first audio input device to confirm whether the first participant is currently speaking.
15. A non-transitory computer-readable medium storing one or more programs configured for execution by a system, the one or more programs including instructions for: executing a meeting management application, including establishing a plurality of connections with a plurality of audio input devices configured to record audio data; receiving a plurality of audio streams via the plurality of connections during a meeting, the plurality of audio streams including a first audio stream; associating the first audio stream with a first participant; identifying a first audio fragment from the first audio stream; transcribing the first audio fragment to first textual content; and compiling the plurality of audio streams into a storyboard of the meeting, the storyboard including at least the first textual content of the first audio fragment to be displayed in association with the first participant.
16. The non-transitory computer-readable medium of claim 15, the one or more programs further comprising instructions for: extracting a voice profile from a first audio input device associated with the first participant, wherein the first audio fragment is transcribed based on the voice profile.
17. The non-transitory computer-readable medium of claim 15, the one or more programs further comprising instructions for: distributing the storyboard of the meeting to a subset of the plurality of audio input devices.
18. The non-transitory computer-readable medium of claim 15, the plurality of audio streams including a second audio stream, the one or more programs further comprising instructions for: identifying from the second audio stream a second audio fragment, wherein the storyboard includes the second audio fragment; and providing the storyboard to one or more recipients, wherein the storyboard identifies the transcribed first audio fragment as corresponding to the first participant and the second audio fragment as corresponding to a second participant.
19. The non-transitory computer-readable medium of claim 18, wherein providing the storyboard to the one or more recipients includes replaying the second audio fragment to the one or more recipients.
20. The non-transitory computer-readable medium of claim 18, the one or more programs further comprising instructions for: maintaining the first audio fragment in a first audio channel associated with the first participant; and maintaining the second audio fragment in a second audio channel associated with the second participant, the first and second audio channels being separate from each other.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0027] Embodiments of the system described herein will now be explained in more detail in accordance with the figures of the drawings, which are briefly described as follows.
[0028]
[0029]
[0030]
[0031]
DETAILED DESCRIPTION OF VARIOUS EMBODIMENTS
[0032] The system described herein provides a mechanism for recording meeting audio on multiple individual smartphones of meeting participants, automatic speaker identification, handling double-talk episodes, compiling meeting storyline from fragments recorded in speaker channels, including post-meeting voice annotations, and optional voice-to-text conversion of certain portions of recording.
[0033]
[0034] If all conditions and checks for a current speaker are satisfied, the participant 110 is marked as an active speaker and the smartphone 140 of the participant 110 is marked as a principal recording device and becomes a designated one of the smartphones 140, 150, 160 recording a voice stream of the participant 110, as schematically shown on the screen of the smartphone 140. A channel of the participant 110 is activated (or created if the participant 110 speaks for the first time in the meeting) and a fragment of an audio recording of the participant 110 is added to the channel after a pause or speaker change, as explained elsewhere herein.
[0035]
[0036]
[0037] A storyline 350 of a meeting may be compiled from original audio fragments for meeting participants recorded during the meeting, combined with voice annotations and other components, such as pre-recorded introductions of each speaker and organized chronologically, by topics or otherwise. For example, the storyline 350 may be organized in a chronological order of speaker fragments, with the addition of voice annotations immediately after annotated fragments. Such storylines may be distributed as key meeting materials shortly after the end of the meeting.
[0038] Some of the recorded audio fragments may be converted to text using voice-to-text technologies. In
[0039] Referring to
[0040] Processing begins at a step 410, where the system establishes connections between smartphones of participants and/or with a local or cloud service run by the system. The system may also ensure that software for the system is running on each smartphone of each participant and that a recording mode on each smartphone is enabled. After the step 410, processing proceeds to a step 415, where a meeting participant speaks. After the step 415, processing proceeds to a step 420, where the system measures average volume of an audio signal over short periods of time and delay of the audio signal on each smartphone, as explained elsewhere herein (see in particular
[0041] After the step 425, processing proceeds to a step 430, where a candidate for the current speaker is detected according to specific criteria, as explained elsewhere herein. After the step 430, processing proceeds to a step 435, where the system runs an additional speaker identification check, as explained in conjunction with
[0042] After the step 460, processing proceeds back to the test step 455. If it was determined at the test step 455 that any of the current speakers stopped talking, processing proceeds to a step 465, where a recorded speaker fragment from principal smartphones is added to the corresponding speaker channels, as explained elsewhere herein (see
[0043] After the step 480, processing proceeds to a step 485, where certain fragments may be optionally transcribed to text, as explained elsewhere herein, in particular, in conjunction with
[0044] Various embodiments discussed herein may be combined with each other in appropriate combinations in connection with the system described herein. Additionally, in some instances, the order of steps in the flowcharts, flow diagrams and/or described flow processing may be modified, where appropriate. Subsequently, elements and areas of screen described in screen layouts may vary from the illustrations presented herein. Further, various aspects of the system described herein may be implemented using software, hardware, a combination of software and hardware and/or other computer-implemented modules or devices having the described features and performing the described functions. Smartphones functioning as audio recording devices may include software that is pre-loaded with the device, installed from an app store, installed from a desktop (after possibly being pre-loaded thereon), installed from media such as a CD, DVD, etc., and/or downloaded from a Web site. Such smartphones may use operating system(s) selected from the group consisting of iOS, Android OS, Windows Phone OS, Blackberry OS and mobile versions of Linux OS.
[0045] Software implementations of the system described herein may include executable code that is stored in a computer readable medium and executed by one or more processors. The computer readable medium may be non-transitory and include a computer hard drive, ROM, RAM, flash memory, portable computer storage media such as a CD-ROM, a DVD-ROM, a flash drive, an SD card and/or other drive with, for example, a universal serial bus (USB) interface, and/or any other appropriate tangible or non-transitory computer readable medium or computer memory on which executable code may be stored and executed by a processor. The software may be bundled (pre-loaded), installed from an app store or downloaded from a location of a network operator. The system described herein may be used in connection with any appropriate operating system.
[0046] Other embodiments of the invention will be apparent to those skilled in the art from a consideration of the specification or practice of the invention disclosed herein. It is intended that the specification and examples be considered as exemplary only, with the true scope and spirit of the invention being indicated by the following claims.