Patent classifications
H04M3/40
Voice Filtering Other Speakers From Calls And Audio Messages
A method includes receiving a first instance of raw audio data corresponding to a voice-based command and receiving a second instance of the raw audio data corresponding to an utterance of audible contents for an audio-based communication spoken by a user. When a voice filtering recognition routine determines to activate voice filtering for at least the voice of the user, the method also includes obtaining a respective speaker embedding of the user and processing, using the respective speaker embedding, the second instance of the raw audio data to generate enhanced audio data for the audio-based communication that isolates the utterance of the audible contents spoken by the user and excludes at least a portion of the one or more additional sounds that are not spoken by the user The method also includes executing.
SYSTEM AND METHOD FOR CONTENT FOCUSED CONVERSATION
Systems and methods are provided to enable a communication to be monitored and, if found to possess emotionally charged content, reduce or eliminate the emotionally charged content while allowing the communication to otherwise be presented to the recipient. For example, agents in a contact center may encounter a customer who utilizes emotionally charged content which may be expressed as words or phrases (e.g., insults, slurs, profanity, etc.) or by intonations (e.g., yelling, talking through their teeth, etc.). By omitting such content through volume level balancing, tonal balancing, and/or redaction or substitution, the recipient may focus on the content of the message without the distraction of the emotionally charged content of a communication, whether the communication is audio, video, or text.
Voice filtering other speakers from calls and audio messages
A method includes receiving a first instance of raw audio data corresponding to a voice-based command and receiving a second instance of the raw audio data corresponding to an utterance of audible contents for an audio-based communication spoken by a user. When a voice filtering recognition routine determines to activate voice filtering for at least the voice of the user, the method also includes obtaining a respective speaker embedding of the user and processing, using the respective speaker embedding, the second instance of the raw audio data to generate enhanced audio data for the audio-based communication that isolates the utterance of the audible contents spoken by the user and excludes at least a portion of the one or more additional sounds that are not spoken by the user The method also includes executing.
METHOD AND APPARATUS FOR SOUND ENHANCEMENT
A method and apparatus for sound enhancement are provided in this invention. The method comprises: obtaining sound signals and converting the sound signals into digital signals; decomposing the digital signals to obtain a plurality of IMFs or pseudo-IMFs; selectively amplifying the amplitudes of the IMFs and pseudo-IMFs; reconstituting the selectively amplified IMFs or pseudo-IMFs to obtain reconstituted signals and converting the reconstituted signals into analog signals. The present invention is based on the Hilbert-Huang transform. Through the present invention, the sound can be selectively amplified, and only the high-frequency consonants in the sound are amplified without vowel, which effectively improves the clarity of the enhanced sound. The present invention overcomes the problems in the current sound enhancement method which makes the sound louder without increasing the clarity.
Systems and methods for in-vehicle voice calls
Embodiments are disclosed for providing voice calls to users of a motor vehicle. As an example, a method comprises: responsive to a voice call, routing the voice call to at least one phone zone of a plurality of phone zones based on at least one of a user input and a source of the voice call, the plurality of phone zones included in a cabin of a motor vehicle. In this way, sonic interference with a voice call may be reduced, while a main system audio may continue to play for unselected phone zones.
TECHNIQUES FOR USING COMPUTER VISION TO ALTER OPERATION OF SPEAKER(S) AND/OR MICROPHONE(S) OF DEVICE
In one aspect, a first device includes at least one processor and storage accessible to the at least one processor. The storage includes instructions that may be executable by the processor to receive input from a camera and identify a second device based on the input from the camera. The second device may include at least one speaker and at least one microphone. The instructions may also be executable to identify a current location of the second device within an environment based on the input from the camera and to identify a current location of an object within the environment that is different from the second device. The instructions may then be executable to provide a command to alter operation of the at least one speaker and/or the at least one microphone based on the current location of the second device and the current location of the object.
Communication volume level change detection
Systems and methods for detecting volume level changes in communications are described herein. In some embodiments, a system comprises a computer system. The computer system comprises at least one processor and a memory coupled to the at least one processor. The memory stores program instructions that are executable by the at least one processor to cause the computer system to perform tasks. The tasks include recording a communication that comprises audio, and analyzing the audio of the communication. The analysis of the audio is operable to detect a change in a volume level of the audio that indicates an occurrence of a potential event of interest. The tasks also include creating and storing an information record corresponding to the communication in a second database. The information record includes an indication of the detected change in the volume level.
Conference system volume control
A method, system and computer program product includes detecting a volume level for audio input of a first user in a multi-user conference call, and automatically adjusting a volume level for a second user receiving audio output of the first user based on at least one of preferences of the second user, historic data between the first and the second user, and geographic characteristics of the audio input of the first user.
Sound detection and alert system for a workspace
A workspace assembly includes at least a first sound sensor located in a first facility space, at least one communication device located within the first space, and a processor in communication with the at least a first sound sensor and the communication device. The processor is adapted to compare the volume of sound emanating from within the first space to a threshold level and to generate a signal via the communication device when the volume of sound emanating from within the first space exceeds the threshold level. The processor also periodically automatically adjusts the threshold level.
System and method for dynamic optical microphone
A dynamic optical microphone system may include an acoustic microphone that receives an audio signal and a laser microphone that transmits a laser beam and receives optical feedback from a human struck by the laser beam. The system may include a depth sensor that determines a distance to the human and a camera that tracks human faces. A processor may be communicatively coupled to the acoustic microphone, laser microphone, depth sensor, camera, and a memory storing computer executable instructions. The processor may determine a direction to a human, direct the laser beam at a voice box of the human, determine a distance to the human using the depth sensor, adjust an intensity of the laser beam based on the distance, receive optical feedback and isolate a voice signal through the optical feedback from background noise in the audio signal.