Patent classifications
G10L2021/065
REMOTE AUTOMATED SPEECH TO TEXT INCLUDING EDITING IN REAL-TIME ("RASTER") SYSTEMS AND METHODS FOR USING THE SAME
Remote automated speech to text with editing in real-time systems, and methods for using the same, are described herein. Communications between two or more endpoints are established, and audio and/or video data is transmitted there between. Text data representing the audio data, for example, may be generated, and provided the endpoint that formulated the audio data. That endpoint may then edit the text data for clarity and correctness, and the edited text data may then be provided to the receipt endpoint(s).
MULTIMODAL SPEECH RECOGNITION FOR REAL-TIME VIDEO AUDIO-BASED DISPLAY INDICIA APPLICATION
Aspects relate to computer implemented methods, systems, and processes to automatically generate audio-based display indicia of media content including receiving, by a processor, a plurality of media content categories including at least one feature, receiving a plurality of categorized speech recognition algorithms, each speech recognition algorithm being associated with a respective one or more of the plurality of media content categories, determining a media content category of a current media content based on at least one feature of the current media content, selecting one speech recognition algorithm from the plurality of categorized speech recognition algorithms based on the determination of the media content category of the current media content, and applying the selected speech recognition algorithm to the current media content.
SPEECH RECOGNITION CANDIDATE SELECTION BASED ON NON-ACOUSTIC INPUT
A method includes the following steps. A speech input is received. At least two speech recognition candidates are generated from the speech input. A scene related to the speech input is observed using one or more non-acoustic sensors. The observed scene is segmented into one or more regions. One or more properties for the one or more regions are computed. One of the speech recognition candidates is selected based on the one or more computed properties of the one or more regions.
Method and system for controlling voice entrance to user ears, by designated system of earphone controlled by Smartphone with reversed voice recognition control system
Method and system for controlling voice entrance to user ears, by designated system of earphone controlled by Smartphone with reversed voice recognition control system, wherein, a Voice Recognition (VR) system distinction between some environmental voices and between some human participates, optionally in a telephones call. Wherein, such distinctions, enabled by voice recognition dedicated software's and a Smartphone device. Wherein, dedicated software's in Smartphone device enable the user, a visual display in his Smartphone screen, including a Voice Recognition (VR) system enabling the user to control the voices, by approving some of the voice and blocking some other voices, wherein, the voices approved by the user, are the voice that passes to the user earphone connected to the Smartphone. Wherein, optionally some of the functions of the present system and software's are at the earphone device.
Systems and methods for supporting hearing impaired users
A method for providing speech recognition to a user on a mobile device are provided, the method comprising: 1) receiving, by a processor, audio data; 2) processing the audio data, by a speech recognition engine, to determine one or more corresponding text, wherein the processing comprises querying a local language model and a local acoustic model; and 3) displaying the one or more corresponding text on a screen of the mobile device.
Speech recognition candidate selection based on non-acoustic input
A method includes the following steps. A speech input is received. At least two speech recognition candidates are generated from the speech input. A scene related to the speech input is observed using one or more non-acoustic sensors. The observed scene is segmented into one or more regions. One or more properties for the one or more regions are computed. One of the speech recognition candidates is selected based on the one or more computed properties of the one or more regions.
Speech recognition candidate selection based on non-acoustic input
A method includes the following steps. A speech input is received. At least two speech recognition candidates are generated from the speech input. A scene related to the speech input is observed using one or more non-acoustic sensors. The observed scene is segmented into one or more regions. One or more properties for the one or more regions are computed. One of the speech recognition candidates is selected based on the one or more computed properties of the one or more regions.
Performing artificial intelligence sign language translation services in a video relay service environment
Video relay services, communication systems, non-transitory machine-readable storage media, and methods are disclosed herein. A video relay service may include at least one server configured to receive a video stream including sign language content from a video communication device during a real-time communication session. The server may also be configured to automatically translate the sign language content into a verbal language translation during the real-time communication session without assistance of a human sign language interpreter. Further, the server may be configured to transmit the verbal language translation during the real-time communication session.
Noise reduction audio reproducing device and noise reduction audio reproducing method
A noise reduction audio reproducing method includes the steps of: generating, from an audio signal of collected and obtained noise, an audio signal for noise cancellation to cancel the noise by synthesizing the audio signal for noise cancellation and the noise in an acoustic manner, reproducing the audio signal for noise cancellation acoustically to synthesize this with the noise in an acoustic manner; emphasizing an audio component to be listened to, of collected audio; synthesizing an audio signal with the audio component to be listened to being emphasized, and the audio signal for noise cancellation to supply the synthesized signal thereof to an electro-acoustic converting unit; and controlling so as to supply an audio signal, with the audio component to be listened to having been emphasized, to a synthesizing unit, regarding only a section based on a control signal.
METHOD AND SYSTEM FOR GENERATION OF CUSTOMISED SENSORY STIMULUS
A tinnitus treatment system comprising a sound processing unit, a haptic stimulus unit and an audio delivery unit. The sound processing unit comprises a processor input for receiving an audio signal; and a digital signal processor operable to analyse said audio signal and generate a plurality of actuation signals therefrom which are representative of said audio signal. The digital signal processor is further operable to spectrally modify said audio signal in accordance with a predetermined modification profile to generate a modified audio signal. The haptic stimulus unit comprises an array of stimulators each of which can be independently actuated to apply a tactile stimulus to a subject; and a stimulus unit input for receiving the plurality of actuation signals generated by said digital signal processor and directing individual actuation signals to individual stimulators. The audio delivery unit comprises an audio delivery unit input for receiving the modified audio signal generated by said digital signal processor.