G10L21/007

METHOD, APPARATUS, AND NON-TRANSITORY COMPUTER READABLE MEDIUM FOR PROCESSING AUDIO OF VIRTUAL MEETING ROOM
20230169982 · 2023-06-01 ·

A method for processing audio generated in a virtual meeting room (VMR) includes setting a quantity of mesh vertexes according to seats in the VMR, obtaining first voiceprint information of a presenter, the first voiceprint information comprising a frequency, an amplitude, and a phase difference of an audio signal, adjusting the frequency or amplitude of the first voiceprint information according to the quantity of the mesh vertexes, and obtaining second voiceprint information; and determining a seat of the presenter in the VMR according to the second voiceprint information. An apparatus and a non-transitory computer readable medium for processing audio as above are also disclosed.

METHOD, APPARATUS, AND NON-TRANSITORY COMPUTER READABLE MEDIUM FOR PROCESSING AUDIO OF VIRTUAL MEETING ROOM
20230169982 · 2023-06-01 ·

A method for processing audio generated in a virtual meeting room (VMR) includes setting a quantity of mesh vertexes according to seats in the VMR, obtaining first voiceprint information of a presenter, the first voiceprint information comprising a frequency, an amplitude, and a phase difference of an audio signal, adjusting the frequency or amplitude of the first voiceprint information according to the quantity of the mesh vertexes, and obtaining second voiceprint information; and determining a seat of the presenter in the VMR according to the second voiceprint information. An apparatus and a non-transitory computer readable medium for processing audio as above are also disclosed.

EVALUATING SCREEN CONTENT FOR ACCESSIBILITY

In one example, a method for evaluating screen content for accessibility with a screen reader device is disclosed. The method provides a baseline document including a script of expected screen content that conforms accessibility requirements. The method may generate an audio file based on screen content elements. For some implementations, the method uses a machine learning model to transcribe the audio file into an output transcription file. The method may determine whether output transcription file matches the baseline document and a corresponding output report is generated.

EVALUATING SCREEN CONTENT FOR ACCESSIBILITY

In one example, a method for evaluating screen content for accessibility with a screen reader device is disclosed. The method provides a baseline document including a script of expected screen content that conforms accessibility requirements. The method may generate an audio file based on screen content elements. For some implementations, the method uses a machine learning model to transcribe the audio file into an output transcription file. The method may determine whether output transcription file matches the baseline document and a corresponding output report is generated.

Facilitating creation and playback of user-recorded audio
11238854 · 2022-02-01 · ·

Methods, apparatus, and computer readable media are described related to recording, organizing, and making audio files available for consumption by voice-activated products. In various implementations, in response to receiving an input from a first user indicating that the first user intends to record audio content, audio content may be captured and stored. Input may be received from the first user indicating at least one identifier for the audio content. The stored audio content may be associated with the at least one identifier. A voice input may be received from a subsequent user. In response to determining that the voice input has particular characteristics, speech recognition may be biased in respect of the voice input towards recognition of the at least one identifier. In response to recognizing, based on the biased speech recognition, presence of the at least one identifier in the voice input, the stored audio content may be played.

Facilitating creation and playback of user-recorded audio
11238854 · 2022-02-01 · ·

Methods, apparatus, and computer readable media are described related to recording, organizing, and making audio files available for consumption by voice-activated products. In various implementations, in response to receiving an input from a first user indicating that the first user intends to record audio content, audio content may be captured and stored. Input may be received from the first user indicating at least one identifier for the audio content. The stored audio content may be associated with the at least one identifier. A voice input may be received from a subsequent user. In response to determining that the voice input has particular characteristics, speech recognition may be biased in respect of the voice input towards recognition of the at least one identifier. In response to recognizing, based on the biased speech recognition, presence of the at least one identifier in the voice input, the stored audio content may be played.

METHOD AND APPARATUS FOR AUDIO PROCESSING, AND STORAGE MEDIUM
20220270627 · 2022-08-25 ·

The present disclosure relates to a method and an apparatus for audio processing and a storage medium. The method includes: obtaining an audio mixing feature of a target object, in which the audio mixing feature at least includes: a voiceprint feature and a pitch feature of the target object; and determining a target audio matching with the target object in the mixed audio according to the audio mixing feature.

METHOD AND APPARATUS FOR AUDIO PROCESSING, AND STORAGE MEDIUM
20220270627 · 2022-08-25 ·

The present disclosure relates to a method and an apparatus for audio processing and a storage medium. The method includes: obtaining an audio mixing feature of a target object, in which the audio mixing feature at least includes: a voiceprint feature and a pitch feature of the target object; and determining a target audio matching with the target object in the mixed audio according to the audio mixing feature.

Speech processing device, teleconferencing device, speech processing system, and speech processing method
11398220 · 2022-07-26 · ·

A speech processing method executes at least one of first speech processing and second speech processing. The first speech processing identifies a language based on speech, performs signal processing according to the identified language, and transmits the speech on which the signal processing has been performed, to a far-end-side. The second speech processing identifies a language based on speech, receives the speech from the far-end-side, and performs signal processing on the received speech, according to the identified language.

Speech processing device, teleconferencing device, speech processing system, and speech processing method
11398220 · 2022-07-26 · ·

A speech processing method executes at least one of first speech processing and second speech processing. The first speech processing identifies a language based on speech, performs signal processing according to the identified language, and transmits the speech on which the signal processing has been performed, to a far-end-side. The second speech processing identifies a language based on speech, receives the speech from the far-end-side, and performs signal processing on the received speech, according to the identified language.