Patent classifications
G10L19/173
Voice signal processing method, related apparatus, and system
Present disclosure disclose a voice signal processing method, includes: receiving a first voice coded signal from a first terminal; performing voice decoding processing on the first voice coded signal to obtain a voice decoding parameter and a first voice decoded signal; performing, by using the voice decoding parameter, virtual bandwidth extension processing to obtain a bandwidth extension voice decoded signal corresponding to the first voice decoded signal; after combining the first voice decoded signal and the bandwidth extension voice decoded signal, performing voice coding processing to obtain a second voice coded signal; and sending the second voice coded signal to a second terminal that establishes a call connection to the first terminal, where a maximum frequency bandwidth supported by the first terminal is less than a maximum frequency bandwidth supported by the second terminal. Thus, Service quality of terminals that have asymmetric maximum frequency bandwidth support capabilities can be improved.
Audio metadata providing apparatus and method, and multichannel audio data playback apparatus and method to support dynamic format conversion
An audio metadata providing apparatus and method and a multichannel audio data playback apparatus and method to support a dynamic format conversion are provided. Dynamic format conversion information may include information about a plurality of format conversion schemes that are used to convert a first format set by an author of multichannel audio data into a second format that is based on a playback environment of the multichannel audio data and that are each set for corresponding playback periods of the multichannel audio data. The audio metadata providing apparatus may provide audio metadata including the dynamic format conversion information. The multichannel audio data playback apparatus may identify the dynamic format conversion information from the audio metadata, may convert the first format of the multichannel audio data into the second format based on the identified dynamic format conversion information, and may play back the multichannel audio data in the second format.
System and a method for selecting a ring back tone to be provided to a caller
The present subject matter relates to a method and a system for selecting a ring back tone to be provided to a caller. The method comprising selecting, by a subscriber, an editable audio file with a predetermined duration; encrypting the selected editable audio file on a storage device; communicating, by a communication interface, a start time of the selected editable audio file to the server, the server transcodes and smoothens the selected portion of the editable audio file; transferring, by the server, the transcoded audio file to the mobile operator network as a ring back tone.
SYSTEM, CONTROL METHOD, AND CONTROL TERMINAL
Provided is a system including: a first reproduction device including a first signal processor and being configured to reproduce audio data; a second reproduction device configured to reproduce the audio data, wherein the first reproduction device is configured to: receive the audio data from a content server in response to a request for reproduction of the audio data by the second reproduction device; cause the first signal processor to execute predetermined signal processing on the received audio data; and transfer the audio data subjected to the predetermined signal processing to the second reproduction device.
Audio codec extension
An apparatus comprising means configured to: receive a primary track comprising at least one audio signal; receive at least one secondary track, each of the at least one secondary track comprising at least one audio signal, wherein the at least one secondary track is based on the primary track; and decode and render the primary track and the at least one secondary track using spatial audio decoding.
Apparatus, method and computer program for encoding, decoding, scene processing and other procedures related to DirAC based spatial audio coding
An apparatus for generating a description of a combined audio scene, includes: an input interface for receiving a first description of a first scene in a first format and a second description of a second scene in a second format, wherein the second format is different from the first format; a format converter for converting the first description into a common format and for converting the second description into the common format, when the second format is different from the common format; and a format combiner for combining the first description in the common format and the second description in the common format to obtain the combined audio scene.
System and method for external communications to/from radios in a radio network
Embodiments of the invention provide a system and method that enables multiple networks themselves based on different underlying technologies to be combined into a larger network that provides seamless communications between a plurality of communication networks. For example, embodiments of the invention enable users in a radio network (e.g., push-to-talk radio systems) to broadcast messages to users in a non-radio network (e.g., users on personal computers or mobile phones) with the radio message translated into text for transmission to the non-radio network. Similarly, text messages originating from users in a non-radio network (e.g., users on personal computers or mobile phones) may be communicated to users in radio networks with the text messages translated into audio for broadcast on the radio network.
SOUND FILE SOUND QUALITY IDENTIFICATION METHOD AND APPARATUS
A sound file sound quality identification method is provided. The method includes converting a format of a to-be-identified sound file into a preset reference audio format; performing framing on the sound file to obtain a plurality of frames; and performing Fourier transformation processing on the to-be-identified sound file to obtain a spectrum of each frame. The method also includes performing model matching according to the spectrum of each frame of the to-be-identified sound file to obtain a preliminary classification result of the to-be-identified sound file; determining an energy change point of the to-be-identified sound file according to the spectrum of each frame; and determining a sound quality of the to-be-identified sound file according to the preliminary classification result of the to-be-identified sound file and the energy change point of the to-be-identified sound file.
LAYERED INTERMEDIATE COMPRESSION FOR HIGHER ORDER AMBISONIC AUDIO DATA
In general, techniques are described for performing layered intermediate compression for higher order ambisonic (HOA) audio data. A device comprising a memory and a processor may be configured to perform the techniques. The memory may store HOA coefficients of the HOA audio data. The processors may decompose the HOA coefficients into a predominant sound component and a corresponding spatial component. The spatial component may be representative of the directions, shape, and width of the predominant sound component, and defined in the spherical harmonic domain. The processor may specify, in a bitstream conforming to an intermediate compression format, a subset of the HOA coefficients that represent an ambient component. The processor may also specify, in the bitstream and irrespective of a determination of a minimum number of ambient channels and a number of elements to specify in the bitstream for the spatial component, all elements of the spatial component.
Methods and apparatus to perform audio watermarking and watermark detection and extraction
Methods and apparatus to audio watermarking and watermark detection and extracted are described herein. According to an example method, an identifier is encoded in media content when a different identifier has been previously encoded. According to another example method, messages decoded from media content are validated to provide improved decoding accuracy. In another example method, decoded symbols are stored in memory and synchronization symbols are located to detect a message encoded in media content.