G10L19/173

SYSTEMS AND METHODS FOR REDUCING TRANSCODING RESOURCE ALLOCATION DURING CALL SETUP TO MULTIPLE TERMINATIONS

In some implementations, an application server may receive, from a calling party user equipment, a call for a called party associated with multiple user equipment. The application server may provide to the multiple user equipment, and based on the call, a request for transcoding information associated with the multiple user equipment. The application server may assign a transcoding resource for handling the call, wherein the transcoding resource is provided in a network. The application server may receive, based on the request, the transcoding information from a particular user equipment of the multiple user equipment. The application server may provide the transcoding information to the transcoding resource, wherein the transcoding information causes the transcoding resource to establish and transcode the call between the calling party user equipment and the particular user equipment.

Method and system for implementing split and parallelized encoding or transcoding of audio and video content

Novel tools and techniques are provided for implementing split and parallelized encoding or transcoding of audio and video. In various embodiments, a computing system might split an audio-video file that is received from a content source into a single video file and a single audio file. The computing system might encode or transcode the single audio file. Concurrently, the computing system might split the single video file into a plurality of video segments. A plurality of parallel video encoders/transcoders might concurrently encode or transcode the plurality of video segments, each video encoder/transcoder encoding or transcoding one video segment of the plurality of video segments. Subsequently, the computing system might assemble the plurality of encoded or transcoded video segments with the encoded or transcoded audio file to produce an encoded or transcoded audio-video file, which may be output to a display device(s), an audio playback device(s), or the like.

APPARATUS, METHOD AND COMPUTER PROGRAM FOR ENCODING, DECODING, SCENE PROCESSING AND OTHER PROCEDURES RELATED TO DIRAC BASED SPATIAL AUDIO CODING

An apparatus for generating a description of a combined audio scene, includes: an input interface for receiving a first description of a first scene in a first format and a second description of a second scene in a second format, wherein the second format is different from the first format; a format converter for converting the first description into a common format and for converting the second description into the common format, when the second format is different from the common format; and a format combiner for combining the first description in the common format and the second description in the common format to obtain the combined audio scene.

DYNAMIC DECODER CONFIGURATION FOR LIVE TRANSCODING
20220150514 · 2022-05-12 · ·

A method and system for managing transcoding of data in a stream that includes identifying an input source change for the stream with a new input source type, and adding a decoder for the new input source type, the decoder configured to output for a respective encoder in a transcoder pipeline.

TRANSFORM AMBISONIC COEFFICIENTS USING AN ADAPTIVE NETWORK FOR PRESERVING SPATIAL DIRECTION

A device includes a memory configured to store untransformed ambisonic coefficients at different time segments. The device includes one or more processors configured to obtain the untransformed ambisonic coefficients at the different time segments, where the untransformed ambisonic coefficients at the different time segments represent a soundfield at the different time segments. The one or more processors are configured to apply one adaptive network, based on a constraint that includes preservation of a spatial direction of one or more audio sources in the soundfield at the different time segments, to the untransformed ambisonic coefficients at the different time segments to generate transformed ambisonic coefficients at the different time segments, wherein the transformed ambisonic coefficients at the different time segments represent a modified soundfield at the different time segments, that was modified based on the constraint. The one or more processors are also configured to apply an additional adaptive network.

Loudness control methods and devices

Audio data in a first format may be processed to produce audio data in a second format, which may be a reduced or simplified version of the first format. A loudness correction process may produce loudness-corrected audio data in the second format. A first power of the audio data in the second format and a second power of the loudness-corrected audio data in the second format may be determined. A second-format loudness correction factor for the audio data in the second format may be based, at least in part, on a power ratio between the first power and the second power. A first-format loudness correction factor for the audio data in the first format may be based, at least in part, on the power ratio and a power relationship between the audio data in the first format and the audio data in the second format.

Apparatus, method and computer program for encoding, decoding, scene processing and other procedures related to DirAC based spatial audio coding

An apparatus for generating a description of a combined audio scene, includes: an input interface for receiving a first description of a first scene in a first format and a second description of a second scene in a second format, wherein the second format is different from the first format; a format converter for converting the first description into a common format and for converting the second description into the common format, when the second format is different from the common format; and a format combiner for combining the first description in the common format and the second description in the common format to obtain the combined audio scene.

METHODS, APPARATUS AND ARTICLES OF MANUFACTURE TO IDENTIFY SOURCES OF NETWORK STREAMING SERVICES
20230252999 · 2023-08-10 ·

Methods, apparatus and articles of manufacture to identify sources of network streaming services are disclosed. An example apparatus includes a coding format identifier to identify, from a received first audio signal representing a decompressed second audio signal, an audio compression configuration used to compress a third audio signal to form the second audio signal, and a source identifier to identify a source of the second audio signal based on the identified audio compression configuration.

Methods, encoder and decoder for linear predictive encoding and decoding of sound signals upon transition between frames having different sampling rates
11721349 · 2023-08-08 · ·

Methods, an encoder and a decoder are configured for transition between frames with different internal sampling rates. Linear predictive (LP) filter parameters are converted from a sampling rate S1 to a sampling rate S2. A power spectrum of a LP synthesis filter is computed, at the sampling rate S1, using the LP filter parameters. The power spectrum of the LP synthesis filter is modified to convert it from the sampling rate S1 to the sampling rate S2. The modified power spectrum of the LP synthesis filter is inverse transformed to determine autocorrelations of the LP synthesis filter at the sampling rate S2. The autocorrelations are used to compute the LP filter parameters at the sampling rate S2.

Sound quality detection method and device for homologous audio and storage medium

Provided is a sound quality detection method, including: acquiring a plurality of audio files to be detected, wherein the plurality of audio files are homologous audio files; acquiring at least one audio feature of each of the plurality of audio files by performing feature extraction on the audio file, and generating a correspondence list between the at least one audio feature of each of the plurality of audio files and an audio file identifier; and determining, using a sound quality detection model, a sound quality score of each of the plurality of audio files based on the correspondence list between the at least one audio feature of each of the plurality of audio files and the audio file identifier, wherein the sound quality detection model is configured to detect sound quality of homologous audio files.