G10H1/368

HARMONY-AWARE HUMAN MOTION SYNTHESIS WITH MUSIC
20230005201 · 2023-01-05 ·

A method and device for harmony-aware audio-driven motion synthesis are provided. The method includes determining a plurality of testing meter units according to an input audio, each testing meter unit corresponding to an input audio sequence of the input audio, obtaining an auditory input corresponding to each testing meter unit, obtaining an initial pose of each testing meter unit as a visual input based on a visual motion sequence synthesized for a previous testing meter unit, and automatically generating a harmony-aware motion sequence corresponding to the input audio using a generator of a generative adversarial network (GAN) model. The GAN model is trained by incorporating a hybrid loss function. The hybrid loss function includes a multi-space pose loss, a harmony loss, and a GAN loss. The harmony loss is determined according to beat consistencies of audio-visual beat pairs.

Audiovisual content rendering with display animation suggestive of geolocation at which content was previously rendered

Techniques have been developed to facilitate (1) the capture and pitch correction of vocal performances on handheld or other portable computing devices and (2) the mixing of such pitch-corrected vocal performances with backing tracks for audible rendering on targets that include such portable computing devices and as well as desktops, workstations, gaming stations, even telephony targets. Implementations of the described techniques employ signal processing techniques and allocations of system functionality that are suitable given the generally limited capabilities of such handheld or portable computing devices and that facilitate efficient encoding and communication of the pitch-corrected vocal performances (or precursors or derivatives thereof) via wireless and/or wired bandwidth-limited networks for rendering on portable computing devices or other targets.

DEEP LEARNING SYSTEM FOR DETERMINING AUDIO RECOMMENDATIONS BASED ON VIDEO CONTENT
20220414381 · 2022-12-29 ·

Embodiments are disclosed for determining an answer to a query associated with a graphical representation of data. In particular, in one or more embodiments, the disclosed systems and methods comprise receiving an input including an unprocessed audio sequence and a request to perform an audio signal processing effect on the unprocessed audio sequence. The one or more embodiments further include analyzing, by a deep encoder, the unprocessed audio sequence to determine parameters for processing the unprocessed audio sequence. The one or more embodiments further include sending the unprocessed audio sequence and the parameters to one or more audio signal processing effects plugins to perform the requested audio signal processing effect using the parameters and outputting a processed audio sequence after processing of the unprocessed audio sequence using the parameters of the one or more audio signal processing effects plugins.

LYRIC VIDEO DISPLAY METHOD AND DEVICE, ELECTRONIC APPARATUS AND COMPUTER-READABLE MEDIUM
20220394325 · 2022-12-08 ·

The present disclosure provides a lyric video display method and device, an electronic apparatus, and a computer-readable medium. The method includes: playing, based on a lyric video display operation of a user, multimedia data and data about music to be displayed, the multimedia data including image data, and the music data including audio data and lyrics; determining a target time point, determining a target object in the image data corresponding to the target time point, and determining target lyrics in the lyrics corresponding to the target time point; and displaying the target lyrics within a preset range of a position of the target object in the target image, and adjusting display effects of the target lyrics based on depth information of the target object, while playing a part of the audio data corresponding to the target lyrics.

Animation effect attachment based on audio characteristics
11521341 · 2022-12-06 · ·

Systems and methods for rendering a video effect to a display are described. More specifically, video data and audio data are obtained. The video data is analyzed to determine one or more attachment points of a target object that appears in the video data. The audio data is analyzed to determine audio characteristics. A video effect associated with an animation to be added to the one or more attachment points is determined based on the audio characteristics. A rendered video is generated by applying the video effect to the video data.

AUTOMATIC DISPLAY MODULATION BASED ON AUDIO ARTIFACT COMPOSITION
20220375429 · 2022-11-24 ·

The disclosed technology provides solutions for enhancing a user's experience of content playback, such as that of a user that is viewing multimedia content, such as a music video, on a mobile device. In some aspects, a process of the disclosed technology can include steps for receiving a mean energy curve associated with a sound file and dynamically modulating a brightness level of a displayed content on a display based on audio properties of the mean energy curve whereby an average brightness experienced over playback of the displayed content is equal to a default brightness of the display.

ARTIFICIAL INTELLIGENCE MODELS FOR COMPOSING AUDIO SCORES
20220366881 · 2022-11-17 ·

A method for training one or more AI models for generating audio scores accompanying visual datasets includes obtaining training data comprising a plurality of audiovisual datasets and analyzing each of the plurality of audiovisual datasets to extract multiple visual features, textual features, and audio features. The method also includes correlating the multiple visual features and textual features with the multiple audio features via a machine learning network. Based on the correlations between the visual features, textual features, and audio features, one or more AI models are trained for composing one or more audio scores for accompanying a given dataset.

FLAME THROWER PIPE ORGAN
20220358901 · 2022-11-10 ·

A micro flame effects unit (MFEU) includes a solenoid valve, an ignition coil, and a pair of electrodes. The ignition coil is to receive a voltage to create a spark between end portions of the electrodes. The solenoid valve is to receive a voltage to release gas, the spark and the gas to create a flame effect. An example flame thrower pipe organ (FTPO) includes a set of cylindrical members with a micro flame effects unit (MFEU) located within each cylindrical member. The FTPO further includes a set of keys that provide a sound when a key of the set of keys is played. A control device causes an MFEU to release a flame effect concurrent with the sound of a corresponding key being played. The FTPO may be automated with a self-playing system.

SYSTEM AND METHOD FOR GENERATING HARMONIOUS COLOR SETS FROM MUSICAL INTERVAL DATA
20230096679 · 2023-03-30 ·

Systems and methods are disclosed for generating color sets based on musical concepts of pitch intervals and harmony. Color sets are derived via a music-to-hue process which analyzes musical pitch data associated with musical input to determine pitch intervals included in the music. Pitch interval angles associated with the pitch intervals are applied to a tuned hue index to identify hue note ordered within the index which are separated by a hue interval angle similar to the pitch angle associated with the analyzed pitch data. The systems and methods provide for the creation of color sets which are analogous to musical chords in that they include multiple hue notes selected based on hue interval angles derived from musical interval angles associated with the received musical input.

Audio-visual effects system for augmentation of captured performance based on content thereof

Visual effects schedules are applied to audiovisual performances with differing visual effects applied in correspondence with differing elements of musical structure. Segmentation techniques applied to one or more audio tracks (e.g., vocal or backing tracks) are used to compute some of the components of the musical structure. In some cases, applied visual effects schedules are mood-denominated and may be selected by a performer as a component of his or her visual expression or determined from an audiovisual performance using machine learning techniques.