Patent classifications
G10L21/12
AUDIO WAVEFORM DISPLAY USING MAPPING FUNCTION
The described technology is generally directed towards providing a visible waveform representation of an audio signal, by processing the audio signal with a polynomial (e.g., cubic) mapping function. Coefficients of the polynomial mapping function are predetermined based on constraints (e.g., slope information and desired range of the resultant curve), and whether the plotted audio waveform corresponds to sound field quantities or power quantities. Once the visible representation of the reshaped audio waveform is displayed, audio and/or video editing operations can be performed, e.g., by time-aligning other audio or video with the reshaped audio waveform, and/or modifying the reshaped audio waveform to change the underlying audio data.
DISPLAYING ENHANCEMENT ITEMS ASSOCIATED WITH AN AUDIO RECORDING
Systems, methods, and software are disclosed herein for displaying visual representations of entities associated with an audio recording. A waveform associated with an audio recording is presented in a user interface to an application. A set of entities associated with the audio recording are then identified. Responsive to identifying the set of entities, a set of enhancement items associated with the set of entities is presented. In response to a selection of a given one of the enhancement items, a visual representation of an associated one of the entities in the user interface to the application.
CALL RECORDING SYSTEM AND METHOD OF REPRODUCING RECORDED CALL
The sections of the call unnecessary for reproduction are grasped without wasting time and labor. The call recording system includes a call information entry unit to enter operation information of call terminals which is acquired by a call control unit into a terminal operation information table; a recorded information entry unit to enter recorded information of the call which are acquired by a call recording unit into a recorded information table; and a call information reproduction unit to recognize sections unnecessary for reproduction of the recorded information based on the operation information of the call terminals so as to display a reproduction screen including a result on the recognized sections on a display section.
METHOD AND APPARATUS FOR SOUND EVENT DETECTION ROBUST TO FREQUENCY CHANGE
Disclosed is a sound event detecting method including receiving an audio signal, transforming the audio signal into a two-dimensional (2D) signal, extracting a feature map by training a convolutional neural network (CNN) using the 2D signal, pooling the feature map based on a frequency, and determining whether a sound event occurs with respect to each of at least one time interval based on a result of the pooling.
METHOD AND APPARATUS FOR SOUND EVENT DETECTION ROBUST TO FREQUENCY CHANGE
Disclosed is a sound event detecting method including receiving an audio signal, transforming the audio signal into a two-dimensional (2D) signal, extracting a feature map by training a convolutional neural network (CNN) using the 2D signal, pooling the feature map based on a frequency, and determining whether a sound event occurs with respect to each of at least one time interval based on a result of the pooling.
Method and system for detecting anomalous sound
A system and method for detecting anomalous sound are disclosed. The method includes receiving a spectrogram of an audio signal with elements defined by values in a time-frequency domain of the spectrogram. Each of the values corresponds to an element of the spectrogram that is identified by a coordinate in the time-frequency domain. The time-frequency domain of the spectrogram is partitioned into a context region and a target region. The context region and the target region are processed by a neural network using an attentive neural process to recover values of the spectrogram for elements with coordinates in the target region. The recovered values of the elements of the target region are compared with values of elements of the partitioned target region. An anomaly score is determined based on the comparison. The anomaly score is used for performing a control action.
Method and system for detecting anomalous sound
A system and method for detecting anomalous sound are disclosed. The method includes receiving a spectrogram of an audio signal with elements defined by values in a time-frequency domain of the spectrogram. Each of the values corresponds to an element of the spectrogram that is identified by a coordinate in the time-frequency domain. The time-frequency domain of the spectrogram is partitioned into a context region and a target region. The context region and the target region are processed by a neural network using an attentive neural process to recover values of the spectrogram for elements with coordinates in the target region. The recovered values of the elements of the target region are compared with values of elements of the partitioned target region. An anomaly score is determined based on the comparison. The anomaly score is used for performing a control action.
ELECTRONIC DEVICE AND METHOD
According to one embodiment, an electronic device records an audio signal, determines a plurality of user-specific utterance features within the audio signal, the plurality of user-specific utterance features including a first set of user specific-utterance features associated with the registered user and a second set of user-specific utterance features associated with the unregistered user, and displays the identifier of the registered user differently than an identifier of the unregistered user.
ELECTRONIC DEVICE AND METHOD
According to one embodiment, an electronic device records an audio signal, determines a plurality of user-specific utterance features within the audio signal, the plurality of user-specific utterance features including a first set of user specific-utterance features associated with the registered user and a second set of user-specific utterance features associated with the unregistered user, and displays the identifier of the registered user differently than an identifier of the unregistered user.
Embedded plug-in presentation and control of time-based media documents
A software plug-in module that interfaces to a media editing host application generates and embeds information about a media composition being edited directly within portions of the user interface generated by the host application. The information may include a custom representation of media data of a time-based element of the media composition that replaces, augments, or overlays a timeline representation of the element generated by the host application. Media editing functionality provided by the plug-in may be accessed by an operator based on viewing or interacting with the custom representation. Results of analysis of the media composition by the plug-in may be displayed within the host-generated timeline and used by an operator as a basis for performing edit operations with standard host tools or with plug-in generated tools. Plug-ins may embed their interfaces within user interfaces of host digital audio workstations, non-linear video editing systems, and music notation applications.