H04N9/8715

Information processing apparatus, information processing method, and non-transitory computer readable medium
10984836 · 2021-04-20 · ·

An information processing apparatus includes a receiving unit that receives, during or after play of a video, a predetermined operation with respect to the video, an associating unit that associates the received operation with a play location where the received operation has been generated in the video, and a setting unit that sets in response to the received operation an importance degree of the play location associated with the received operation.

Method and apparatus for presenting media information, storage medium, and electronic apparatus

The present disclosure describes embodiments of a method, a device, and a non-transitory computer readable storage medium for presenting media information. The method includes displaying, by a device, an interaction interface. The device includes a memory storing instructions and a processor in communication with the memory. The method includes obtaining, by the device, an image set through the interaction interface, the image set comprising at least one image. The method includes obtaining, by the device, target media based on the image set through the interaction interface, the target media comprising a first audio generated according to an image feature of the image set. The method includes presenting, by the device, the target media.

Synchronously playing method and device of media file, and storage medium

The disclosure relates to a synchronously playing method and device of a media file, and a storage medium, the method includes: creating a media source object corresponding to a playing window in a webpage through a player embedded into the webpage; adding different tracks in the fragmented media file into the same source buffer object in the media source object; transmitting a virtual address taking the media source object as a data source to a media element of the webpage; calling the media element to parse the media source object associated with the virtual address, and reading the tracks in the source buffer object of the associated media source object, and decoding and playing the tracks.

Automated generation and presentation of textual descriptions of video content

Systems, methods, and computer-readable media are disclosed for systems and methods for automated generation of textual descriptions of video content. Example methods may include determining, by one or more computer processors coupled to memory, a first segment of video content, the first segment including a first set of frames and first audio content, determining, using a first neural network, a first action that occurs in the first set of frames, and determining a first sound present in the first audio content. Some methods may include generating a vector representing the first action and the first sound, and generating, using a second neural network and the vector, a first textual description of the first segment, where the first textual description includes words that describe events of the first segment.

IMAGE PROCESSING DEVICE, IMAGE PROCESSING METHOD, AND RECORDING MEDIUM
20230412902 · 2023-12-21 · ·

An image processing device (3000) comprises an input unit (3020) and a presentation unit (3040). The input unit (3020) accepts an input of an operation for movement, on a captured image captured by a camera, of a first image which is superimposed on the captured image on the basis of a predetermined camera parameter indicating the position and attitude of the camera and which indicates a target object having a predetermined shape and a predetermined size set in a real space. The presentation unit (3040) presents the first image indicating the target object in a manner of view corresponding to a position on the captured image after the movement on the basis of the camera parameter.

IMAGE PROCESSING DEVICE, IMAGE PROCESSING METHOD, AND RECORDING MEDIUM
20230412903 · 2023-12-21 · ·

An image processing device (3000) comprises an input unit (3020) and a presentation unit (3040). The input unit (3020) accepts an input of an operation for movement, on a captured image captured by a camera, of a first image which is superimposed on the captured image on the basis of a predetermined camera parameter indicating the position and attitude of the camera and which indicates a target object having a predetermined shape and a predetermined size set in a real space. The presentation unit (3040) presents the first image indicating the target object in a manner of view corresponding to a position on the captured image after the movement on the basis of the camera parameter.

Transmitter, transmission method, receiver, and reception method
10965927 · 2021-03-30 · ·

An association with a system timing at the time of transmission is secured without changing a display timing in text information of a subtitle, and a reception side displays the subtitle at an appropriate timing. A packet in which a document of the text information of the subtitle having display timing information is included in a payload is generated and transmitted in synchronization with a sample period. A header of the packet includes a time stamp on a first time axis indicating a start time of the corresponding sample period. The payload of the packet further includes reference time information of a second time axis regarding the display timing associated with the start time of the corresponding sample period.

Video practice systems and methods
10950140 · 2021-03-16 · ·

A system and method may provide video content for training a user in an athletic motion or action. For example, video content may be provided with diminishing visibility to allow the user to visualize and imagine the action presented in the video content. In another example, a portion of a video content may be faded out, not displayed, or obscured to allow for visualization and imagination of the portion. In another example, video content may be presented in an manner that retains a user's interest despite repeated viewings.

Apparatus and method to display event information detected from video data

Upon capture of video data for a match of a sport at a first time, an apparatus performs detection of event information from the captured video data during a first time-period starting from the first time, where the event information includes information identifying an occurrence timing of an event that occurs in the match of the sport, an event type of the event, and an occurrence position of the event. The apparatus reproduces the video data, on a display screen, with a delay by a second time-period obtained by adding a third time-period longer than or equal to a predetermined time-period to the first time-period, and, upon detection of the event information, continues displaying the event type and the occurrence position of the event, for the predetermined time-period, from a timing that is the predetermined time before the occurrence timing of the event within the reproduced video data.

Editing text in video captions

This disclosure describes techniques that include modifying text associated with a sequence of images or a video sequence to thereby generate new text and overlaying the new text as captions in the video sequence. In one example, this disclosure describes a method that includes receiving a sequence of images associated with a scene occurring over a time period; receiving audio data of speech uttered during the time period; transcribing into text the audio data of the speech, wherein the text includes a sequence of original words; associating a timestamp with each of the original words during the time period; generating, responsive to input, a sequence of new words; and generating a new sequence of images by overlaying each of the new words on one or more of the images.