Patent classifications
H04N19/25
METHOD AND APPARATUS FOR MEDIA SCENE DESCRIPTION
Systems, methods, and devices for managing media storage and delivery, including obtaining, by a media access function (MAF), a Graphics Language Transmission Format (glTF) file corresponding to a scene; obtaining from the glTF file a uniform resource locator (URL) parameter indicating a binary data blob; determining that the binary data blob has a Concise Binary Object Representation (CBOR) format; converting the binary data blob into an object having a JavaScript Object Notation (JSON) format using a CBOR parser function implemented by the MAF; and obtaining media content corresponding to the scene based on the object.
Image processing apparatus and image processing method and storage medium
An image processing apparatus converts a multivalued image to a binary image, determines a character region and a non-character region in the binary image, extracts areas in units of a character from the character region, determines whether each of the areas is a character region or a non-character region, and extracts, for each of the areas determined to be the character region, a first representative color from the multivalued image included in the area, and generates a binary image corresponding to the first representative color. The apparatus decides a second representative color closest to a color of a target pixel by comparing the first representative color with the color of the target pixel, and changes the binary image of the area corresponding to the first representative color to a binary image of the decided second representative color.
Image processing apparatus and image processing method and storage medium
An image processing apparatus converts a multivalued image to a binary image, determines a character region and a non-character region in the binary image, extracts areas in units of a character from the character region, determines whether each of the areas is a character region or a non-character region, and extracts, for each of the areas determined to be the character region, a first representative color from the multivalued image included in the area, and generates a binary image corresponding to the first representative color. The apparatus decides a second representative color closest to a color of a target pixel by comparing the first representative color with the color of the target pixel, and changes the binary image of the area corresponding to the first representative color to a binary image of the decided second representative color.
Temporal Alignment of MPEG and GLTF Media
An apparatus includes at least one processor; and at least one memory including computer program code; wherein the at least one memory and the computer program code are configured to, with the at least one processor, cause the apparatus at least to: provide an animation timing extension; wherein the animation timing extension links a graphics library transmission format animation to timed metadata and a metadata track of the timed metadata; wherein the metadata track of the timed metadata is listed with an object associated with moving picture media; and align at least one timeline of the moving picture media with at least one timeline of the graphics library transmission format animation; wherein a sample of the metadata track is used to manipulate an animation event.
Video compression using cognitive semantics object analysis
A method, and associated computer system and computer program product, for video compression that includes receiving a video file including a plurality of frames, identifying at least one image feature in each of the plurality of frames, determining a semantic state change of the image feature for each successive frame after a first of the plurality of frames, and storing the first of the plurality of frames and the semantic change of the image feature for each successive frame after the first of the plurality of frames.
CONTEXT-BASED BINARY ARITHMETIC ENCODING AND DECODING
An encoding method is disclosed. At least one context is first determined for encoding a syntax element associated with a block of a picture responsive to a current quantization parameter associated with the block. Second, the syntax element is context-based entropy encoded with the at least one determined context.
CONTEXT-BASED BINARY ARITHMETIC ENCODING AND DECODING
An encoding method is disclosed. At least one context is first determined for encoding a syntax element associated with a block of a picture responsive to a current quantization parameter associated with the block. Second, the syntax element is context-based entropy encoded with the at least one determined context.
3D machine-vision system
One embodiment can provide a machine-vision system. The machine-vision system can include a structured-light projector, a first camera positioned on a first side of the structured-light projector, and a second camera positioned on a second side of the structured-light projector. The first and second cameras are configured to capture images under illumination of the structured-light projector. The structured-light projector can include a laser-based light source.
3D machine-vision system
One embodiment can provide a machine-vision system. The machine-vision system can include a structured-light projector, a first camera positioned on a first side of the structured-light projector, and a second camera positioned on a second side of the structured-light projector. The first and second cameras are configured to capture images under illumination of the structured-light projector. The structured-light projector can include a laser-based light source.
LAYERED SCENE DECOMPOSITION CODEC SYSTEM AND METHODS
A system and methods for a CODEC driving a real-time light field display for multi-dimensional video streaming, interactive gaming and other light field display applications is provided applying a layered scene decomposition strategy. Multi-dimensional scene data is divided into a plurality of data layers of increasing depths as the distance between a given layer and the plane of the display increases. Data layers are sampled using a plenoptic sampling scheme and rendered using hybrid rendering, such as perspective and oblique rendering, to encode light fields corresponding to each data layer. The resulting compressed, (layered) core representation of the multi-dimensional scene data is produced at predictable rates, reconstructed and merged at the light field display in real-time by applying view synthesis protocols, including edge adaptive interpolation, to reconstruct pixel arrays in stages (e.g. columns then rows) from reference elemental images.