Patent classifications
G06V20/35
METHOD FOR TRAINING STUDENT NETWORK AND METHOD FOR RECOGNIZING IMAGE
Disclosed are a method for training a Student Network and a method for recognizing an image. The method includes: acquiring first prediction feature information of a sample image on the first granularity and second prediction feature information of the sample image on the second granularity by inputting the sample image into a Student Network, and acquiring first feature information of the sample image on the first granularity and second feature information of the sample image on the second granularity by inputting the sample image into a Teacher Network, and acquiring a target Student Network.
Method and device for carrying out eye gaze mapping
The invention relates to a device and a method for performing an eye gaze mapping (M), in which at least one point of vision (B) and/or a viewing direction of at least one person (10) in relation to at least one scene recording (S) of a scene (12) viewed by the at least one person (10) is mapped onto a reference (R). At least a part of an algorithm (A1, A2, A3) for performing the eye gaze mapping (M) is thereby selected from multiple predetermined algorithms (A1, A2, A3) as a function of at least one parameter (P), and the eye gaze mapping (M) is performed on the basis of the at least one part of the algorithm (A1, A2, A3).
Systems and methods for dynamic image category determination
Disclosed are systems and methods for dynamically determining categories for images. A computer-implemented method may include training a neural network to receive an input image and determine one or more image categories associated with the input image; obtaining a set of images associated with a user; determining, using the trained neural network, one or more image categories associated with each image included in the obtained set of images; determining one or more dominant image categories associated with the user based on the determined image categories for the obtained set of images; and determining an image editing user interface for the user based on the determined one or more dominant image categories.
Video generation method and apparatus, electronic device, and computer readable medium
Disclosed are a video generation method and apparatus, an electronic device, and a computer readable medium. A specific embodiment of the method comprises: obtaining a video footage and an audio footage, the video footage comprising a picture footage; determining a music point of the audio footage, the music point being used for dividing the audio footage into a plurality of audio clips; using the video footage to generate a video clip for each music clip in the audio footage to obtain a plurality of video clips, corresponding music clips and video clips having the same duration; and splicing the plurality of video clips according to the time when music clips respectively corresponding to the plurality of video clips appear in the audio footage, and adding the audio footage as a video audio signal to obtain a composite video.
Wearable Multimedia Device and Cloud Computing Platform with Application Ecosystem
Systems, methods, devices and non-transitory, computer-readable storage mediums are disclosed for a wearable multimedia device and cloud computing platform with an application ecosystem for processing multimedia data captured by the wearable multimedia device. In an embodiment, a method comprises: receiving, by one or more processors of a cloud computing platform, context data from a wearable multimedia device, the wearable multimedia device including at least one data capture device for capturing the context data; creating a data processing pipeline with one or more applications based on one or more characteristics of the context data and a user request; processing the context data through the data processing pipeline; and sending output of the data processing pipeline to the wearable multimedia device or other device for presentation of the output.
Apparatus, method and storage medium
An apparatus generates a plurality of albums including a first album and a second album and includes a page generation unit to generate common pages to be arranged in both albums and individual pages including a first individual page to be arranged in the first album and not in the second album and a second individual page to be arranged in the second and not in the first. The first album is album data that a first object is set as a main object. An image including the first object is arranged in the first individual page. The second album is album data that a second object is set as a main object. An image including the second object is arranged in the second individual page. A determination unit determines an arrangement order of the common pages and the individual pages.
Use of on-screen content identifiers in automated tool control systems
An inventory control system comprises an object storage device, a display device, and one or more processors. The object storage device includes a plurality of compartments, in which each compartment has a plurality of storage locations for storing objects. The display device is configured to display information about the object storage device. The one or more processors are configured to establish a description database of objects configured for storage in the inventory control system. The one or more processors retrieve object keywords corresponding to objects stored in the plurality of storage locations of one of the plurality of compartments. The one or more processors also generate a text block based on the retrieved object keywords. On the display device, the one or more processors display a representation of the plurality of compartments of the object storage device with the text block applied to the one of the plurality of compartments.
Method and system for producing story video
A method and a system for producing a story video are provided. A method for producing a story video, according to one embodiment, can produce a specific story video by determining a theme of a story that is suitable for collected videos and selecting and arranging an appropriate video for each frame of a template associated with the theme.
Providing a response in a session
The present disclosure provides method and apparatus for providing a response to a user in a session. At least one message associated with a first object may be received in the session, the session being between the user and an electronic conversational agent. An image representation of the first object may be obtained. Emotion information of the first object may be determined based at least on the image representation. A response may be generated based at least on the at least one message and the emotion information. The response may be provided to the user.
Robotic interactions for observable signs of intent
Described herein are assistant robots that anticipate needs of one or more people (or animals). The assistant robots may recognize a current activity, knowledge of the person's routines, and contextual information. As such, the assistant robots can provide or offer to provide appropriate robotic assistance. The assistant robots can learn users' habits or be provided with knowledge regarding humans in its environment. The assistant robots develop a schedule and contextual understanding of the persons' behavior and needs. The assistant robots may interact, understand, and communicate with people before, during, or after providing assistance. The robot can combine gesture, clothing, emotional aspect, time, pose recognition, action recognition, and other observational data to understand people's medical condition, current activity, and future intended activities and intents.