Patent classifications
G06V40/10
APPARATUS AND METHOD FOR CLASSIFYING CLOTHING ATTRIBUTES BASED ON DEEP LEARNING
Disclosed herein are an apparatus and method for classifying clothing attributes based on deep learning. The apparatus includes memory for storing at least one program and a processor for executing the program, wherein the program includes a first classification unit for outputting a first classification result for one or more attributes of clothing worn by a person included in an input image, a mask generation unit for outputting a mask tensor in which multiple mask layers respectively corresponding to principal part regions obtained by segmenting a body of the person included in the input image are stacked, a second classification unit for outputting a second classification result for the one or more attributes of the clothing by applying the mask tensor, and a final classification unit for determining and outputting a final classification result for the input image based on the first classification result and the second classification result.
SYNTHESIZED SPEECH AUDIO DATA GENERATED ON BEHALF OF HUMAN PARTICIPANT IN CONVERSATION
Generating synthesized speech audio data on behalf of a given user in a conversation. The synthesized speech audio data includes synthesized speech that incorporates textual segment(s). The textual segment(s) can include recognized text that results from processing spoken input, of the given user, using a speech recognition model and/or can include a selection of a rendered suggestion that conveys the textual segment(s). Some implementations dynamically determine one or more prosodic properties for use in speech synthesis of the textual segment, and generate the synthesized speech with the one or more determined prosodic properties. The prosodic properties can be determined based on the textual segment(s) used in speech synthesis, textual segment(s) corresponding to recent spoken input of additional participant(s), attribute(s) of relationship(s) between the given user and additional participant(s) in the conversation, and/or feature(s) of a current location for the conversation.
INFORMATION PROCESSING DEVICE, PROGRAM, AND METHOD
An information processing device that includes a control unit configured to track an object in an image using images input in time series, using a tracking result obtained by performing tracking in units of a tracking region corresponding to a specific part of the object.
INFORMATION PROCESSING DEVICE, PROGRAM, AND METHOD
An information processing device that includes a control unit configured to track an object in an image using images input in time series, using a tracking result obtained by performing tracking in units of a tracking region corresponding to a specific part of the object.
APPARATUS AND METHOD FOR IDENTIFYING CONDITION OF ANIMAL OBJECT BASED ON IMAGE
An image-based animal object condition identification apparatus includes: a communication module that receives an image of an object; a memory that stores therein a program configured to extract animal condition information from the received image; and a processor that executes the program. The program extracts continuous animal detection information of each object by inputting the received image into an animal detection model that is trained based on learning data composed of animal images and determines predetermined animal condition information for each class of each animal object by inputting the continuous animal detection information of each object into an animal condition identification model.
APPARATUS AND METHOD FOR IDENTIFYING CONDITION OF ANIMAL OBJECT BASED ON IMAGE
An image-based animal object condition identification apparatus includes: a communication module that receives an image of an object; a memory that stores therein a program configured to extract animal condition information from the received image; and a processor that executes the program. The program extracts continuous animal detection information of each object by inputting the received image into an animal detection model that is trained based on learning data composed of animal images and determines predetermined animal condition information for each class of each animal object by inputting the continuous animal detection information of each object into an animal condition identification model.
AIRCRAFT DOOR CAMERA SYSTEM FOR DOCKING ALIGNMENT MONITORING
A camera with a field of view toward an external environment of an aircraft is disposed within an aircraft door such that a ground surface is within the field of view of the camera during taxiing of the aircraft. A display device is disposed within an interior of the aircraft. A processor is operatively coupled to the camera and to the display device. The processor analyzes image data captured by the camera for docking guidance by identifying, within the captured image data, a region on the ground surface corresponding to an alignment fiducial indicating a parking location for the aircraft, determining, based on the region of the captured image data corresponding to the alignment fiducial indicating the parking location, a relative location of the aircraft with respect to the alignment fiducial, and outputting an indication of the relative location of the aircraft to the alignment fiducial.
METHODS AND SYSTEMS FOR DETECTING A TAILGATING EVENT AT AN ACCESS POINT CONTROLLED BY AN ACCESS CONTROL SYSTEM
Apparatus and methods for controlling access to a restricted area by an access control device includes obtaining a first image using a first sensor mounted at a first location. A second image is obtained using a second sensor mounted at a location different from a location of the first sensor. The second image is processed, using the second sensor, to obtain information regarding the detected objects in the second image. The information regarding the detected objects is sent from the second sensor to the first sensor. The first sensor compares the received information with a number of objects detected using the first image. A tailgating event is identified, in response to determining that the number of objects detected using the first image does not match the information regarding the number of objects detected using the second image. A tailgating notification is outputted, by the first sensor, indicating a tailgating event.
PROJECTION ON A VEHICLE WINDOW
A system includes a camera aimed externally to a vehicle, a window of the vehicle, a projector positioned to project on the window, and a computer communicatively coupled to the camera and the projector. The computer is programmed to, upon receiving data from the camera indicating a first person outside the vehicle, instruct the projector to project an image on the window depicting a second person inside the vehicle.
Detecting interactions with non-discretized items and associating interactions with actors using digital images
Commercial interactions with non-discretized items such as liquids in carafes or other dispensers are detected and associated with actors using images captured by one or more digital cameras including the carafes or dispensers within their fields of view. The images are processed to detect body parts of actors and other aspects therein, and to not only determine that a commercial interaction has occurred but also identify an actor that performed the commercial interaction. Based on information or data determined from such images, movements of body parts associated with raising, lowering or rotating one or more carafes or other dispensers may be detected, and a commercial interaction involving such carafes or dispensers may be detected and associated with a specific actor accordingly.