Patent classifications
G06V20/35
Machine learning model training method and device, and expression image classification method and device
This application relates to a machine learning model training method and apparatus, and an expression image classification method and apparatus. The machine learning model training method includes: obtaining a machine learning model that includes a model parameter and that is obtained through training according to a general-purpose image training set; determining a sample of a special-purpose image and a corresponding classification label; inputting the sample of the special-purpose image to the machine learning model, to obtain an intermediate classification result; and adjusting the model parameter of the machine learning model according to a difference between the intermediate classification result and the classification label, continuing training, and ending the training in a case that a training stop condition is met. The solutions provided in this application improve the training efficiency of the machine learning model.
INFORMATION PROCESSING SYSTEM, CONTROL METHOD THEREOF, AND NON-TRANSITORY COMPUTER-READABLE STORAGE MEDIUM
An information processing system is provided. The information processing system comprises: an analysis device configured to analyze emotion of a first person and quantify an emotion level of the first person; and an identification device configured to identify a second person serving as a communication target of the first person. The information processing system further comprises a notification device configured to notify the second person if the emotion level exceeds a set threshold.
System and method for providing dominant scene classification by semantic segmentation
A method for computing a dominant class of a scene includes: receiving an input image of a scene; generating a segmentation map of the input image, the segmentation map being labeled with a plurality of corresponding classes of a plurality of classes; computing a plurality of area ratios based on the segmentation map, each of the area ratios corresponding to a different class of the plurality of classes of the segmentation map; and outputting a detected dominant class of the scene based on a plurality of ranked labels based on the area ratios.
Learning Iconic Scenes and Places with Privacy
Devices, methods, and non-transitory program storage devices (NPSDs) are disclosed herein to provide for the privacy-respectful learning of iconic scenes and places, wherein the learning is based on information received from one or more client devices in response to one or more collection criteria specified as part of one or more collection operations launched by a server device. In some embodiments, differential privacy techniques (such as the submission of predetermined amounts of noise-injecting, e.g., randomly-generated, data in conjunction with actual data) are employed by the client devices, such that any insights learned by the server device only relate to “hot spots,” “themes,” or other scenes, objects, and/or topics that are highly popular and captured in the digital assets (DAs) of many users, ensuring there is no way for the server device to learn or glean any insights related to particular users of individual client devices participating in the collection operations.
Selective image compression of an image stored on a device based on user preferences
A computer-implemented method according to one embodiment includes analyzing an image stored on a device. In response to determining that a storage consumption of the device is greater than a first predetermined threshold, content of the image that is both non-focused and that is not of interest is selectively compressed. In response to determining that the storage consumption subsequent to selectively compressing content of the image that is both non-focused and that is not of interest, is greater than a second predetermined threshold, content of the image that is both focused and that is not of interest is selectively compressed. In response to determining that the storage consumption subsequent to selectively compressing content of the image that is both focused and that is not of interest, is greater than the second predetermined threshold, content of the image that is both focused and that is of interest is selectively compressed.
Stereophonic apparatus for blind and visually-impaired people
A method and a wearable system which includes distance sensors, cameras and headsets, which all gather data about a blind or visually impaired person's surroundings and are all connected to a portable personal communication device, the device being configured to use scenario-based algorithms and an A.I to process the data and transmit sound instructions to the blind or visually impaired person to enable him/her to independently navigate and deal with his/her environment by provision of identification of objects and reading of local texts.
MAPPING PHYSICAL LOCATIONS TO FIT VIRTUALIZED AR AND VR ENVIRONMENTS
Systems, methods, and computer programming products for generating, rendering and/or displaying a computer-generated virtual environment as augmented reality and/or virtual reality. The physical boundaries containing the active area where the virtual environments are rendered and displayed are established. Based on the constraints and characteristics of the physical boundaries, virtual environments are mapped using assets from real, historical and/or fictitious locations. The assets can be dynamically re-sized and distanced to fit constraints of the physical space. Based on historical levels of interactivity with the selected environments, the virtual assets can be sorted and tagged as points of interest or filler assets, then mapped to the virtual environment using GAN technology and other machine learning techniques to re-create unique versions of the selected environments. Virtual thresholds can be introduced to segment the virtual environment into multiple portions and reduce the amount of assets needing to be simultaneously displayed.
Information recommendation method, computer device, and storage medium
Information recommendation methods are provided. Image information corresponding to an image is obtained by processing circuitry. The image is associated with a user identifier. A user tag set corresponding to the user identifier and the image information is generated. A feature vector corresponding to user tags in the user tag set and the image information is formed. The feature vector is processed according to a trained information recommendation model, to obtain a recommendation parameter of to-be-recommended information. A recommendation of the to-be-recommended information is provided to a terminal corresponding to the user identifier according to the recommendation parameter.
Discriminative caption generation
A discriminative captioning system generates captions for digital images that can be used to tell two digital images apart. The discriminative captioning system includes a machine learning system that is trained by a discriminative captioning training system that includes a retrieval machine learning system. For training, a digital image is input to the caption generation machine learning system, which generates a caption for the digital image. The digital image and the generated caption, as well as a set of additional images, are input to the retrieval machine learning system. The retrieval machine learning system generates a discriminability loss that indicates how well the retrieval machine learning system is able to use the caption to discriminate between the digital image and each image in the set of additional digital images. This discriminability loss is used to train the caption generation machine learning system.
IMAGE RANKING SYSTEM
Systems and methods are provided for generating a base visual score for each candidate image of a plurality of images received by a computing system, based on the scene type of each image. For each candidate image, the computing system multiplies the base visual score by a feature importance weight to generate a first visual score, adds respective scene type bonus points to the first visual score to generate a second visual score, and adds diversity scoring points to the second visual score to generate a final visual score for each candidate image. The computing system ranks the candidate images based on the final visual scores and provides a specified number of the top-ranked candidate images to be displayed on a display of the computing device.