G06V20/41

Systems And Methods For Improved Video Understanding

A computer-implemented method for classifying video data with improved accuracy includes obtaining, by a computing system comprising one or more computing devices, video data comprising a plurality of video frames; extracting, by the computing system, a plurality of video tokens from the video data, the plurality of video tokens comprising a representation of spatiotemporal information in the video data; providing, by the computing system, the plurality of video tokens as input to a video understanding model, the video understanding model comprising a video transformer encoder model; and receiving, by the computing system, a classification output from the video understanding model.

COMPUTER VISION-BASED SURGICAL WORKFLOW RECOGNITION SYSTEM USING NATURAL LANGUAGE PROCESSING TECHNIQUES
20230017202 · 2023-01-19 ·

Systems, methods, and instrumentalities are disclosed for computer vision-based surgical workflow recognition using natural language processing (NLP) techniques. Surgical video of surgical procedures may be processed and analyzed, for example, to achieve workflow recognition. Surgical phases may be determined based on the surgical video and segmented to generate an annotated video representation. The annotated video representation of the surgical video may provide information associated with the surgical procedure. For example, the annotated video representation may provide information on surgical phases, surgical events, surgical tool usage, and/or the like.

TECHNIQUES FOR DETECTION/NOTIFICATION OF PACKAGE DELIVERY AND PICKUP

Systems, computer-readable media, methods, and approaches described herein may identify delivery and/or pickup of packages. For example, packages may be identified within the areas captured by images and/or video. Based on the identification of the packages, it may be determined whether the package was delivered or picked up. A notification may be initiated that indicates that a package has been delivered and/or picked up.

Machine learning-based multi-view video conferencing from single view video data
11706385 · 2023-07-18 · ·

Machine learning-based multi-view video conferencing from single view video data, including: identifying, in video data, a plurality of objects; and generating a user interface comprising a plurality of first user interface elements each comprising a portion of the video data corresponding to one or more of the plurality of objects.

GRATITUDE DELIVERY FOR MANUFACTURED PRODUCTS
20230019317 · 2023-01-19 ·

System, methods, and other embodiments described herein relate to an improved approach to providing gratitude between consumers and workers. In one embodiment, a method includes acquiring from a camera within a manufacturing facility, source video of manufacturing of different stages of a product. The method includes identifying, from the source video, segments associated with the product. The method includes generating a combined video from the segments. The method includes providing the combined video to a consumer associated with the product.

MEDIA SHARING AND COMMUNICATION SYSTEM
20230016221 · 2023-01-19 ·

A media sharing and communication system, including a recording mechanism that records a desired portion of media upon activation by a first individual user, a first user transmitter/receiver that transmits the portion of media and a message generated by the first individual user regarding the portion of media to a second individual user and is capable of transmitting a message to a second individual user, a confirmation mechanism that confirms that the second individual user is authorized to view the portion of media, a notification mechanism that notifies the first individual user if the second individual user is not authorized to receive the portion of media, a second user transmitter/receiver that receives the portion of media and voice message upon authorization of the second individual user, a search mechanism, and a video recording mechanism, an online betting module, and an online food ordering module.

OBTAINING CAMERA DEVICE IMAGE DATA REPRESENTING AN EVENT

Methods, computer program products, and systems are presented and can include for instance: determining a location of interest; and obtaining image data from one or more camera device about the location of interest.

Image processing method
11557185 · 2023-01-17 · ·

An image processing method is provided. The method includes acquiring a video. The method includes using an object detection engine to detect a person in the video. The object detection engine is integrated with an image signal processing pipeline. The method includes transmitting the video over a network. The method includes determining that the detected person has moved less than a pre-set distance. The method includes, responsive to the determining, pausing transmission of the video. An embedded image processor including an object detection engine is also provided.

VIDEO PROCESSING FOR ENABLING SPORTS HIGHLIGHTS GENERATION
20230222797 · 2023-07-13 · ·

One or more highlights of a video stream may be identified. The highlights may be segments of a video stream, such as a broadcast of a sporting event, that are of particular interest to one or more users. According to one method, at least a portion of the video stream may be stored. The portion of the video stream may be compared with templates of a template database to identify the one or more highlights. Each highlight may be a subset of the video stream that is deemed likely to match the one or more templates. The highlights, an identifier that identifies each of the highlights within the video stream, and/or metadata pertaining particularly to the one or more highlights may be stored to facilitate playback of the highlights for the users.

Method for estimating operation of work vehicle, system, method for producing trained classification model, training data, and method for producing training data
11556739 · 2023-01-17 · ·

A method is performed by a computer. The method includes obtaining motion data indicating a motion change of a work vehicle, and determining an operation classification of the work vehicle from the motion data by performing image classification using a trained classification model. The motion data is generated from a plurality of images indicating the work vehicle in operation in time series.