Patent classifications
G06V20/46
Method and apparatus for detecting abnormal objects in video
Disclosed are a method and an apparatus for detecting abnormal objects in a video. The method for detecting abnormal objects in a video reconstructs a restored batch by applying each input batch to which an inpainting pattern is applied to a trained auto-encoder model, and fuses a time domain reconstruction error using time domain restored frames output by extracting and restoring a time domain feature point by applying a spatial domain reconstruction error and a plurality of successive frames using a restored frame output by combining the reconstructed restoring batch to a trained LSTM auto-encoder model to estimate an area where an abnormal object is positioned.
Fatigue state detection method and apparatus, medium, and electronic device
Disclosed are a fatigue state detection method and apparatus, a medium and a device. The method includes: obtaining image blocks containing an organ area of a target object from a plurality of video frames collected by a camera apparatus disposed in a mobile device, to obtain an image-block sequence that is based on the organ area; determining a fatigue state type of the target object based on the image-block sequence of the organ area; sending the image-block sequence to a cloud server if the fatigue state type meets a first preset type, and rendering the cloud server to detect a fatigue level of the target object based on the image-block sequence; and receiving fatigue level information about the target object that is returned by the cloud server. The present disclosure may improve accuracy of fatigue state detection, thereby helping to improve driving safety of the mobile device.
Detection apparatus, detection method, and computer program product
A detection apparatus includes one or more processors. The processors set at least one time-period candidate. The processors input, to a first model that inputs a feature acquired from a plurality of time-series images and the time-period candidate and outputs at least one first likelihood indicating a likelihood of occurrence of at least one action previously determined as a detection target and correction information for acquisition of at least one correction time period resulting from correction of the at least one time-period candidate, the feature and the time-period candidate, and acquire the first likelihood and the correction information output from the first model. The processors detect, based on the at least one correction time period acquired based on the correction information and the first likelihood, the action included in the time-series images and a start time and a finish time of a time period of occurrence of the action.
Target character video clip playing method, system and apparatus, and storage medium
Provided are a target character video clip playing method, system and apparatus, and a storage medium. The method comprises: using image recognition technology to perform target character recognition on an entire video, positioning a plurality of video clips containing target characters, and obtaining a first playing time period set corresponding to the video clips; according to audio clips corresponding to each character marked within the entire video, obtaining a second playing time period set corresponding to the audio clips of the various characters; merging the time periods included in the playing time period sets, and obtaining a sum playing time period set of the target characters; according to a sorting of various playing timelines within the sum playing time period set, performing video playing of the target characters.
VIDEO PROCESSING METHOD, VIDEO SEARCHING METHOD, TERMINAL DEVICE, AND COMPUTER-READABLE STORAGE MEDIUM
A video processing method, comprising: according to the scenario, editing a video to be edited, and obtaining a target video (S100); acquiring feature parameters of the target video (S200); generating, according to the feature parameters, a keyword of the target video (S300); and associatively storing the keyword and the target video (S400).
VIDEO INFORMATION PROCESSING METHOD AND APPARATUS, ELECTRONIC DEVICE, AND STORAGE MEDIUM
This application provides a video information processing method performed by an electronic device. The method includes: determining a video image frame set corresponding to each of a first video and a second video, respectively; determining a static stitching region corresponding to image frames in the video image frame set; cropping the image frames in the video image frame set according to the static stitching region, and determining an image feature vector for the video based on a corresponding cropping result using a video information processing model; and determining a similarity between the first video and the second video based on an image feature vector corresponding to the first video and an image feature vector corresponding to the second video.
IMAGE PROCESSING METHOD AND APPARATUS, AND COMPUTER-READABLE STORAGE MEDIUM
An image processing method is provided. For each frame of a video stream, a pixel digital frame mask in the respective frame of the video stream is obtained. The pixel digital frame mask of the respective frame includes a plurality of preset pixel position sets. At least two target preset pixel position sets are determined from the plurality of preset pixel position sets that form a frame sequence number of the respective frame based on values of pixels included in the at least two target preset pixel position sets. A frame sequence number corresponding to the respective frame of the video stream is determined according to positions of the at least two target preset pixel position sets in the pixel digital frame mask in the respective frame. Further, video fluency of the video stream is determined based on the frame sequence numbers.
VEHICLE-SEARCHING GUIDANCE METHOD AND APPARATUS, TERMINAL DEVICE, ELECTRONIC DEVICE, AND COMPUTER-READABLE STORAGE MEDIUM
A vehicle-searching guidance method includes: displaying a vehicle-searching guidance interface in response to a first trigger operation on a first control, the vehicle-searching guidance interface being used for providing guidance information for vehicle searching; transmitting a multimedia content obtaining request to an electronic device in response to a second trigger operation on a second control on the vehicle-searching guidance interface; receiving multimedia guidance content returned by the electronic device according to the multimedia content obtaining request, the multimedia guidance content including environmental information of a parking space in which a vehicle is parked; and presenting the multimedia guidance content for a user to search for the vehicle.
BEHAVIOR RECOGNITION METHOD AND SYSTEM, ELECTRONIC DEVICE AND COMPUTER-READABLE STORAGE MEDIUM
A behavior recognition method and system, including: dividing video data into a plurality of video clips, performing frame extraction processing on each video clip to obtain frame images, and performing optical flow extraction on the frame images to obtain optical flow images; performing feature extraction on the frame images and the optical flow images to obtain feature maps of the frame images and the optical flow images; performing spatio-temporal convolution processing on the feature maps of the frame images and the optical flow images, and determining a spatial prediction result and a temporal prediction result; fusing the spatial prediction results of all the video clips to obtain a spatial fusion result, and fusing the temporal prediction results of all the video clips to obtain a temporal fusion result; and performing two-stream fusion on the spatial fusion result and the temporal fusion result to obtain a behavior recognition result.
ACTIVITY RECOGNITION IN DARK VIDEO BASED ON BOTH AUDIO AND VIDEO CONTENT
Videos captured in low light conditions can be processed in order to identify an activity being performed in the video. The processing may use both the video and audio streams for identifying the activity in the low light video. The video portion is processed to generate a darkness-aware feature which may be used to modulate the features generated from the audio and video features. The audio features may be used to generate a video attention feature and the video features may be used to generate an audio attention feature. The audio and video attention features may also be used in modulating the audio video features. The modulated audio and video features may be used to predict an activity occurring in the video.