SYSTEM AND METHOD FOR AUTOMATIC DETECTION OF VISUAL EVENTS IN TRANSPORTATION ENVIRONMENTS
20230040565 · 2023-02-09
Inventors
- Ilan Naslavsky (Newton, MA, US)
- Aditya Gupte (Cambridge, MA, US)
- Moran Cohen (Jerusalem, IL)
- David J. Michael (Waban, MA, US)
Cpc classification
G06V10/774
PHYSICS
G06V10/765
PHYSICS
G06V10/25
PHYSICS
G06V20/52
PHYSICS
International classification
G06V10/94
PHYSICS
G06V10/25
PHYSICS
G06V10/774
PHYSICS
Abstract
This invention provides a system and method that uses a hybrid model for transportation-based (e.g. maritime) visual event detection of events. In operation, video data is reduced by detecting change and exclusively transmitting images to the deep learning model when changes are detected, or alternatively, based upon a timer that samples at selected intervals. Relatively straightforward deep learning models are used, which operate on sparse individual frames, instead of employing complex deep learning models that operate on multiple frames/videos. This approach reduces the need for specialized models. Independent, rule-based classifiers are used, based on the output of the deep learning model into visual events that, in turn, allows highly specialized events to be constructed. For example, multiple detections can be combined into higher-level single events, and thus, the existence maintenance procedures, cargo activities, and/or inspection rounds can be derived from combining multiple events or multiple detections.
Claims
1. A system for detecting visual events in a transportation environment having one or more locations of interest in which the events occur comprising: a plurality of cameras arranged to image each of a plurality of activities relevant to the transportation environment, in which each camera of the plurality of cameras respectively acquires images of a location of interest and transmits image data thereof to a processor; and a visual detector associated with the processor arranged to include, (a) at least one visual change detector that identifies changes between at least two of the images, (b) at least one pre-trained visual deep learning model operating on the images and generates a deep learning inference output, and (c) at least one rule-based classifier that produces events or alerts from the deep learning inference output run on images trained with at least one rule.
2. The system as set forth in claim 1 wherein the processor comprises one or more CPUs or one or more GPUs.
3. The system as set forth in claim 1 wherein the visual change detector includes at least one of a data filter, an optical flow processor or image differencer block, and a sum or max threshold operator.
4. The system as set forth in claim 3 wherein the optical flow processor or image difference operates at multiple scales.
5. The system as set forth in claim 3 wherein the visual detector operates on image sequences with or free-of activity.
6. The system as set forth in claim 5 wherein the visual change detector is adapted to operate based upon a change in brightness, a threshold of object size, a threshold of texture or a threshold of object velocity.
7. The system as set forth in claim 1 wherein the deep learning model comprises at least one of a single stage detector, YOLO, SSD, a multistage detector, RCNN, FasterRCNN, a segmentation network, MaskRCNN, a segmentation network from an open source library, and a segmentation network from at least one of OpenCV and Detectron2.
8. The system as set forth in claim 1 wherein the deep learning model is adapted to operate based upon at least one of a deep learning and machine learning framework, Cafe, Tensorflow, Pytorch, and Keras.
9. The system as set forth in claim 1 wherein the rule-based classifier operates based upon at least one of an event sequence description, a person or object predicate, a person or object location, a person or object pose, and a labelled image region.
10. The system as set forth in claim 1 whereon the rule-based classifier operates on stored temporal and spatial partial events to generate a complete visual event.
11. The system as set forth in claim 1 wherein the visual change detector includes a clock output.
12. The system as set forth in claim 1 wherein the visual change detector includes an external trigger and a timer that provides image frames to downstream detection processes based upon a predetermined time interval.
13. The system as set forth in claim 1 wherein the rule-based classifier receives regions of interest of the scene as an input.
14. The system as set forth in claim 1 wherein the rule-based classifier is based upon a detected and localized person, a region of interest and an event sequence.
15. The system as set forth in claim 14 wherein, at least one of, (a) the person is a crew on a vessel, (b) the region of interest is a location on the vessel, and (c) the event sequence, is an operation related to the vessel.
16. The system as set forth in claim 16 wherein the cameras and the visual detector are arranged to determine a present time and location of the vessel and compare the time and location to detection of lighting so as to alert of improper lighting at the time and location.
17. A method for detecting visual events in a transportation environment having one or more locations of interest in which the events occur comprising the steps of: imaging each of a plurality of activities relevant to the transportation environment with a plurality of cameras, wherein each camera of the plurality of cameras respectively acquires images of a location of interest and transmits image data thereof to a processor; and providing a visual detector associated with the processor that performs the steps of, (a) identifying changes between at least two of the images, (b) operating at least one pre-trained visual deep learning model on the images to generate at least one deep learning inference output, and (c) producing, with at least one rule-based classifier events or alerts from the deep learning inference output run on images trained with at least one rule.
18. The method as set forth in claim 17 wherein the visual change detector includes at least one of a data filter, an optical flow processor or image differencer block, and a sum or max threshold operator.
19. The method as set forth in claim 17 wherein the rule-based classifier operates based upon at least one of an event sequence description, a person or object predicate, a person or object location, a person or object pose, and a labelled image region.
20. The method as set forth in claim 17 whereon the rule-based classifier operates on stored temporal and spatial partial events to generate a complete visual event.
21. The method as set forth in claim 17 wherein the rule-based classifier is based upon a detected and localized person, a region of interest and an event sequence, and at least one of, (a) the person is a crew on a vessel, (b) the region of interest is a location on the vessel, and (c) the event sequence, is an operation related to the vessel.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0010] The invention description below refers to the accompanying drawings, of which:
[0011]
[0012]
[0013]
[0014]
[0015]
[0016]
[0017]
[0018]
[0019]
[0020]
[0021]
[0022]
[0023]
[0024]
[0025]
[0026]
DETAILED DESCRIPTION
I. System Overview
[0027] Reference is made to the above-incorporated U.S. patent application Ser. No. 17/175,364, which, by way of background depicts the system of
[0028]
[0029] Note that data used herein can include both direct feeds from appropriate sensors and also data feeds from other data sources that can aggregate various information, telemetry, etc. For example, location and/or directional information can be obtained from navigation systems (GPS etc.) or other systems (e.g. via APIs) through associated data processing devices (e.g. computers) that are networked with a server 130 for the system. Similarly, crew members can input information via an appropriate user interface. The interface can request specific inputs—for example logging into or out of a shift, providing health information, etc.—or the interface can search for information that is otherwise input by crew during their normal operations—for example, determining when a crew member is entering data in the normal course of shipboard operations to ensure proper procedures are being attended to in a timely manner.
[0030] The shipboard location 110 can further include a local image/other data recorder 120. The recorder can be a standalone unit, or part of a broader computer server arrangement 130 with appropriate processor(s), data storage and network interfaces. The server 130 can perform generalized shipboard, or be dedicated to, operations of the system and method herein with appropriate software. The server 130 communicates with a work station or other computing device 132 that can include an appropriate display (e.g. a touchscreen) 134 and other components that provide a graphical user interface (GUI). The GUI provides a user on board the vessel with a local dashboard for viewing and controlling manipulation of event data generated by the sensors 118 as described further below. Note that display and manipulation of data can include, but is not limited to enrichment of the displayed data (e.g. images, video, etc.) with labels, comments, flags, highlights, and the like.
[0031] The information handled and/or displayed by the interface can include a workflow provided between one or more users or vessels. Such a workflow would be a business process where information is transferred from user to user (at shore or at sea interacting with the application over the GUI) for action according to the business procedures/rules/policies. This workflow automation is commonly referred to as “robotic process automation.”
[0032] The processes 150 that run the dashboard and other data-handling operations in the system and method can be performed in whole or in part with the onboard server 130, and/or using a remote computing (server) platform 140 that is part of a land-based, or other generally fixed, location with sufficient computing/bandwidth resources (a base location 142). The processes can generally include 150 a computation process 152 that handles sensor data to meaningful events. This can include machine vision algorithms and similar procedures. A data-handling process 154 can be used to derive events and associated status based upon the events—for example movements of the crew and equipment, cargo handling, etc. An information process 156 can be used to drive dashboards for one or more vessels and provide both status and manipulation of data for a user on the ship and at the base location.
[0033] Data is communicated between the ship (or other remote location) 110 and the base 142 and occurs over one or more reduced-bandwidth wireless channels, which can be facilitated by a satellite uplink/downlink 160, or another transmission modality for example, long-wavelength, over-air transmission. Moreover, other forms of wireless communication can be employed such as mesh networks and/or underwater communication (for example long-range, sound-based communication and/or VLF). Note that when the ship is located near a land-based high-bandwidth channel or physically connected by-wire while at port, the system and method herein can be adapted to utilize that high-bandwidth channel to send all previously unsent low-priority events, alerts, and/or image-based information.
[0034] The (shore) base server environment 140 communicates via an appropriate, secure and/or encrypted link (e.g. a LAN or WAN (Internet)) 162 with a user workstation 170 that can comprise a computing device with an appropriate GUI arrangement, which defines a user dashboard 172 allowing for monitoring and manipulation of one or more vessels in a fleet over which the user is responsible and manages.
[0035] Referring further to
[0036] Referring again to
[0037] Note that, in various embodiments, the bandwidth of the communications link between vessel and base can be limited by external systems such as QoS-quality of service-settings on routers/link OR by the internal system (edge server 130)—for example to limit usage to (e.g.) 15% of total available communication bandwidth. This limitation in bandwidth can be based on a variety of factors, including, but not limited to, the time of day and/or a communications satellite usage cost schedule. An appropriate instruction set can be programmed into the server using conventional or custom control processes. The specific settings for such bandwidth control can also be directed by the user via the GUI.
II. Visual Detectors
[0038] As shown in
[0039] As shown in
[0040] (a) A person is present at their station at the expected time and reports the station, start time, end time, and elapsed time;
[0041] (b) A person has entered a location at the expected time and reports the location, start time, end time, and elapsed time;
[0042] (c) A person moved through a location at the expected time and reports the location, start time, end time, and elapsed time;
[0043] (d) A person is performing an expected activity at the expected location at the expected time and reports the location, start time, end time, and elapsed time—the activity can include (e.g.) watching, monitoring, installing, hose-connecting or disconnecting, crane operating, tying with ropes;
[0044] (e) a person is running, slipping, tripping, falling, lying down, using or not using handrails at a location at the expected time and reports the location, start time, end time, and elapsed time;
[0045] (f) A person is wearing or not wearing protective equipment when performing an expected activity at the expected location at the expected time and reports the location, start time, end time, and elapsed time-protective equipment can include (e.g.) a hard-hat, left or right glove, left or right shoe/boot, ear protection, safety goggles, life-jacket, gas mask, welding mask, or other protection;
[0046] (g) A door is open or closed at a location at the expected time and reports the location, start time, end time, and elapsed time;
[0047] (h) An object is present at a location at the expected time and reports the location, start time, end time and elapsed time—the object can include (e.g.) a gangway, hose, tool, rope, crane, boiler, pump, connector, solid, liquid, small boat and/or other unknown item;
[0048] (i) That normal operating activities are being performed using at least one of engines, cylinders, hose, tool, rope, crane, boiler, and/or pump; and
[0049] (j) That required maintenance activities are being performed on engines, cylinders, boilers, cranes, steering mechanisms, HVAC, electrical, pipes/plumbing, and/or other systems.
[0050] Note that the above-recited listing of examples (a-j) are only some of a wide range of possible interactions that can form the basis of detectors according to illustrative embodiments herein. Those of skill should understand that other detectable events involving person-to-person, person-to-equipment or equipment-to-equipment interaction are expressly contemplated.
[0051] In operation, an expected event visual detector takes as input the detection result of one or more vision systems aboard the vessel. The result could be a detection, no detection, or an anomaly at the time of the expected event according to the plan. Multiple events or multiple detections can be combined into a higher-level single event. For example, maintenance procedures, cargo activities, or inspection rounds may result from combining multiple events or multiple detections. Note that each visual event is associated with a particular (or several) vision system camera(s) 118, 180, 182 at a particular time and the particular image or video sequence at a known location within the vessel. The associated video can be optionally sent or not sent with each event or alarm. When the video is sent with the event or alarm, it may be useful for later validation of the event or alarm. Notably, the discrete images and/or short-time video frame sequences actually represent a small fraction of the video stream, and consequently represent a substantial reduction in the bandwidth required for transmission in comparison to the entire video sequence over the reduced-bandwidth link. Moreover, in addition to compacting the video by reducing it to a few images or short-time sequence, the system can reduce the images in size either by cropping the images down to significant or meaningful image locations required by the detector or by reducing the resolution say from the equivalent of high-definition (HD) resolution to standard-definition (SD) resolution, or below standard resolution.
[0052] In addition to reducing bandwidth by identifying events via the vision system and cropping such images where appropriate, the number of image frames can be reduced, in a sequence thereof, by increasing the interval of time between frames. Moreover, bandwidth can be even further reduced using the procedures above, and then subjecting (all on the shipboard server side) the event-centric, cropped, spaced-apart, using commercially available or customized lossy or lossless image compression techniques. Such techniques can include, but are not limited to discrete cosine transform (DCT), run-length encoding (RLE), predictive coding, and/or Lempel-Ziv-Welch (LZW).
[0053] The images or video sequences NOT associated with visual events may be stored for some period of time on board the vessel.
[0054] The shipboard server establishes a priority of transmission for the processed visual events that is based upon settings provided from a user, typically operating the on-shore (base) dashboard. The shipboard server buffers these events in a queue in storage that can be ordered based upon the priority. Priority can be set based on a variety of factors—for example personnel safety and/or ship safety can have first priority and maintenance can have last priority, generally mapping to the urgency of such matters. By way of example, all events in the queue with highest priority are sent first. They are followed by events with lower priority. If a new event arrives shipboard with higher priority, then that new higher priority event will be sent ahead of lower priority events. It is contemplated that the lowest priority events can be dropped if higher priority events take all available bandwidth. The shipboard server receives acknowledgements from the base server on shore and confirms that events have been received and acknowledged on shore before marking the shipboard events as having been sent. Multiple events may be transmitted prior to receipt (or lack of receipt) of acknowledgement. Lack of acknowledgement potentially stalls the queue or requires retransmission of an event prior to transmitting all next events in the priority queue on the server. The shore-based server interface can configure or select the visual event detectors over the communications link. In addition to visual events, the system can transmit non-visual events like a fire alarm signal or smoke alarm signal.
[0055] Note that a single visual event detector may operate continuously, and receive input from a single video camera typically running at 15, 30 or 60 frames per second. A typical deployment may involve several or dozens of visual event detectors running on the input from several or dozens of video cameras. By way of example of such operation ten (10) channels of raw video data generate 5 Mb/s per HD video channel or 50 Mb/s in aggregate, which represents a substantial volume of input, and renders the use of bandwidth reduction, as described above, is highly desirable.
III. System Operation
[0056] With reference to
[0057]
IV. Hybrid Event Detector
[0058] Reference is made to
[0059] The temporal change detector 210 processes full framerate raw video input or an appropriate image sequence 212 from at least one camera and produces as output sampled video output of scenes with activity and moving objects. However, broadly stated, the existence of signal of interest can be detected by the presence of an object as well as motion thereof (i.e. free of object movement). Other triggers for possible detection can be derived from (e.g.) other types of sensors, a timer, and/or interleaved input from multiple cameras.
[0060] The depicted, exemplary detector 210 in
[0061] As shown further in
[0062] It is noted that additional modules can be provided optionally to the general flow of
[0063] With reference to
[0064] Reference is further made to
[0065] With reference to
[0066]
[0067] In operation, the classifier receives output from the deep-learning vision detectors that report on what has been detected such as a person, or an object (such as a tool or motor), or a boat/vessel, and the additional information from the deep-learning vision detectors on where in the particular image that detection took place, and possibly pose information (how exactly the person is standing or the object is positioned in space). It either directly converts that output into an alert or more typically using mathematical logic, combines it with additional information such as the expected detection/location/pose and duration of the output to form the specific alert.
[0068] It is recognized that a deep learning model typically occupies significant memory resources (on the order of several gigabytes). In order to run at video frame rate for a single camera, multiple, e.g., eight (8) commercially available NVidia GPU's, may be required. However, the system and method described herein, and run on the processing arrangement 150 and associated computing platform(s) 130 and 142 of
V. Operational Examples
[0069] By way of non-limiting example, the above-described system and method can operate in a variety of instances.
[0070] A. Event Examples
[0071] 1. An example of a crew behavior visual event is that crew members are performing expected activities on the bridge of the vessel such as navigation at the expected time and the event also includes a reported location, start time, end time and elapsed time.
[0072] 2. An example of crew safety visual event is an alert that the crew members are wearing hard-hats when required to do so by their assigned activity.
[0073] 3. An example of a ship maintenance visual event is an alert that engine oil is being added to the engine at an appropriate time.
[0074] 4. An example of a ship environment visual event is an alert that another vessel is in the vicinity.
[0075] 5. An example of an active cargo monitoring visual event is an alert that the crew members have performed an inspection round on the cargo.
[0076] B. Further Examples of Maritime Visual Events Reported by the System
[0077] 1. A person is present at their station at the expected time and reports the station, start time, end time, and elapsed time.
[0078] 2. A person has entered a location at the expected time and reports the location, start time, end time, and elapsed time
[0079] 3. A person moved through a location at the expected time and reports the location, start time, end time, and elapsed time
[0080] 4. A person is performing an expected activity at the expected location at the expected time and reports the location, start time, end time, and elapsed time. The activity could be watching, monitoring, installing, hose connecting or disconnecting, crane operating, tying with ropes.
[0081] 5. A person is running, slipping, tripping, falling, lying down, using or not using handrails at a location at the expected time and reports the location, start time, end time, and elapsed time.
[0082] 6. A person is wearing or not wearing protective equipment when performing an expected activity at the expected location at the expected time and reports the location, start time, end time, and elapsed time. Protective equipment could be a hard-hat, left or right glove, left or right shoe/boot, ear protection, safety goggles, life-jacket, gas mask, welding mask, or other protection.
[0083] 7. A door is open or closed at a location at the expected time and reports the location, start time, end time, and/or elapsed time
[0084] 8. An object is present at a location at the expected time and reports the location, start time, end time and elapsed time. The object could be a gangway, hose, tool, rope, crane, boiler, pump, connector, solid, liquid, small boat or unknown.
[0085] 9. Normal operating activities are being performed using engines, cylinders, hose, tool, rope, crane, boiler, and/or pump.
[0086] 10. Maintenance activities are being performed on engines, cylinders, boilers, cranes, steering mechanisms, HVAC, electrical, pipes/plumbing, and/or other systems.
[0087] C. Operational Example
[0088]
[0089] Referring again to the functional blocks for data acquisition 141, detection pipeline 145 and data transport 151 shown in
[0090] In
[0091]
[0092] With reference to
[0093] To aid in understanding the types of detector building blocks available, the following are non-limiting examples of various Activities, Events, Sequences and GUI Activities that can be employed in the system and method herein.
[0094] The following Table lists exemplary activities in a maritime environment.
TABLE-US-00001 # activity_id scene-field_of_view metric 1 d3d_scene_round_steering_room steering_room start_time duration end_time 2 d3d_scene_round_main_engine main_engine start_time duration end_time 3 d3d_scene_round_deck_port_view deck_port_view start_time duration end_time 4 d3d_scene_round_deck_stb_view deck_stb_view start_time duration end_time 5 d3d_scene_round_generator generator start_time duration end_time 6 d3d_equipment_inspection_bridge_port_view bridge_port_view start_time duration end_time
[0095] The following tables list series of exemplary events and their characteristics/characterization in the system and method herein operating in a maritime environment.
TABLE-US-00002 # d2d_id summary 1 d2d_presence_in_polygon person in polygon 2 d2d_presence_at_station person standing at station 3 d2d_interaction_equipment person interacting with equipment 4 d2d_inspection_equipment person inspecting equipment 5 d2d_compartment_access person entering compartment 6 d2d_compartment_access_segment person entering compartment (segmentation based)
TABLE-US-00003 # description in ( ) 1 iterating over a list of bounding boxes, gets a d1d_person bounding box for a detected person, and a polygon, returns TRUE if the bounding box (person) is within the polygon, FALSE otherwise 2 iterating over a list of keypoint vectors, gets a d1d_keypoint keypoint vector for a detected person, and a polygon-station, returns TRUE if the detected person is pose-standing and their feet keypoints are within polygon-station , FALSE otherwise 3 iterating over a list of keypoint vectors, gets a d1d_keypoint keypoint vector for a detected person, and a polygon-equipment, returns TRUE if the detected person is pose-interacting and their hand keypoints are within polygon-equipment, FALSE otherwise 4 iterating over a list of keypoint vectors, gets a d1d_keypoint keypoint vector for a detected person, and a polygon-equipment, returns TRUE if the detected person is pose-facing and their feet keypoints are within polygon-station, FALSE otherwise 5 iterating over a list of bounding boxes, gets a d1d_person bounding box for a detected person, a line, and a previous state of the stream, returns TRUE if the detected person has crossed the line in the access direction, FALSE otherwise 6 d1d_segment_person
TABLE-US-00004 # out metric in (1_out) function 1 bounding_box_list timestamp bounding_box_list in_polygon:polygon-general stream_id people_count stream_id boolean_list 2 keypoint_vector_list timestamp keypoint_vector_list at_polygon:polygon-station stream_id people_count stream_id estimate_pose:pose-standing boolean_list 3 keypoint_vector_list timestamp keypoint_vector_list in_polygon:polygon-equipment stream_id people_count stream_id estimate_pose:pose-interacting boolean_list 4 timestamp keypoint_vector_list people_count stream_id 5 bounding_box timestamp bounding_box_list line_crossed:line-access stream_id people_count stream_id boolean_list 6 contour_list timestamp stream_id line_crossed:line-access stream_id people_count contour_list boolean_list 7 bounding_box timestamp bounding_box_list line_crossed:line-exit stream_id people_count stream_id boolean_list 8 bounding_box_list timestamp bounding_box_list in_scene:scene-field_of_view stream_id people_count stream_id boolean_list 9 bounding_box_list timestamp bounding_box_list line_crossed:line-access stream_id people_count stream_id boolean_list 10 bounding_box_list timestamp bounding_box_list line_crossed:line-access stream_id people_count stream_id boolean_list 11 keypoint_vector_list timestamp keypoint_vector_list estimate_pose:pose-facing stream_id people_count stream_id boolean_list 12 keypoint_vector_list timestamp keypoint_vector_list estimate_pose:pose-standing stream_id people_count stream_id boolean_list 13 keypoint_vector_list timestamp keypoint_vector_list estimate_pose:pose-interacting stream_id people_count stream_id boolean_list
TABLE-US-00005 # function_operator function_parameter function_out 1 in_polygon polygon-general boolean_list 2 at_polygon polygon-station boolean_list boolean_list estimate_pose pose-standing 3 in_polygon polygon-equipment boolean_list boolean_list estimate_pose pose-interacting 4 5 line_crossed line-access boolean_list 6 line_crossed line-access boolean_list 7 line_crossed line-exit boolean_list 8 in_scene scene-field_of_view boolean_list 9 line_crossed line-access boolean_list 10 line_crossed line-access boolean_list 11 estimate_pose pose-facing boolean_list 12 estimate_pose pose-standing boolean_list 13 estimate_pose pose-interacting boolean_list
TABLE-US-00006 # used by: UI ACTIVITIES UI ACTIVITIES copy 1 d3d_scene_round_middle 2 d3d_equipment_inspection_s
d3d_equipment_maintenance
d3d_equipment_operation_s
d3d_equipment_maintenance
d3d_equipment_operation_e
d3d_equipment_inspection_e
d3d_station_presence_start d3d_station_presence_middl
d3d_station_presence_end 3 d3d_equipment_maintenance
d3d_equipment_operation_m
4 d3d_equipment_inspection_n
5 d3d_scene_round_start d2d_compartment_access_b
6
indicates data missing or illegible when filed
[0096] The following table lists a series of exemplary sequences of events activities relevant to a marine environment.
TABLE-US-00007 # sequence_step summary EVENT (from ) d3d_id: d3d_equipment_inspection 1 d3d_equipment_inspection_start equipment inspection d2d_presence_at_station 2 d3d_equipment_inspection_middle equipment inspection d2d_inspection_equipment 3 d3d_equipment_inspection_end equipment inspection d2d_presence_at_station d3d_id: d3d_equipment_maintenance 4 d3d_equipment_maintenance_start equipment maintenance d2d_presence_at_station 5 d3d_equipment_maintenance_middle equipment maintenance d2d_interaction_equipment 6 d3d_equipment_maintenance_end equipment maintenance d2d_presence_at_station d3d_id: d3d_equipment_operation 7 d3d_equipment_operation_start equipment operation d2d_presence_at_station 8 d3d_equipment_operation_middle equipment operation d2d_interaction_equipment 9 d3d_equipment_operation_end equipment operation d2d_presence_at_station d3d_id: d3d_scene_round 10 d3d_scene_round_start single-scene round d2c_compartment_access 11 d3d_scene_round_middle single-scene round d2d_presence_in_polygon 12 d3d_scene_round_end single-scene round d2d_compartment_exit d3d_id: d3d_station_presence 13 d3d_station_presence_start person at station d2d_presence_at_station 14 d3d_station_presence_middle person at station d2d_presence_at_station 15 d3d_station_presence_end person at station d2d_presence_at_station
TABLE-US-00008 function_parameter # (from 2) # of repetitions order condition value 1 polygon-station single start TRUE_duration >1 sec pose-standing 2 single middle TRUE_duration >10 sec 3 polygon-station single end FALSE_duration >5 sec pose-standing 4 polygon-station single start TRUE_duration >5 sec pose-standing 5 polygon-equipment single middle TRUE_duration >10 sec pose-interacting 6 polygon-station single end FALSE_duration >5 sec pose-standing 7 polygon-station single start TRUE_duration >5 sec pose-standing 8 polygon-equipment multiple middle TRUE_duration >10 sec pose-interacting 9 polygon-station single end FALSE_duration >5 sec pose-standing 10 line-access single start boolean_detection TRUE 11 polygon-general multiple middle TRUE_duration >1 sec 12 line-exit single end boolean_detection TRUE 13 polygon-station single start TRUE_duration >1 sec pose-standing 14 polygon-station multiple middle TRUE_duration >5 sec pose-standing 15 polygon-station single end FALSE_duration >5 sec pose-standing
TABLE-US-00009 # metric description ACTIVITIES3 1 start_time Person standing by equipment and d3d_equipment_inspection bridge_por facing it, observing it 2 duration Person standing by equipment and d3d_equipment_inspection bridge_por facing it, observing it 3 end_time Person standing by equipment and d3d_equipment_inspection bridge_por facing it, observing it 4 start_time Person interacting with equipment in a way that implies maintenance . . . 5 duration Person interacting with equipment in a way that implies maintenance . . . 6 end_time Person interacting with equipment in a way that implies maintenance . . . 7 start_time Person interacting with equipment 8 duration Person interacting with equipment 9 end_time Person interacting with equipment 10 start_time Person traversing a path within a d3d_scene_round steering_room scene, beginning and finishing at . . . d3d_scene_round main_engine d3d_scene_round deck_port_view d3d_scene_round deck_stb_view d3d_scene_round generator 11 duration Person traversing a path within a d3d_scene_round steering_room scene, beginning and finishing at . . . d3d_scene_round main_engine d3d_scene_round deck_port_view d3d_scene_round deck_stb_view d3d_scene_round generator 12 end_time Person traversing a path within a d3d_scene_round steering_room scene, beginning and finishing at . . . d3d_scene_round main_engine d3d_scene_round deck_port_view d3d_scene_round deck_stb_view d3d_scene_round generator 13 start_time person staying for significant time at station 14 duration person staying for significant time at station 15 end_time person staying for significant time at station
[0097] The following table lists a series of exemplary User Interface activities relevant to a marine environment.
TABLE-US-00010 # ui_activity_id scene metric (from 2 EVENTS) 1 d2d_presence_in_compartment_bridge_port_view bridge_port_view timestamp people_count 2 d2d_compartment_access_bridge_port_view bridge_port_view timestamp people_count 3 d2d_compartment_exit_bridge_port_view bridge_port_view timestamp people_count
[0098] It should be clear that the foregoing tables represent examples of operation examples that are relevant to a particular marine environment and can be varies for use in different environments.
[0099] A further option detection process contemplated herein relates to detection of a hazard that the bridge lights are on while the vessel is underway at night. By way of analogy, a similar hazard occurs in a vehicle at night if the lights in the car cabin are on. If so, the driver loses night vision accommodation, and is unable to see well outside in the dark. At night, seafarers on the bridge who are supposed to be on watch or lookout cannot do their jobs properly if the lights are on in the bridge. Often muted red lights and/or instrument panel lights are substituted for full illumination of the bridge space and its surroundings at night.
[0100] The detection software employs, as decision-making inputs, a satellite-based Automatic Identification (AIS) signal that the vessel is underway and not in port or at anchor, and also determines the GPS ship coordinates. In this manner, the detection process can estimate when is nighttime as defined by sunrise/sunset or more precisely when is 30 minutes after dusk and 30 minutes before dawn which depends on the exact location of the ship on the globe. It also employs the UTC time from the ship's clock to know if it is nighttime based on that GPS location. All of these inputs are used by the process to correlate whether it is nighttime (within a predetermined fixed or variable threshold of dusk/dawn) at the current time/latitude/longitude. This is then compared to an indicia of whether lights are on/off. In an example, two vision algorithms/processes are employed to estimate if the lights are on, including (1) using the camera as an absolute light meter measuring the light coming into the camera on the bridge with a threshold and (2) using the camera as a relative light measurement device looking at the distribution of gray values by the histogram of the scene where we look for bright objects corresponding to an illuminated object or light fixture. The detection is reported as positive (e.g. bridge lighting at night is on when underway) if all of these tests pass simultaneously. If so, an alert is issued.
VI. Conclusion
[0101] It should be clear that the above-described system and method for allowing the hybrid detection of events using both deep learning and code-based algorithms provides an efficient and effective mechanism for identifying visual events in a remote environment, such as a ship, where computing and communication bandwidth resources can be limited. The events can be highly variable and involve both personnel and equipment, as well as cameras and other appropriate sensors.
[0102] The foregoing has been a detailed description of illustrative embodiments of the invention. Various modifications and additions can be made without departing from the spirit and scope of this invention. Features of each of the various embodiments described above may be combined with features of other described embodiments as appropriate in order to provide a multiplicity of feature combinations in associated new embodiments. Furthermore, while the foregoing describes a number of separate embodiments of the apparatus and method of the present invention, what has been described herein is merely illustrative of the application of the principles of the present invention. For example, as used herein, the terms “process” and/or “processor” should be taken broadly to include a variety of electronic hardware and/or software-based functions and components (and can alternatively be termed functional “modules” or “elements”). Moreover, a depicted process or processor can be combined with other processes and/or processors or divided into various sub-processes or processors. Such sub-processes and/or sub-processors can be variously combined according to embodiments herein. Likewise, it is expressly contemplated that any function, process and/or processor herein can be implemented using electronic hardware, software consisting of a non-transitory computer-readable medium of program instructions, or a combination of hardware and software. Additionally, as used herein various directional and dispositional terms such as “vertical”, “horizontal”, “up”, “down”, “bottom”, “top”, “side”, “front”, “rear”, “left”, “right”, and the like, are used only as relative conventions and not as absolute directions/dispositions with respect to a fixed coordinate space, such as the acting direction of gravity. Additionally, where the term “substantially” or “approximately” is employed with respect to a given measurement, value or characteristic, it refers to a quantity that is within a normal operating range to achieve desired results, but that includes some variability due to inherent inaccuracy and error within the allowed tolerances of the system (e.g. 1-5 percent). Accordingly, this description is meant to be taken only by way of example, and not to otherwise limit the scope of this invention.