Augmented Cognition Methods And Apparatus For Contemporaneous Feedback In Psychomotor Learning

Abstract

A method of creating a scalable dynamic jointed skeleton (DJS) model for enhancing psychomotor leaning using augmented cognition methods realized by an artificial intelligence (AI) engine or image processor. The method involves extracting a DJS model from either live motion images of video files of an athlete, teacher, or expert to create a scalable reference model for using in training, whereby the AI engine extracts physical attributes of the subject including arm length, length, torso length as well as capturing successive movements of a motor skill such as swinging a gold club including position, stance, club position, swing velocity and acceleration, twisting, and more.

Claims

1. A method to teach psychomotor skills to a live athlete or student comprising a camera, an image processor, a dynamic jointed skeleton reference model and a display device visible by the live athlete during practice, whereby; the live athlete's movements are captured by a camera in real time as a succession of video frames and filtered to remove superfluous detail; the image processor analyzes the live athlete's relevant physical attributes from the captured video frame images then scales the dimensions of the dynamic jointed skeleton to best match the live athlete's body dimensions; the scaled dynamic jointed skeletal model generates images of a jointed skeleton as a motion sequence; the generated images of the jointed skeleton model are dynamically overlaid onto the live athlete's image to create a composite video image containing both live and generated image content; and where the composite image is delivered to a video display unit for the live athlete to observe thereby delivering a real-time visual comparison of the athlete's position and movements to that of the skeleton.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

[0044] FIG. 1: Diagram of learning retention pyramid.

[0045] FIG. 2: Graph of Ebbinghaus forgetting curve.

[0046] FIG. 3: Graph of Ebbinghaus forgetting curve with repeated reviews.

[0047] FIG. 4: Graph of various learning and forgetting curves.

[0048] FIG. 5: Examples of visual display devices.

[0049] FIG. 6: Comparison of different height golfers.

[0050] FIG. 7: Pixilation resulting from image scaling.

[0051] FIG. 8: Poor angle reference image capture.

[0052] FIG. 9: Comparison of two similar sized golfers.

[0053] FIG. 10: Comparison of various body types in height and proportion.

[0054] FIG. 11: Illustration of a stepwise golf swing.

[0055] FIG. 12: Graph of rotational speed of a golf swing.

[0056] FIG. 13: Graph of rotational speeds of various golf swings.

[0057] FIG. 14: A-B side-by-side video image overlays comprising raw images and edge detection.

[0058] FIG. 15: A-B ghost image video overlay comprising raw images and edge detection.

[0059] FIG. 16: Diagram of modified learning retention pyramid including augmented cognition.

[0060] FIG. 17: Flow chart of image capture sequence and model extraction.

[0061] FIG. 18: Filtering used to remove superfluous content from video images.

[0062] FIG. 19: DJS model extraction of a human.

[0063] FIG. 20: Alternative DJS models useful for human motion modeling.

[0064] FIG. 21: Demonstration of a simple jointed skeleton model emulating walking.

[0065] FIG. 22A: Conversion of a video capture of a golf swing into a dynamic jointed skeleton sequence.

[0066] FIG. 22B: Graph of golf club velocity for a smooth continuous-motion swing.

[0067] FIG. 22C: Graph of golf club velocity for a discontinuous-motion swing with a variable delay between backswing and downswing.

[0068] FIG. 23: DJS model parameter extraction to generate compact dynamic model.

[0069] FIG. 24: Curve fitting of various mathematical models to measured data.

[0070] FIG. 25: File generation comprising reference video with extracted DJS model, course map, and file data

[0071] FIG. 26: AI based model visualization UX including rotation and frame interpolation.

[0072] FIG. 27: Rotation of jointed skeletal model.

[0073] FIG. 28: Simulation of DJS model including terrain data.

[0074] FIG. 29: DJS model derived trajectory animation.

[0075] FIG. 30: Block diagram of image DJS overlay and augmented cognition system and process.

[0076] FIG. 31: Block diagram of data acquisition and scoring combining live video and launch sensor data.

[0077] FIG. 32: Ball drop position and scoring based on DJS model derived trajectory simulation.

[0078] FIG. 33: Combined video and ultrasonic launch sensor data capture.

[0079] FIG. 34: Torque data extraction from video and ultrasonic launch sensor data.

[0080] FIG. 35: MEMs sensor position, velocity, and torque data extraction.

[0081] FIG. 36: Golf ball drive position and velocity MEMs sensor data.

[0082] FIG. 37A: Ball sensor data acquisition.

[0083] FIG. 37B: Club sensor data acquisition.

[0084] FIG. 38A: Graphical components of realtime DJS overlays (front view).

[0085] FIG. 38B: Graphical components of realtime DJS overlays (rear view).

[0086] FIG. 38C: Scaling graphical components of DJS model to different sizes using proportional and graphical edge methods.

[0087] FIG. 39: Image capture and DJS screen display on golf course.

[0088] FIG. 40: AI-based separation of high-resolution video into DJS model, low-resolution video, and silhouette video.

[0089] FIG. 41: Overlay comparison of silhouette video to expert DJS model.

[0090] FIG. 42: DJS overlay image projected onto heads-up display glasses.

GLOSSARY

[0091] AR Glasses: Wearable Augmented Reality (AR) devices that are worn like regular glasses and merge virtual information with physical information in a user's view field. AR Glasses, also known as smart glasses, are usually worn like traditional glasses or are mounted on regular glasses.

[0092] Artificial Intelligence (AI): A branch of computer science dealing with the simulation of intelligent behavior in computers, or alternatively the capability of a machine to imitate intelligent human behavior. The Turing Test is one measurement of the successful realization of AI.

[0093] Augmented Cognition: A form of human-systems interaction in which a tight coupling between user and computer is achieved via physiological and neurophysiological sensing of a user's cognitive state or through audio-visual sensing and feedback.

[0094] Augmented Reality (AR): A technology that superimposes a computer-generated image on a user's view of the real world, thus providing a composite view.

[0095] Cognition: The mental action or process of acquiring knowledge and understanding through thought, experience, and the senses. Cognition may be achieved biologically in the brain or may be emulated through Artificial Intelligence.

[0096] Contemporaneous Feedback: Information feedback comprising electrical, visual, auditory, or other sensory mechanisms occurring in real time, i.e. with minimum delay, from the action or event being monitored or measured.

[0097] Heads-up display (HUD): A transparent or miniaturized display technology that does not require users to shift their gaze from where they are naturally looking. A HUD should not obstruct the user's view. Some, but not all, AR Glasses may be considered as HUDs.

[0098] Image Capture: The process of obtaining a digital image from a vision sensor, such as a camera, or a camera phone. Usually this entails a hardware interface known as a frame grabber capturing a succession of video frames, converting the image's analog values (gray scale) to digital, and transfers the files into computer memory or transmitted across a communication network. The conversion process is often accompanied with image compression.

[0099] Image Overlay: A type of process or technology combining multiple images into a common graphical representation displayed on a video screen, or via VR headset or AR glasses. A dynamic image overlay performs image overlay on a frame-by-frame basis for rapid or real time playback.

[0100] Kinesthesia: In biology, a sense mediated by receptors located in muscles, tendons, and joints and stimulated by bodily movements and tensions, or in robotics the application of sensory data to control the movement of mechanical appendages or prosthesis. Also known as kinesthesis or kinesthetic control. Kinesthesia based learning is also referred to as Psychomotor Learning.

[0101] Latency: In computer and communication networks, the amount of time delay before a transfer of data begins following an instruction for its transfer. In discontinuous or sporadic processes, Latency may be considered as start-up delay.

[0102] Learning: The acquisition of knowledge or skills through experience, study, teaching. training, and practice.

[0103] Machine Learning: The application of artificial intelligence (AI) that provides systems the ability to automatically learn and improve from experience without being explicitly programmed.

[0104] Model Parameters: Variables used to match a mathematical model to measured data and to predict behavior, stimulus-response patterns, and Kinesthesia.

[0105] Muscle Memory: The learning and repeated reinforcement of psychomotor skills where an athlete or student is able to consistently repeat a movement or skill without being consciously aware of their learned actions. The repetitious practice and psychomotor learning, gymnasts, drummers, golfers, and baseball pitchers and batters exhibit muscle memory.

[0106] Propagation Delay: In computer and communication networks, the amount of time it takes for a signal to travel from its source or sender to a receiver or recipient. It can be computed as the ratio between the link length and the propagation speed over the specific medium. Propagation Delay may be considered as transport time for a data packet across a communication link or through a network and does not generally include Latency.

[0107] Psychomotor Control: The closed loop control of muscles and movement where afferent nerves detect skeletomuscular movement, position, or force, and via nerve transduction through the peripheral nervous System (PNS) and central nervous system (CNS) informing the brain of muscle action, and where the brain cognitively responds to the stimulus by sending instructions to corresponding efferent nerves on the same muscle tissue to adjust movement.

[0108] Psychomotor Learning: The process of learning involved in developing motor skills such as movement, coordination, manipulation, dexterity, grace, strength, and speed used in athletic activity, or needed in the operation of tools or instruments.

[0109] Turing Test: A method of inquiry in artificial intelligence (AI) for determining whether or not a computer is capable of thinking like a human being. Turing proposed that a computer is said to possess Artificial Intelligence if it can mimic human responses under specific conditions.

[0110] Virtual Reality (VR): The computer-generated simulation of a three-dimensional image or environment that can be interacted with in a seemingly real or physical way by a person using special electronic equipment, such as a helmet with a screen inside or gloves fitted with sensors. Also referred to as “artificial reality”.

[0111] VR Headset: A head-worn apparatus that completely covers the eyes for an immersive 3D experience. VR headsets are also referred virtual reality glasses or goggles.

DESCRIPTION OF THE INVENTION

[0112] Given the innumerable problems in producing video image overlays that match the size and proportions of a student or trainee to a reference or expert's movement or timing, the application of video images, scaled or unscaled, is not applicable or useful for psychomotor learning. Moreover, such video content lacks the contrast or camera angle for a trainee to clearly observe the movements of the instructor, reference, or coach's actions. Recorded videos, in fact, contains superfluous images such as trees, landscapes, crowds, weather and other artifacts that only obscure the important content and impede the use of image enhancement technology.

[0113] As described herein we propose an inventive method and apparatus to achieve contemporaneous feedback for psychomotor learning through the application of dynamic jointed skeleton (DJS) motion modeling and mirroring enhanced by AI-hosted augmented cognition technology, methods used to adapt the training procedures to the user's learning. Referring to FIG. 16, as shown by the modified retention pyramid 190, the enablement of concurrent psychomotor learning accelerates the acquisition of new skills while improving learning retention. Enabled by artificial intelligence and machine learning to improve training process efficiencies, contemporaneous feedback for psychomotor learning offers the potential of improving retention beyond 85% to 90%, as illustrates by the topmost pyramid piece 191. In this sense, as the student learns from the system, the system adapts, learning the student's behaviors and adapting its training process thereto. For example, if a golfer spends a longer duration at the top of their swing, i.e. with the club above their head, than other golfers, the system will realize their behavior and not commence the downswing demonstration so quickly. In this manner the golfer doesn't feel rushed or uncomfortable with the instructive training images needed for psychomotor learning.

[0114] The method of contemporaneous feedback for psychomotor learning through augmented cognition involves two fundamental steps. In the first step, referred to herein as “image capture and model extraction” reference content, generally a video of an expert or coach, is converted into a behavioral model and stored in a model library for later or possibly contemporaneous use. During this process, a reference video of an expert or coach is converted by an artificial intelligence (AI) engine into a dynamic jointed skeleton (DSJ) model—a physical and behavioral model capable of producing a sequence of images that describe the essential elements of the instructor's actions and motions. In the absence of sufficient information, the AI engine extracts a model to the best of its ability given the quality of its input, generally video content. With access to a library of prior model extractions, the AI engine adapts its model extraction algorithms using machine learning (ML) to improve the efficiency and accuracy of the model over time.

[0115] The addition of physics-based models and equipment specification libraries further improves the intelligence of the AI used in the extraction process. The resulting model represents a kinesthetic description of an expert or instructor's actions scalable to match the size and proportions of a student. In one such model, described here as a “dynamic jointed skeleton” or DSJ, the model parameters are converted to graph elements of varying length edges and vertices that define the allowed motions of one edge to another. The model parameters comprise numeric variables used to match the Dynamic Joint Skeleton's mathematical model to measured data. Once calibrated to maximize model accuracy, the DJS model can be used to visually depict complex movement, to predict kinesthetic behavior, and stimulus-response patterns. While the disclosure relies on the use of a DSJ model, the disclosed method may be adapted to other forms of dynamic motion models, for example holographic 3D models, as they become available.

[0116] In the second step, referred to here as “image DJS overlay and augmented cognition” a live or processed image of an athlete is displayed in a visualization device superimposed with an interactive image of the dynamic joint skeleton or other applicable image renderings (such as a hologram), whereby the trainee can mimic the actions of the reference model, the dynamic model scaled to the exact proportions of the trainee's body. The dynamic model includes event triggers and employs synchronization methods, adapting the model's movement to synchronize to the trainee's actions, incrementally adjusting the model to the expert's actions until the trainee and the model are both executing the same actions in accordance with the trainer or expert's actions used to create the reference model. Since the image DJS overlay is dynamic, i.e. involving movement of both the reference DJS model and the trainee, the AI visualization system adapts its instruction methods to better instruct the trainee in a step-by-step process.

[0117] Machine learning of an AI system based on the bidirectional data flow of an AI-based instructor teaching a student and the student's actions affecting the way the AI instructor provides instruction is referred to herein as “augmented cognition.” Moreover, since the image DJS overlay occurs in real time, i.e. comprising “contemporaneous feedback” to the trainee, the learning curve is accelerated and the subsequent forgetting curve depth and duration is minimized even in the extended absence of a live coach. Using the disclosed methods adapting augmented cognition to contemporaneous feedback through visual based image DJS overlays, psychomotor learning is thereby accelerated. Other forms of feedback to the trainee may also be employed including tactile, haptic, audible or other methods.

[0118] In its advanced form, evolution of the AI engine may ultimately lead to the synthesis of an AI reference model that outperforms experts in the field used to educate the ML system during AI pattern imprinting. Later, these same behavioral models may be used to direct the actions of robots, for example, leading to a robotic golfer or tennis player with expert skills.

[0119] Image Capture Sequence and Model Extraction—The first step in the disclosed method and apparatus for augmented cognition for psychomotor learning involves the extraction of a behavioral model. As shown in the flow chart of FIG. 17, the process starts with a video of an expert, trainer, or coach as a reference video 200. The video content may be prerecorded or comprise a live video stream from a camera recording a demonstration, teaching session, instruction, or from a live competitive event. A digital filter process 201 then optionally modifies the image to improve contrast and enhance the images on a frame-by-frame basis by removing or diminishing the presence of extraneous background features. For example, digital filtering shown in FIG. 18 is able to completely remove background content 208 from an image of golfer 207 leaving a blank background 209.

[0120] The output of this filtering process 201 is then fed into an AI processor operation 202 to extract a DJS model file 203, a dynamic jointed skeleton that captures the key features of the reference image, specifically, the expert or coach and any associated equipment involved in the motion. Exemplified in FIG. 19, the use of artificial intelligence is able to identify the shape of a human body 210 and identify human body parts including the head 211a, shoulder joint 211b, elbow 211c, hand 211d, hip joint 211e, knee 211f, ankle 211g, and toes 211h. Once extracted, the resulting DJS model 220 connects the identified joints with connectors representing inflexible components, such as neck bone 212a; scapula or shoulder blade 212b; upper-arm bone or humerus 212c; the forearm bone 212d, the spine and rib cage 212e; the hip bone or pelvis 212f: the femur or upper leg bone 212g; the lower leg bone 212h comprising the tibia and the fibula, and various bones collectively comprising the foot 212i. To identify stick or skeleton models including joints, we refer to the skeletal model 220 as a ‘jointed’ skeleton to highlight that joints between bones are explicitly identified.

[0121] During the parameter extraction process these physical attributes are parameterized, i.e. the dimensions of the body parts are converted into numerical variables defining the reference athlete's body shape. The dimension of each parameter value is then measured and a file created for the unscaled model accurately matching the reference image. These parameters may, for example, include without limitation: [0122] The variable x.sub.sb, describing the width of the shoulder blade 212b, [0123] The variable x.sub.ua, describing the length of the upper arm 212c, [0124] The variable x.sub.fa, describing the length of the lower arm also known as the forearm 212d, [0125] The variable x.sub.t, describing the length of the body's trunk or torso as measured from the shoulder blade 212b and the waist 212f, [0126] The variable x.sub.ul, describing the length of the upper leg 212g also referred to by their attached muscles quadriceps, [0127] The variable xii, describing the length of the upper leg 212h also referred to by their attached muscles the calves,

[0128] These variables are used to define the relative size of an athlete's body parts and their overall size. By parameterizing the DJS model as described, the scalable model is created—a model that can be adjusted to match the size and shape of any athlete. Combining the scalable model with classical physics, the method is capable of predicting the impact of a change in an athlete's physical attributes from the original reference model. For example, if an athlete's legs are shorter, the model can be used to predict changes in force and launch angle of a golf ball during tee-off, and adjust the swing accordingly to produce the same result as the expert despite the fact that the golfer is taller or shorter than the expert used to create the reference model.

[0129] The model therefore is not simply adjusted for an athlete's size but must also be adjusted in accordance with physics to achieve the desired performance, compensating for any size changes. In essence the question is not simply “how would Tiger Woods swing the club if he were my height?” but more importantly “how would Tiger Woods have to adjust his swing to produce the same result if he were my height?” Only by simplifying an expert's movements, i.e. their dynamics, into a dynamic jointed skeleton model, can force, club acceleration and ball velocity be modeled in a succinct and rapid manner using a minimal number of calculations. A static model cannot predict force.

[0130] In mathematic vernacular, the joints of a DJS model are referred to as ‘vertices’ and the connecting bones are defined as ‘edges.’ As in any physical system subject to Newtonian mechanics, i.e., classical physics, the relation of movement of edges at a vertex are subject to physical laws of motion in response to force or torque. As such physics can be used to govern the dynamic movement of the model in time, hence the acronym DJS for ‘dynamic’ jointed skeleton. Given that the DJS is governed by physics, an extracted model can be analyzed for linear and angular position, velocity, and acceleration by analyzing the time movement of the graph edges with respect to the vertices and other edges. To extract forces in an analysis, [https://en.wikipedia.org/wiki/Newton %27s_laws_of_motion], we must employ Newton's 2.sup.nd Law which states the linear vector equation F=ma for linear motion, where m is mass, a is an acceleration vector, and F is a vector force. Alternatively for angular or rotational movement like swing a golf club or a baseball bat, it is convenient to use the rotational version of the 2.sup.nd law τ=Iα where τ is a torque vector, I is the moment of inertia, and α is an angular acceleration vector [https://brilliant.org/wiki/rotational-form-of-newtons-second-law/]. Given the description of body mass for the athlete derived by knowing his weight, and the mass of material and density composition of the equipment specified in an equipment specification library 206 shown in FIG. 17, AI operation 204 employs artificial intelligence to extract vector force model parameters 205 for further analysis. The relevant force model parameters depend on the action being performed. For example, in a golf tee-off, an extracted force analysis involves the force which the ball is hit and the force with which the club strikes the ball. Through the use of physics such information can be used to compare one athlete's performance to another or to evaluate how the ball will travel on a given course. Since the mass of the golf club affects momentum transfer and ball launch velocity, the precise weight characteristics can be downloaded from an equipment specification library 206 in order to improve the absolute accuracy of the vector force model parameters 205 used in compact dynamic DJS model 225.

[0131] In general, all motion occurs at explicitly identified joints. In the DJS depiction shown in FIG. 19 two exceptions to the rule that motion only occurs at a joint should be mentioned. First of all, the unidentified virtual neck-shoulder joint 213 between neck bone 212a and shoulder blade 212b does allow a limited degree of rotational movement. Variables may be used to specify any body dimension including the length of the athlete's forearm x.sub.fa, the length of their upper arm x.sub.ua, the width of their shoulder blade x.sub.sb, the length of their torso x.sub.t, the length of their upper leg x.sub.ul, and the length of their lower leg xii. A precise physical description is not needed to predict movement. For example, the spine-ribcage 212e is depicted graphically as a triangle meeting the hipbone 212f at a single unidentified virtual hip-spine joint 214, whereby a limited range of rotation is allowed. An alternative representation is shown in FIG. 20 where the neck-shoulder joint 213 and hip-spine joint 214 are explicitly illustrated in DJS model 221. A more accurate DJS model 222 explicitly separates spine and rib cage 212e into two components—an upper thoracic spine and rib cage 212m, and a lower spine or lumbar 212n.

[0132] The addition or more vertices complicates the DJS model, slowing simulation and real-time animation. As such, care should be taken not to add any vertices unless it is needed to properly model a movement. For example, modeling the foot may or may not improve model accuracy. Overly complex models make timely calculations difficult and do not necessarily improve accuracy, as they require more variables to be used in the parameter extraction and model creation process.

[0133] Once extracted, a DJS file can be used to imitate the motion of any person as a kinematic model able to generate a video file of the motion or action such as the DJS model for walking depicted in FIG. 21. Although the methodology of a moving stick model was first realized by Walt Disney in 1929 [https://www.youtube.com/watch?v=oyrGwRWKtJg] the ability for intelligence to automatically analyze a picture and extract a multi-jointed skeleton model did not occur till the development of robotic vision in the 1990's in a process referred to a digital thinning algorithm referred to as “skeletonization” [http://homepages.inf.ed.ac.uk/rbf/HIPR2/skeleton.htm]. The process of AI-based skeletonization, using artificial intelligence to identify components of a human, animal, or machine and extract a skeleton is recent and still ongoing as a subject of deep learning research [https://www.youtube.com/watch?v=3ZhQKmSbNug].

[0134] The extraction of dynamic jointed skeleton (DJS) models for psychomotor learning disclosed herein is however unique, as it requires the extraction to extract physical characteristics that affect precision movement for a specific result according to the laws of physics and to preserve these subtle differences in the model. For example, creating a simple model of a person swinging a golf club is no different than animation, but modeling an athlete's action to predict performance requires physics based models. Animation, by contrast, need not follow the laws of physics. For a kinematic model for psychomotor learning to be useful, however, it must be physically accurate.

[0135] Capturing the precise movements of a tennis pro athlete, a master golf pro, or a world-class neurosurgeon requires a high resolution extraction of precise movements, stored with any associated equipment specifications involved in the action. For example, the length and weight of a golf club or of a tennis racket affects which DJS model needed to precisely predict the desired motion. The shape of a scalpel could be the difference between a successful surgery and inadvertently severing a nerve.

[0136] And although a library of good DJS models is a key element in quality psychomotor training, it alone is not enough. It is also important to movement by separating intervals of smooth movement and interruptions by discrete time segments identified by start and stop “triggers”.

[0137] Motion capture of a golf swing are shown in successive images of FIG. 22A including start A, backswing B, and top of swing C corresponding to images 98a, 98b, and 98c respectively. The conclusion of the backswing in image 98c occurs at a time referred to a t=0.sup.−, just an instantaneous moment before t=0 where the club's velocity is zero, i.e. v=0.

[0138] As a separate movement from the backswing, the downswing commences at an instantaneous moment called t=0.sup.+ after the completion of the backswing at t=0, also represented by image 98c. Following top of swing C, the downswing progresses through downswing D into drive E when the club strikes the ball, to follow-through E and ultimately to finish G, a sequence represented by images 98c, 98d, 98e, 98f, and 98g respectively. The equivalent dynamic jointed skeletons include shoulder 250, left arm 251a, left leg 252a, right arm 251b, and right leg 252b along with club 253. As shown, the video sequence 98a to 98g corresponds to skeletal models 240a through 240g respectively.

[0139] The golf club velocity corresponding to these positions is shown in FIG. 22B where velocity 241a represents the commencement of the backswing 242 represented by negative velocities (v<0) and by negative time, (t<0, i.e. times before t=0) corresponding to position 240a, velocity 241b corresponds to position 240b and velocity 241c corresponds to position 240c at time t=0, just after the completion of the backswing 242 at time t=0.sup.− and just before the beginning of the downswing 243 at time t=0.sup.+.

[0140] At the t=0, club velocity (in calculus, the first time derivative of position) changes polarity from negative to positive, and club acceleration (in calculus, the second time derivative of position or the first derivative of velocity) changes from negative (deceleration at the top of the backswing) to positive (accelerating at the commencement of the downswing). As such, either velocity or acceleration data can be extracted from successive video frames and used to identify the instant the backswing ends or the downswing commences. Downswing 243 is thereby graphically represented by positive values (v>0) including peak velocity 241e corresponding to position 240e and finishing at velocity 241g when the swing follow-through is complete. As such backswing 242 and the downswing 243 can be modeled as two smooth actions separated by a polarity reversal in direction and acceleration. This polarity transition can be used as a “trigger” beneficial in controlling model playback for the purpose of synchronization.

[0141] One example of the need for a triggered DJS model is to accommodate discontinuous movement. For example, some golfers stop for a moment at the top of their backswing before commencing their downswing, rather than immediately commencing the downswing as one continuous motion. This case is represented in the graph shown in FIG. 22C where downswing 243 doesn't instantly follow backswing 242, but instead is delayed by delay interval 244 of a duration Δt.sub.d. The delay varies dramatically with athletes, ranging from 100 ms (almost instantaneously) to up to 10 seconds.

[0142] An athlete who feels comfortable waiting five seconds at the top of their swing cannot comfortably learn psychomotor skills from watching a video of an athlete who holds his club for less than a second at the top of their swing because they will feel rushed trying to catch up with the video. By partitioning the DJS model into discrete pieces of continuous movement defined by event triggers, delays and motion interruption can be matched to the student's needs. Consistent with FIG. 22B, in FIG. 22C time t=0 represents the end of the backswing 242. Therefore the backswing commences (when velocity 241a is zero) at the time t=−t.sub.bs corresponding to trigger 1 and concludes at time t=0. Downswing 243 however does not commence until trigger 2 after a variable delay Δt.sub.d depicted by interval 244. Starting at velocity 245 equal to zero, downswing 242 therefore does not commence until time t=Δt.sub.d. Lasting a duration Δt.sub.ds downswing 242 does not conclude until the time t=Δt.sub.d+Δt.sub.ds when velocity 242g reaches zero. The integration of trigger 1 and trigger 2 into the DJS model therefore allows model playback to be broken into two pieces backswing 242 and downswing 243 separated by a variable delay 244 specified by detecting a condition, e.g. a student commencing their downswing, and commencing playback by activating trigger 2. In this manner, the student went feel rushed or pressured into trying to match the reference video of another athlete.

[0143] The same principle of trigger based discontinuous playback can be adapted to learning other psychomotor skills such as surgery, where an incision is made in two strokes rather than in one continuous movement.

[0144] As shown in the exemplary video frames and extracted skeletal models of FIG. 23, a compact dynamic jointed skeletal model 225 minimizes the error in predicting all the respective movements in captured motion sequence 255, shown in FIG. 24. Mathematical models to describe actual measured data of curve 255 include linear model A shown by curve 256a, exponential model B shown by curve 256b, polynomial model C shown by curve 256c, and higher order polynomial D shown by curve 256d. Constants in the mathematical model are adjusted to minimize overall errors to maximize the curve fit accuracy to actual data represented by curve 255. The adjustment of model curve fitting parameters to minimize errors is referred to parameter extraction. Parameter extraction is an imperfect process where the accuracy of a curve fit over a limited range may be increased by sacrificing accuracy over the full interval, or vice versa. Using an artificial intelligence engine to interpret a series of graphical images, however, errors can be minimized over repeated events or video sources, allowing the system to better “learn” what it is looking at.

[0145] As shown in FIG. 25, once a compact dynamic DJS model 225 is extracted, its reference source video 200, i.e. source data file 260 is combined in file generation process 263 with GPS 3D terrain course map 262 and file source data information 261 to file 264 for cloud storage. Source data information 260 may include the name of an athlete, the specific golf course, the name of the event, the date of the event, as well as the athlete's score for each hole (measured against the hole's par), and more. Terrain information 262 may also be included and may optionally include wind information by time (although this data is difficult to extract from pre-recorded files where weather information is unavailable).

[0146] As depicted in FIG. 26, playback of stored files comprises user 268 selecting a specific expert file from cloud file storage 264 via user interface 265. The download comprises compact model 225 and course map 262. UI/UX control 265 instructs DIS model visualizer 266 to process the data and model using artificial interface (AI) engine 267. AI processing includes 3D rotation, terrain, animation, parametrics, and calculation of performance evaluation, i.e. scoring. The generated video from the DJS model 225 may include sampled frames 240c, 240d, 240e, and 240f, and interpolated frames such as 240z generated from the DJS model. The model can be “played” like a movie by executing an evaluation of movement on a frame-by-frame basis over time. The model can be synchronized to live video trigger or run autonomously.

[0147] Another feature of DJS model 225 with AI engine 267 shown in FIG. 27 includes 3D rotation of image 240d for side view 240x or rear view 240y. Based on physical models the rotation can be performed even though only a single camera is used to capture a video image. In this manner, the DJS model always can be rotated to match any available video source or even compared against multiple video sources.

[0148] In FIG. 28, AI engine 267 is able to combine DSJ model 240g with terrain 268. By combining a golf course terrain with a DJS model, a student can analyze how an expert played a particular hole. Beneficially a student can play the hole themselves, comparing their performance against the expert.

[0149] In FIG. 29, DSJ model 240f is combined with drive animation 271 to display ball trajectory 269. The analysis of any video can categorize 270 the trajectory result 272 parametrically as a slice, hook, fade, draw, push, pull or a perfect “pure” stroke. Scoring may also be assigned to the drive and used in evaluating competitive performance.

[0150] Image overlay and Augmented Cognition—The process and apparatus of augmented cognition for psychomotor learning using a kinematic DJS model with contemporaneous feedback via A-B image DJS overlays is illustrated in FIG. 30. As shown, golfer 301 on golf course 300 wishes to learn, for example, how a pro played the same course. By placing smartphone 302 on a tripod to monitor the golfer's swing, a live video file 304 is processed by an artificial intelligence using intelligent image processing within AI engine 310, preferably implemented within smartphone 302, although a separate or dedicated AI processor engine may also be used. Operations within AI engine 310 occur live, i.e. in real time, involving a complex and inventive sequence of operations as follows: [0151] AI-engine 310 downloads DJS model file 203 of a selected expert for training purposes. [0152] AI-engine 310 receives live video streaming file 304 of golfer 301 as a continuous input. [0153] On the fly, i.e. continuously, AI-engine 310 removes superfluous background content of golf course 300 from live video streaming file 304. [0154] Optionally, launch monitor 316 measures parametric data from golf tee-off and provides its measurements to AI-engine 310. [0155] AI-engine 310 identifies the image of golfer 301 in the video stream using artificial intelligence-based pattern recognition. [0156] A sample of the video images from video streaming file 304 is analyzed to extract the height and the body proportions of golfer 301 including the golf club, lengths of upper and lower legs and arms, torso length, etc. [0157] A DJS model 203 selected from the model library loaded into AI-engine 310 is adjusted to match the height and body proportions of golfer 301. [0158] A set of vector force model parameters 205 (including any equipment related specifications) is loaded into AI-engine 310 and the DJS model is adjusted for the proper acceleration needed to calculate the same force and ball trajectory as the original reference library expert's performance. [0159] In real time, AI-engine 310 outputs overlay 311 comprising the live image of golfer 301 and the scaled version of DJS model 203 at the same size and body proportions as golfer 301 but with motions matching the performance of the expert in the DJS reference library. [0160] The DJS overlay 311 is wirelessly transmitted to visualization device 313 worn by golfer 301 allowing the athlete to compare their live motions to the DJS skeleton image overlaid upon their own live video image. [0161] The image overlay 311 of the DJS model is synchronized to the motions of golfer, triggered by the golfer's motions including for example the commencement of the backswing and again at the commencement of the downswing. DJS model playback pauses until the golfer's next swing. AI-engine 310 dynamically changes its instruction images to gradually become more closely matching the expert captured in the reference model, [0162] AI-engine 310 also outputs feedback analytics 314 which may be a report summarizing the golfer's performance or may include real time analytical data such as club angles, swing planes, etc. displayed as part of image overlay 311.

[0163] As shown in FIG. 31, AI-engine 310 optionally scores feedback analytics from launch sensor 316 data and video streaming file 304, where the calculated score 315 may be used to measure the golfer's performance, including comparing the golfer's swing to the swing of an expert. The measured data may also be used to measure the golfer's tee-off performance 322 against some evaluation criteria (e.g. angle, speed, calculated drive distance, etc.). The use of evaluation criteria provide a quantifiable measurement of an athlete's performance, including the following features: [0164] Scoring 315 may be uploaded to data cloud 330 for comparing golfers 331a and 331b for competition, tournaments, gamification, rewards, tokenization, and gambling. [0165] Combined with map details shown in FIG. 32, the launch analytics may also be used to calculate ball trajectory 332 across a course and parametrically scored 315 for the ball's final destination 334 including the distance to the hole 336, landing on or off the green 335a or in the rough 335b or 335c, landing in a water or sand trap 335d, etc.
Performance evaluation can be used on a real golf course. Alternatively, the evaluation method can be applied to a golf simulator, where the athlete practices by hitting the ball into a net and but the system evaluates the tee-off performance as if the athlete was on a real golf course. In this manner, a golfer can practice by following the actions of a professional or expert depicted by the DJS model but evaluate their performance against the course or against other golfers using the simulator.

[0166] Although a smartphone or video camera combined with artificial intelligence can be used to evaluate an athlete's performance, other sensors may also be used in combination with the disclosed psychomotor learning system. For example, launch sensor 340 shown in FIG. 33 combines video image 342 with differential ultrasound or LIDAR (laser light-based radar) signals 341a and 341b to more precisely measure a golfer's swing than video alone can extract. Processed by AI-engine 310, the launch sensor 340 data can be used to precisely detect hand-angle 344 and club position 343, shoulder position 346, arm position 345, and waist angle 347. By analyzing a sequence of frames over time, positional data can be used to calculation swing speed and torque, including effective and applied arm torque 351a and 351b, effective and applied wrist torque 352a and 352b, and shoulder torque 350 as depicted in FIG. 34.

[0167] Torque, acceleration, and positional data optically measured by camera, while conveniently monitored, lacks accuracy. Additional accuracy can be gained by including microelectronic machine (MEMs) sensors in balls, clubs, and other equipment. As shown in FIG. 35 MEMs sensor 360 is able to measure a number of parameters 361 including position, velocity, acceleration, and torque versus time. The sensor relays its data to a receiver via a low power RF link 362 and internal antenna 363 using low power Bluetooth, low power WiFi (such as 802.11ah), or telephonically using 4G/LTE or low bandwidth power-saving modes of the 5G communication protocol. In FIG. 36, down-range flight trajectory 269 of golf ball 364 can be detected using MEMs sensor data 360a including signal delay, air pressure, etc., and relayed back to a receiver using RF communication. Since golf ball 364 has a dimpled but otherwise uniform surface (spherical symmetry), torque information is not important in predicting a ball's trajectory. As such, sensor 360a need only detect relative position and acceleration for not rotational velocities.

[0168] As shown in FIG. 37A, baseball 365, however, is not spherically symmetrical because of its stitching. Baseball sensor 360b must measure torque in order to predict ball trajectory. This information is difficult to measure using a camera. As shown in FIG. 37B, sensors 360c and 360d in golf club 366 and sensors 360e and 360f in baseball bat 367 must detect torque to precisely describe a ball's trajectory.

[0169] As described, the AI-based system exhibits augmented cognition whereby the behavior of the golfer is trained to match the expert's performance while the AI-engine learns best how to gradually improve the golfer's performance. In the described system, the golfer can compare their actions to an expert reference using a real-time DJS overlay. As shown in FIG. 38A a front side view combines DJS reference model 370a with live filtered video 372a, e.g. a silhouette image, to produce realtime DJS overlay 373a.

[0170] The process of scaling the DJS model to the live athlete or student allows the unscaled original DJS model 370a, having a height x.sub.h(ref) to be scaled in size to fit the height x.sub.h(live) of the live athlete image 372a. The resulting composite image, i.e. overlay 373a, thereby comprises a representative image of the live athlete 372a at full size and a scaled version of the DJS reference model 370c both consistent with the height x.sub.h(live) of the live athlete image 372a. Using artificial intelligence this scaling can be performed once at the onset of the live session or can be performed dynamically and repetitively to gradually improve the accuracy and fit of the model during each practice session.

[0171] Even without a rear-view camera, the AI system can also calculate and display the rearview image of the golfer in real time as depicted in FIG. 38B, combining a 180° rotation of DJS reference image 370b with a rotated image 372b, to produce real-time DJS overlay 373b. As such a DJS model can easily be rotated to match any image perspective of a live video feed, where video images cannot.

[0172] Aside from its advantage in image rotation, an AI-based graphics processor can execute scaling of a DJS model to match a live image or video feed of a student athlete in several ways. As shown in FIG. 38C, the unscaled DJS model 370a can be scaled proportionally or piecemeal using graphical edges, the segments between skeletal vertices. In proportional scaling all the DJS model elements including the arms, legs, and torso are scaled by a the same proportionality factor α whereby the height x.sub.w(ref) of the model's waist from the ground is scaled to a value αx.sub.w(ref) and the total height is scaled from a value x.sub.h(ref) to αx.sub.h(ref). The golf club length x.sub.c(ref) may also be scaled proportionally or alternatively be scaled to match the actual length of the club as specified by the manufacturing data sheet.

[0173] In a graphical edge scaled DJS model, every edge in the model is scaled separately to match the video frame of the live athlete, whereby the height x.sub.w(ref) of the model's waist from the ground is scaled to a value δx.sub.w(ref) and the total height is scaled from a value x.sub.h(ref) to βx.sub.h(ref) including separate scaling factors for the upper and lower legs, the torso, and the upper and lower arms. Even the golf club can be scaled separately from x.sub.c(ref) to γx.sub.c(ref).

[0174] Although filtered video images 372a and 372b are conveniently displayed as silhouettes, shadow or glow, the filtered image can also comprise an outline, a low-resolution video, or an animated depiction of the golfer. Key advantages of this approach compared to any available training aid today include [0175] The DJS reference model is scaled in size to the golfer or athletic trainee. Reference videos cannot easily be scaled especially when the source data comes from video archives, some videos dating back several decades ago. [0176] The DJS reference model can be rotated to match the camera angle of the live image of the golfer or athletic trainee. [0177] The DJS model is overlaid atop the live image of the golfer or athletic trainee so the athlete doesn't have to compare two side-by-side images, which requires the eyes to pan back and forth between the two images thereby distracting the athlete. [0178] The DJS model skeleton eliminates unnecessary detail of the reference image of the expert athlete (such as hair, hats, clothes, etc.), which can clutter the overlay and obscure details of movement. [0179] The video representation eliminates unnecessary detail of the live athlete (such as hair, hats, clothes, etc.), which can clutter the video and obscure details of movement.

[0180] In one embodiment shown in FIG. 39, display of the overlay image 373b can be conveniently realized using a standing video display 402. The live video of golfer 403 used to create the overlay can be captured using a frontside camera 400 but may optionally include a rearview camera.

[0181] As shown in FIG. 40, overcoming bandwidth limitations of real-time communication, high-resolution video 410 from camera 400 can be recorded and stored locally in non-volatile memory 411 while AI-engine 413 filters the data stream to produce either/or (i) low-resolution video and (ii) silhouette video 416 for limited-bandwidth transmission 419 between RF-amplifier 417 and RF receiver 421 via antennas 418 and 420 respectively. AI-engine 413 also can create a real-time DJS overlay video 373b based on DJS model 414. Display 402 therefore can be used to display any combination of the video sources.

[0182] In particular the DJS overlay of a live golfer's silhouette 416 and a skeletal model 430 of a reference or expert shown in FIG. 41 allows an athlete to immediately see what they are doing wrong and how their motion differs from a master athlete, e.g. the difference 433 between the student's club position 431 and the expert's club 432 positioning. The value of contemporaneous feedback is that the student or trainee can improve their swing or stroke with each attempt. Unlike a looping video, the expert only swings when the student swings their club. The movement is natural, where the expert's swing pauses at the top at t=0.sup.− and doesn't resume till it is triggered by the beginning of the student's downswing at t=0.sup.+. At the beginning of training, the AI-engine accommodates the student's slow pace, learning what is comfortable to the student. Gradually however, the AI-engine incrementally accelerates the pace until the student is matching the expert's timing and performance till the skill is learned.

[0183] As shown in FIG. 42, the DJS overlay image 444 can be displayed 443 in a heads-up display 440 using a projector 441 or other heads up or holographic display techniques.

[0184] The benefit of contemporaneous feedback in psychomotor learning using augmented cognition is applicable to a wide range of activities including sports such as diving, skating, skiing, golf, tennis, basketball, hockey, weight lifting, archery, and baseball, as well as precision professional skills such as automotive repair, surgery, sign language, and defense related activities such as marksmanship, marshal arts, etc.

Augmented Cognition Methods And Apparatus For Contemporaneous Feedback In Psychomotor Learning

Assignee

Inventors

Cpc classification

Classification Explorer

G06T2207/10016

PHYSICS

Classification Explorer

G09B5/06

PHYSICS

Classification Explorer

G09B5/12

PHYSICS

Classification Explorer

G09B5/125

PHYSICS

Classification Explorer

G06T11/00

PHYSICS

Classification Explorer

A63B2220/05

HUMAN NECESSITIES

Classification Explorer

A63B2024/0015

HUMAN NECESSITIES

Classification Explorer

G09B19/0038

PHYSICS

Classification Explorer

A63B2220/806

HUMAN NECESSITIES

Classification Explorer

G06V40/10

PHYSICS

Classification Explorer

A63B24/0006

HUMAN NECESSITIES

Classification Explorer

G06T7/251

PHYSICS

Classification Explorer

G06V20/20

PHYSICS

Classification Explorer

G09B5/065

PHYSICS

Classification Explorer

G06V40/23

PHYSICS

Classification Explorer

A63B2102/32

HUMAN NECESSITIES

Classification Explorer

G06T2207/30221

PHYSICS

Classification Explorer

G06T2207/30196

PHYSICS

Classification Explorer

A63B71/0622

HUMAN NECESSITIES

International classification

Classification Explorer

A63B71/06

HUMAN NECESSITIES

Classification Explorer

A63B24/00

HUMAN NECESSITIES

Classification Explorer

G06V40/10

PHYSICS

Classification Explorer

G06T11/00

PHYSICS

Abstract

Claims

Description