Multimodal Dialog-Based Remote Patient Monitoring of Motor Function
20230137366 · 2023-05-04
Inventors
- Oliver Roesler (Weyhe, DE)
- William BURKE (Portland, OR, US)
- Hardik Kothare (Burlingame, CA, US)
- Jackson LISCOMBE (Mill River, MA, US)
- Michael Neumann (Waiblingen, DE)
- Andrew Cornish (Gore, NZ)
- Doug HABBERSTAD (Savanna, GA, US)
- David Pautler (San Francisco, CA, US)
- David Suendermann-Oeft (San Francisco, CA, US)
- Vikram Ramanarayanan (San Francisco, CA, US)
Cpc classification
G16H20/30
PHYSICS
A61B5/0077
HUMAN NECESSITIES
A61B5/7246
HUMAN NECESSITIES
G16H50/30
PHYSICS
A61B5/4082
HUMAN NECESSITIES
G16H50/70
PHYSICS
A61B5/4803
HUMAN NECESSITIES
A61B5/1121
HUMAN NECESSITIES
A61B5/6898
HUMAN NECESSITIES
International classification
A61B5/11
HUMAN NECESSITIES
G16H50/30
PHYSICS
Abstract
A system and method for remote monitoring of patient motor functions includes a computing device that uses captured image data depicting a patient's body part and, based on movement information, detects whether a condition may exist that is affecting motor functions. The body part can be a hand that is tracked as the user performs a tapping exercise. The body part can also include the patient's face during speech and also without speech.
Claims
1. A method for remote patient monitoring of motor functions, comprising: scanning, by a computing device, a body part of a user; mapping, by the computing device, at least two points on the body part of a user; detecting, by the computing device, a relative movement between the at least two points; comparing, by the computing device, the detected relative movement against at least one metric; and determining, by the computing device, the existence of a condition based on the comparison of the detected relative movement against the at least one metric.
2. The method of claim 1, further comprising wherein: the body part comprises a hand of the user; and the at least two points comprise a point on a thumb on the hand and a point on a finger on the hand.
3. The method of claim 2, wherein the relative movement comprises a movement of the point on the thumb relative to the point on the finger of the hand during the execution of a tapping exercise.
4. The method of claim 3, wherein detecting the relative movement further comprises detecting at least one of a relative movement distance and a tapping time during the execution of the tapping exercise.
5. The method of claim 1, wherein the scanning by a computing device further comprises: capturing, by an image sensor, image data that includes an image of the body part; and recognizing, by the computing device using image recognition, the body part as a usable body part.
6. The method of claim 5, wherein the image data comprises video data, and wherein the image sensor is located remotely from a care provider.
7. The method of claim 1, wherein the detected relative movement comprises a repeated relative movement and the at least one metric comprises a decrease in a speed of movement or of a range of movement during the repeated relative movement.
8. The method of claim 7, wherein the condition comprises a disorder of motor function.
9. The method of claim 1, wherein the body part of a user comprises a user's face and the at least two points on the body part comprise a point on a lip of the user and a point on the jaw of the user.
10. The method of claim 1, further comprising: recording, by an audio capture device, speech audio from the user; comparing, by the computing device, at least one speech characteristic against a speech metric; determining, by the computing device, the existence of the condition based on the comparison of the at least one speech characteristic against the speech metric and the comparison of the detected relative movement against the at least one metric.
11. The method of claim 10, wherein the at least one speech characteristic comprises at least one of a speaking rate or a speaking duration.
12. The method of claim 1, wherein the step of determining the existence of a condition further comprises: gathering diagnostic data; and delivering the diagnostic data to a care provider.
13. The method of claim 12, wherein the diagnostic data comprises a measured change in the detected relative movement across multiple repetitions.
14. The method of claim 12, wherein the diagnostic data comprises at least one of a measured amplitude and speed across a tapping exercise.
15. The method of claim 1, wherein the scanning is performed via at least one of a video camera, an infrared sensor, and a wearable sensor.
Description
BRIEF DESCRIPTION OF THE DRAWING
[0021]
[0022]
[0023]
[0024]
[0025]
DETAILED DESCRIPTION
[0026] Throughout the following discussion, numerous references will be made regarding servers, services, interfaces, engines, modules, clients, peers, portals, platforms, or other systems formed from computing devices. It should be appreciated that the use of such terms, is deemed to represent one or more computing devices having at least one processor (e.g., ASIC, FPGA, DSP, x86, ARM, ColdFire, GPU, multi-core processors, etc.) programmed to execute software instructions stored on a computer readable tangible, non-transitory medium (e.g., hard drive, solid state drive, RAM, flash, ROM, etc.). For example, a server can include one or more computers operating as a web server, database server, or other type of computer server in a manner to fulfill described roles, responsibilities, or functions. One should further appreciate the disclosed computer-based algorithms, processes, methods, or other types of instruction sets can be embodied as a computer program product comprising a non-transitory, tangible computer readable media storing the instructions that cause a processor to execute the disclosed steps. The various servers, systems, databases, or interfaces can exchange data using standardized protocols or algorithms, possibly based on HTTP, HTTPS, AES, public-private key exchanges, web service APIs, known financial transaction protocols, or other electronic information exchanging methods. Data exchanges can be conducted over a packet-switched network, the Internet, LAN, WAN, VPN, or other type of packet switched network.
[0027] The following discussion provides many example embodiments of the inventive subject matter. Although each embodiment represents a single combination of inventive elements, the inventive subject matter is considered to include all possible combinations of the disclosed elements. Thus, if one embodiment comprises elements A, B, and C, and a second embodiment comprises elements B and D, then the inventive subject matter is also considered to include other remaining combinations of A, B, C, or D, even if not explicitly disclosed.
[0028]
[0029] The system 100 includes a computing device 110 that is communicatively coupled with a sensor 120. In the embodiments shown herein, sensor 120 is considered to be an image sensor (e.g., a camera) that is capable of capturing images of a body part 130 (in this example, a user's hand). However, it is contemplated that other sensors 120 could be used. For example, for embodiments that involve a patient speaking as a part of the evaluation, the sensor 120 can include a microphone.
[0030] The computing device 110 can include the camera 120 or can be separate from the camera 120. Suitable computing devices can include smartphones, tablets, desktop computers, laptop computers, gaming consoles, etc.
[0031] In embodiments, the computing device 110 is local to the user. In these embodiments, the computing device 110 can obtain information and execution code from a remote server in order to carry out the processes of the inventive subject matter locally.
[0032] In embodiments, the camera 120 can be local to the user (such as a standalone camera with data-exchange capability, or a camera within a local computing device) and computing device 110 can be remote from the user (e.g., a remote server) and connected to the device of camera 120 via a data exchange network such as the internet. In these embodiments some or all of the processes associated with the computing device 110 can be performed remotely. For example, in some embodiments, the remote computing device 110 performs all of the processes and the local computing device with camera 120 is only used for image capture and other user interactions. In a variation of these embodiments, some of the processes can be carried out locally by a local computing device while others are carried out remotely by a remote computing device, thus distributing the computing load.
[0033] The image data captured by the camera 120 is preferably video image data, though a series of still images can also be used.
[0034] It should be noted that in the embodiments discussed herein, the patient can be (and often is), remotely located from any health care provider—it could be their home, which may be geographically distant from their health care provider.
[0035]
[0036] Prior to the start of the processes discussed herein, the computing device 110 can be programmed to detect the presence of necessary hardware (e.g., a camera 120, microphone, etc.) and can conduct tests of the camera, microphone, speaker, etc. that will be used by the patient. The tests of these devices can include tests to determine that the devices are providing sensor data of a sufficient quality (e.g., proper microphone sensitivity, sufficient video resolution and frame rate, etc.) for the tests discussed here. A patient can access the functions of the system via a weblink, login portal, or other known methods of accessing networked or distributed computer systems.
[0037] At step 210, a camera 120 captures image data of a part of a patient's body that is to be used for the test. In this example, the image data depicts the patient's hand 130.
[0038] By applying image recognition techniques, the computing device 110 recognizes the body part in the image data. In embodiments, the computing device 110 can provide instruction if it detects that the body part is not fully within the image or the image is otherwise unusable (e.g., glare, unfocused, etc.). Thus, for example, if the hand is too close to the camera 120 such that the relevant portions of the hand are not fully visible, the computing device 110 displays instructions to the user to move the hand away.
[0039] At step 220, the computing device maps points on the body part 130 based on the captured image. The points include two active points that are to be used to determine a relative movement between the points for the purposes of the patient test.
[0040]
[0041] In embodiments the computing device maps the points by recognizing the body part (e.g., the hand of
[0042] In other embodiments, the points are physically marked on the patient's body part (such as with a marker). These physical marks then appear in the image data of the body part and are detected by the computing device at step 220.
[0043] In embodiments, software such as MediaPipe Hands can be used for hand and hand landmark (i.e., the points 310, 320) detection.
[0044] At step 230, the computing device 110 tracks the movement of the active points 320 as the user as the user performs an exercise.
[0045] The exercises such as the tapping exercise can be known exercises used for the diagnosis of conditions. For example, the tapping exercises can be those of section 3.4 of the Movement Disorder Society-Unified Parkinson's Disease Rating Scale (“MDS-UPDRS”). In this case, the exercise is a tapping exercise that requires the user to tap their thumb and pointer finger together (
[0046] In embodiments of the inventive subject matter, the computing device 110 can display prompts or instructions that show a user how to perform the exercise.
[0047] The tracking of the movements can include timing each of the individual taps during the exercise, measuring the lengths of the range of motion of each of the taps.
[0048] The computing device 110 can also track for interruptions or pauses during the exercise by determining the change in position on a frame-by-frame basis and detecting a pause by determining that the position of the fingers has not changed for a certain number of frame after being in motion.
[0049] In embodiments of the inventive subject matter, the computing device 110 tracks the movement to determine that the patient is performing the exercise correctly. If the computing device 110 determines that the patient is not performing the exercise correctly, it can provide feedback by way of instructions to assist the patient in correcting the way they are performing the exercise. To do so, the computing device 110 can compare the patient's captured movement against templates of tracked movements to determine a similarity within a percentage threshold. For example, based on the tracking of the active points 320 as well as other points 310 on the hand, the computing device 110 may determine that they user is not fully extending their finger and thumb during the exercise. In another example, tracking the points 320, 310 can allow the computing device 110 to determine that the patient is using the wrong finger to tap with the thumb.
[0050] The instructions provided by the computing device 110 can include textual instructions displayed on a screen, audio instructions, and/or a video or animation that illustrates the correct way of performing the exercise.
[0051] At step 240, the computing device 110 compares the tracked movement of the active points 320 against one or more baseline metrics. The baseline metrics are metrics that can correspond to the movements of a person having normal or unaffected motor functions.
[0052] The baseline metrics can include speed metrics (i.e., that set a baseline about how quickly the full range of motion of a single repetition of finger tapping exercise should take), a range of motion metric (i.e., that sets a baseline regarding the range of motion between the point at which the thumb and finger touch and the point of the motion when they are farthest apart), and consistency metrics (i.e., that set a baseline regarding the consistency of a speed and/or range of motion of each tap during the entire tapping exercise).
[0053] In embodiments, the baseline metrics can be set based on historical data from the user such that a baseline for that particular user can be set. In other embodiments, the baseline metrics can be set based on the first tap or first set of taps performed during the exercise (i.e., reflecting a “rested” condition on the part of the user) and the subsequent taps compared against these baseline taps.
[0054] Thus, for finger tapping exercises that ask the user to tap as quickly as they can, the tracked motion is compared against speed metrics; for finger tapping exercises that as the user to make the tapping movement as wide as possible, the tracked motion is compared against range of motion metrics, etc.
[0055] The consistency metrics can include a slowing down of the pace of the finger taps from one tap to the next across the exercise.
[0056] In embodiments where the system 100 implements the tests from section 3.4 of the MDS-UPDRS, the computing device 110 tracks the regularity and smoothness of the rhythm during the tapping exercise (e.g., interruptions or hesitations), the slowing of the pace during the exercise and a change in the amplitude (the range of motion between the fully opened hand and fingers touching, and back) of the movements after the start. The metrics used in these embodiments can include the number of interruptions, the amount of slowing of the pace, or the decrease in amplitude after a certain number of repetitions. To do so, the computing device 110 determines a maximum distance, a maximum velocity and a maximum acceleration across all of the cycles (taps) during the exercise, a difference between the average velocity and acceleration during the first and second half of the exercise, a “jitter” (a cycle-to-cycle variation of the time period), and a “shimmer” (a cycle-to-cycle variation of the amplitude).
[0057] At step 250, the computing device determines the existence of a condition based on the comparison of the tracked movement against the one or more metrics of step 240.
[0058] The comparison at step 240 can be against more than one metric such that the combination of results are used to determine the patient's condition.
[0059] In embodiments that use the MDS-UPDRS, the comparison of the performance of the patient during the test against the metrics is scored. The score can then be used as a part of an assessment to determine whether the patient has a condition (or likely has a condition) and the severity of the condition.
[0060] In embodiments of the inventive subject matter, the computing device 110 is also programmed to execute a virtual dialog agent that engages with a patient to elicit certain speech and facial behaviors. The virtual dialog agent provides instructions to the user such that the patient responds in manner that enables the computing device to detect and analyze the speech in accordance with section 3.1 of the MDS-UPDRS.
[0061] The condition detected can include motor function disorders, neurological disorders, Parkinson's disease, or other conditions.
[0062] In embodiments, step 250 can also include gathering and delivering diagnostic data to a health care provider. In these embodiments, the computing device 110 gathers data associated with the performance of the test by the patient. For example, for a finger tapping exercise, the diagnostic data could include measured amplitude and/or speed during the exercise. The diagnostic data could also/instead include measured changes in the detected movement of the patient's body part during the test. For example, a decrease in the amplitude or a slow-down in the speed as the test progresses.
[0063] Once the computing device 110 has gathered the data from the test(s), it can transmit the data to the computing device(s) of one or more health care providers. As mentioned herein, these providers may be geographically remote from the patient.
[0064]
[0065] To do so, the computing device 110 (via the virtual dialog agent), asks the patient questions and then, via the camera 120 (that includes a microphone), captures the response at step 410. The questions are typically open ended to elicit spoken responses that extend beyond “yes” or “no” responses.
[0066] The computing device 110 then evaluates the speech according to one or more metrics at step 420. The metrics used in the evaluation of the speech include volume, modulation (prosody), and clarity. Clarity metrics can include detecting slurring, palilalia (repetition of syllables) and tachyphemia (rapid speech, running syllables together).
[0067] To evaluate the speech, the computing device 110 employs speech recognition software. For example, the speech recognition software can transcribe what it “understands” and then compare that against known words and phrases to determine the level of correct or accurate understanding. The speech recognition software can also detect the repetition of syllables and the speech speed and cadence.
[0068] At step 430, the computing device 110 scores the evaluated speech against a scoring metric. One such scoring metric is in section 3.1 of the MDS-UPDRS, which assigns point values based on the patient's modulation, diction, volume and ease of understanding.
[0069] Based on the scoring of the speech, the computing device 110 then determines a possible condition at step 440.
[0070] It is contemplated that the processes of
[0071] A technique for the collection and use of speech in the determination of conditions that could be applied to the methods and systems discussed herein is discussed in Applicant's own provisional application 63/273,829 titled “On the robust automatic computation of speaking and articulation duration in ALS patients versus healthy controls”, incorporated by reference in its entirety.
[0072] In embodiments of the inventive subject matter, the computing device 110 is also programmed to analyze image data of the patient's face in accordance with section 3.2 of the MDS-UPDRS.
[0073] To do so, the computing device 110 receives image data (video image data or a series of still images) from the camera 120 that is capturing the patient's face at step 510.
[0074] At step 520, the computing device 110 can map points to the user's face (similar to that of step 220 with the hand). Unlike the hand example of
[0075] In other embodiments, facial recognition software can be used that can detect facial features and their movements.
[0076] At step 530, the computing device 110 tracks the movement of the various facial features during the exercise. The exercise can involve talking as well as without talking.
[0077] Based on the detected movements of the various facial features, the computing device 110 analyzes the movements to determine metrics at step 540. The metrics can include eye-blink frequency, makes facies or loss of facial expression, smiling, and parting of lips. Thus, the computing device 110 determines the amount of blinks/blink frequency, changes in facial expression, smiling, parting of lips, etc. based on the movement of the facial features.
[0078] At step 550, the metrics are scored by the computing device 110. An example of how the scoring can occur is included in MDS-UPDRS section 3.1.
[0079] Based on the scoring, the computing device 110 determines a condition or possible condition at step 560.
[0080] The processes of
[0081] In embodiments of the inventive subject matter, the computing device 110 is programmed to detect correlations between the tracked movements and speech of the different tests of
[0082] In the embodiments shown herein, a camera is used as the primary sensor for the purposes of the processes discussed herein. However, other sensors can be used instead of or in addition to a camera. Other sensors that could be used include an infrared sensor, a stand-alone microphone (for tests that use speech), wearable sensors (e.g., gloves with sensors at the fingertips capable of detecting movement and speed), and other sensor.
[0083] As used herein, and unless the context dictates otherwise, the term “coupled to” is intended to include both direct coupling (in which two elements that are coupled to each other contact each other) and indirect coupling (in which at least one additional element is located between the two elements). Therefore, the terms “coupled to” and “coupled with” are used synonymously.
[0084] It should be apparent to those skilled in the art that many more modifications besides those already described are possible without departing from the inventive concepts herein. The inventive subject matter, therefore, is not to be restricted except in the spirit of the appended claims. Moreover, in interpreting both the specification and the claims, all terms should be interpreted in the broadest possible manner consistent with the context. In particular, the terms “comprises” and “comprising” should be interpreted as referring to elements, components, or steps in a non-exclusive manner, indicating that the referenced elements, components, or steps may be present, or utilized, or combined with other elements, components, or steps that are not expressly referenced. Where the specification claims refers to at least one of something selected from the group consisting of A, B, C . . . and N, the text should be interpreted as requiring only one element from the group, not A plus N, or B plus N, etc.