Media content tracking of users' gazing at screens
11227307 · 2022-01-18
Assignee
Inventors
Cpc classification
H04N21/44204
ELECTRICITY
A61B5/0077
HUMAN NECESSITIES
A61B2576/00
HUMAN NECESSITIES
G16H50/20
PHYSICS
A61B5/7264
HUMAN NECESSITIES
A61B2503/12
HUMAN NECESSITIES
H04N21/41415
ELECTRICITY
H04N21/44218
ELECTRICITY
H04N21/4532
ELECTRICITY
H04N21/441
ELECTRICITY
International classification
G09G5/00
PHYSICS
H04N21/414
ELECTRICITY
G06F3/14
PHYSICS
A61B5/16
HUMAN NECESSITIES
A61B5/11
HUMAN NECESSITIES
H04N21/442
ELECTRICITY
G06F3/03
PHYSICS
H04N21/441
ELECTRICITY
H04N21/45
ELECTRICITY
Abstract
A method includes receiving a user identifier and instructing display systems to display media content based on the user identifier. Each display system has a corresponding screen. The method also includes receiving image data from an imaging system configured to have a field of view arranged to capture images of a user. The method further includes determining gaze characteristics of the user including a gaze target of the user. The method further includes determining whether the gaze target corresponds to one of the screens. When the gaze target corresponds to one of the screens, the method includes determining a time period of gaze engagement with the corresponding screen. The method also includes storing at least one of the gaze characteristics and the media content or an identifier of the media content displayed on the screen corresponding to the gaze target.
Claims
1. A method comprising: receiving, at data processing hardware, user identifiers associated with a plurality of users located in a display environment, each user identifier comprising a uniform resource locater (URL) indicating a respective genre of media content relating to the associated user; instructing, by the data processing hardware, each of a plurality of display systems within the display environment to concurrently display genres of media content based on the URLs associated with the plurality of users, each of the display systems having a corresponding screen concurrently viewable by the plurality of users located in the display environment; receiving, at the data processing hardware, image data from an imaging system configured to have a field of view arranged to capture images of the plurality of users while the plurality of users view the corresponding screens of the plurality of display systems; determining, by the data processing hardware, collective group gaze characteristics for the plurality of users based on the image data, the collective group gaze characteristics comprising gaze targets for each of the plurality of users; for each user of the plurality of users, determining, by the data processing hardware, (i) a time period of gaze engagement with a corresponding screen based on the respective gaze target of the respective user and (ii) the respective genre of media content being displayed on the corresponding screen during the time period of gaze engagement; and for at least one genre of media content associated with one of the plurality of users, generating, by the data processing hardware, a collective gaze engagement metric indicating a collective time period of gaze engagement based on an aggregate of the time periods of gaze engagement by the plurality of users with the at least one genre of media content.
2. The method of claim 1, wherein the collective gaze engagement metric comprises a ratio of the collective time period of gaze engagement to a total time of all of the plurality of users when the at least one genre of media content was being displayed within the display environment.
3. The method of claim 1, further comprising storing, by the data processing hardware, in memory hardware, the respective time period of gaze engagement of the respective user and the URL of the media content displayed on the screen corresponding to the gaze target.
4. The method of claim 1, further comprising storing, by the data processing hardware, in memory hardware, the collective gaze engagement metric, the at least one genre of media content associated with the one of the plurality of users, and the user identifiers of the plurality of users corresponding to the collective gaze engagement metric.
5. The method of claim 1, wherein instructing each of the plurality of display systems within the display environment to concurrently display genres of media content comprises instructing each display system to display a respective genre of media content for an interval of time, and wherein at least two display systems display different genres of media content at the interval of time.
6. The method of claim 1, further comprising identifying, by the data processing hardware, any genres of media content having received at least one of: a threshold time number of gazes by the plurality of users; or a threshold time period of gaze engagement by one or more of the plurality of users.
7. The method of claim 1, further comprising identifying, by the data processing hardware, the plurality of users in the display environment based on the image data.
8. The method of claim 7, further comprising: identifying, by the data processing hardware, facial features of the plurality of users based on the image data; and determining, by the data processing hardware, the user identifiers based on the facial features of the plurality of users.
9. The method of claim 1, wherein the imaging system comprises at least one of: a camera; a three-dimension volumetric point cloud imaging sensor; stereo cameras; a light detection and ranging (LIDAR) system; or a laser detection and ranging (LADAR) system.
10. The method of claim 1, wherein receiving the user identifiers comprises receiving a near-field measurement from an electro-magnetic near-field scanner.
11. A system comprising: data processing hardware; and memory hardware in communication with the data processing hardware, the memory hardware storing instructions that when executed on the data processing hardware cause the data processing hardware to perform operations comprising: receiving user identifiers associated with a plurality of users located in a display environment, each user identifier comprising a uniform resource locater (URL) indicating a respective genre of media content relating to the associated user; instructing each of a plurality of display systems within the display environment to concurrently display genres of media content based on the URLs associated with the plurality of users, each of the display systems having a corresponding screen concurrently viewable by the plurality of users located in the display environment; receiving image data from an imaging system configured to have a field of view arranged to capture images of the plurality of users while the plurality of users view the corresponding screens of the plurality of display systems; determining collective group gaze characteristics for the plurality of users based on the image data, the collective group gaze characteristics comprising gaze targets for each of the plurality of users; for each user of the plurality of users, determining (i) a time period of gaze engagement with a corresponding screen based on the respective gaze target of the respective user and (ii) the respective genre of media content being displayed on the corresponding screen during the time period of gaze engagement; and for at least one genre of media content associated with one of the plurality of users, generating a collective gaze engagement metric indicating a collective time period of gaze engagement based on an aggregate of the time periods of gaze engagement by the plurality of users with the at least one genre of media content.
12. The system of claim 11, wherein the collective gaze engagement metric comprises a ratio of the collective time period of gaze engagement to a total time of all of the plurality of users when the at least one genre of media content was being displayed within the display environment.
13. The system of claim 11, wherein the operations further comprise storing, in memory hardware, the respective time period of gaze engagement of the respective user and the URL of the media content displayed on the screen corresponding to the gaze target.
14. The system of claim 11, wherein the operations further comprise storing, in memory hardware, the collective gaze engagement metric, the at least one genre of media content associated with the one of the plurality of users, and the user identifiers of the plurality of users corresponding to the collective gaze engagement metric.
15. The system of claim 11, wherein instructing each of the plurality of display systems within the display environment to concurrently display genres of media content comprises instructing each display system to display a respective genre of media content for an interval of time, and wherein at least two display systems display different genres of media content at the interval of time.
16. The system of claim 11, further comprising identifying any genres of media content having received at least one of: a threshold time number of gazes by the plurality of users; or a threshold time period of gaze engagement by one or more of the plurality of users.
17. The system of claim 11, wherein the operations further comprise identifying the plurality of users in the display environment based on the image data.
18. The system of claim 17, wherein the operations further comprise: identifying facial features of the plurality of users based on the image data; and determining the user identifiers based on the facial features of the plurality of users.
19. The system of claim 11, wherein the imaging system comprises at least one of: a camera; a three-dimension volumetric point cloud imaging sensor; stereo cameras; a light detection and ranging (LIDAR) system; or a laser detection and ranging (LADAR) system.
20. The system of claim 11, wherein receiving the user identifiers comprises receiving a near-field measurement from an electro-magnetic near-field scanner.
Description
DESCRIPTION OF DRAWINGS
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12) Like reference symbols in the various drawings indicate like elements.
DETAILED DESCRIPTION
(13) As companies invest money and time into goods and services, the companies may use tools to determine ways to attract attention from consumers to their goods and their services. Companies have, therefore, traditionally studied consumer habits and consumer behaviors with focus groups and surveys as a means of consumer research to receive feedback and opinions of consumers. These traditional methods, however, often suffer from inherent biases such poor question design or researcher bias. Consumers may also skew their responses to place themselves in a favorable public light. With these traditional means, consumer research struggles to capture organic consumer habits and consumer behaviors. A media content tracking environment enables a company to conduct consumer research related to media content while reducing traditional biases. In the media content tracking environment, a consumer or a user participates in a viewing session over a period of time. During the viewing session, the media content tracking environment feeds the user media content while observing, collecting and storing image data regarding interactions of the user with the media content.
(14)
(15) Based on the user identifier 12, the processing system 110 is configured to display media content 20 to, the user 10 by display systems 120. Each display system 120, 120a-n of the display systems 120 has a corresponding screen 122 depicting media content 20. Some examples of the display systems 120 include televisions, monitors, or projector and screen combinations.
(16) Each user 10, 10a-n has gaze characteristics that the imaging system 300 identifies to determine whether the user 10 has an interest in the depicted media content 20. The gaze characteristics include a gaze target G.sub.T that corresponds to a subject of focus (i.e., a center of interest) within a field of view F.sub.V of the user 10. For example, referring to
(17) With continued reference to
(18) Additionally or alternatively, when the media content tracking environment 100 has more than one user 10 (e.g., first, second, and third users 10a-c), the processing system 110 may determine gaze characteristics of a group 11 of more than one user 10. The gaze characteristics of the group 11 may be collective group gaze characteristics or gaze characteristics of a single user 10 with reference to the group of more than one user 10. For example, the processing system 110 determines collective group gaze characteristics similar to the gaze characteristics of the user 10, such as a group collective time period T.sub.GE of gaze engagement (i.e. a summation of the time period of gaze engagement of all users with reference to genre 26 of media content 20 or a particular display screen 122). In some implementations, the processing system 110 determines a concentration C.sub.E of collective gaze engagement. The concentration C.sub.E of collective gaze engagement is a ratio of the collective time period T.sub.E of gaze engagement and a total time (e.g., total time of a user 10 or total time of all users 10a-n) within the media content tracking environment 100. The ratio may be with reference to a particular display screen 122, a particular genre 26 of media content 20, a particular user 10 (C.sub.Euser), or the group 11 of more than one user 10 (C.sub.Egroup). Examples of the ratio are shown below in equations 1 and 2.
(19)
(20) In some implementations, the processing system 110 stores the gaze characteristics as gaze characteristic data in the memory hardware 114. In some examples, the processing system 110 stores all generated gaze characteristics within the memory hardware 114. In other examples, an entity, such as an end-user, a processing system programmer, or a media content tracking environment provider, provides parameters that function as thresholds to store gaze characteristic data that qualifies according to the provided thresholds. The entity may consider gaze characteristic data stored according to thresholds more meaningful to review or to evaluate than all generated gaze characteristics. For example, thresholds permit the entity to efficiently and effectively evaluate media content 20 provided within the media content tracking environment 100. A media content provider may use the media content tracking environment 100 to evaluate whether one type of media content 20 more effectively engages users 10 than another type of media content 20. With thresholds, the entity can easily identify a level of gaze engagement that interests the entity. For example, the level of gaze engagement may be set according to thresholds such that the entity receives gaze characteristics corresponding to a level of gaze engagement greater than the thresholds. The processing system 110 may include default thresholds or receive thresholds from an entity. Some example thresholds that the processing system 110 may receive and/or identify include a threshold time number of gazes by at least one user 10, a threshold a collective time period T.sub.E of gaze engagement by the at least one user 10, a threshold concentration C.sub.E of collective gaze engagement by the at least one user 10, a threshold display time (i.e., a length of time provided media content 20 is displayed), or a threshold number of users 10.
(21) Additionally or alternatively, the processing system 110 may store all gaze characteristics data or gaze characteristics data 115 corresponding to thresholds in a gaze characteristic database. The gaze characteristic database may be located within the memory hardware 114, on the network 116, or on the server 118. The characteristic database may be configured such that an entity may be able to filter gaze characteristics data 115 according to filtering thresholds. For example, the filtering thresholds are values defined by the entity to remove or to hide gaze characteristics data such that the entity may review and may evaluate less gaze characteristic data than all gaze characteristic data or all gaze characteristics data corresponding to thresholds.
(22)
(23)
(24) Referring further to
(25) Additionally or alternatively, an imaging system 300 performs facial recognition 12, 12.sub.3 as the user identifier 12. The imaging system 300 may be the same imaging system 300 used to determine the gaze target G.sub.T of the user 10 or a dedicated imaging system 300, 300a for facial recognition 12, 12.sub.3. The imaging system 300 performs facial recognition 12, 12.sub.3 based on facial features 14 of the user 10. To perform facial recognition 12, 12.sub.3, the imaging system 300 captures at least one facial recognition image 310, 310a, generates corresponding image data 312, and communicates the image data 312 to the processing system 110. The processing system 110 is configured to identify and determine the user identifier 12 from the image data 312 based on facial features 14 of the user 10 captured by the at least one facial recognition image 310, 310a. In some examples, the processing system 110 communicates with a facial recognition database that compares image data 312 from the facial recognition database to image data 312 communicated to the processing system 110 from the imaging system 300. Generally, image data 312 for facial recognition 12, 12.sub.3 corresponds to several nodal points related to facial features 14 of a user 10, such as peaks and valleys around a mouth, a nose, eyes, a chin, a jawline, a hairline, etc. The processing system 110 may include facial recognition software to perform facial recognition 12, 12.sub.3.
(26) In some examples, the identification system 200 automatically launches media content 20. For example, the user identifier 12 (e.g., 12, 12.sub.1-3) of the user 10 includes user information corresponding to a genre 26 of media content 20 related to the user 10. When the identification system 200 identifies the user identifier 12 (e.g., by identification card 12, 12.sub.1,2 or facial recognition 12, 12.sub.3), the identification system 200 communicates uniform resource locators (URLs) within the user information to the processing system 110 such that the processing system 110 instructs the display system 120 to display a genre 26 of media content 20 related to the user 10 based on the URLs within the user identifier 12.
(27)
(28)
(29) Additionally or alternatively, the media content tracking environment 100 includes a calibration process. During the calibration process, a user 10 follows a sequence of gaze targets G.sub.T displayed on different screens 122 with the display systems 120 of the media content tracking environment 100. With the sequence preprogrammed, the processing system 110 stores image data from the calibration corresponding to each gaze target G.sub.T within the sequence to associate with image data generated after calibration when the user 10 receives non-calibration media content 20. From the association, the processing system 110 may more accurately determine gaze characteristics of the user 10.
(30)
(31)
(32)
(33)
(34)
(35)
(36)
(37) At operation 608, the method 600 includes determining gaze characteristics of the user 10 based on the image data. At operation 608, the method 600 further includes determining gaze characteristics including a gaze target G.sub.T of the user 10. In some examples, the method 600 also includes determining a gaze direction G.sub.D of the user 10. The method 600 further includes, at operation 610, determining whether the gaze target G.sub.T corresponds to one of the screens 122. When the gaze target corresponds to one of the screens 122, the method 600 proceeds to operation 612. Otherwise, when the gaze target does not correspond to one of the screens 122, the method 600 may end operations. At operation 612, the method 600 includes determining a time period of gaze engagement with the corresponding screen 122 based on the gaze characteristics of the user 10. At operation 612, the method 600 further includes storing at least one of the gaze characteristics of the user or the time period t of gaze engagement with the corresponding screen 122 and the media content 20 or an identifier 22 of the media content 20 displayed on the screen 122 corresponding to the gaze target G.sub.T. Additionally or alternatively, the method 600 may further include identifying genres 26 of media content 20 receiving gaze engagement by the users 10 based on the associations of the time periods of gaze engagement of the users 10 with the corresponding media content 20. Optionally, the method 600 may include storing the identified genres 26 of media content 20.
(38)
(39) The computing device 700 includes a processor 710, memory 720, a storage device 730, a high-speed interface/controller 740 connecting to the memory 720 and high-speed expansion ports 750, and a low speed interface/controller 760 connecting to a low speed bus 770 and a storage device 730. Each of the components 710, 720, 730, 740, 750, and 760, are interconnected using various busses, and may be mounted on a common motherboard or in other manners as appropriate. The processor 710 can process instructions for execution within the computing device 700, including instructions stored in the memory 720 or on the storage device 730 to display graphical information for a graphical user interface (GUI) on an external input/output device, such as display 780 coupled to high speed interface 740. In other implementations, multiple processors and/or multiple buses may be used, as appropriate, along with multiple memories and types of memory. Also, multiple computing devices 700 may be connected, with each device providing portions of the necessary operations (e.g., as a server bank, a group of blade servers, or a multi-processor system).
(40) The memory 720 stores information non-transitorily within the computing device 700. The memory 720 may be a computer-readable medium, a volatile memory unit(s), or non-volatile memory unit(s). The non-transitory memory 720 may be physical devices used to store programs (e.g., sequences of instructions) or data (e.g., program state information) on a temporary or permanent basis for use by the computing device 700. Examples of non-volatile memory include, but are not limited to, flash memory and read-only memory (ROM)/programmable read-only memory (PROM)/erasable programmable read-only memory (EPROM)/electronically erasable programmable read-only memory (EEPROM) (e.g., typically used for firmware, such as boot programs). Examples of volatile memory include, but are not limited to, random access memory (RAM), dynamic random access memory (DRAM), static random access memory (SRAM), phase change memory (PCM) as well as disks or tapes.
(41) The storage device 730 is capable of providing mass storage for the computing device 700. In some implementations, the storage device 730 is a computer-readable medium. In various different implementations, the storage device 730 may be a floppy disk device, a hard disk device, an optical disk device, or a tape device, a flash memory or other similar solid state memory device, or an array of devices, including devices in a storage area network or other configurations. In additional implementations, a computer program product is tangibly embodied in an information carrier. The computer program product contains instructions that, when executed, perform one or more methods, such as those described above. The information carrier is a computer- or machine-readable medium, such as the memory 720, the storage device 730, or memory on processor 710.
(42) The high speed controller 740 manages bandwidth-intensive operations for the computing device 700, while the low speed controller 760 manages lower bandwidth-intensive operations. Such allocation of duties is exemplary only. In some implementations, the high-speed controller 740 is coupled to the memory 720, the display 780 (e.g., through a graphics processor or accelerator), and to the high-speed expansion ports 750, which may accept various expansion cards (not shown). In some implementations, the low-speed controller 760 is coupled to the storage device 730 and a low-speed expansion port 790. The low-speed expansion port 790, which may include various communication ports (e.g., USB, Bluetooth, Ethernet, wireless Ethernet), may be coupled to one or more input/output devices, such as a keyboard, a pointing device, a scanner, or a networking device such as a switch or router, e.g., through a network adapter.
(43) The computing device 700 may be implemented in a number of different forms, as shown in the figure. For example, it may be implemented as a standard server 700a or multiple times in a group of such servers 700a, as a laptop computer 700b, or as part of a rack server system 700c.
(44)
(45) At operation 810, for each user 10 of the plurality of users 10, the method 800 includes determining a time period of gaze engagement T.sub.GE with a corresponding screen 122 based on the respective gaze target G.sub.T of the respective user 10 and the respective genre 26 of media content 20 being displayed on the corresponding, screen 122 during the time period T.sub.E of gaze engagement. At operation 812, for at least one genre 26 of media content 20 associated with one of the plurality of users 10, the method 800 includes generating a collective gaze engagement metric C.sub.E. The collective gaze engagement metric C.sub.E indicates a collective time period of gaze engagement based on an aggregate of the time period of gaze engagement T.sub.GE by the plurality of users 10 with at least one genre 26 of media content 20.
(46)
(47)
(48)
(49)
(50)
(51)
(52) Various implementations of the systems and techniques described herein can be realized in digital electronic and/or optical circuitry, integrated circuitry, specially designed ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various implementations can include implementation in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, coupled to receive data and instructions from, and to transmit data and instructions to, a storage system, at least one input device, and at least one output device.
(53) These computer programs (also known as programs, software, software applications or code) include machine instructions for a programmable processor, and can be implemented in a high-level procedural and/or object-oriented programming language, and/or in assembly/machine language. As used herein, the terms “machine-readable medium” and “computer-readable medium” refer to any computer program product, non-transitory computer readable medium, apparatus and/or device (e.g., magnetic discs, optical disks, memory, Programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term “machine-readable signal” refers to any signal used to provide machine instructions and/or data to a programmable processor.
(54) The processes and logic flows described in this specification can be performed by one or more programmable processors executing one or more computer programs to perform functions by operating on input data and generating output. The processes and logic flows can also be performed by special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application specific integrated circuit). Processors suitable for the execution of a computer program include, by way of example, both general and special purpose microprocessors, and any one or more processors of any kind of digital computer. Generally, a processor will receive instructions and data from a read only memory or a random access memory or both. The essential elements of a computer are a processor for performing instructions and one or more memory devices for storing instructions and data. Generally, a computer will also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto optical disks, or optical disks. However, a computer need not have such devices. Computer readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices; magnetic disks, e.g., internal hard disks or removable disks; magneto optical disks; and CD ROM and DVD-ROM disks. The processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.
(55) To provide for interaction with a user, one or more aspects of the disclosure can be implemented on a computer having a display device, e.g., a CRT (cathode ray tube), LCD (liquid crystal display) monitor, or touch screen for displaying information to the user and optionally a keyboard and a pointing device, e.g., a mouse or a trackball, by which the user can provide input to the computer. Other kinds of devices can be used to provide interaction with a user as well; for example, feedback provided to the user can be any form of sensory feedback, e.g., visual feedback, auditory feedback, or tactile feedback; and input from the user can be received in any form, including acoustic, speech, or tactile input. In addition, a computer can interact with a user by sending documents to and receiving documents from a device that is used by the user; for example, by sending web pages to a web browser on a user's client device in response to requests received from the web browser.
(56) A number of implementations have been described. Nevertheless, it will be understood that various modifications may be made without departing from the spirit and scope of the disclosure. Accordingly, other implementations are within the scope of the following claims.