Method and device for processing a multimedia stream to verify access rights
11039223 · 2021-06-15
Assignee
Inventors
Cpc classification
G06F21/105
PHYSICS
H04N21/8193
ELECTRICITY
International classification
Abstract
The invention relates to a method for processing multimedia streams, in particular to verify access rights to a content of said multimedia stream, the multimedia stream being provided by a server device connected to a communication network, and downloadable by a client device via data formatted in a markup language, by using software for browsing and displaying data formatted in a markup language, cooperating with software for reading multimedia streams, the multimedia stream comprising at least one video stream able to be displayed in the form of pixels in a display zone (42, 44, 46, 48) of a display screen of the client device. The method includes determining a display zone (42), allocated by said browsing and display software to the software for reading multimedia streams in order to display a video stream extracted from said multimedia stream, and recording at least one digital image formed by values of the pixels to be displayed in the display zone (42) determined at a given time.
Claims
1. A method for processing multimedia streams, in particular for verifying access rights to content of a multimedia stream, the multimedia stream being provided by a server device connected to a communication network, and downloadable by a client device via data formatted in a markup language, by using software for browsing and displaying data formatted in a markup language, cooperating with software for reading multimedia streams, the multimedia stream comprising at least two video streams comprising a main video stream and at least one additional video stream, each video stream having an associated display zone of a display screen of the client device, the method comprising: determining, for each video stream extracted from the multimedia stream, the associated display zone, allocated by said browsing and display software to the software for reading multimedia streams; identifying a main display zone, among the determined display zones, as being the display zone associated with the main video stream, comprising: computing a score associated with each determined display zone; and selecting, as the main display zone, the display zone whose associated score is the highest score or the lowest score; and recording at least one digital image formed by values of pixels to be displayed in the display zone determined at a given time.
2. The method according to claim 1, wherein said determining implements an analysis of commands exchanged between the browsing and display software and the software for reading multimedia streams, based on a programming interface provided by the browsing and display software.
3. The method according to claim 2, further comprising intercepting a command to create or initialize an executable instance of the software for reading multimedia streams.
4. The method according to claim 2, further comprising determining a display mode from among a first mode and a second mode by analyzing a command to allocate a display zone sent by the browsing and display software and the software for reading multimedia streams.
5. The method according to claim 1, wherein said computing a score depends on at least one operation from among a group consisting of: calculating a surface area occupied by each display zone, calculating a number of display zones at least partially superimposed on each display zone, and calculating a ratio between at least a first and second dimension of each display zone.
6. The method according to claim 1, wherein the recorded pixel values form a digital image, the method further comprising analyzing recorded digital images to determine the digital images belonging to the main video stream and the digital images belonging to an additional video stream.
7. A device for processing multimedia streams, in particular for verifying access rights to a content of a multimedia stream, the processing device comprising: a central processing unit; and a data storage unit, the processing device comprising or being connected to a display screen, the multimedia stream being provided by a server device connected to a communication network, and downloadable by the processing device via data formatted in a markup language, while using software for browsing and displaying data formatted in a markup language, cooperating with software for reading multimedia streams, the multimedia stream comprising at least two video streams comprising a main video stream and at least one additional video stream, each video stream having an associated display zone of said display screen; and a module operable for: determining, for each video stream extracted from the multimedia stream, the associated display zone, allocated by the browsing and display software to the software for reading multimedia streams; identifying a main display zone, among the determined display zones, as being the display zone associated with the main video stream, comprising: computing a score associated with each determined display zone; and selecting, as the main display zone, the display zone whose associated score is the highest score or the lowest score; and recording at least one digital image formed by values of pixels to be displayed in the display zone determined at a given time.
8. A non-transitory computer readable medium storing instructions, which, when executed by a processor of an electronic device, cause the processor to implement a method for processing a multimedia stream according to claim 1.
9. The method according to claim 1, wherein said recording is performed at a predetermined temporal frequency, making it to possible to record a plurality of digital images of the video stream.
10. The method according to claim 4, wherein, when the display mode is a first display mode, the method further comprises intercepting a command to update a display zone, and wherein said recording is performed after said interception.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) Other features and advantages of the invention will emerge from the description thereof provided below, for information and non-limitingly, in reference to the appended figures, in which:
(2)
(3)
(4)
(5)
(6)
DETAILED DESCRIPTION OF EMBODIMENTS
(7)
(8) In this system 1, a server device 2 is schematically shown connected to a communication network 4, and also comprising a storage system 6, which can be distributed. The storage system 6 comprises multimedia content 8a, 8b, etc., which in this example is previously recorded.
(9) For example, in the context of a “video on demand” (VOD) server, the multimedia content is for example video content comprising images and sound, for example movies, documentaries, television series, encoded in an appropriate encoding format.
(10) Alternatively, the server 2 receives multimedia content in the form of multimedia streams from a content broadcaster 10, for example via a communication network other than the communication network 4, for example a satellite communication network. For example, these received multimedia streams correspond to a real-time broadcast of an artistic or sporting event, typically a show or sports match.
(11) The system 1 also comprises client devices 16, 18, 20, these client devices also being connected to the communication network 4.
(12) The client device 20 is outlined in
(13) In one embodiment, the client device 20 is a programmable device, for example a computer, comprising a communication unit 22 with the communication network 4, able to send and receive data using an appropriate communication protocol, for example the IP protocol (Internet Protocol).
(14) The client device 20 also includes a central processing unit 24, including one or several processors, able to execute computer program instructions when the device 20 is powered on. The device 20 also includes an information storage unit 26, for example registers, able to store data and executable code instructions making it possible to carry out programs including code instructions able to carry out the method according to the invention. The various functional blocks of the device 20 described above are connected via a communication bus 28.
(15) The programmable device 20 comprises or is connected to a display screen 30. Optionally, the programmable device 20 comprises an interface 31 for interacting with the user, for example keyboard, mouse or any other pointing means. In one embodiment, the display screen 30 is of the touch-sensitive type and also forms an interaction interface 31 with a user.
(16) The central processing unit 24 includes software modules, in particular a software module 32, which implements a data communication protocol, for example the HyperText Transfer Protocol (HTTP), to obtain data and/or instructions provided by servers 2 implementing the same communication protocol, and provides a rendering on the screen 30 of viewable data extracted from the obtained data and/or instructions.
(17) The viewable data comprise text, still images, videos.
(18) The software module 32 is a network browsing and display module, commonly referred to as a web browser, or simply browser, the main function of which is to view information available on the World Wide Web.
(19) The browser uses a web address or URL (Uniform Resource Locator), indicating the location of a page, commonly called webpage, set of resources containing data and/or instructions on a server implementing the HTTP protocol, and downloads the targeted page.
(20) Such a webpage comprises data formatted in a markup language, for example HTML (Hypertext Markup Language), this language providing the text to be displayed as well as the general structure of the formatting: titles and paragraphs, lists, tables. The formatting can be refined by using cascading style sheets (CSS): margins, alignments, spacings, colors, borders, etc.
(21) Traditionally, a browser is able to communicate with one or several software programs for reading multimedia streams (software module 34). Such software is also called “player”. More generally, a browser is able to execute compatible extension software to provide additional functionalities.
(22) The communication between the browser and any extension software is done by application programming interfaces (API), defining communication functions between the browser and any extension software, making it possible to perform functionalities, send information and send parameter values.
(23) For example, the MozillaFirefox® browser uses an interface called NPAPI for Netscape Plugin Application Programming Interface.
(24) In general, all browsers have an associated programming interface intended to allow outside software to interface with the browser and provide additional functionalities. In one alternative embodiment, the software for reading multimedia streams is integrated into the browser 32, for example in the case of HTMLS browsers. In this case, internal APIs are used, the operation being similar to that described above.
(25) Furthermore, a software module 36 is added, comprising code instructions to carry out a method for processing multimedia streams according to the invention, embodiments of which will be described in detail below.
(26) The browser 32 is able to display content intended to be viewed on a display screen 30.
(27) To perform such a display, a browser 32 uses either an internal graphic composition engine 38 or an external graphic composition engine (not shown), which is a composition engine of the operating system implemented in the programmable device 20.
(28) Each content is associated with an element to be displayed (still image or video stream), with which the graphic composition engine associates a reserved display zone on the display screen 30.
(29) Each element is then displayed in the form of a set of pixels of the reserved display zone, each pixel being a display unit on a screen and having an associated value encoded over several bits. The values of the pixels to be displayed at a given time are digital image data forming an image to be displayed.
(30) When the element to be displayed is a video stream, the values of the pixels of the display zone are refreshed at a temporal frequency determined by the encoding format of the video stream.
(31) A display zone has a planar geometric shape, for example rectangular, and is characterized by parameters characterizing the planar geometric shape, its position on the display screen and its depth Z, making it possible to define a plane with an associated depth and to define a hierarchy of depth between the display zones.
(32) For example, a rectangular display zone is defined by a position parameter, length L, width I along the axes X and Y of an associated spatial coordinate system and depth Z parameters. A spatial coordinate system (X, Y, Z) is illustrated in
(33) Thus, in the example of
(34) A second zone 42, containing the element to be displayed that one wishes to view, which contains the main video stream, is displayed with an associated depth Z.sub.1>Z.sub.0.
(35) However, several additional display zones 44, 46, 48, with smaller sizes, are displayed above, for example comprising contents of the type: advertising video stream, logos or informational text.
(36) In the illustrated example, the display zones 44, 46, 48 have associated depths Z.sub.2, Z.sub.3, Z.sub.4, which may be equal, but all greater than Z.sub.1.
(37) Traditionally, a user wishing to view the main video stream displayed in the display zone 42 acts on interaction zones 44a, 46a, 48a associated with the display zones to close the display zones or windows 44, 46, 48.
(38) Alternatively, a display duration is associated with each display zone 44, 46, 48 in the definition file of the page to be displayed.
(39) The method for processing multimedia streams according to the invention aims to identify the main display zone corresponding to the main video stream to be displayed, and to successively record images of the main video stream from values of the display pixels in the identified display zone.
(40) Thus, advantageously, the method according to the invention does not require knowledge of the operating mode of the software for reading video streams or the encoding format of the received video streams.
(41)
(42) The method comprises a first step 60 for obtaining data formatted according to a markup language, for example in the form of a webpage downloaded from a server, including display instructions executable by the browser to display one or several multimedia streams, in particular containing text, images and one or several video streams to be displayed. This step for obtaining data is carried out by the browser 32.
(43) Step 60 is followed by a step 62 for triggering the execution by the browser 32 of the webpage obtained for performing the display.
(44) This step 62 is followed by a step 64 for determining a display zone allocated for displaying a main video stream contained in the downloaded multimedia stream(s), implemented by the software module 36, and using the analysis of the commands exchanged between the browser and the software for reading multimedia streams.
(45) It will be noted that here the case is considered of software for reading multimedia streams, but it is understood that the operation is similar if several different software programs for reading multimedia streams are implemented.
(46) Step 64 comprises a sub-step 66 for intercepting, via the software module 36, a command to create or initialize an executable instance of software for reading multimedia streams, based on the application programming interface (API) provided by the browser.
(47) For example, when the API is NPAPI, a NP_Initialize( ) command is executed to initialize an executable instance of the software for reading multimedia streams as an extension of the browser. Next, instances of this software for reading multimedia streams are initialized via NPP_New( ) commands.
(48) More generally, the functions bearing the “NPP_” prefix indicate a command sent from the browser to the software for reading multimedia streams, and more generally to add-on software, while functions bearing the “NPN_” prefix indicate a command sent by add-on software to the browser.
(49) After a command relative to the creation or initialization of an executable instance of the software for reading video streams is intercepted, a sub-step 68 for determining parameters for identifying each display zone allocated to display a video stream, in which the software for reading multimedia streams will provide video content to be displayed in the form of pixel values to be displayed.
(50) Various embodiments of the determination 68 of the parameters for identifying each allocated display zone are considered, as explained in detail below, depending on the display mode used.
(51) If applicable, if several display zones are identified to display various video streams, like in the example illustrated in
(52) Step 64 for determining a display zone allocated for displaying the main video stream is followed by a step 72 for recording values of the pixels displayed in the main display zone, associated with the main video stream.
(53) According to a first embodiment, the recording is done at a recording temporal frequency that may be predetermined.
(54) Alternatively, the recording is done as a function of commands to update the main display zone, intercepted by a prior step for intercepting commands.
(55) Thus, a series of image data is recorded, and subsequently processed during a processing step 74. The processing for example consists of analyzing the recorded image data to extract marking information therefrom, using indelible and imperceptible marking or watermarking methods, this marking information making it possible to obtain information relative to the right to access the contents of the corresponding video stream.
(56) Alternatively, the processing 74 consists of encoding using a selected video encoding format.
(57) According to one alternative, the processing step 74 also implements processing operations to analyze the recorded image data making it possible to eliminate a portion, so as to keep only the image data belonging to the main video stream and not the image data for example belonging to transitional screens or advertising videos embedded in the received multimedia stream comprising the main video stream.
(58) Methods known by those skilled in the art can be used to that end.
(59) For example, successive images are compared and still images are detected and eliminated.
(60) Furthermore, black or more generally uniform images are also eliminated, these images in all likelihood corresponding to transitional screens.
(61) Lastly, a high scene change frequency indicates advertising content; successive images with a high scene change rate are therefore eliminated.
(62) Conversely, a low detected change frequency is considered characteristic of a sporting event or broadcast channel; the images are therefore kept.
(63) The implementation of step 68 for determining identification parameters of each display zone allocated for displaying a video stream depends on the implemented display mode.
(64) When the NPAPI API is used, two display modes are distinguished: a first display mode, called “windowless”, in which the software for reading multimedia streams displays the decoded video stream directly via the browser, and a second display mode, called “windowed”, in which the browser allocates a display window to the software for reading multimedia streams for the display.
(65) The “NPP-SetWindow( )” function is used to allocate a display zone, but the parameters used depend on the display mode.
(66)
(67) During a first step 80, it is determined whether the display mode used is the first mode (“windowless” mode) or the second mode (“windowed” mode), by detecting the presence of a predetermined parameter in the HTML page.
(68) If the display mode is the first display mode, a step 82 for intercepting the “NPP_SetWindow( )” command is carried out, this command using, as parameter, an object defining the display zone, called “drawable”.
(69) “NPN_InvalidateRect( )” or “NPN_InvalidateRegion( )” commands, sent by the software for reading multimedia streams to the browser, are intercepted in step 84, these commands indicating the need to update the “drawable” display zone.
(70) Next, during step 86, a “NPP_HandleEvent( )” command is intercepted, this command comprising, in a parameter, an identification reference of the updated “drawable” display zone.
(71) Step 86 is followed by a step 88 for obtaining the identification reference of the display zone.
(72) Advantageously, when this first display mode is implemented, the interception of the “NPP_HandleEvent( )” command makes it possible at the same time to determine the update of the display zone.
(73) If the display mode is the second display mode, step 90 intercepts the “NPP_SetWindow( )” command and is followed by step 92 for obtaining a window identifier value of the X Window system allocated to the display. This identifier value is sent to the software for reading video streams, which can then display it directly without informing the browser.
(74) Nevertheless, the obtained window identifier value makes it possible to determine the allocated display zone.
(75) It then suffices to record the values of the pixels displayed in this identified display zone, with a predetermined time frequency.
(76) Alternatively, the window system is modified to intercept update notices, making it possible, like for the first display mode, to determine the update of the display zone and record values of the displayed pixels only following an update of the display.
(77) As explained above, in most cases, several display zones are determined, in which case a step for determining the main display zone corresponding to the main video stream is carried out.
(78)
(79) For this determination, at least one simple heuristic is carried out, making it possible to calculate a final score associated with each determined display zone, and to select, as main display zone, the zone obtaining the best final score.
(80) In the embodiment of
(81) Next, several heuristics are carried out making it possible to associate, with each of the N display zones, a score according to each of these heuristics.
(82) During a step 102, the surface area of each of the zones S.sub.1 to S.sub.N is determined, and decreasing scores are associated as a function of the occupied surface area, the maximum surface area zone having the best score according to this surface area heuristic.
(83) Returning to the example of
(84) The zone 42 has the largest surface area, followed by zones 44, 48 and 46.
(85) During a step 104, the length/width ratio is determined of the rectangle corresponding to each display zone, and it is compared to a predetermined value, for example 4/3 or 16/9, which are the ratios most used for the display of the main streams.
(86) The display zones are then ranked as a function of the distance between the obtained ratio and the predetermined value. The zone with the ratio closest to the predetermined value receives the best score according to this format heuristic.
(87) During a step 106, one determines, based on the depth value Z.sub.i associated with each display zone S.sub.i, the number of zones superimposed on each of the display zones. Hypothetically, the display zones corresponding to advertising content are positioned above the main display zone.
(88) In the case of
(89) During a step 108, one determines the centering relative to the display screen of each of the zones, for example by the distance between the center of each display zone and the center of the total display surface area of the screen. For this centering heuristic as well, the display zone 42 obtains the best score, since it is centered.
(90) Lastly, the scores for each display zone relative to the various heuristics are combined into one final score, and during step 110, the display zone obtaining the best final score is selected as the main display zone corresponding to the display of the main video stream.
(91) In one embodiment, the combination consists of calculating a final score as weighted sum of the scores obtained for each implemented heuristic, and the maximum final score, or minimum final score, is selected as the best.
(92) Alternatively, the selected display zone is the zone having obtained the highest number of best scores according to the various heuristics used, i.e., the final score of a zone is therefore the number of best scores obtained by this zone according to the various heuristics used, and the maximum final score is selected as being the best.
(93) According to one embodiment, only one or only a portion of the heuristics described above is implemented to determine the main display zone.
(94) Alternatively, other methods for determining a main display zone are implemented, combining the heuristics described above and content analysis methods making it possible for example to detect advertising videos and consequently to eliminate the determined display zone associated with such a video.
(95) For example, it is possible to select the two or three display zones having obtained the best final scores and next to analyze the content thereof over a certain time period going from a single image sample to several tens of images to discriminate the best zone.
(96) According to one sub-optimal embodiment, the values of the pixels displayed in all of the determined display zones are first recorded, then an analysis of the contents is done to determine the main video stream. This embodiment is sub-optimal because it requires more memory and computing resources to determine and extract a main video stream from the downloaded multimedia stream(s).
(97) The proposed content analysis techniques can include the detection of a black screen or an invariant element in the image such as a logo identifying a TV channel, or for example a measurement of the frequency of shot changes, which, when close, characterize advertising content.