Video providing system and program
20220295135 · 2022-09-15
Inventors
Cpc classification
H04N21/8541
ELECTRICITY
H04N21/4126
ELECTRICITY
H04N21/2668
ELECTRICITY
International classification
H04N21/475
ELECTRICITY
Abstract
An object of the present invention is to provide a video providing system and a program that allow a viewer to actively edit digital content. A video providing system 100 that provides video content to a viewer includes: a component 104, 105, 106 for receiving a guidance trigger prompting the viewer to participate in the video content at a device of the viewer; a component 103, 1602, 1603 for accepting an option content command or option content that is different from mainstream content corresponding to a viewer trigger sent from the device as a response to the guidance trigger through a network; and a component 101, 102, 1605 for reproducing or displaying the option content on a display device specified by the command.
Claims
1. A video providing system for providing digital content to a viewer, the system comprising: a component for receiving a guidance trigger prompting the viewer to participate in video content at a device of the viewer; a component for accepting an option content command or option content that is different from mainstream content corresponding to a viewer trigger sent from the device as a response to the guidance trigger through a network; and a component for reproducing or displaying the option content on a display device.
2. The video providing system as claimed in claim 1, wherein the guidance trigger is provided to the device through a function of the device including voice, vibration, email and SNS.
3. The video providing system as claimed in claim 1, wherein reproduction of the option content is performed through a media medium and display of the option content displays the option content acquired through the network on the display device, a video screen, or an object.
4. The video providing system as claimed in claim 1, wherein reproduction of the option content is performed by video streaming.
5. The video providing system as claimed in claim 1, wherein display of the option content is performed by the video screen or projection mapping.
6. The video providing system as claimed in claim 1, comprising a collaboration server for communicating between a plurality of viewers.
7. The video providing system as claimed in claim 6, wherein the collaboration server performs collaboration by voice communication using SNS.
8. The video providing system as claimed in claim 1, wherein the option content is determined by voting of a large number of the viewers.
9. An executable program for making an information processing device function as a video providing system that provides digital content to viewer, the information processing device being made to function as: a component for receiving a guidance trigger prompting the viewer to participate in video content at a device of the viewer; a component for accepting an option content command or option content that is different from mainstream content corresponding to a viewer trigger sent from the device as a response to the guidance trigger through a network; and a component for reproducing or displaying the option content on a display device
10. The program as claimed in claim 9, wherein the guidance trigger is provided to the device through a function of the device including voice, vibration, email and SNS.
11. The program as claimed in claim 9, wherein reproduction of the option content is performed through a media medium and display of the option content displays the option content acquired through the network on the display device, a video screen, or an object.
12. The program as claimed in claim 9, wherein reproduction of the option content is performed by video streaming.
13. The program as claimed in claim 9, wherein display of the option content is performed by the video screen or projection mapping.
14. The program as claimed in claim 9, comprising a collaboration server for communicating between a plurality of viewers.
15. The program as claimed in claim 14, wherein the collaboration server performs collaboration by voice communication using SNS.
16. The program as claimed in claim 1, wherein the option content is determined by voting of a large number of the viewers.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0021]
[0022]
[0023]
[0024]
[0025]
[0026]
[0027]
[0028]
[0029]
[0030]
[0031]
[0032]
[0033]
[0034]
[0035]
[0036]
[0037]
[0038]
[0039]
[0040]
[0041]
[0042]
[0043]
EXPLANATION OF REFERENCE NUMERAL
[0044] 100: video providing system
[0045] 101: display device
[0046] 102: speaker
[0047] 103: content reproducing device
[0048] 104: smart speaker
[0049] 105: tablet terminal
[0050] 106: smartphone
[0051] 110: network
[0052] 120: collaboration server
[0053] 130: streaming server
MODE FOR CARRYING OUT THE INVENTION
First Embodiment
[0054] The present invention will be described below with reference to embodiments, but the present invention is not limited to the embodiments described below.
[0055] The content reproducing device 103 is not particularly limited as long as it can be connected to a network 110 and can send video signals and audio signals to the display device 101 and the speakers 102. For example, an information processing device such as a DVD player, a Blu-Ray (registered trademark) player, XBOX (registered trademark) player, PlayStation (registered trademark), or the like, or a personal computer can be used as the content reproducing device 103. Note that the content reproducing device 103 preferably implements a program (which may be an application or firmware) capable of interpreting information sent from the network 110 and editing a reproduction sequence of the content.
[0056] Further, a streaming server 130 and a collaboration server 120 are connected to the video providing system 100 through the network 110. The streaming server 130 streams digital content and sends it to the content reproducing device 103 and provides the video via the display device 101 and the speaker 102. Further, the collaboration server 120 provides a function of receiving a user action sent from a smart speaker 104, a tablet terminal 105, or a smartphone 106 accessible by the user, determining the content of the user action, and enabling editing of content to be reproduced later. Note that the device used by the viewer also includes a controller such as Amazon Fire Stick (registered trademark), for example. The display device 101 also includes a projector.
[0057] Note that the network 110 is described below as including any one or both of a voice call and data communication using a public telephone network, in addition to communication using a wired or wireless TCP/IP protocol including Gigabit Ethernet, 4G, and 5G. Further, the smart speaker 104, the tablet terminal 105, and the smart phone 106 may be capable of making a voice call using a public telephone line in addition to a voice call via an Internet by a so-called SNS or the like such as Facetime (registered tradename), LINE (registered tradename), Facebook (registered tradename) and Twitter (registered tradename).
[0058]
[0059] Note that a smartphone or a dedicated portable control device can be used as the device in the present embodiment. When the device is the smartphone, the smartphone functions as a component for receiving a guidance trigger prompting the viewer to participate in video content on viewer's device, a component for accepting an option content command or option content that is different from mainstream content corresponding to a viewer trigger sent from the device as a response to the guidance trigger through a network and a component for reproducing or displaying the option content on a display device as the smartphone application in the present embodiment. It is also conceivable that a dedicated control device such as an PSP downloads or installs a program for the dedicated control device to provide the same function.
[0060] The viewer participation information includes the viewer trigger that the viewer can configure as keywords, commands, or the like for modifying the content, and in the case of a voice call, the viewer participation information includes voice call information of the viewer. In addition, operations such as tapping and shaking on a touch screen and the like may also be used as the viewer trigger.
[0061] The action processing server unit 124 includes a so-called IVR function, a voice analysis function, and an action analysis function. When the viewer participation information is voice information, the action processing server unit 124 sends the voice information to a participation information analysis unit 125 as the participation information. Further, in a specific embodiment, the voice information of a received voice call may be sent to collaboration server 120 as it is and generated from speaker 104 and may be superimposed on voice information of the decoded digital content so that the viewer present in the space feeds back the voice call as if a participant who sent the voice call was in the content from the beginning. Furthermore, the action processing server unit 124 detects position information, acceleration information, tapping, swiping, and the like transmitted from the tablet terminal 105 and the smartphone 106, and enables editing of the content based on the detected viewer trigger. Note that voice processing function can be configured as a cloud server, and a service including, for example, AI such as Google Assistant (trademark) and IBM Speech to Text (registered trademark) can be used as the cloud service for performing such voice processing but is not limited to a specific cloud service.
[0062] A viewer management unit 123 has a function to collect information such as a user ID of the viewer and, as necessary, a password, a terminal form, and a participation mode sent through the network 110 in advance and register the information in a user database (not illustrated). In addition, the web server unit 122 and the action processing server unit 124 each have a function of causing the participation information analysis unit 125 to perform processing corresponding to the participation mode of the viewer when receiving the participation information.
[0063] Further, the collaboration server 120 includes the participation information analysis unit 125 that analyzes the viewer participation information sent from the viewer and a trigger extraction unit 126. The participation information analysis unit 125 determines whether the participation information transmitted from the viewer is voice information or a command from the application or the like, decodes the participation information of the viewer in accordance with the participation form of the viewer, and determines whether or not the viewer participation information including a preset viewer trigger is included in the trigger extraction unit 126.
[0064] When the viewer trigger is not included, the collaboration server 120 does not issue any particular command to modify the content. Further, in a case where the collaboration server 120 determines that the viewer participation information includes the preset viewer trigger, the collaboration server 120 sends a content command including the viewer trigger to the content reproducing device 103 or the streaming server 130 through the network 110. The command is sent to the content reproducing device 103 or the streaming server 130, and the decoding order and the streaming order of the digital contents decoded by the content reproducing device 103 are switched to enable the viewer to participate in the video and audio.
[0065] Further, the collaboration server 120 manages a response log database 128. The response log database 128 registers history of viewer participation in not only the screening of that time, but also the same video or event that was performed in the past by associating the history of viewer participation with the user information, a user attribute, viewing time, viewing area, and the like. Examples of the state of viewer participation include scene selection, action information type, command type from the smartphone application, and the like, and the collaboration server 120 accumulates these as a response log.
[0066] In aspects of the present invention, the collaboration server 120 may analyze the response log and learn the content such as scenes and videos that many participants sympathize with in the digital content to provide effective content creation. Further, response information accumulated in the response log database 128 may be used as big data for subsequent content creation.
[0067]
[0068] Hereinafter, the function will be described from the processing unit on the upstream side that has received the participation information from the viewer. The interface unit 103a receives the content command sent from the collaboration server 120 corresponding to an action added to the scene. The interface unit 103a sends the received content command to the content sequencer 103c. The content sequencer 103c analyzes the content command to select a scene ID associated with the viewer trigger included in the content command and causes the scene ID designated for the content reproducing device 103 to be loaded from the media medium 103e into the buffer 103d. Note that the scene means a time-series video provided with a certain meaning or attribute in the mainstream content, which is composed of a plurality of scenes, a plurality of GOPs (Group of Picture), and the like.
[0069] The content reproducing device 103 sends the buffered data of the scene ID to the decoder 103b, outputs video information of the scene ID associated with the viewer trigger to an output buffer 103f as the decoding is completed, and sequentially enables reproducing of the selected scene on the display device 101 and the speaker 102. By using the processes described above, the scene of the mainstream content can be reproduced in real time without interruption.
[0070] Note that the association between the viewer trigger and the scene ID can be performed by assigning the scene ID to the content corresponding to the viewer trigger, for example, in response to specific keywords such as “go up”, “go down”, “go right”, “go left”, “go forward”, “return”, and the like. In addition, the scene ID to be selected in accordance with an operation such as the position information, the acceleration information, tapping, swiping, or the like of the tablet terminal 105 and the smartphone 106 can be associated, and the viewer trigger and the content associated with the viewer trigger are not particularly limited as long as the realistic sensation of participation in the digital content can be improved.
[0071]
[0072] Here, it is assumed that the streaming server 130 is already streaming a specific digital content in response to a request from the viewer. Streaming server 130 receives content designation from collaboration server 120 along with the viewer trigger during streaming of the digital content. When an interface unit 131 that has received the content command determines that the received information includes the content command, the interface unit 131 sends the content command to the stream sequencer 133. The stream sequencer 133 analyzes the viewer trigger included in the content command, selects the scene ID associated with the viewer trigger, and buffers the digital content specified by the scene ID in a buffer 134.
[0073] The streaming server 130 sends the buffered digital content to a transmitter 132 as a video stream to be delivered next and sends the content reproducing device 103 through the network 110. Note that, when streaming from the streaming server 130 is performed, the content reproducing device 103 directly provides the stream from the interface unit 103a to the decoder 103b to decode the stream, and then displays a video image on the display device 101 through the output buffer 103f. Note that the content reproducing device 103 can include a plurality of decoders according to types and attributes of the content to be reproduced. Note that examples of preferable encoding methods for streaming purposes include MP2, MP3, MP4, H264, MOV, and the like, and are not particularly limited to a specific format.
[0074]
[0075] Further, video content 502 is an embodiment in which the sequence of the mainstream content 500 is edited by the participation of the viewer. When the video providing system 100 receives the viewer participation information at a scene A, the content server 120 analyzes the viewer trigger and selects the digital content of scene ID associated with the viewer trigger as the option content for reproducing as the next scene. Therefore, the initially prepared mainstream content 501 is edited in accordance with actions of the viewer.
[0076] Then, when another viewer trigger is received in a scene B, the next option content is selected in response to the viewer trigger to provide a video. Here, the option content means the digital content that is replaced with a mainstream scene in response to the viewer trigger. Here, a scene C further receives other viewer triggers to modify the scene sequence, and a scene D also modifies the scene sequence in response to the viewer triggers to continue this until the end of the video.
[0077] Note that the viewer who can send the viewer triggers between the scene A-D may be the same or different from each other. Note that if no viewer trigger is received at all, the mainstream content 501 is provided after inserting the option content to provide a reaction such as the phone not being connected.
[0078]
[0079] The collaboration server 120 analyzes the viewer triggers included in the viewer participation information using the information shown in
[0080]
[0081] For example, a phone call from an actor, a question on the screen from the actor, a message transmission, a SNS transmission, a vibration, or the like can be used as the guidance trigger, and a plurality of option content 701a associated with each of the guidance triggers are recorded in association with the scene ID. For example, when the guidance trigger sends the voice call to the viewer's smartphone 106 as “Which way do you want to go ?” or “What should I do with this guy?”, the viewer responds to the voice call by “I think it's better to go to the left” or “I don't want to let it go without doing anything” so that the story unfolds depending on a context of a storyline, such as seriousness, comical, action, and the like, after an actor moves to the left or escapes the actor.
[0082] Further, as shown by the hatching 704, the guidance trigger is also arranged in the option content, and it is possible to change the video stream from one option content to the other option content. In another embodiment, a guardian may instruct to select a safe scene by voice call or the like, such as when the guardian does not want a young person such as a child to watch the video.
[0083] Further, a similar guidance trigger is added to the mainstream contents 702 and 703 that follow thereafter, and each of the option contents 702a and 703a is associated with the actions of the viewer to enable the viewer to edit the video content.
[0084]
[0085] Note that the guidance trigger shown in
[0086]
[0087] In a case where there is no viewer trigger (no), it is determined whether or not there is time-out in a step S905 and in a case where there is no time-out (no), the processing is branched to the step S902 to further confirm whether or not there is the viewer trigger. On the other hand, when the time-out expires in the step S905 (yes), the trigger determines that the guidance trigger was ineffective because the viewer is sleeping, standing in the bathroom, or not aware of it at all, and then the processing is branched to a step S906, and the video continues to be provided in the sequence of the mainstream content until a timing of the next guidance trigger comes.
[0088] On the other hand, in a case where there is the viewer trigger in the step S902 (yes), the option content corresponding to the media attribute of the viewer trigger and the content of the viewer trigger is selected in a step S903, and the collaboration server 120 sends the content command to the content reproducing device 103 or the streaming server 130 in a step S904.
[0089] In the step S906, the content reproducing device 103 or the streaming server 130 selects the option content to be played next and starts preparation for decoding or transmission. Thereafter, in a step S907, the content reproducing device 103 reproduces the option content. The processing then returns to the step S902 and waits for a subsequent viewer trigger to be received.
[0090] By using the processing described above, the viewer can be guided to progress of the video, and the video can be rendered to the viewer present in the space as if the viewer were to appear with the actor in advance in movies or the like, and the sensation thereof can be shared by the viewer.
[0091]
[0092]
[0093]
[0094]
[0095]
[0096]
[0097] Further, the network device 1505 connects the content reproducing device 103 to a wireless network such as 4G or 5G/a network such as the Internet at the transport layer level and the physical layer level to establish a session with the user terminal.
[0098] An I/O bus bridge 1506 is further connected to the system bus 1510. A storage device 1507 such as a hard disk is connected to the downstream side of the I/O bus bridge 1506 via an I/O bus 1509 such as PCI by an IDE, ATA, ATAPI, serial ATA, SCSI, USB, or the like. Further, an input device 1508 such as a pointing device such as a keyboard and a mouse is connected to the I/O bus 1509 via a bus such as USB and receives inputs and commands from an operator such as a system administrator.
[0099] More specifically, examples of the CPU 1501 used by the content reproducing device 103 include PENTIUM (registered trademark) to PENTIUM IV (registered trademark), PENTIUM (registered trademark) compatible CPU, CORE2DUO (registered trademark), COREi3 to i7 (registered trademark), POWER PC (registered trademark), XEON (registered trademark), and the like.
[0100] Examples of an operating system (OS) to be used include MacOS (registered trademark), Windows (registered trademark), UNIX (registered trademark), LINUX (registered trademark), CHROME (registered trademark), ANDROID (registered trademark), and other suitable OS. Further, the content reproducing device 103 stores and performs application programs running on the OS described above and described in programming languages such as C, C++, Visual C++, VisualBasic, Java (registered trademark), Java (registered trademark) ScriptPerl, and Ruby.
[0101] Further, although functional configurations of the collaboration server 120 and the streaming server 130 used in the present embodiment differ depending on a provision function, the same hardware configuration can be adopted.
[0102] Note that the program of the present embodiment is referred to as a so-called “application” and can be performed by downloading the program to the viewer device, such as the smart speaker 104, the tablet terminal 105, and the smartphone 106. Furthermore, a content viewing device 104 can also be implemented by using a program or the like that uses an executable just-in-time compiler or the like without compiling or resetting the program by downloading it through the network.
[0103] Basic elements of the device used by the viewer of the present embodiment is not significantly different from the configuration of the content reproducing device 103 shown in
[0104] In addition, examples of the OS performed by the device required by the viewer include Android (registered trademark), iOS (registered trademark), Bada (registered trademark), BlackBerry OS (registered trademark), Fire Fox (registered trademark), Symbian OS (registered trademark), BREW (registered trademark), WindowsMobile (registered trademark), WindowsPhone (registered trademark), but are not limited thereto.
Second Embodiment
[0105] A second embodiment will be described below. A second embodiment is a video providing system that edits and provides the content in accordance with viewers or audience behavior in theaters, live performances, and the like.
[0106] Hereinafter, the present invention will be described with reference to embodiments, but the present invention is not limited to the embodiments described later.
[0107] A speaker 1604 is installed in a vicinity of the screen 1601, and sends sound synchronized with the video projected from the projectors 1602 and 1603 into the space. Note that the speakers 1604 shown in
[0108] Further, although the embodiment shown in
[0109] The video providing system 1600 is further configured to include a content server 1605 and a collaboration server 1606. The content server 1605 has a function of controlling the content to be projected by the projectors 1602 and 1603 and a decoding sequence of the content. The collaboration server 1606 also includes functions of a web server and a voice processing (IVR) server. The collaboration server 1606 processes the user information, the viewer participation information, the voice information, and the like sent by the viewers from a mobile terminal 1608 such as a mobile phone, a smartphone, or a tablet terminal from the viewers sharing the video in the space through a network or a public telephone network 1607. Note that, in addition to communication using a TCP/IP protocol using gigabit Ethernet (registered trademark), the network 1607 will be described below as including data communication using a wireless communication protocol such as 4G and 5G, and/or both of voice communication using the public telephone network and the data communication. In addition, the application for communication can be used by a person capable of so-called SNS data communication, such as Facetime (registered trademark), LINE (registered trademark), Facetime (registered trademark), and Twitter (registered trademark).
[0110] The collaboration server 1606 has a function of modifying a video decoding order of the content server 1605 in response to a response from the viewer, causing the speaker 1604 to generate additional audio information, and the like. Note that although
[0111]
[0112]
[0113] The voice processing server unit 1704 includes a so-called IVR function. When the viewer participation information is voice information, the voice processing server unit 1704 sends the voice information to a participation information analysis unit 1705 as the participation information. Also, in a specific embodiment, the voice information of the received voice call may be sent to the content server 1605 as it is and generated from speaker 1604 and may be superimposed on voice information of the decoded digital content so that the viewer present in the space feeds back the voice call as if the participant who sent the voice call was in the content from the beginning.
[0114] A viewer management unit 1703 has a function to collect information such as a user ID of the viewer and, as necessary, a password, a terminal form, and a participation mode sent through the network 1607 in advance and register the information in a user database (not illustrated). In addition, the web server unit 1702 and the voice processing server unit 1704 each have a function of causing the participation information analysis unit 1705 to perform processing corresponding to the participation mode of the viewer when receiving the participation information.
[0115] Further, the collaboration server 1606 is configured to include the participation information analysis unit 1705 that analyzes the viewer participation information sent from the viewer and a trigger extraction unit 1706. The participation information analysis unit 1705 determines whether the participation information transmitted from the viewer is voice information or a command from the application or the like, decodes the participation information of the viewer in accordance with the participation form of the viewer, and determines whether or not the viewer participation information including a preset viewer trigger is included in the trigger extraction unit 1706 according to a mode of viewer participation.
[0116] When the viewer trigger is not included, no particular command is issued to content server 1605. Further, in a case where it is determined that the preset viewer trigger is included in the viewer participation information, a command is sent to the content server 1605 to enable the viewer to participate in the video and audio by switching the decoding order of the digital content decoded by the content server 1605, or by separating the video and audio parts of the digital content and decoding only the video part to replace that part with other audio information, or by performing the projection mapping, or superimposing it on other audio information. Note that voice processing function can be configured as a cloud server, and a service including, for example, AI such as Google Assistant (trademark) and IBM Speech to Text (registered trademark) can be used as the cloud service for performing such voice processing but is not limited to a specific cloud service.
[0117] Further, the collaboration server 1606 manages a response log database 1707. The response log database 128 registers history of viewer participation in not only the screening of that time, but also the same video or event that was performed in the past by associating the history of viewer participation with the user information, a user attribute, screen time, screened area, and the like. Examples of the state of viewer participation include scene selection, voice information type, command type from the smartphone application, and the like, and the collaboration server 1606 accumulates these as a response log.
[0118] In the second embodiment, collaboration server 1606 may analyze the response log, select a scene or video in which many participants sympathize with in the screen or event and learn the content to cause the content server 1605 to display it. Further, response information accumulated in the response log database 1707 may be used as big data for subsequent content creation.
[0119]
[0120] Hereinafter, the function will be described from the processing unit on the upstream side that has received the participation information from the viewer. The trigger buffer 1805 has a function of buffering the viewer trigger included in the participation information. Note that the scene means a time-series video provided with a certain meaning or attribute in the mainstream content, which is composed of a plurality of scenes, a plurality of GOPs (Group of Picture), and the like.
[0121] The content server 1605 includes a function to cause the options database 1809 to load, in advance, the content for providing video enabling viewer participation to be decoded as a next scene in response to the viewer trigger in the viewer participation information. The content server 1605 loads the mainstream content for providing the video from a content database 1808 and stores it in a content buffer 1804. The mainstream content stored in the content buffer 1804 is sent to the decoder 1802 in response to a command from the content sequencer 1803 to enable projection from the proj ector 1602.
[0122] Further, the content server 1605 determines the viewer trigger in the trigger buffer 1805, causes the content buffer 1804 to load the content for providing subsequent video to provide an option to the viewer and modifies a reproduction order table in which a scene order for the content sequencer 1803 to load is registered according to the viewer trigger. Further, as necessary, the content buffer 1804 separates video portion information and audio information portion of the scene being loaded and modifies the content of the scene so as to reproduce only the video portion or the audio portion.
[0123] To provide this function, the content server 1605 uses a lookup table or the like to determine an identification value specifying the option content specified by the viewer trigger content and the corresponding guidance trigger content. After that, the option content specified by the identification value is loaded into the content buffer using the determined identification value.
[0124] The content sequencer 1803 refers a reproduction order table and supplies the scene or the content to the decoder 1802 in ascending order of the reproduction order to start decoding. The decoder 1802 decodes the sent scene sequence using an appropriate decoding scheme for H264, MPEG4, and other high vision, 4K, 8K, and 3D, and supplies a video image to the projector 1602 via an appropriate video driver such as VGA, SVGA, and XGA.
[0125] Further, the content server 1605 corresponds to the viewer trigger. When the viewer trigger commands projection mapping, for example, the content server 1605 loads the content for projection mapping in synchronization with the reproduction of the corresponding scene in the reproduction order table of the scene for performing projection mapping and sends the content for the projection mapping to the decoder 1806 to enable synchronized projection mapping from the projector 1603.
[0126]
[0127] The collaboration server 1606 first receives a registration of the user information or the like from the viewer and registers the user information and the like in the user database shown in
[0128] The viewer is guided by the trigger information and sends the participation information to the collaboration server 1606 from a GUI such as a button of the smartphone application and the scroll bar. In another embodiment, the viewer receives a call by the IVR function of the collaboration server 1606 and sends a voice call to the collaboration server 1606, whereby the participation information from the viewer can be sent.
[0129] The collaboration server 1606 receives the participation information from the viewer and performs the above-described processing to enable the viewer participation in the form of audio/video, both audio and video, video, projection mapping, and the like.
[0130]
[0131] Further, a single trigger point or multiple trigger points can be configured for the scene, and the option content to be called can be changed depending on the time position indicating whether or not the viewer responds. Furthermore, in the second embodiment, it is possible to configure whether both video and audio portions are decoded, only the video portion is decoded, or only the audio portion is decoded when decoding is performed. In the case of this embodiment, modes of the viewer participation can be further diversified, such as providing completely different videos or providing completely different audio information even with the same video, in accordance with the participation information of the viewers.
[0132]
[0133] In a case where there is no viewer trigger (no), it is determined whether or not there is time-out in a step S2105 and in a case where there is no time-out (no), the processing is branched to the step S2102 to further confirm whether or not there is the viewer trigger. On the other hand, when the time-out expires in the step S2105 (yes), the trigger determines that the guidance trigger was ineffective because the viewer is sleeping, standing in the bathroom, or not aware of it at all, and then the processing is branched to a step S2106, and the video continues to be provided in the sequence of the mainstream content until a timing of the next guidance trigger comes.
[0134] On the other hand, in a case where there is the viewer trigger in the step S2102 (yes), the option content corresponding to the media attribute and the content of the viewer trigger is searched from an option database 1809 in a step S2103 and loaded into the content buffer 1804, and in a step S2104, the content server 1605 modifies the reproduction order in the reproduction order table to set the reproduction sequence. Thereafter, in the step S2106, the content sequencer 1803 loads the scene to be reproduced next, sends it to the decoder 1802, decodes it in the specified order, and sends a video signal to the projector in a step S2107 to enable video reproduction. Note that, at this time, by providing the video superimposed on the video projected as voice information or video information to the viewer in accordance with the viewer participation information, the participation of a single viewer can be shared by all the viewers present in the space.
[0135] In a step S2108, the viewer trigger such as other keywords or commands is searched in the viewer participation information, and in a step S2109, it is determined whether or not there is information requesting a scene change as the viewer trigger.
[0136] When there is such information (yes), the processing is branched to the step S2104 to change the sequence and enable the viewer participation. On the other hand, in a case where there is no information in the step S2109, the processing is branched to the step S2106 to continue decoding the video without changing the scene.
[0137] By using the processing described above, the viewer can be guided to progress of the video, and the video can be rendered to the viewer present in the space as if the viewer were to appear in movies or the like in advance, and the sensation thereof can be shared by all the viewers.
[0138]
[0139]
[0140] Further, video content 2302 is an embodiment in which the sequence of the mainstream content 2300 is replaced by the viewer participation. Upon receiving the viewer participation information at scene A, the content server 1605 modifies the next scene and modifies the mainstream content. When the other viewer participation information is received in scene B, the next scene is changed to provide the video. Further, in scene C, the other viewer participation information is received to modify the sequence of the scene, and further, in scene D, the sequence of the scene is modified corresponding to the viewer participation information, and this is continued until the end of the video.
[0141] Note that the viewers who can send the viewer participation information to the scene A to D may be the same or different. The option information may be voice information or video information such as projection mapping. In yet other embodiments, projection mapping may be used to display digital content, the mainstream content may be projection-mapped to the option content and displayed, or vice versa.
[0142] The above functions of the present embodiment can be realized by a device executable program written in object-oriented programming languages such as C++, Java (registered trademark), Java (registered trademark) Beans, Java (registered trademark) Applet, Java (registered trademark) Script, Perl, Ruby, and Python, for example, a program referred to as an application, and can be downloaded through the network or recorded and distributed on a device-readable recording medium. Further, the elements common to the first embodiment and the second embodiment may be shared or may include a plurality of types of collaboration servers in a specific application.
[0143] As described above, according to the present invention, it is possible to provide a video providing system and a program configured to control progress of content in a manner of involving a viewer who views a video reflected on a display device.
[0144] The present invention has been described above with reference to embodiments, but the present invention is not limited to the embodiments shown in the drawings and can be modified within the scope that can be conceived by a person skilled in the art, such as other embodiments, additions, changes, and deletions, and all aspects are within the scope of the present invention as long as the effects of the present invention are achieved.