Selective playback of audio at normal speed during trick play operations
11558674 · 2023-01-17
Assignee
Inventors
Cpc classification
H04N21/84
ELECTRICITY
H04N21/8456
ELECTRICITY
H04N5/783
ELECTRICITY
H04N21/47217
ELECTRICITY
H04N21/6587
ELECTRICITY
H04N21/8106
ELECTRICITY
H04N21/4394
ELECTRICITY
International classification
H04N21/6587
ELECTRICITY
Abstract
Systems and methods are described herein for selective playback of portions of audio at normal speed during a fast-forward operation. Upon receiving a command to perform a fast-forward operation, a current playback position is identified, as well as a plurality of portions of audio of the content item that will be subject to the fast-forward operation. A subset of the audio portions that will be subject to the fast-forward operation are selected. The fast-forward operation is initiated, and video of the content item is played back at the increased speed while the selected portions of audio of the content item are played back at normal speed.
Claims
1. A method comprising: receiving a command to play operation on a media content item, wherein the play operation comprises playing a media content item at a first speed that is greater than a normal speed; calculating a length of a time window, wherein the length is calculated based on: (a) a duration of a scene of the media content item that is being played when the command is received, and (b) a magnitude of the first speed; analyzing a segment of audio of the media content item in the time window, wherein the time window starts at the point when the command is received and has a duration that matches the calculated length of the time window, to identify a sub-portion of the segment of audio in the time window; initializing the play operation; and in response to initializing the play operation, playing the media content item at the first speed while playing the identified sub-portion of the segment of the audio at a second speed that is slower than the first speed.
2. The method of claim 1, wherein analyzing a segment of audio of the media content item in the time window further comprises accessing a plurality of metadata of the content item, wherein the metadata comprises descriptive information about each respective portion of a plurality of portions of the segment of audio.
3. The method of claim 2, wherein the plurality of metadata further comprises an importance level for each respective portion of the plurality of portions of the segment of audio.
4. The method of claim 3, further comprising: determining, for each respective portion of the plurality of portions of the segment of audio, whether the importance level of each respective portion of audio exceeds a threshold importance level; and selecting a portion of the plurality of portions of the segment of audio that has an importance level that exceeds the threshold importance level.
5. The method of claim 1, further comprising: accessing a plurality of metadata of the content item, wherein the metadata comprises a plurality of descriptors and a plurality of significance factors; accessing user preference data comprising a plurality of preference factors corresponding to at least one of the plurality of descriptors; determining a user preference for each respective portion of the plurality of portions of the segment of audio by comparing each of the plurality of descriptors of the respective portion of the plurality of portions of the segment of audio with the corresponding preference factor; calculating an importance level for the respective portion of the plurality of portions of the segment of audio based on the significance factor and the preference factor; and selecting the respective portion of the plurality of portions of the segment of audio that exceeds the threshold importance level.
6. The method of claim 5, wherein calculating an importance level for the respective portion of the plurality of portions of the segment of audio based on the significance factor and the preference factor comprises: determining an absolute importance level based on the significance factor; determining a weighing factor corresponding to the preference factor; and applying the weighting factor to the absolute importance level to determine a relative importance level.
7. The method of claim 1, further comprising: separating video and audio of the content item to create a video stream comprising the video and an audio stream comprising the audio; wherein initiating the play operation comprises increasing the playback speed of the video stream; and playing the identified sub-portion of the segment of the audio stream at a second speed that is slower than the first speed.
8. The method of claim 7, wherein playing the identified sub-portion of the segment of the audio stream at a second speed that is slower than the first speed comprises: advancing an audio playback position of the audio stream to a position in the audio stream corresponding to the identified sub-portion of audio; and playing back the identified sub-portion of audio.
9. The method of claim 1, wherein: initializing the play operation comprises moving a starting position of the time window at a speed corresponding to the first speed.
10. The method of claim 1, further comprising buffering the identified segment of the audio.
11. A system comprising: input/output circuitry configured to: receive a command to play operation on a media content item, wherein the play operation comprises playing a media content item at a first speed that is greater than a normal speed; and processing circuitry configured to: calculate a length of a time window, wherein the length is calculated based on: (a) a duration of a scene of the media content item that is being played when the command is received, and (b) a magnitude of the first speed; analyze a segment of audio of the media content item in the time window, wherein the time window starts at the point when the command is received and has a duration that matches the calculated length of the time window, to identify a sub-portion of the segment of audio in the time window; initialize the play operation; and in response to initializing the play operation, play the media content item at the first speed while playing the identified sub-portion of the segment of the audio at a second speed that is slower than the first speed.
12. The system of claim 11, wherein the processing circuitry configured to analyzing a segment of audio of the media content item in the time window is further configured to access a plurality of metadata of the content item, wherein the metadata comprises descriptive information about each respective portion of a plurality of portions of the segment of audio.
13. The system of claim 12, wherein the plurality of metadata accessed by the processing circuitry further comprises an importance level for each respective portion of the plurality of portions of the segment of audio.
14. The system of claim 13, wherein the processing circuitry is further configured to: determine, for each respective portion of the plurality of portions of the segment of audio, whether the importance level of each respective portion of audio exceeds a threshold importance level; and select a portion of the plurality of portions of the segment of audio that has an importance level that exceeds the threshold importance level.
15. The system of claim 11, wherein the processing circuitry is further configured to: access a plurality of metadata of the content item, wherein the metadata comprises a plurality of descriptors and a plurality of significance factors; access user preference data comprising a plurality of preference factors corresponding to at least one of the plurality of descriptors; determine a user preference for each respective portion of the plurality of portions of the segment of audio by comparing each of the plurality of descriptors of the respective portion of the plurality of portions of the segment of audio with the corresponding preference factor; calculate an importance level for the respective portion of the plurality of portions of the segment of audio based on the significance factor and the preference factor; and select the respective portion of the plurality of portions of the segment of audio that exceeds the threshold importance level.
16. The system of claim 15, wherein the processing circuitry configured to calculate an importance level for the respective portion of the plurality of portions of the segment of audio based on the significance factor and the preference factor is further configured to: determine an absolute importance level based on the significance factor; determine a weighing factor corresponding to the preference factor; and apply the weighting factor to the absolute importance level to determine a relative importance level.
17. The system of claim 11, wherein the processing circuitry is further configured to: separate video and audio of the content item to create a video stream comprising the video and an audio stream comprising the audio; initiate the play operation by increasing the playback speed of the video stream; and play the identified sub-portion of the segment of the audio stream at a second speed that is slower than the first speed.
18. The system of claim 17, wherein the processing circuitry configured to play the identified sub-portion of the segment of the audio stream at a second speed that is slower than the first speed is further configured to: advance an audio playback position of the audio stream to a position in the audio stream corresponding to the identified sub-portion of audio; and play back the identified sub-portion of audio.
19. The system of claim 11, wherein the processing circuitry is further configured to: initialize the play operation comprises moving a starting position of the time window at a speed corresponding to the first speed.
20. The system of claim 11, wherein the processing circuitry is further configured to buffer the identified segment of the audio.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) The above and other objects and advantages of the disclosure will be apparent upon consideration of the following detailed description, taken in conjunction with the accompanying drawings, in which like reference characters refer to like parts throughout, and in which:
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
DETAILED DESCRIPTION
(13)
(14) Media device 100 may buffer audio data of the portions of audio such as portion 106 as the fast-forward operation proceeds, playing back the portion at normal speed from the buffered audio data. Alternatively, media device 100 may separate and individually control playback of the video and audio of content item 102, generating separate video and audio streams. Media device 100 increases the playback speed of the video stream and advances the playback position of the audio stream to a position corresponding to portion 106, which is played back at normal speed. Upon conclusion of portion 106, media device 100 advances the playback position of the audio stream to the next identified portion of audio to be played back.
(15) To identify portions of audio that will be subject to the fast-forward operation, media device 100 may initialize a moving window having a starting point at the current playback point and a length corresponding to the minimum duration of the fast-forward operation. The minimum duration can be determined based on the increased playback speed to be used in the fast-forward operation and the average amount of time a user is expected to want to execute the fast-forward command. For example, if the fast-forward operation increases the playback speed to 2× speed and the user is expected to fast-forward for ten seconds, media device 100 may initialize the moving window with a length of twenty seconds.
(16)
(17)
(18)
(19) Control circuitry 500 receives 502, using input circuitry 504, a command to perform a fast-forward operation. Input circuitry 504 may include a microphone and voice processing circuitry for receiving voice commands, infrared receiving circuitry for receiving commands from a remote control device, a touchscreen interface for receiving user interactions with graphical user interface elements, or any combination thereof or any other suitable input circuitry for receiving any other suitable user input. In response to the command, input circuitry 504 generates a query for metadata of the content item (e.g., metadata 300 or metadata 400) and transmits 506 the query to transceiver circuitry 508 to be transmitted 510 to content metadata database 512. The query may be an SQL “SELECT” command, or any other suitable query format. Transceiver circuitry 508 may be a network connection such as an Ethernet port, WiFi module, or any other data connection suitable for communicating with a remote server. Transceiver circuitry 508 receives 514 from content metadata database 512, in response to the query, metadata describing a plurality of portions of audio of the content item (e.g., metadata 300 or metadata 400). In some embodiments input circuitry 504 also generates a second query for user preference data. Transceiver circuitry 508 transmits 516 the second query to user profile database 518 and receives 520 from user profile database 518, in response to the query, user preference data (e.g., user preference data 402).
(20) Transceiver circuitry 508 transfers 522 the metadata to comparison circuitry 524. Comparison circuitry 524 identifies a number of portions of audio that will be subject to the fast-forward operation and analyzes their respective importance levels to select a subset of portions of audio that are to be played back at normal speed during the fast-forward operation. Comparison circuitry 524 may also receive, or have access to, the current playback position and the length of moving window 216. Once the subset of portions of audio have been selected, comparison circuitry 524 transfers 526 the identifiers corresponding to the subset of portions to output circuitry 528. Output circuitry 528 increases the speed of video output 530 and, using the identifiers of the subset of portions of audio, outputs 532 audio of each portion of the subset of portions. Output circuitry 528 may time the output of each portion of audio to correspond with the time at which the corresponding video is played back at the increased speed, or may simply play each portion sequentially.
(21)
(22) At 602, control circuitry 500 receives, using input circuitry 504, a command to perform a fast-forward operation. The command may be received from a remote control or other user input device, or may be a voice command. At 604, control circuitry 500 identifies a current playback position of the content item. For example, control circuitry 500 accesses a timestamp of a frame of video content currently being displayed.
(23) At 606, control circuitry 500 identifies a plurality of portions of audio of the content item following the current playback position that will be subject to the fast-forward operation. For example, control circuitry 500 accesses, using transceiver circuitry 508, metadata of the content item describing portions of audio of the content item and their respective starting times. Control circuitry 500 determines, based on the starting time of each portion of audio and the current playback position, which portions of the plurality of portions of audio will be subject to the fast-forward operation. At 608, control circuitry 500 accesses metadata of the content item comprising an importance level of each portion of audio that will be subject to the fast-forward operation.
(24) At 610, control circuitry 500 initializes a counter variable N, setting its value to zero, and a variable T.sub.P representing the total number of portions of audio subject to the fast-forward operation, setting its value to the number of portions of audio. At 612, control circuitry 500 determines whether the importance level of the N.sup.th portion of audio exceeds a threshold importance level. For example, portions of audio may be rated on a scale of importance from one to five. Control circuitry 500 may establish a threshold importance level of four, meaning that any portion having an importance level of four or higher should be played back at normal speed. If the importance level of the N.sup.th portion of audio exceeds the threshold importance level, then, at 614, control circuitry 500 adds the N.sup.th portion of audio, or an identifier thereof, to a subset of portions of audio.
(25) After adding the N.sup.th portion of audio to the subset of portions of audio, or if the importance level of the N.sup.th portion of audio does not exceed the threshold importance level, at 616, control circuitry 500 determines whether N is equal to T.sub.P. If not, then, at 618, control circuitry 500 increments the value of N by one, and processing returns to step 612. If N equals T.sub.P, meaning that all portions of audio have been analyzed, then, at 620, control circuitry 500 initiates the fast-forward operation.
(26) At 622, control circuitry 500, using output circuitry 528, plays back video of the content item at an increased speed. At 624, control circuitry 500, using output circuitry 528, plays back the subset of portions of audio at normal speed. Control circuitry 500 may determine when video corresponding to each portion of the subset of portions of audio is displayed and play back the corresponding portion of audio at that time. Alternatively, control circuitry 500 may play each portion of the subset of portions sequentially beginning at the time at which the fast-forward operation is initiated.
(27) In cases where the time required to play back the subset of portions of audio at normal speed exceed the duration of the fast-forward operation, control circuitry 500 may slow the speed at which the fast-forward operation is performed. For example, control circuitry 500 may reduce the speed from 2× to 1.5× in order to provide additional time to play back the subset of portions of audio at normal speed before the end of the fast-forward operation. Alternatively or additionally, control circuitry 500 may reduce the number of portions in the subset of portions of audio. For example, control circuitry 500 may raise the threshold level of importance or may disregard user preference data which resulted in the inclusion of additional portions of audio.
(28) The actions or descriptions of
(29)
(30) At 702, control circuitry 500 accesses metadata of the content item comprising a significance factor and a plurality of descriptors for each portion of audio of the content item. For example, control circuitry 500, using transceiver circuitry 508, transmits a query to a database and receives the metadata in response to the query. The significance factor may represent a significance of the portion of audio to the overall plot of the content item. At 704, control circuitry 500 accesses user preference data comprising a plurality of preference factors corresponding to at least one of the plurality of descriptors. For example, control circuitry 500, using transceiver circuitry 508, transmits a query to a user profile database and receives the user preference data in response to the query. The plurality of descriptors may include an identifier of the character who spoke each portion of audio, and the user preference data may include preference factors for a plurality of characters.
(31) At 706, control circuitry 500 initializes a counter variable N, setting its value to zero, and a variable T.sub.P representing the total number of portions of audio, setting its value to the number of portions of audio. At 708, control circuitry 500, using comparison circuitry 524, compares the plurality of descriptors for the N.sup.th portion of audio with the corresponding preference factor. For example, a character descriptor of the portion of audio corresponding to the dialogue “These aren't the droids you're looking for” may indicate Obi-Wan Kenobi as the speaker of the dialogue. User preference data may indicate a high preference factor for the character Obi-Wan Kenobi. At 710, control circuitry 500 calculates an importance level of the N.sup.th portion of audio based on the significance factor and the preference factor. This may be accomplished using methods described below in connection with
(32) At 712, control circuitry 500 determines whether the importance level of the N.sup.th portion of audio exceeds a threshold importance level. This may be accomplished using methods described above in connection with
(33) The actions or descriptions of
(34)
(35) At 802, control circuitry 500 determines an absolute importance level based on the significance factor. For example, the significance factor may indicate the significance of the portion of audio within the context of the entire content item, in which case the significance factor is equal to the absolute importance level. Alternatively, the significance factor may indicate the significance of the portion of audio only in relation to other portions of audio in the same scene or subset of portions of audio. In this case, the overall importance of the scene or subset of portions of audio influences the absolute importance of the portion of audio.
(36) At 804, control circuitry 500 determines a weighting factor corresponding to the preference factor. For example, control circuitry 500 may convert an integer preference factor into a percent value by which the absolute importance level is to be multiplied. The preference factor may be an integer from one to five. If the preference factor is three or lower, indicating low preference, control circuitry 500 converts the preference factor into a percentage value that is less than one. If the preference factor is higher than three, indicating higher preference, control circuitry 500 converts the preference factor into a percentage value that is higher than one. At 806, control circuitry 500 applies the weighting factor to the absolute importance level by, for example, multiplying the importance level by the percentage value, to determine a relative importance level of the portion of audio.
(37) The actions or descriptions of
(38)
(39) At 902, control circuitry 500 separates video and audio data of the content item to create a video stream comprising the video and an audio stream comprising the audio. For example, control circuitry 500 may apply a filter to the content item which isolates packets containing video data from packets containing audio data. Alternatively, the content item may be in a format such as MPEG-2, which inherently contains separate audio and video streams which control circuitry 500 can process separately.
(40) At 904, control circuitry 500 increases playback speed of the video stream. Control circuitry 500, having separated the video and audio into individual streams, can control the playback of each stream individually.
(41) At 906, control circuitry 500 initializes a counter variable N, setting its value to zero, and a variable T.sub.P representing the total number of portions of audio in the subset of portions of audio, setting its value to the number of portions of audio in the subset of portions of audio. At 908, control circuitry 500 advances an audio playback position of the audio stream to a position in the audio stream corresponding to the N.sup.th portion of audio. For example, metadata of the N.sup.th portion of audio indicates a start time. Control circuitry 500 advances the playback position of the audio stream to the indicated start time. At 910, control circuitry 500 plays back the N.sup.th portion of audio at normal speed. At 912, control circuitry 500 determines whether N is equal to T.sub.P. If not, then, at 914, control circuitry 500 increments the value of N by one, and processing returns to step 908. If N is equal to T.sub.P, meaning that all portions of audio of the subset of portions of audio have been played back, then the process is complete.
(42) The actions or descriptions of
(43)
(44) At 1002, control circuitry 500 receives a command to perform a fast-forward operation, the command comprising an indication of the increased speed. For example, the command may indicate that the fast-forward operation should advance through the content item at 1.5×, 2×, 3×, or 4× speed. At 1004, control circuitry 500 calculates a minimum duration of the fast-forward operation based on the indication of the increased speed. For example, if 2× speed is indicated, control circuitry 500 determines that, for every second that the fast-forward operating is being performed, two seconds of content are being played back. In some embodiments, control circuitry may use the average length of a fast-forward operation or the length of the current scene to determine for how long the fast-forward operating will be performed and multiply that time by the increased speed to calculate the duration of the fast-forward operation in terms of content length.
(45) At 1006, control circuitry 500 initializes a moving window having a starting position corresponding to the current playback position and a fixed length corresponding to the minimum duration. At 1008, control circuitry 500 advances the starting position of the window at a speed corresponding to the increased speed to identify additional portions of audio that will be subject to the fast-forward operation. For example, if the fast-forward operation is performed at 2× speed, the starting point of the moving window is advance by 2 seconds of content every second.
(46) The actions or descriptions of
(47)
(48) At 1102, control circuitry 500 accesses metadata of the content item, the metadata comprising a start time of each portion of audio. At 1104, control circuitry 500 initializes a counter variable N, setting its value to zero, and a variable T.sub.P representing the total number of portions of audio, setting its value to the total number or portions of audio. At 1106, control circuitry 500 determines whether the start time of the N.sup.th portion of audio is between the current start time and current end time of the moving window. If so, then, at 1108, control circuitry 500 identifies the N.sup.th portion of audio as a portion of audio that will be subject to the fast-forward operation. After making the identification, or if the start time of the N.sup.th portion of audio is not between the current start time and current end time of the moving window, at 1110, control circuitry determines whether N is equal to T.sub.P. If no, then, at 1112, control circuitry 500 increments the value of N by one, and processing returns to step 1106. If N is equal to T.sub.P, meaning that all portions of audio have been analyzed, then the process is complete.
(49) The actions or descriptions of
(50) The processes described above are intended to be illustrative and not limiting. One skilled in the art would appreciate that the steps of the processes discussed herein may be omitted, modified, combined, and/or rearranged, and any additional steps may be performed without departing from the scope of the invention. More generally, the above disclosure is meant to be exemplary and not limiting. Only the claims that follow are meant to set bounds as to what the present invention includes. Furthermore, it should be noted that the features and limitations described in any one embodiment may be applied to any other embodiment herein, and flowcharts or examples relating to one embodiment may be combined with any other embodiment in a suitable manner, done in different orders, or done in parallel. In addition, the systems and methods described herein may be performed in real time. It should also be noted that the systems and/or methods described above may be applied to, or used in accordance with, other systems and/or methods.