Scalable real-time videoconferencing over WebRTC
20170359187 · 2017-12-14
Inventors
Cpc classification
H04N7/147
ELECTRICITY
H04L65/65
ELECTRICITY
H04L12/1827
ELECTRICITY
International classification
Abstract
A WebRTC-compliant media server avoids sharing the SSRCs of passive participants (namely, the video viewers who do not send video) by intercepting feedback packets (issued from the viewers) on the server side, modifying those packets, and then transmitting the modified packets back to the sender such that, when the sender receives these feedback packets, the sender treats the packets as if they were sent by a known SSRC. Preferably, the known SSRC is one that is associated with a single SSRC (e.g., a dummy or surrogate SSRC, or a technical SSRC, in either event that was previously shared with the video sender). The sender knows how to handle these packets and can then send the desired answer to the viewer(s) to maintain the conference stable and operational even as the number of participants grows and exceeds the SSRC peer limitations.
Claims
1. Apparatus, comprising: a processor; computer memory holding computer program instructions executed by the processor during a videoconference established among a peer sender, and a plurality of peer viewers that are not senders, the computer program instructions operative to: intercept WebRTC source identifier feedback packets issued from the plurality of peer viewers; and in lieu of forwarding the intercepted WebRTC source identifier feedback packets to the peer sender, sending the peer sender feedback packets that appear to originate from a peer already known to the peer sender.
2. The apparatus as described in claim 1 wherein the WebRTC source identifier is a Real-time Transport Protocol (RTP) synchronization source identifier (SSRC).
3. The apparatus as described in claim 1 wherein the plurality of peer viewers exceeds twenty (20) peer viewers.
4. The apparatus as described in claim 1 wherein the plurality of peer viewers exceeds hundreds of peer viewers.
5. The apparatus as described in claim 1 wherein the peer already known to the peer sender has associated therewith a single WebRTC source identifier.
6. A method of videoconferencing among a peer sender, and a plurality of peer viewers that are not senders, comprising: as a videoconference initiated by the peer sender is on-going, intercepting WebRTC source identifier feedback packets issued from the plurality of peer viewers; and in lieu of forwarding the intercepted WebRTC source identifier feedback packets to the peer sender, sending the peer sender feedback packets that appear to originate from a surrogate peer already known to the peer sender.
7. The method as described in claim 6 wherein the WebRTC source identifier is a Real-time Transport Protocol (RTP) synchronization source identifier (SSRC).
8. The method as described in claim 6 wherein the plurality of peer viewers exceeds twenty (20) peer viewers.
9. The method as described in claim 6 wherein the plurality of peer viewers exceeds hundreds of peer viewers.
10. The method as described in claim 6 wherein the surrogate peer already known to the peer sender has associated therewith a single WebRTC source identifier.
11. The method as described in claim 6 further including providing the peer sender information identifying the surrogate peer in advance of the videoconference initiation.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0010] For a more complete understanding of the present invention and the advantages thereof, reference is now made to the following descriptions taken in conjunction with the accompanying drawings, in which:
[0011]
[0012]
DETAILED DESCRIPTION OF AN ILLUSTRATIVE EMBODIMENT
[0013]
[0014] Generalizing, one or more functions of such a technology platform may be implemented in a cloud-based architecture. As is well-known, cloud computing is a model of service delivery for enabling on-demand network access to a shared pool of configurable computing resources (e.g. networks, network bandwidth, servers, processing, memory, storage, applications, virtual machines, and services) that can be rapidly provisioned and released with minimal management effort or interaction with a provider of the service. Available services models that may be leveraged in whole or in part include: Software as a Service (SaaS) (the provider's applications running on cloud infrastructure); Platform as a service (PaaS) (the customer deploys applications that may be created using provider tools onto the cloud infrastructure); Infrastructure as a Service (IaaS) (customer provisions its own processing, storage, networks and other computing resources and can deploy and run operating systems and applications).
[0015] The platform may comprise co-located hardware and software resources, or resources that are physically, logically, virtually and/or geographically distinct. Communication networks used to communicate to and from the platform services may be packet-based, non-packet based, and secure or non-secure, or some combination thereof.
[0016] More generally, the techniques described herein are provided using a set of one or more computing-related entities (systems, machines, processes, programs, libraries, functions, or the like) that together facilitate or provide the described functionality described above. In a typical implementation, a representative machine on which the software executes comprises commodity hardware, an operating system, an application runtime environment, and a set of applications or processes and associated data, that provide the functionality of a given system or subsystem. As described, the functionality may be implemented in a standalone machine, or across a distributed set of machines.
[0017] The architecture in
[0018] The infrastructure provides for an unlimited number of meetings, and each meeting may include up to a large number (e.g., 250) participants.
[0019]
[0020] Thus, in one embodiment, the first participating end user (sender 200) accesses the service via a desktop or laptop computer. A representative machine is a data processing system that includes a communications fabric that provides communications between a processor unit, memory, persistent storage, a communications unit, an input/output (I/O) unit, and a display. A typical data processing system includes a web browser or the like that is WebRTC-compliant. Thus, a sender peer camera transfers a video in real-time to the display screens of the viewer peers via direct (peer-to-peer or “P2P”) connections. As noted, the stream delivery (video encoding and decoding, etc.) conforms to the WebRTC standard.
[0021] The Web Real-Time communication (WebRTC) framework provides the protocol building blocks to support direct, interactive, real-time communication using audio, video, collaboration, etc., between two peers' web-browsers. WebRTC uses the Real-time Transport Protocol (RTP) (RFC3550) as its media transport protocol. RTP provides a framework for delivery of audio and video teleconferencing data and other real-time media applications. According to WebRTC, the sharing the SSRCs of video viewers (namely, the peers that only receive video) with the sender is necessary. (SSRC is an identifier for an RTP synchronization source). In particular, this feedback has to be handled properly by the sender peer (e.g., before the sender sends a new keyframe) so that the viewer is able to decode the received media stream continuously (i.e. identify the received data and associate the received video with a particular (high level) user). That said, to solve the above-identified performance problem, a computing entity (e.g., a server, or server group) that is managing the videoconference and its delivery recognizes SSRCs of those participants who are only viewing the videoconference (as opposed to sending video), and then processes those SSRCs in a unique way.
[0022] In particular, and according to this disclosure, the server avoids sharing the SSRCs of the video viewers by intercepting these feedback packets on the server side, modifying those packets, and then transmitting the modified packets back to the sender such that, when the sender receives these feedback packets, the sender treats the packets as if they were sent by a known SSRC. Preferably, the known SSRC is one that is associated with a single SSRC (e.g., a dummy SSRC, or a technical SSRC, in either event that was previously shared with the video sender). The sender knows how to handle this packet and can then send the desired answer to the viewer to maintain the conference stable and operational even as the number of participants grows and exceeds the SSRC peer limitations.
[0023] Thus, according to the disclosure, the approach no longer identifies video viewer peers in a WebRTC-based videoconference with their own actual SSRCs, but rather minimizes the number of required SSRCs in a video meeting by capturing the feedback packets (from those viewer peers) and faking video packets back to the sender during network transfer. This approach may be implemented using a conventional video streaming server that is augmented (e.g., via a plug-in) to enable the functionality. By faking the packets between the two peers on the server side, the peer that receives the packets handles them as if there were sent by another known peer.
[0024] When multiple senders are present in the conference, the approach as described preferably is used for each of them.
[0025] The approach provides significant advantages. It enables scaling of the videoconference to well beyond a limited number of participants. Even as the conference scales up to a high participant count, the conference is stable and latency remains very low (e.g., under 2 seconds). The approach can be implemented whenever the use cases have one or only a few number of video senders, irrespective of the number of video viewers. Thus, the technique may be used to provide webinars, town-hall meetings, online classroom trainings, and the like.
[0026] While the above describes a particular order of operations performed by certain embodiments of the invention, it should be understood that such order is exemplary, as alternative embodiments may perform the operations in a different order, combine certain operations, overlap certain operations, or the like. References in the specification to a given embodiment indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may not necessarily include the particular feature, structure, or characteristic.
[0027] While the disclosed subject matter has been described in the context of a method or process, the subject disclosure also relates to apparatus for performing the operations herein. This apparatus may be specially constructed for the required purposes, or it may comprise a general-purpose computer selectively activated or reconfigured by a computer program stored in the computer. Such a computer program may be stored in a computer readable storage medium, such as, but is not limited to, any type of disk including an optical disk, a CD-ROM, and a magnetic-optical disk, a read-only memory (ROM), a random access memory (RAM), a magnetic or optical card, or any type of media suitable for storing electronic instructions, and each coupled to a computer system bus.
[0028] While given components of the system have been described separately, one of ordinary skill will appreciate that some of the functions may be combined or shared in given instructions, program sequences, code portions, and the like.
[0029] The described commercial products, systems and services are provided for illustrative purposes only and are not intended to limit the scope of this disclosure.
[0030] The techniques herein provide for improvements to technology or technical field, namely, on-demand remote access environments, as well as improvements to various technologies such as videoconferencing over a wide area network, and the like, all as described.