How to do peer-to-peer video call?

Flash Player version 10 or later allows you to do peer-to-peer media streams instead of going through a media server. The VideoIO application exposes this feature in the API where you can just set its src property to the correct URL to enable peer-to-peer video calls.

Instead of rtmp URL which enables client-server media streams, the application uses rtmfp URL. Our previous tutorials discussed how to use RTMP along with a media server to enable video calls where all media streams go through the media server. The new protocol, RTMFP, enables secure end-to-end media streams among different instances of Flash Player without going through a media server. This makes low-latency audio and video communication possible over the web. However, it still needs a rendezvous server, which is used by the Flash Player instances to negotiate media and transport attributes, and to create end-to-end media streams. Please see RTMFP vs SIP on how peer-to-peer of Flash Player is actually end-to-end media.

Fortunately, Adobe also hosts a free service for developers to facilitate such peer-to-peer rendezvous. In particular, developers can signup for Stratus developer key, and then build applications that use their service. Once you sign-up, you get a developer key and an rtmfp URL for the service. We have signed up for our VideoIO application with the following URL.

rtmfp://stratus.rtmfp.net/d1e1e5b3f17e90eb35d244fd-c711881365d9/

There are some similarities and differences with client-server video call described in How to do two-party video call?. Instead of setting the src URL to your media server, we will set it to this URL for doing peer-to-peer video call. The use of publish and play URL parameters is similar. However, peer-to-peer mode also needs a secure identifier of the publishing end available to the playing end. Flash Player defines a nearID attribute, which we expose in our API. Please see How to use the VideoIO API? for details.

Suppose Alice and Bob wants to do a peer-to-peer video call. Both ends have two instances each of the VideoIO application -- one for capturing live camera view and microphone audio as local stream, and other for playing audio and video of the remote stream. When Alice is connects to the Stratus service, her local VideoIO instance gets the nearID, which looks like a random string. Alice can then publish a stream by name, say alice. When Bob wants to listen to Alice's stream, he also connects to the Stratus service, and uses the play stream name, alice, in the remote VideoIO instance. Additionally, Bob sets the farID parameter to that of Alice's local nearID property. Thus, the listening side, Bob, needs to know both the stream name and the nearID attribute of the publishing side, Alice's local VideoIO. Similarly, Bob gives his local nearID property and publish stream name, bob, to Alice, so that Alice can use these properties as parameters play and farID in the remote VideoIO of play mode.

For illustrations in this section we assume our Stratus developer key to connect to the service. In practice you should obtain and use your own developer key. Suppose Alice wants to call Bob. As mentioned before, both ends have two instances of VideoIO application, a local/publish mode and a remote/play mode. First Alice picks her stream name, alice, and sets the src property of local VideoIO to "rtmfp://stratus.rtmfp.net/d1..d9/?publish=alice". The developer key part in the URL is shortened for brevity, but in practice you will need to use the full developer key. Once Alice's local VideoIO is connected to the service, it gets the nearID property. You can capture this by defining onPropertyChange JavaScript function in your embedding HTML page.

<script;>
function onPropertyChange(event) {
    if (event.property == "nearID") {
        if (event.objectID == "video1") {
            // ... event.newValue is the "nearID" of 
            // local VideoIO named "video1"
        }
    }
}
</script;>

The nearID is typically a large random string, that identifies this connection. Once Alice knows the publish stream name and nearID of local VideoIO, she sends these information, along with the service URL to Bob. For example, she can send a single URL as "rtmfp://stratus.rtmfp.net/d1..d9/?publish=alice&nearID=a3..ac". The nearID parameter value is shortened for brevity, but in practice you will need to use the full value. When Bob receives the URL, he replaces publish with play and nearID with farID, and assigns the src property of the remote VideoIO application as "rtmfp://stratus.rtmfp.net/d1..d9/?play=alice&farID=a3..ac". The reverse direction stream is similar. Bob picks his own stream name, bob, connects his local VideoIO application by setting src property to "rtmfp://stratus.rtmfp.net/d1..d9/?publish=bob", obtains the nearID of the local VideoIO, and sends his URL to Alice to accept the call as "rtmfp://stratus.rtmfp.net/d1..d9/?publish=bob&nearID=b3..bc". Alice, replaces publish and nearID as before, and sets the src of remote VideoIO as "rtmfp://stratus.rtmfp.net/d1..d9/?play=bob&farID=b3..bc". Now both Alice and Bob are exchanging low latency peer-to-peer media stream without going through a media server. Terminating the call is same as setting src to null or "" for both VideoIO instances.

The remaining challenge is to convey the URL containing stream name and nearID to the other end in a call. In practice, people use variety of methods ranging from CGI web service, JavaScript or Jabber service to exchange such information. Whatever mechanism you decide will become the call initiation service of your system. In our tutorial we will assume that these information is exchanged out-of-band by the individuals, e.g., over instant messenging or email.

stream, IDAliceBob
publish/localalice, nearID=id1bob, nearID=id2
play/remotebob, farID=id2alice, nearID=id1

The following user interface embeds two VideoIO instances. The left one is for local video and right one for remote video. To try the demonstration, open this page in two browser instances. You can try on two different machines on the same machine. On first keep your default publish stream name as "alice" and play as "bob". On second swap the stream name so that publish is "bob" and play is "alice". First, click on first browser's left VideoIO's set button to publish the local stream. This will populate the nearID property. Copy that value from the text input box, and paste it in the second browser's right VideoIO's farID text input box. Then click on the set button of right VideoIO to play the stream. Similarly, on second browser, click on the set button of left VideoIO to publish his stream. This populates the nearID property, which you copy and paste to the first browser's second VideoIO's farID parameter, before clicking on its set button. Now you will have two-party peer-to-peer video call between these two browser instances.

?publish=
nearID=
?play= &
farID=

The source code for these are similar to earlier examples. The main difference is that it uses the nested object and embed tags instead of a single object tag, so that the ExternalInterface can invoke event handlers onPropertyChange from VideoIO application to JavaScript of the HTML page. Please see How to embed VideoIO in your web page? for more details on this. You can also right-click and select "View Page Source" or equivalent menu option to see the source code.

Earlier for client-server video call, as long as you decide on the stream names before hand, you can set the src properties in any order. Hence, you can set the play src before the corresponding publish src is set on the other end. It worked. However, now there is a need to transfer nearID property along with the stream name to the other end. Since the nearID property is dynamically generated every time you connect to the Stratus service, there are some constraints in the order of setting the src properties. You MUST first set the src property of the local/publish VideoIO, get the nearID, and then can set the other end's srcplay and farID parameters to play that stream.

Information: If you reset src of the publish VideoIO, and then set src again to the same value, it still generates a new nearID, which needs to be sent again to all the playing streams.

Summary

In summary, you can use the src property with URL parameter containing publish for your stream and play for remote stream. The recording and playing properties control the current state of whether your video is transmitted or whether remote video is played.

This tutorial is just the begining of how to do peer-to-peer video call. This peer-to-peer mode can be easily extended to one-to-many video broadcast as well as multi-party conference by combining what you learned in this tutorial, in How to do one-to-many video broadcast? and How to do multi-party video conference?. However, we do not recommend using peer-to-peer mode for multi-party or broadcast use cases in VideoIO because it will consume huge uplink bandwidth on users computers -- proportional to the number of other participants in a multi-party conference, or number of listeners for a broadcast user. In future, we will extend VideoIO to include peer-to-peer groups available in Flash Player 10.1 or later.

The VideoIO API is so simple that just setting the src property allows you to accomplish several use cases. There are several other properties that affect behavior of a call, e.g., you can control the camera quality or sound mute. Additionally, you can build other advanced user interface controls in JavaScript similar to how VideoIO's control panel shows in Flash.

The video displayed in local and remote video are different. The publish mode displays the live camera view. In VideoIO the live camera view is always flipped horizontally to appear as if you are looking in a mirror, whereas the actual media stream sent to the server for recording or re-distribution appears as if the camera is looking at you. We feel that this gives the most natural behavior for live video chat or message recording.

Finally, you must use nested object and embed tags or SWFobject.js instead of a single object tag for enabling peer-to-peer video call because you need to dynamically listen for change in the nearID property using onPropertyChange event handler, and convey it to the other end. The event handler works well only with nested tags.