$ mkdir flask-twilio-video $ cd flask-twilio-video
$ mkdir static $ mkdir templates
TWILIO_ACCOUNT_SID=<your-twilio-account-sid>
TWILIO_ACCOUNT_SID=<your-twilio-account-sid> TWILIO_API_KEY_SID=<your-twilio-api-key-sid> TWILIO_API_KEY_SECRET=<your-twilio-api-key-secret>
$ python -m venv venv $ source venv/bin/activate (venv) $ pip install twilio flask python-dotenv
$ python -m venv venv $ venv\Scripts\activate (venv) $ pip install twilio flask python-dotenv
pip
, the Python package installer, to install the three Python packages that we are going to use in this project, which are:certifi==2020.4.5.1 chardet==3.0.4 click==7.1.1 Flask==1.1.2 idna==2.9 itsdangerous==1.1.0 Jinja2==2.11.2 MarkupSafe==1.1.1 PyJWT==1.7.1 python-dotenv==0.12.0 pytz==2019.3 requests==2.23.0 six==1.14.0 twilio==6.38.1 urllib3==1.25.8 Werkzeug==1.0.1
from flask import Flask, render_template app = Flask(__name__) @app.route('/') def index(): return render_template('index.html')
app
variable is called the “application instance”. Its purpose is to provide the support functions we need to implement our web server. We use the app.route
decorator to define a mapping between URLs and Python functions. In this particular example, when a client requests the root URL for our server, Flask will run our index()
function and expect it will provide the response. The implementation of our index()
function renders a index.html file that we are yet to write. This file is going to contain the HTML definition of the main and only web page of our video chat application.(venv) $ FLASK_ENV=development flask run
(venv) $ set FLASK_ENV=development (venv) $ flask run
* Environment: development * Debug mode: on * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit) * Restarting with stat * Debugger is active! * Debugger PIN: 274-913-316
<!doctype html> <html> <head> <link rel="stylesheet" type="text/css" href="{{ url_for('static', filename='styles.css') }}"> </head> <body> <h1>Flask & Twilio Video Conference</h1> <form> Name: <input type="text" id="username"> <button id="join_leave">Join call</button> </form> <p id="count"></p> <div id="container" class="container"> <div id="local" class="participant"><div></div><div>Me</div></div> <!-- more participants will be added dynamically here --> </div> <script src="//media.twiliocdn.com/sdk/js/video/releases/2.3.0/twilio-video.min.js"></script> <script src="{{ url_for('static', filename='app.js') }}"></script> </body> </html>
<head>
section of this file references a styles.css file. We are using the url_for()
function from Flask to generate the correct URL for it. This is nice, because all we need to do is put the file in the static directory and let Flask generate the URL. If you were wondering what is the difference between a template file and a static file this is exactly it; template files can have placeholders that are generated dynamically when the render_template()
function you’ve seen above runs.<body>
section of the page defines the following elements:<h1>
title<form>
element with name field and submit button<p>
element where we’ll show connection status and participant count<div>
with one participant identified with the name local
where we’ll show our own video feed. More participants will be added dynamically as they join the video call<div>
contains an empty <div>
where the video will be displayed and a second <div>
where we’ll display the name..container { margin-top: 20px; width: 100%; display: flex; flex-wrap: wrap; } .participant { margin-bottom: 5px; margin-right: 5px; } .participant div { text-align: center; } .participant div:first-child { width: 240px; height: 180px; background-color: #ccc; border: 1px solid black; } .participant video { width: 100%; height: 100%; }
<div>
element, which is structured as a flexbox so that participants are automatically added to the right and wrapped to the next line as needed according to the size of the browser window..participant div:first-child
definition applies to the first child element of any <div>
elements that have the participant
class. Here we are constraining the size of the video to 240x180 pixels. We also have a darker background and a black border, just so that we can see a placeholder for the video window. The background is also going to be useful when the dimensions of the video do not exactly match our aspect ratio. Feel free to adjust these options to your liking.function addLocalVideo() { Twilio.Video.createLocalVideoTrack().then(track => { var video = document.getElementById('local').firstChild; video.appendChild(track.attach()); }); }; addLocalVideo();
addLocalVideo()
function uses the Twilio Programmable Video JavaScript library to create a local video track. The createLocalVideoTrack()
function from the library is asynchronous and returns a promise, so we use the then()
method to add some logic in a callback function after the video track is created.<div>
child of the local
element. In case this is confusing, let’s review the structure of the local participant from the index.html file:<div id="local" class="participant"><div></div><div>Me</div></div>
local
element has two <div>
elements as children. The first is empty, and this is the element to which we are attaching the video. The second <div>
is for the label that appears below the video.import os from dotenv import load_dotenv from flask import Flask, render_template, request, abort from twilio.jwt.access_token import AccessToken from twilio.jwt.access_token.grants import VideoGrant load_dotenv() twilio_account_sid = os.environ.get('TWILIO_ACCOUNT_SID') twilio_api_key_sid = os.environ.get('TWILIO_API_KEY_SID') twilio_api_key_secret = os.environ.get('TWILIO_API_KEY_SECRET') app = Flask(__name__) @app.route('/') def index(): return render_template('index.html') @app.route('/login', methods=['POST']) def login(): username = request.get_json(force=True).get('username') if not username: abort(401) token = AccessToken(twilio_account_sid, twilio_api_key_sid, twilio_api_key_secret, identity=username) token.add_grant(VideoGrant(room='My Room')) return {'token': token.to_jwt().decode()}
load_dotenv()
function from the python-dotenv package to import those secrets, and then we assign them to variables for convenience.AccessToken
helper class from the Twilio Python Helper library. We attach a video grant for a video room called “My Room”. A more complex application can work with more than one video room and decide which room or rooms this user can enter.{ "token": "the-token-goes-here" }
click
event. The changes to static/app.js are shown below.var connected = false; const usernameInput = document.getElementById('username'); const button = document.getElementById('join_leave'); const container = document.getElementById('container'); const count = document.getElementById('count'); var room; function addLocalVideo() { /* no changes in this function */ }; function connectButtonHandler(event) { event.preventDefault(); if (!connected) { var username = usernameInput.value; if (!username) { alert('Enter your name before connecting'); return; } button.disabled = true; button.innerHTML = 'Connecting...'; connect(username).then(() => { button.innerHTML = 'Leave call'; button.disabled = false; }).catch(() => { alert('Connection failed. Is the backend running?'); button.innerHTML = 'Join call'; button.disabled = false; }); } else { disconnect(); button.innerHTML = 'Join call'; connected = false; } }; addLocalVideo(); button.addEventListener('click', connectButtonHandler);
connected
boolean tracks the state of the connection, mainly to help decide if a button click needs to connect or disconnect. The room
variable will hold the video chat room object once we have it.connectButtonHandler()
function to the click event on the form button. The function is somewhat long, but it mostly deals with validating that the user entered a name and updating how the button looks as the state of the connection changes. If you filter out the form management you can see that the actual connection and disconnection are handled by two functions connect()
and disconnect()
that we are going to write in the following sections.var connected = false; const usernameInput = document.getElementById('username'); const button = document.getElementById('join_leave_button'); const container = document.getElementById('container'); const count = document.getElementById('count'); var room; function addLocalVideo() { /* no changes in this function */ }; function connectButtonHandler(event) { /* no changes in this function */ }; function connect(username) { var promise = new Promise((resolve, reject) => { // get a token from the back end fetch('/login', { method: 'POST', body: JSON.stringify({'username': username}) }).then(res => res.json()).then(data => { // join video call Twilio.Video.connect(data.token).then(_room => { room = _room; room.participants.forEach(participantConnected); room.on('participantConnected', participantConnected); room.on('participantDisconnected', participantDisconnected); connected = true; updateParticipantCount(); resolve(); }).catch(() => { reject(); }); }).catch(() => { reject(); }); }); return promise; };
connect()
function returns a promise, to which the caller can use to attach actions to be performed once the connection is established, or also to handle errors. Internally, the promise outcome is controlled via the resolve()
and reject()
functions that are passed as arguments into the execution function passed in the Promise()
constructor. You can see calls to these functions sprinkled throughout the connection logic. A call to resolve()
will trigger the caller’s success callback, while a call to reject()
will do the same for the error callback.fetch()
function to send a request to the /login route in the Flask application that we created above. This function returns a promise as well, so we use the then(...).catch(...)
handlers to provide success and failure callbacks.reject()
to fail our own promise. If the call succeeds, we decode the JSON payload into the data
variable and then call the connect()
function from the twilio-video library passing our newly acquired token.container
part of the page to reflect that._room
argument, which represents the video room. Since this is a useful variable, we assign _room
to the global variable room
, so that the rest of the application can access this room when needed.room.participants
array contains the list of people already in the call. For each of these we have to add a <div>
section that shows the video and the name. This is all encapsulated in the participantConnected()
function, so we invoke it for each participant. We also want any future participants to be handled in the same way, so we set up a handler for the participantConnected
event pointing to the same function. The participantDisconnected
event is also important, as we’d want to remove any participants that leave the call, so we set up a handler for this event as well.connected
boolean variable. The final action we take is to update the <p>
element that shows the connection status to show the participant count. This is done in a separate function because we’ll need to do this in several places. The function updates the text of the element based on the length of the room.participants
array. Add the implementation of this function to static/app.js.function updateParticipantCount() { if (!connected) count.innerHTML = 'Disconnected.'; else count.innerHTML = (room.participants.size + 1) + ' participants online.'; };
room.participants
array includes every participant except ourselves, so the total number of people in a call is always one more than the size of the list.participantConnected
handler. This function needs to create a new <div>
inside the container
element, following the same structure we used for the local
element that shows our own video stream.participantConnected()
function along with the participantDisconnected()
counterpart and a few auxiliary functions, all of which also goes in static/app.js.function participantConnected(participant) { var participant_div = document.createElement('div'); participant_div.setAttribute('id', participant.sid); participant_div.setAttribute('class', 'participant'); var tracks_div = document.createElement('div'); participant_div.appendChild(tracks_div); var label_div = document.createElement('div'); label_div.innerHTML = participant.identity; participant_div.appendChild(label_div); container.appendChild(participant_div); participant.tracks.forEach(publication => { if (publication.isSubscribed) trackSubscribed(tracks_div, publication.track); }); participant.on('trackSubscribed', track => trackSubscribed(tracks_div, track)); participant.on('trackUnsubscribed', trackUnsubscribed); updateParticipantCount(); }; function participantDisconnected(participant) { document.getElementById(participant.sid).remove(); updateParticipantCount(); }; function trackSubscribed(div, track) { div.appendChild(track.attach()); }; function trackUnsubscribed(track) { track.detach().forEach(element => element.remove()); };
participantConnected()
callback receives a Participant object from the twilio-video library. The two important properties of this object are participant.sid
and participant.identity
, which are a unique session identifier and name respectively. The identity
attribute comes directly from the token we generated. Recall that we passed identity=username
in our Python token generation function.<div id="{{ participant.sid }}" class="participant"> <div></div> <!-- the video and audio tracks will be attached to this div --> <div>{{ participant.name }}</div> </div>
participantConnected()
function you can see that we create a participant_div
, to which we add a tracks_div
and a label_div
as children. We finally add the participant_div
as a child of container
, which is the top-level <div>
element where we have all the participants of the call.tracks_div
element we just created. We run a loop through all the tracks the participants export, and following the basic usage shown in the library’s documentation we attach those to which we are subscribed. The actual track attachment is handled in a trackSubscribed()
auxiliary function that is defined right below.trackSubscribed
and trackUnsubscribed
events, which use the attach()
and detach()
methods of the track object to add and remove the HTML elements that carry the feeds.connect()
function is disconnect()
, which has to restore the state of the page to how it was previous to connecting. This is a lot simpler, as it mostly involves removing all the children of the container
element except the first one, which is our local video stream.function disconnect() { room.disconnect(); while (container.lastChild.id != 'local') container.removeChild(container.lastChild); button.innerHTML = 'Join call'; connected = false; updateParticipantCount(); };
<div>
with the id local
, which is the one that we created statically in the index.html page. We also use the opportunity to update our connected
global variable, change the text of the connect button and refresh the <p>
element to show a “Disconnected” message.(venv) $ FLASK_ENV=development flask run
(venv) $ set FLASK_ENV=development (venv) $ flask run
$ ngrok http 5000
Forwarding
lines to see what is the public URL that ngrok assigned to your server. Use the one that starts with https://, since many browsers do not allow unencrypted sites to access the camera and the microphone. In the example above, the public URL is https://bbf1b72b.ngrok.io. Yours is going to be similar, but the first component of the domain is going to be different every time you start ngrok.