How to Integrate Active Speaker Indication in JavaScript Video Call App?

Introduction

Integrating Active Speaker Indication in a JavaScript video chat app enhances user experience by highlighting the participant currently speaking. This functionality visually distinguishes the active speaker, improving communication flow in group calls. Through JavaScript, real-time audio analysis detects sound levels and determines the speaker. Visual cues, such as highlighting their video feed or displaying a speaker icon, notify participants of the active speaker.

Benefits of Integrating Active Speaker Indication:

Improved Communication Flow: Participants can easily identify the active speaker, leading to smoother conversations and reduced interruptions.
Enhanced User Experience: Active speaker indication adds a layer of interactivity, making the video chat app more engaging and user-friendly.
Increased Engagement: Visual cues encourage active participation and attentiveness among users, fostering more meaningful interactions.
Reduced Confusion: With clear visual indicators, users can avoid confusion about who's speaking, leading to more efficient communication.

Use Cases of Integrating Active Speaker Indication:

Remote Work: In remote work scenarios, active speaker indication ensures smooth communication during team meetings, allowing members to follow discussions more easily.
Online Education: In virtual classrooms, teachers can use active speaker indication to monitor student participation and facilitate discussions effectively.
Customer Support: Active speaker indication in customer support video calls helps agents to know when customers are speaking, improving response times and service quality.

This tutorial guides you through integrating this valuable feature into your JavaScript video call application using VideoSDK. We'll cover the steps required to leverage VideoSDK's capabilities and implement visual cues that highlight the active speaker within your app's interface.

Getting Started with VideoSDK

To take advantage of Active Speaker Indication functionality, we must use the capabilities that the VideoSDK offers. Before diving into the implementation steps, ensure you complete the necessary prerequisites.

Create a VideoSDK Account

Go to your VideoSDK dashboard and sign up if you don't have an account. This account gives you access to the required Video SDK token, which acts as an authentication key that allows your application to interact with VideoSDK functionality.

Generate your Auth Token

Visit your VideoSDK dashboard and navigate to the "API Key" section to generate your auth token. This token is crucial in authorizing your application to use VideoSDK features. consider referring to the provided tutorial for a more visual understanding of the account creation and token generation process.

Prerequisites

Before proceeding, ensure that your development environment meets the following requirements:

VideoSDK Developer Account (if you do not have one, follow VideoSDK Dashboard)
Have Node and NPM installed on your device.

⬇️ Install VideoSDK

Import VideoSDK using the <script> tag or install it using the following npm command. Make sure you are in your app directory before you run this command.

<html>
  <head>
    <!--.....-->
  </head>
  <body>
    <!--.....-->
    <script src="https://sdk.videosdk.live/js-sdk/0.0.83/videosdk.js"></script>
  </body>
</html>

npm

npm install @videosdk.live/js-sdk

Yarn

yarn add @videosdk.live/js-sdk

Structure of the project

Your project structure should look like this.

  root
   ├── index.html
   ├── config.js
   ├── index.js

You will be working on the following files:

index.html: Responsible for creating a basic UI.
config.js: Responsible for storing the token.
index.js: Responsible for rendering the meeting view and the join meeting functionality.

Essential Steps to Implement Video Call Functionality

Once you've successfully installed VideoSDK in your project, you'll have access to a range of functionalities for building your video call application. Active Speaker Indication is one such feature that leverages VideoSDK's capabilities. It leverages VideoSDK's capabilities to identify the user with the strongest audio signal (the one speaking)

Step 1: Design the user interface (UI)

Create an HTML file containing the screens, join-screen and grid-screen.

<!DOCTYPE html>
<html>
  <head> </head>

  <body>
    <div id="join-screen">
      <!-- Create new Meeting Button -->
      <button id="createMeetingBtn">New Meeting</button>
      OR
      <!-- Join existing Meeting -->
      <input type="text" id="meetingIdTxt" placeholder="Enter Meeting id" />
      <button id="joinBtn">Join Meeting</button>
    </div>

    <!-- for Managing meeting status -->
    <div id="textDiv"></div>

    <div id="grid-screen" style="display: none">
      <!-- To Display MeetingId -->
      <h3 id="meetingIdHeading"></h3>

      <!-- Controllers -->
      <button id="leaveBtn">Leave</button>
      <button id="toggleMicBtn">Toggle Mic</button>
      <button id="toggleWebCamBtn">Toggle WebCam</button>

      <!-- render Video -->
      <div class="row" id="videoContainer"></div>
    </div>

    <!-- Add VideoSDK script -->
    <script src="https://sdk.videosdk.live/js-sdk/0.0.83/videosdk.js"></script>
    <script src="config.js"></script>
    <script src="index.js"></script>
  </body>
</html>

Step 2: Implement Join Screen

Configure the token in the config.js file, which you can obtain from the VideoSDK Dashbord.

// Auth token will be used to generate a meeting and connect to it
TOKEN = "Your_Token_Here";

Next, retrieve all the elements from the DOM and declare the following variables in the index.js file. Then, add an event listener to the join and create meeting buttons.

// Getting Elements from DOM
const joinButton = document.getElementById("joinBtn");
const leaveButton = document.getElementById("leaveBtn");
const toggleMicButton = document.getElementById("toggleMicBtn");
const toggleWebCamButton = document.getElementById("toggleWebCamBtn");
const createButton = document.getElementById("createMeetingBtn");
const videoContainer = document.getElementById("videoContainer");
const textDiv = document.getElementById("textDiv");

// Declare Variables
let meeting = null;
let meetingId = "";
let isMicOn = false;
let isWebCamOn = false;

function initializeMeeting() {}

function createLocalParticipant() {}

function createVideoElement() {}

function createAudioElement() {}

function setTrack() {}

// Join Meeting Button Event Listener
joinButton.addEventListener("click", async () => {
  document.getElementById("join-screen").style.display = "none";
  textDiv.textContent = "Joining the meeting...";

  roomId = document.getElementById("meetingIdTxt").value;
  meetingId = roomId;

  initializeMeeting();
});

// Create Meeting Button Event Listener
createButton.addEventListener("click", async () => {
  document.getElementById("join-screen").style.display = "none";
  textDiv.textContent = "Please wait, we are joining the meeting";

  // API call to create meeting
  const url = `https://api.videosdk.live/v2/rooms`;
  const options = {
    method: "POST",
    headers: { Authorization: TOKEN, "Content-Type": "application/json" },
  };

  const { roomId } = await fetch(url, options)
    .then((response) => response.json())
    .catch((error) => alert("error", error));
  meetingId = roomId;

  initializeMeeting();
});

Step 3: Initialize Meeting

Following that, initialize the meeting using the initMeeting() function and proceed to join the meeting.

// Initialize meeting
function initializeMeeting() {
  window.VideoSDK.config(TOKEN);

  meeting = window.VideoSDK.initMeeting({
    meetingId: meetingId, // required
    name: "Thomas Edison", // required
    micEnabled: true, // optional, default: true
    webcamEnabled: true, // optional, default: true
  });

  meeting.join();

  // Creating local participant
  createLocalParticipant();

  // Setting local participant stream
  meeting.localParticipant.on("stream-enabled", (stream) => {
    setTrack(stream, null, meeting.localParticipant, true);
  });

  // meeting joined event
  meeting.on("meeting-joined", () => {
    textDiv.style.display = "none";
    document.getElementById("grid-screen").style.display = "block";
    document.getElementById(
      "meetingIdHeading"
    ).textContent = `Meeting Id: ${meetingId}`;
  });

  // meeting left event
  meeting.on("meeting-left", () => {
    videoContainer.innerHTML = "";
  });

  // Remote participants Event
  // participant joined
  meeting.on("participant-joined", (participant) => {
    //  ...
  });

  // participant left
  meeting.on("participant-left", (participant) => {
    //  ...
  });
}

Step 4: Create the Media Elements

In this step, Create a function to generate audio and video elements for displaying both local and remote participants. Set the corresponding media track based on whether it's a video or audio stream.

// creating video element
function createVideoElement(pId, name) {
  let videoFrame = document.createElement("div");
  videoFrame.setAttribute("id", `f-${pId}`);
  videoFrame.style.width = "300px";
    

  //create video
  let videoElement = document.createElement("video");
  videoElement.classList.add("video-frame");
  videoElement.setAttribute("id", `v-${pId}`);
  videoElement.setAttribute("playsinline", true);
  videoElement.setAttribute("width", "300");
  videoFrame.appendChild(videoElement);

  let displayName = document.createElement("div");
  displayName.innerHTML = `Name : ${name}`;

  videoFrame.appendChild(displayName);
  return videoFrame;
}

// creating audio element
function createAudioElement(pId) {
  let audioElement = document.createElement("audio");
  audioElement.setAttribute("autoPlay", "false");
  audioElement.setAttribute("playsInline", "true");
  audioElement.setAttribute("controls", "false");
  audioElement.setAttribute("id", `a-${pId}`);
  audioElement.style.display = "none";
  return audioElement;
}

// creating local participant
function createLocalParticipant() {
  let localParticipant = createVideoElement(
    meeting.localParticipant.id,
    meeting.localParticipant.displayName
  );
  videoContainer.appendChild(localParticipant);
}

// setting media track
function setTrack(stream, audioElement, participant, isLocal) {
  if (stream.kind == "video") {
    isWebCamOn = true;
    const mediaStream = new MediaStream();
    mediaStream.addTrack(stream.track);
    let videoElm = document.getElementById(`v-${participant.id}`);
    videoElm.srcObject = mediaStream;
    videoElm
      .play()
      .catch((error) =>
        console.error("videoElem.current.play() failed", error)
      );
  }
  if (stream.kind == "audio") {
    if (isLocal) {
      isMicOn = true;
    } else {
      const mediaStream = new MediaStream();
      mediaStream.addTrack(stream.track);
      audioElement.srcObject = mediaStream;
      audioElement
        .play()
        .catch((error) => console.error("audioElem.play() failed", error));
    }
  }
}

Step 5: Handle participant events

Thereafter, implement the events related to the participants and the stream.

The following are the events to be executed in this step:

participant-joined: When a remote participant joins, this event will trigger. In the event callback, create video and audio elements previously defined for rendering their video and audio streams.
participant-left: When a remote participant leaves, this event will trigger. In the event callback, remove the corresponding video and audio elements.
stream-enabled: This event manages the media track of a specific participant by associating it with the appropriate video or audio element.

// Initialize meeting
function initializeMeeting() {
  // ...

  // participant joined
  meeting.on("participant-joined", (participant) => {
    let videoElement = createVideoElement(
      participant.id,
      participant.displayName
    );
    let audioElement = createAudioElement(participant.id);
    // stream-enabled
    participant.on("stream-enabled", (stream) => {
      setTrack(stream, audioElement, participant, false);
    });
    videoContainer.appendChild(videoElement);
    videoContainer.appendChild(audioElement);
  });

  // participants left
  meeting.on("participant-left", (participant) => {
    let vElement = document.getElementById(`f-${participant.id}`);
    vElement.remove(vElement);

    let aElement = document.getElementById(`a-${participant.id}`);
    aElement.remove(aElement);
  });
}

Step 6: Implement Controls

Next, implement the meeting controls such as toggleMic, toggleWebcam, and leave the meeting.

// leave Meeting Button Event Listener
leaveButton.addEventListener("click", async () => {
  meeting?.leave();
  document.getElementById("grid-screen").style.display = "none";
  document.getElementById("join-screen").style.display = "block";
});

// Toggle Mic Button Event Listener
toggleMicButton.addEventListener("click", async () => {
  if (isMicOn) {
    // Disable Mic in Meeting
    meeting?.muteMic();
  } else {
    // Enable Mic in Meeting
    meeting?.unmuteMic();
  }
  isMicOn = !isMicOn;
});

// Toggle Web Cam Button Event Listener
toggleWebCamButton.addEventListener("click", async () => {
  if (isWebCamOn) {
    // Disable Webcam in Meeting
    meeting?.disableWebcam();

    let vElement = document.getElementById(`f-${meeting.localParticipant.id}`);
    vElement.style.display = "none";
  } else {
    // Enable Webcam in Meeting
    meeting?.enableWebcam();

    let vElement = document.getElementById(`f-${meeting.localParticipant.id}`);
    vElement.style.display = "inline";
  }
  isWebCamOn = !isWebCamOn;
});

You can check out the complete.

Integrate Active Speaker Indication

The Active Speaker Indication feature allows you to identify the participant who is currently the active speaker in a meeting. This feature proves especially valuable in larger conferences or webinars, where numerous participants can make it challenging to identify the active speaker.

Whenever any participant speaks in a meeting, the onSpeakerChanged event will trigger, providing the participant ID of the active speaker.

For example, the meeting is running with Alice and Bob. Whenever any of them speaks, onSpeakerChanged the event will trigger and return the speaker's participantId.

let meeting;

// Initialize Meeting
meeting = VideoSDK.initMeeting({
  // ...
});

meeting.on("speaker-changed", (activeSpeakerId) => {
    console.log("Active Speaker", activeSpeakerId);
    if (activeSpeakerId != null) {

      // To check if there was any previous active participant
      if (previousActiveSpeaker) {
        var previousDivElement = document.getElementById(`f-${previousActiveSpeaker}`);
        // To check if the previous active participant video element is still present or not
        if(previousDivElement){
          previousDivElement.style.webkitBoxShadow = '';
          previousDivElement.style.mozBoxShadow = '';
          previousDivElement.style.boxShadow = '';
        }
      }

      // Apply box shadow to the current active speaker
      var currentDivElement = document.getElementById(`f-${activeSpeakerId}`);
      // To check if the active participant video element is still present or not
      if(currentDivElement){
        currentDivElement.style.webkitBoxShadow = '0 0 20px blue';
        currentDivElement.style.mozBoxShadow = '0 0 20px blue';
        currentDivElement.style.boxShadow = '0 0 20px blue';
      }

      // Update the previous active speaker ID
      previousActiveSpeaker = activeSpeakerId;
    }
  })
}

✨ Want to Add More Features to JavaScript Video Calling App?

If you found this guide helpful and want to explore more features for your JavaScript video-calling app, check out these additional resources:

Image Capture: Link
Screen Share Feature: Link
Picture-in-Picture (PiP) Mode: Link
RTMP Livestream: Link

Conclusion

You have successfully integrated active speaker indication. From this, you'll significantly improve your video call app, and users will appreciate the clarity and reduce confusion during conversations, leading to a more engaging and productive video conferencing experience.

If you are new here and want to build an interactive JavaScript app with free resources, you can Sign up with VideoSDK and get ? 10000 free minutes every month. This will help your new video-calling app go to the next level without any costs associated with initial usage, allowing you to focus on building and scaling your application effectively.