Object Detection for Face Focus Detection in Video Calls

Developed an object detection system using Landing.ai to determine if a person is focused on the camera during video calls, such as on Zoom and other platforms.

Project Overview

Objective: Develop an object detection system using Landing.ai to determine if a person is focused on the camera during video calls, such as on Zoom and other platforms. The goal is to enhance user experience by providing feedback on camera focus, improving engagement and communication effectiveness.

Methodology:

  1. Dataset Preparation:
    • Collected images of people facing the camera and labeled them as "Facing Camera".
    • Collected images of people facing away from the camera and labeled them as "Facing Away".
    • Included images with varying angles and lighting conditions to enhance model robustness.
    • Split the dataset into training and testing sets for model evaluation.
  2. Model Training:
    • Utilized Landing.ai platform for training the model.
    • Selected RTMDet architecture with 9 million parameters for fast training and inference.
    • Conducted training with 100 epochs, leveraging Landing.ai’s GPU resources in the cloud.
    • Employed default hyperparameter tuning provided by the Landing.ai platform.
  3. Model Evaluation:
    • Tested the trained model using a laptop camera, capturing images at various angles.
    • Achieved high accuracy with the model correctly predicting focus in almost all test images.
    • Observed 100% accuracy on both training and validation datasets, noting the potential for overfitting due to the small dataset size.
  4. Future Improvements:
    • Expand the dataset to include more diverse images for better generalization.
    • Perform extensive hyperparameter tuning to optimize model performance.
    • Integrate the system into video conferencing applications to provide real-time feedback on camera focus.

Model Architecture Details from Landing.ai:

Conclusion: The developed system successfully determines if a person is focused on the camera, demonstrating the potential to improve video call experiences. With further dataset expansion and model tuning, the system can be integrated into video conferencing tools for enhanced user engagement and communication.

Project Media

Project 1
Image showing the Overall Face Focus Model Overview.
Project 1
Image shows some of the dataset faces.
Project 1
This Image Shows the two labels used in the model.
Project 1
The Image shows the Evaluation of the Model.
Project 1
Image showing the prediction of the model when facing the camera.
Project 1
Image showing the prediction of the model when Facing Away from the camera.
Project 1
This images shows the QRCode one can scan to try out the model on their phone or device.

High-Level Overview

Objective: To create an object detection system that identifies whether a person is focused on the camera during video calls using Landing.ai.

Approach:

Outcome: The system accurately detects camera focus, with 100% accuracy on both training and validation datasets. Future work includes expanding the dataset and fine-tuning the model for broader application.

Back to Projects