LogoVisionLog

Layer 3: Recognition

Face recognition using InsightFace ArcFace embeddings

The Recognition Layer performs face identification using AI-powered face recognition. The system uses InsightFace with the buffalo_l model, which includes SCRFD for detection and ArcFace for recognition.

Overview

  • Purpose: Identify persons from face images
  • Technology: InsightFace (buffalo_l model)
  • Detection: SCRFD
  • Recognition: ArcFace embeddings
  • Output: Person ID with confidence score

InsightFace

InsightFace is a comprehensive face analysis library that provides state-of-the-art face detection and recognition models.

Model Selection

ModelAccuracySpeedSizeUse Case
buffalo_lHighestMedium~400MBProduction (recommended)
buffalo_mHighFast~200MBBalanced
buffalo_sGoodFastest~100MBReal-time/edge

Recognition Pipeline

Step 1: Face Detection

SCRFD detects faces with landmarks. Each face contains: bounding box, detection score, embedding, and landmarks.

Step 2: Quality Filtering

Filter low-quality detections based on:

  • Minimum face size (50 pixels)
  • Detection confidence (0.5 threshold)

Step 3: Embedding Extraction

Each face is converted to a 512-dimensional vector using ArcFace.

Step 4: Similarity Matching

Compare the query embedding against all enrolled embeddings using cosine similarity.

Step 5: Threshold Decision

Accept match if similarity score meets the threshold; otherwise mark as unknown.

Configuration

Recognition Settings

SettingDescriptionDefault
Model NameInsightFace modelbuffalo_l
Detection SizeInput size for detection640x640
GPU SupportEnable CUDAOff
Similarity ThresholdMatch acceptance0.45
Min Face SizeMinimum face pixels50
Quality ThresholdDetection confidence0.5

Threshold Guidelines

ThresholdSecurityFalse PositivesFalse Negatives
0.35LowHigherLower
0.45MediumBalancedBalanced
0.55HighLowerHigher
0.65Very HighMinimalMore

GPU Acceleration

Enable CUDA for faster processing.

Performance Comparison

HardwareFrames/secNotes
CPU (i7)~5-10 FPSAdequate for small scale
GPU (RTX 3060)~30-50 FPSRecommended for real-time
GPU (RTX 4090)~100+ FPSHigh-throughput scenarios

Embedding Storage

Embeddings are stored in the face database for quick lookup during recognition. The database maps person names to their 512-dimensional embedding vectors.

Best Practices

Enrollment

  • Use 3-5 images per person
  • Include varied angles and lighting
  • Average multiple embeddings for stability
  • Normalize the averaged embedding

Recognition

  • Filter faces by minimum size
  • Check detection confidence
  • Use appropriate threshold for use case
  • Handle "Unknown" cases gracefully

Performance

  • Use GPU for real-time applications
  • Skip frames for video processing
  • Batch process when possible
  • Pre-load models at startup

On this page

Layer 3: Recognition