Image flagging
The image flagging system automatically identifies inappropriate or problematic content in product images to help maintain Open Food Facts' image quality standards.
How it works
Image flagging uses multiple detection methods to identify content that may not be appropriate for a food database:
- Face Detection – Uses Google Cloud Vision's Face Detection API to identify images containing human faces.
- Label Annotation – Scans for labels indicating the presence of humans, pets, electronics, or other non-food items.
- Safe Search – Uses Google Cloud Vision's Safe Search API to detect adult content or violence.
- Text Detection – Analyzes OCR text for keywords related to beauty products or other inappropriate content.
When flagged content is detected, an image_flag
prediction is generated with details about the issue and the associated confidence level. These predictions trigger notifications to moderation services where humans can review potentially problematic images.
Detection Methods
Face Detection
The system processes faceAnnotations
from Google Cloud Vision to detect human faces. If multiple faces are detected, the one with the highest confidence score is used. Only faces with a detection confidence ≥ 0.6 are flagged to minimize false positives.
Prediction data includes:
type
: "face_annotation"label
: "face"likelihood
: Detection confidence score
Label Annotation Detection
The system flags images containing specific labels from Google Cloud Vision with confidence scores ≥ 0.6. Only the first matching label is flagged per image.
Human-related labels:
- Face, Head, Selfie, Hair, Forehead, Chin, Cheek
- Arm, Tooth, Human Leg, Ankle, Eyebrow, Ear, Neck, Jaw, Nose
- Facial Expression, Glasses, Eyewear
- Child, Baby, Human
Other flagged labels:
- Pets: Dog, Cat
- Technology: Computer, Laptop, Refrigerator
- Clothing: Jeans, Shoe
The prediction data includes:
type
: "label_annotation"label
: The detected label (lowercase)likelihood
: Label confidence score
Safe Search Detection
The Safe Search API flags the following categories only if marked as "VERY_LIKELY":
- Adult content – Sexually explicit material
- Violence – Graphic or violent imagery
The prediction data includes:
type
: "safe_search_annotation"label
: "adult" or "violence"likelihood
: Likelihood level name
Text-based Detection
The system scans OCR-extracted text for keywords from predefined keyword files. Only the first matching keyword is flagged per image.
- Beauty products – Cosmetic-related terms from beauty keyword file
- Miscellaneous – Other inappropriate content keywords from miscellaneous keyword file
The prediction data includes:
type
: "text"label
: "beauty" or "miscellaneous"text
: The matched text phrase