Computer VisionComplianceCloud AIFinTech

AI Image Compliance Engine

Context-aware logo verification for branded-apparel compliance workflows

WeBuildTech·October 20, 2025

At a Glance

Client typeEnterprise insights / digital payments brand

Core outcomeAutomated shirt-logo compliance verification

Key technologiesGoogle Cloud Vision, OpenCV, MediaPipe

OutputAnnotated images, JSON reports, Excel summaries

DeploymentLocal batch CLI and desktop GUI

The Challenge

The business problem looked simple at first glance: verify whether a brand mark is visible in a submitted image. In practice, however, the operational requirement was more specific and more demanding. A logo could appear on a shirt, on a separate paper card, in a hand-held placard, on a background object, or as partially readable text. A false positive in any of those cases would weaken the credibility of the compliance process.

A visible logo is not the same as a compliant logo.
The system had to reason about where the logo appears relative to the person and the garment.
The output had to be explainable enough for audit, review, and downstream reporting.

Client situation

A branded apparel or field-operations use case where image-based logo verification had to be accurate enough for real compliance review.

Core gap

A generic "logo present in image" answer was not enough; the business needed to know whether the logo was actually on the shirt.

Design Objectives

Detect the target brand mark reliably across real-world user-provided images.
Validate whether the mark is inside the shirt region rather than elsewhere in the frame.
Reduce false positives caused by hand-held paper cards, placards, or visually similar props.
Generate operator-friendly outputs that can be reviewed, stored, and exported in batch workflows.
Keep the solution modular enough to extend into APIs, dashboards, or additional compliance classes.

Solution Overview

WeBuildTech approached the problem as a context-aware verification task rather than a standard logo-detection task. The engine first standardises the input image, then uses cloud vision to detect people, faces, logos, and text. From there, it infers the most relevant shirt region, checks whether the logo falls inside that region, and applies a second layer of guards to reject non-garment artefacts such as rigid rectangular paper cards or hand-overlapping signs.

What the system does

Image standardisation

Normalises files into a processing-ready JPEG format, improving downstream consistency across local files, URLs, and cloud URIs.

Full-image detection

Uses Google Cloud Vision to identify persons, faces, logos, and text from the original image.

Shirt-region inference

Builds a torso or shirt region from person, face, and pose cues so the logo can be judged in context.

ROI re-scan

If the global hit is uncertain, rescans the shirt region with orientation-aware checks to strengthen the result.

OCR fallback

Looks for text-like variants of the brand name when logo detection alone is not sufficient.

Explainable reporting

Returns a decision, reason, evidence boxes, annotated image, JSON record, and Excel-ready summary row.

Why This Engine Is Different

Most off-the-shelf image classifiers answer a broad question such as whether the brand exists somewhere in the scene. This project required a narrower and more operationally useful answer: whether the brand is on the person's shirt. That distinction is what makes the solution meaningful for compliance workflows.

False-positive control

Hand overlap checks help identify signs or cards being held in front of the torso.
Paper-like texture and rectangular-shape cues reduce the chance of mistaking printed placards for garment branding.
Colour and edge-transition guards compare the suspected logo patch against the surrounding shirt surface.
The system short-circuits when it reaches a clear answer, balancing quality with API-call efficiency.

Engineering Footprint

main.py

Orchestration layer — runs the pipeline end to end, manages I/O, and writes reports and annotated outputs.

detector.py

Decision engine — contains the core compliance logic, shirt-region inference, ROI re-scan logic, OCR fallback, and final decisioning.

vision_client.py

Vision wrappers — encapsulates cloud vision calls and makes the inference layer easier to manage and extend.

image_cleaner.py

Input preparation — standardises image quality, format, and size before analysis.

visualize.py

Explainability layer — draws evidence overlays so reviewers can see why the system made a decision.

gui.py

Operator experience — adds a desktop review interface on top of the backend for practical business use.

Operating Model and Deployment Path

Current operating shape: local batch CLI and desktop GUI suitable for controlled review workflows.
Accepted input modes: local files, folders, web URLs, and cloud storage URIs.
Current outputs: annotated images, JSON reports, and Excel summaries.
Scalable next step: package the same engine behind an API and connect it to cloud storage, dashboards, and review queues.

Business Value Delivered

Turns a manual visual check into an automated first-pass compliance decision.
Produces more defensible outputs by attaching evidence rather than returning a black-box yes or no.
Improves scalability for branding or field-audit workflows that process many images in batch.

Want something similar built?

Let's talk about your problem and how we can design a solution around it.

Book Discussion