Fintech Firm Scales ID Fraud Detection with Generative AI
- •Sun Finance improves identity extraction accuracy from 79.7% to 90.8% using AWS Bedrock.
- •New AI pipeline reduces manual document review by processing applications in under 5 seconds.
- •Hybrid architecture combines traditional OCR with LLM structuring to bypass PII safety filters.
In a striking demonstration of how generative AI is reshaping operational workflows, Sun Finance—a European fintech—has successfully overhauled its identity verification (IDV) pipeline. The company, which processes millions of loan evaluations monthly, previously relied on a legacy system that forced 60% of all applications into manual review. This bottleneck was driven by the limitations of traditional optical character recognition (OCR) tools, which frequently struggled with language variations, complex document layouts, and, ironically, the same privacy safeguards that protect sensitive data.
The solution highlights a sophisticated 'separation of concerns' strategy. Rather than forcing an all-in-one AI model to perform every task—which often triggers built-in safety refusals when encountering personally identifiable information (PII)—the team architected a multi-tier pipeline. They utilized Amazon Textract as the primary workhorse for raw text extraction, which operates without the restrictive privacy hurdles inherent in large language models.
Once the text is securely extracted, the system hands it over to Anthropic’s Claude Sonnet 4, running via Amazon Bedrock, to structure the data into standardized formats. This combination effectively circumvents the model's safety triggers while leveraging its powerful reasoning capabilities to interpret nuances that simple OCR systems miss. The shift was transformative: accuracy jumped by over 11 percentage points, and per-document costs plummeted by 91%, turning a previously prohibitive operational expense into a scalable advantage.
Beyond mere text extraction, the firm implemented a serverless fraud detection mechanism using vector similarity search. By converting selfie backgrounds into numerical vector representations, the system can identify patterns across multiple applications, effectively spotting fraud rings that manual reviewers would likely miss. This deployment underscores a growing trend in enterprise AI: success often depends less on a single 'god-mode' model and more on orchestrating specialized tools into a resilient, serverless pipeline.
This approach offers a blueprint for industries plagued by high manual overhead. By integrating specialized OCR for reliable data capture and LLMs for intelligent, context-aware structuring, companies can finally achieve the high-throughput automation needed for rapid, global scaling.