Name: Evaluating and Monitoring LLM-Powered Applications at Scale
Start: 2025-08-20T16:05:00+00:00
End: 2025-08-20T16:45:00+00:00
Location: AI Risk Summit Track 1 (Salon I)

About This Session

Large Language Models (LLMs) have transformed how businesses automate complex workflows.
At Block Inc., we've integrated LLMs deeply into our operational fabric, automating critical risk operations tasks with significant business impact.

However, deploying LLMs into production is just the beginning—continuously evaluating their effectiveness and maintaining visibility into their performance presents significant challenges.

This talk provides a deep dive into practical frameworks and methodologies for evaluating, monitoring, and improving LLM-based applications at scale.
We'll explore:

Techniques for robust prompt engineering: How do we effectively design, test, and iterate on prompts to ensure maximum impact?

Evaluation frameworks: Leveraging LLMs themselves as "judges" to measure the quality and effectiveness of applications.

Continuous performance monitoring: Strategies to track LLM effectiveness over time, identify performance drift, and proactively address degradation.

Observability and user impact: Using telemetry data, session replay, click tracking, and A/B testing to measure real-world usage and value.

Speaker

Suraj Jayakumar

AI Scientist - Block, Inc.

An experienced Machine Learning practitioner in the fintech industry, having led critical modeling initiatives at Venmo and Cash App.
Currently driving AI-powered automation at Cash App, focused on scaling and optimizing Risk Operations through advanced LLM-based solutions.

Event Session

Evaluating and Monitoring LLM-Powered Applications at Scale

About This Session

Speaker

Suraj Jayakumar

Security Community

About the Conference

Advertising

Reach a large audience of enterprise cybersecurity professionals

Event Session

About This Session

Speaker

Suraj Jayakumar

CISO Forum Updates

Security Community

About the Conference

Advertising

Reach a large audience of enterprise cybersecurity professionals