Links - Morgan Stanley uses AI evals to shape the future of financial services

Sharing this post on Morgan Stanley and AI from the OpenAI blog.

“This technology makes you as smart as the smartest person in the organization. Each client is different, and AI helps us cater to each client’s unique needs.” Jeff McMillan, Head of Firmwide AI at Morgan Stanley

Evals

To evaluate GPT-4’s performance against their experts, Morgan Stanley ran summarization evals to test how effectively the model condensed vast amounts of intellectual capital and process-driven content into concise summaries. Advisors and prompt engineers graded AI responses for accuracy and coherence, allowing the team to refine prompts and improve output quality.

The biggest takeaway

The eval framework wasn’t static; it evolved as the team learned.

This should be expected with Generative AI, your eval framework should not be static, and your project is never done.

Expanding the corpus

This is massive and extremely well done

“We went from being able to answer 7,000 questions to a place where we can now effectively answer any question from a corpus of 100,000 documents,” says David Wu, Head of Firmwide AI Product & Architecture Strategy at Morgan Stanley.

Adoption

over 98% adoption in wealth management

!! !! !!

Their strong eval framework has also unlocked a flywheel for future solutions and services.

I am currently reading The Value Flywheel Effect so this really resonated with me.

They’ve tackled the two hardest obstacles, in my opinion, they’ve created a very successful project AND gotten widespread adoption. When stakeholders are bought in like this, releasing additional projects should have a much lower barrier to entry.

Matt Busche's Picture

About Matt Busche

Software Engineer and Wheel of Fortune Expert If this article helped you, please consider buying me a book.

Des Moines, IA https://www.mrbusche.com