The case for synthetic data in the Financial Services Industry

The case for synthetic data in the Financial Services Industry

By Dave Luttrell, Principal AI Consultant

In the world of financial services, organisations face a constant flow of complex risks every day. Being able to anticipate and tackle these risks is key to keeping things safe and ensuring their operations continue smoothly. A big part of that comes down to the data they hold—and, more importantly, the decisions they make based on it.

By 2026, 75% of businesses will use generative AI to create synthetic customer data, up from less than 5% in 2023 - Gartner Predicts, 2024

SO, WHAT IS SYNTHETIC DATA?

Synthetic data is data that’s created artificially instead of coming directly from real-world events. Think of it as “fake” data made to fill gaps or add to the data an organization already has, helping create a more complete or representative picture.

DRIVERS FOR SYNTHETIC DATA ADOPTION

When it comes to using data to make decisions, especially in finance, there are a number of drivers for using synthetic data:

Concentration (or Bias)

Overdirecting, modelling for or diverting attention to areas which are known.

Lack of data (or unknowns)

In some areas banks may have limited real world examples of where real events happen. Or are trying to quantify known unknowns.

Privacy

A lot of scenarios will require privacy enhancing technology to be used, while also enabling representative data.

HOW DOES THE PROCESS OF MAKING SYNTHETIC DATA WORK?

Creating synthetic data can be as simple as using basic models, or as advanced as deploying deep neural networks. Various techniques, like Variational Autoencoders (VAEs), Generative Adversarial Networks (GANs), and even large language models, can be used to generate this type of data.

A key takeaway across all these techniques is that synthetic data needs a solid foundation. It requires a clear and evolving understanding of what “normal” looks like in order to guide and shape the data that’s generated.

WHAT ARE SOME NOVEL WAYS OF USING SYNTHETIC DATA?

  1. Fraud & Anti-Money Laundering (AML) – You can simulate suspicious transactions that might raise concerns about money laundering or fraud. These scenarios help improve fraud detection models and catch problems earlier.
  2. Customer Simulation – You can create synthetic customer interactions across different touchpoints in their journey with a business. This can be useful for testing things like customer service, fraud detection, marketing strategies, or risk management.
  3. Risk & Regulation – Synthetic data can help you generate new risk scenarios that might not be captured by existing models. This way, you can identify new behaviors and test how well current controls hold up in unexpected situations.
  4. Markets & Equity – Use synthetic data to simulate market and equity scenarios. This helps finance teams test out different strategies without using real-world data, which can be sensitive or difficult to access.

SOME THINGS TO KEEP IN MIND WITH SYNTHETIC DATA

Despite it’s advantages, there are key items to be aware of when enabling synthetic data within your  organisation. Enabling Synthetic Data safely and Securely will help you to leverage it’s value while mitigating key risks.

  1. Laws & Regulations – Generating synthetic data comes with rules, especially around data sovereignty and localization. Make sure you know the legal side before diving in.
  2. Data Quality – Synthetic data shouldn’t just be used to fill gaps in real data. It’s meant to supplement, not replace, actual data where we already know the facts.
  3. Data Deployment – Ideally, synthetic data should stay separate from production data, unless it’s being used for experiments or testing purposes.
  4. Bias – Make sure you’re putting controls in place to avoid any unintentional bias driven through synthetic data. Ensure you continually monitor this in line with ethical guidelines and principles.
  5. Validation & Re-enforcement – Keep an eye on the data you generate to make sure it’s fair and makes sense for the specific area you’re working in. This includes ensuring privacy and proper de-identification measures.

WHAT IMPACT DO WE THINK SYNTHETIC DATA WILL HAVE?

Synthetic Data has the ability to transform how we use data in a safe and secure way, here are some examples of how we think value will be derived from it’s use. 

  1. Faster time to market – enabling innovation through datasets and scenario based modelling previously unavailable to organisations, will help identify and avail of previously unidentified opportunities.
  2. Enhanced Privacy & Security  by generating synthetic data that does not expose PII, will enable entities to operate and innovate with confidence.
  3. Trustworthy AI – with robust generation and diverse data, enabling organisations to develop safe AI systems that adhere to ethical & responsible standards.

LEARN MORE

To learn more about Synthetic data please reach out to InfoCentric today! We’ve been serving the Finance industry with comprehensive data and analytics services and solutions since 2009.