Symbolic Regression: Towards Interpretability and Automated Scientific Discovery

[Slides]

Parshin Shojaee
Virginia Tech

Nour Makke
Qatar Center for Artificial Intelligence

Sanjay Chawla
Qatar Center for Artificial Intelligence

Chandan Reddy
Virginia Tech

Taxonomy

Taxonomy-viz

About this Tutorial

This tutorial provides a comprehensive exploration of Symbolic Regression, an emerging area of AI focused on discovering interpretable mathematical expressions from data. As AI systems become increasingly integrated into critical domains, the ability to uncover transparent, mathematical relationships is essential for advancing scientific understanding and developing trustworthy AI systems.

Recent advances in AI, particularly in deep learning and LLMs, have opened new paradigms in symbolic regression, enabling more sophisticated approaches to equation discovery and interpretation. These developments raise fundamental questions about how we can harness AI techniques to advance scientific understanding while maintaining interpretability.

Our tutorial is guided by the central question: “How can we leverage AI to discover meaningful mathematical expressions that advance scientific understanding while ensuring interpretability and trustworthiness?” We will explore this question through a comprehensive journey that covers:

Foundations and Evolution: How has symbolic regression evolved from traditional search-based methods to modern AI-driven approaches? What are the key principles and challenges in discovering interpretable mathematical expressions?
Modern Approaches: How do different paradigms - from evolutionary algorithms to transformer models and LLMs - contribute to equation discovery? How can we effectively combine these approaches?
Evaluation and Benchmarking: What constitutes meaningful evaluation in symbolic regression? How do we design benchmarks that truly capture the ability to discover interpretable mathematical relationships?
Impact: How can symbolic regression advance interpretable modeling and scientific discovery across different domains? What are the practical implications?

The tutorial is designed for researchers and practitioners in machine learning, AI, and scientific domains who seek to understand and contribute to the advancement of interpretable modeling. While familiarity with basic machine learning concepts is helpful, no prior experience with symbolic regression is required. Through this tutorial, attendees will gain both theoretical understanding and practical insights into symbolic regression, positioning them to contribute to this evolving field and its applications across science and industry.

Reading List

Introduction

Methods

Search SR Methods

Learning SR Methods

Learning + Search SR Methods