KEY ISSUES IN NEAR-TERM AI SAFETY

+ last week’s video: Scott Aaronson’s Quantum Computing Q&A

Next Thursday, July 9, 11 AM PT we are quite excited to welcome 3 brilliant speakers, including our Foresight Fellow in Artificial Intelligence Dan Elton – for an exploration of critical issues in near-term AI safety.

We’ll kick off with an overview by Aryeh Englander and follow with a focused presentation by Foresight Fellow Dan Elton on his paper “Self-explaining AI as an alternative to interpretable AI“. Robert Kirk will join them both for a panel and Q&A.

Introduction to AI Safety and Assured Autonomy

Aryeh L. Englander, Johns Hopkins University Applied Physics Laboratory

As AI has become increasingly powerful, researchers have turned to examining how to ensure the safety of AI-enabled systems for use in safety-critical applications. The AI Safety research community started out focusing on longer-term risks from very advanced AI, but more recently has moved to include nearer-term applications as well. Meanwhile, the older field of Assured Autonomy has started to turn its attention to AI-enabled systems as well. Most recently, the two fields have begun to talk to each other and try to create a shared landscape and set of research agendas. This talk is meant to introduce some of the relevant technical safety challenges and to give a bit of background to the field.

Pitfalls with explainability techniques and self-explaining AI as a possible remedy

Daniel Elton, Ph.D., National Institutes of Health Clinical Center

There has been much research recently on techniques for explaining deep neural networks (DNNs) in the hopes of improving trustworthiness and understanding what weaknesses they may have. However, the techniques for doing so differ wildly and many have been shown to have major issues. Here I will tie the intrinsic difficulty of explaining DNNs to the phenomena of double descent. Double descent implies that DNNs use “direct fitting” which means they can’t extrapolate and function in a ways which are hard to condense into a few sentences. I will then introduce the concept of self-explaining AI, which I explore in my paper “Self-explaining AI as an alternative to interpretable AI” . Additionally, to improve the safety and robustness of AI systems I argue that all AI systems should include uncertainty quantification and “applicability domain analysis” (also called “change point detection” or “detecting outlier exposure”) which can provide a warning light to the user if the data lies outside the distribution they were trained on.

… we can’t wait to see you all there!

—

Foresight’s weekly salons:

Our community meets weekly on Thursdays at 11 AM PT / 8 PM CEST to explore cutting edge topics & undercover science.

We leave ample time for discussion and socializing so you can meet the brilliant speakers and fellow participants in breakout rooms.

Feeling like leading an inspiring salon or taking us into a deep-dive with your presentation? Nothing would have us more excited: you can apply to be a speaker, using this form.

Check out the video recordings of salons we have hosted so far here.

SCOTT AARONSON, QUANTUM COMPUTING Q&A

The video of last week’s wonderful salon can be found here & below!

A Q&A with Scott Aaronson on quantum computing and its implications for other crucial technological developments that will shape our future.

Scott Aaronson is the David J. Bruton Centennial Professor of Computer Science at The University of Texas at Austin, and director of its Quantum Information Center. Prior to that, he taught Electrical Engineering and Computer Science at MIT. His research interests center around the capabilities and limits of quantum computers, and computational complexity theory more generally. He has a formidable blog and published the much-loved Quantum Computing Since Democritus.

Scott Aaronson: Quantum Computing Q&A

Our work is entirely funded by your donations. Please consider donating (fiat, crypto, stock) to help us keep going and grow a global community with you. To collaborate closer, consider becoming a Senior Associate, Patron, or Partner, with different membership benefits.

Thank you so much!

Gaming the Future: The Book!

KEY ISSUES IN NEAR-TERM AI SAFETY

+ last week’s video: Scott Aaronson’s Quantum Computing Q&A

Introduction to AI Safety and Assured Autonomy

Pitfalls with explainability techniques and self-explaining AI as a possible remedy

SCOTT AARONSON, QUANTUM COMPUTING Q&A

Search Foresight Institute