We’ll kick off with an overview by Aryeh Englander and follow with a focused presentation by Foresight Fellow Dan Elton on his paper “Self-explaining AI as an alternative to interpretable AI“. Robert Kirk will join them both for a panel and Q&A.
Introduction to AI Safety and Assured Autonomy
Aryeh L. Englander, Johns Hopkins University Applied Physics Laboratory
As AI has become increasingly powerful, researchers have turned to examining how to ensure the safety of AI-enabled systems for use in safety-critical applications. The AI Safety research community started out focusing on longer-term risks from very advanced AI, but more recently has moved to include nearer-term applications as well. Meanwhile, the older field of Assured Autonomy has started to turn its attention to AI-enabled systems as well. Most recently, the two fields have begun to talk to each other and try to create a shared landscape and set of research agendas. This talk is meant to introduce some of the relevant technical safety challenges and to give a bit of background to the field.
Pitfalls with explainability techniques and self-explaining AI as a possible remedy
Daniel Elton, Ph.D., National Institutes of Health Clinical Center
There has been much research recently on techniques for explaining deep neural networks (DNNs) in the hopes of improving trustworthiness and understanding what weaknesses they may have. However, the techniques for doing so differ wildly and many have been shown to have major issues. Here I will tie the intrinsic difficulty of explaining DNNs to the phenomena of double descent. Double descent implies that DNNs use “direct fitting” which means they can’t extrapolate and function in a ways which are hard to condense into a few sentences. I will then introduce the concept of self-explaining AI, which I explore in my paper “Self-explaining AI as an alternative to interpretable AI” . Additionally, to improve the safety and robustness of AI systems I argue that all AI systems should include uncertainty quantification and “applicability domain analysis” (also called “change point detection” or “detecting outlier exposure”) which can provide a warning light to the user if the data lies outside the distribution they were trained on.
… we can’t wait to see you all there!
—
Foresight’s weekly salons:
Our community meets weekly on Thursdays at 11 AM PT / 8 PM CEST to explore cutting edge topics & undercover science.
We leave ample time for discussion and socializing so you can meet the brilliant speakers and fellow participants in breakout rooms.
Feeling like leading an inspiring salon or taking us into a deep-dive with your presentation? Nothing would have us more excited: you can apply to be a speaker, using this form.
Check out the video recordings of salons we have hosted so far here.