2 minute read

An idea that I have been converging on in discussions recently spawned by Chat-GPT is that, for the purposes of risk assessment and impact, we should to think about modern AI/ML systems as highly skilled people.

What I mean here is not that these systems are alive or sentient in any way. I mean that, if you replaced the black box component of your process, or workflow, or pipeline from some AI/ML model and put a person there, how would you change your trust in the system? Because we’re getting close to the point where your answer should be “not at all”.

Some people are talking about “value alignment”, which is part of this but focusses more on whether a superpowerful AI would do things we think are right or wrong. But what I’m trying to clarify with people is that it’s simultaneously not as bad as that yet, but also much worse than that already.

This is because modern AI/ML systems sometimes have success rates comparable to, or superior to, humans trained on the same tasks. But they also have just as high, or usually much higher, failure rates. They can lie, they can make an invalid decision, they can make mistakes in ways we can’t even imagine. That’s because their values and ways that they reason are not the same as our own.

That’s fine as long as we know what we’re dealing with. So using it for ranking advertisements and search responses, or movies to watch, that is all fine. But should you trust it? Trust it any more than you would trust a human being, or a highly trained animal, when it is for something safety-critical like driving or medical-anything? The correct answer, if you haven’t been paying attention, is NO.

The problem isn’t entirely that the system will fail more than humans, often they fail much less than humans, such as in autonomous driving under good conditions. The problem is that we don’t expect engineered, expensive machines to fail at all.

But these ones do!

Some of these systems have reached a level of performance that only humans can match, but in doing so they become fallible, just like us. So the problem isn’t the AI/ML models being untrustworthy, it’s that we should not trust them any more than we trust a random person, or any intelligent animal, that we don’t know. (In fact, we should often trust them much less than this, but people want to trust expensive machines). It will mostly be fine, but it could be arbitrarily bad sometimes. Once you know that, plan accordingly how you want to integrate it into your INSERT_EXCITING_STARTUP_PLAN_HERE.


This blog post spawned from a longer response to this tweet from Gary Marcus and others: