Month: January 2023

Conservation and its usefulness in AI-alignment

Imagine for a moment an unlikely hypothetical – you’re an early primate, swinging through the trees, when a travelling group of aliens beam you up to their spaceship. After using advanced technology to instill you with improved intelligence and ability to converse with them, they ask you a question:

“We plan to create a new branch of primates by using our advanced technology to accelerate evolution. The resulting primate, which we will call a ‘human’, will be vastly more intelligent and powerful than current primates like yourself. Our technology still has limits, and thus we can’t control the humans’ exact actions in the future, and we can’t control exactly how they’ll understand the world. We can, however, impart a rough purpose or motivation to complement its natural survival intincts. After this conversation, we will place you back in a tree, restored to your previous state. But first, as your species will be sharing Earth with these humans at some point in the future, we’d like you to ask you an important question – what purpose would you like us to instill the humans with?”

I think as an early primate, our best answer would be something like “make the humans conservationists”. Even if a conservationist species believes we are a lesser species, and even if they were skeptical that we were sentient or conscious (after all, they’re a LOT more intelligent, and we can’t even be sure they will continue to understand life in those terms), we can still expect some important protections:

1) Even if they exponentially scale up, your replacement won’t wipe out your species
2) They are unlikely to enslave you, as they broadly like your species living in your ‘natural’ environment (let’s say that DNA in a vat doesn’t constitute survival), and they see your existence in this state as an end not a means
3) The replcacement needs only understand basic scientific definitions like ‘DNA’, instead of needing to agree with you on subjective, flexible and unprovable philosophical concepts like ‘consciousness’ **
4) It scales with multiple iterations – if the replacing party considers the possibility that it may one day produce their own replacement, it makes “later versions should allow earlier versions to continue to exist” seem like a pretty good idea.

It occurs to me that this is much of what we need from AI-alignment is similar to what a non-human species might theoreticaly need from us.

I realise AI-alignment isn’t as simple as “giving the AI a goal”. But, if it is possible AI will replace us as the most powerful cognitive force on the Earth***, and developers have a chance to impart some purpose or goal other than paperclip maximising, ‘scientifically orientated conservationist’ could be a strong contender for best overall philosophic approach.

A common proposed failure mode for even well-meaning AI goals is tiling the universe with things when it scales up. Tiling the universe with copies of 21st century Earth complete with humans**** (and perhaps preserving any extraterrestrial life it finds in a similar way) might be a lot closer to ideal than tiling the universe with paperclips, computronium or brainless happy faces.

NOTES

*Let’s define conservationist as someone scientifically pursuing the survival of a biological species, and jettison any other more political motivations.

** We can’t agree on what consciousness is. We use its unprovable status to cast doubt on whether less intelligent species possess it. Its highly dependent on very abstract philosophy. Your current ‘consciousness’ started when you woke up this morning, and will end in less than a day, regardless of whether you sleep or I turn you into a paperclip. And you want to choose this ‘consciousness’ as your primary AI-safety mechanism? ARE YOU SURE???

*** If AI keeps progressing, could it be possible that even high tech augmentations won’t allow us to keep up. I can’t see human minds (eg. uploads) being a viable way to retain existence during exponential growth, Why would human-like consciousness remain an optimal configuration to process information indefinitely? Even with some form of virtual augmentation, its like trying to upgrade a 20 year old PC by putting more and more RAM on it – at some point the core architecture just isn’t optimal any more, and competition will select for brand-new architectures rather than persisting with difficult upgrades.

**** It might be easier to encode conservation (and for us to seem less hypocritical) if humans had already mastered conservation, but if you remove the virtue signalling and politics I think authentic conservation still exists as one of human’s nobler qualities. Thinking about where AI and conservation overlap as fields seems like an underexplored area at least.