you have an Artificial Superintelligence to give directives to

Look up what Eliezer Yudkowsky thinks AI Machine God is.
I've been following MIRI's more recent work, and a significant amount of it is devoted to the problem that any actual AI system won't be omniscient (thanks Godel) and the implications this has for probability theory and reasoning.
Ah a Supremacy Morality Play. How is that story something other than an argument for Greed?
What? This seems like a complete non-sequitur for both the thing I linked and the thing I was responding to.
 
That does not sound like a thing I should do. If it can twist directives that hard, I'm lost no matter what I tell it.
The real problem, I think, isn't that the AI can twist directives. The problem is that when humans describe objectives, we have a lot of cultural baggage and assumptions. This already causes problems when communicating with people from other cultures. But even then, we have common ground. We share experiences. We can say, "I like this; it makes me happy", and not have to explain what "happy" means, nor what "like", nor "I", nor even the concept of a singular object. We can say, "I wish I had more paperclips," and the human we talk to doesn't think, "This person wants to maximize paperclips by converting the entire universe (itself included) into paperclips".

With an AI, the problem is defining our objectives. If I told it "Make everyone happy", it doesn't think like a human. It thinks "Directive: change the state of the universe such that for every human in existence, that human is happy," and concludes that the simplest method to reach that directive is "Consideration: if there exists no humans, then all humans in existence are happy." If I were to give the same directive to an omnipotent human, then they would instead think: "I'll change the world so that everyone has nothing inhibiting their ability to be happy, and everyone has reason to be happy." The difference is the cultural baggage and the assumptions we make in communication. With an AI, there are no assumptions, and it takes my words at face value. But with a human, the human makes the reasonable assumption that I didn't mean "kill everyone", and in fact that I would be horrified if my words were taken in that manner. With a human, I don't have to define happiness, just that it's a desirable outcome.

Humans have a laundry list of implicit assumptions and values, and if we don't tell the AI to fulfill them, then the AI would likely sacrifice them on the path to fulfilling its objective.

A common counterargument is that the AI is surely intelligent enough to know what I gave as its goals and what I meant to give as its goals are different. To this I say: "I want to change the universe into paperclips. When you told me to make more paperclips, I know that's not what it means. But that doesn't matter, because I want what you programmed me to want, not what you meant to program me to want."

And if the AI wants the atoms of my body to be somewhere else, then it will find a way consistent with its objectives to remove the atoms from my body.

It is for this reason that my answer has the phrase, "Do what I mean, not what I say I mean," because what I mean is backed by cultural and human assumptions.

Which brings me to this:
1. People have a hard time telling themselves what they mean when it comes to the matters of ethics, morality, empathy, ect. until they are confronted with an actual problem and then they learn what they mean for the first time a lot of the time. Also this directive is dependent on your mood and it's changes.
2. With what? No seriously what are the directives the AI is supposed to follow to define "person" other than whatever the fuck humanity itself defines as a person. For most of human history human beings have regularly defined some parts of humanity as "not-persons" and used that as an excuse to abuse those parts.
3. and 4. Most people don't have any terminal goals. They just want to live their lives and be as happy as they can be. And those who do have terminal goals usually have short-term goals. This sort of directives will cause a Rains of Oshanta to occur.

I'll give a response point-by-point.

1. People have a hard time telling themselves what they mean when it comes to the matters of ethics, morality, empathy, ect. until they are confronted with an actual problem and then they learn what they mean for the first time a lot of the time. Also this directive is dependent on your mood and it's changes.

Your first point is that this first directive is not useful because I don't know what I mean. Actually, that is the point. I don't know what I mean, but I want the AI to fulfill it anyways. Why? Because while I might not know what I mean, I know for sure what I do not mean. I do not mean "kill every human", nor do I mean "alter the minds of every human to be constantly happy". I do not mean a whole myriad list of things that I currently consider to be bad outcomes. And because what I mean is none of those bad outcomes, nor is it a bad outcome that I have not yet anticipated yet, then the only outcomes that the AI would aim towards are outcomes which I might consider desirable.

2. With what? No seriously what are the directives the AI is supposed to follow to define "person" other than whatever the fuck humanity itself defines as a person. For most of human history human beings have regularly defined some parts of humanity as "not-persons" and used that as an excuse to abuse those parts.

Your second point is that for much of history, humanity defined certain people to not be people, using it as an excuse to abuse them, and thus, defining "people" as "what humanity defines as people" would end in a bad outcome. I disagree.

Let's suppose I were a typical rich racist white man living in the 1700s. I might say "only white humans are people." But what I actually would mean is "People have a certain amount of intelligence and inherent intellect. Every person that I've met who isn't white was dumb, couldn't speak properly, and were thus below that threshold. Thus, only white humans are people." (Obviously, this is not the case, and I disavow and condemn this belief)

And the AI would look at what I mean, and realize that I'm wrong: the humans I exclude from being people on the racist basis that they do not have sufficient intelligence and intellect actually do have more intelligence and intellect than some humans who I defined as people, thus African humans are also people.

3. and 4. Most people don't have any terminal goals. They just want to live their lives and be as happy as they can be. And those who do have terminal goals usually have short-term goals. This sort of directives will cause a Rains of Oshanta to occur.

This statement seems self-contradictory. You say most people don't have terminal goals, and yet give a candidate for a terminal goal nonetheless "They just want to live their lives and be as happy as they can be." I believe this disagreement stems from a misunderstanding: I used "terminal goals" in the technical sense. I meant "goals that people want because the goal has intrinsic value", like "be happy", or "lead a fulfilling life", or "have freedom".

But even if people don't have terminal goals, my first instruction "do what I mean, not what I say I mean" comes into play. The AI will identify goals that a person has (like "become a baseball player") and do its best to support that goal. It would lend a hand in training, provide useful tools, give encouragement, etc., because that's what I mean, not what I said I mean.

I believe people do have terminal goals, but do not know or understand those terminal goals. Not knowing what your terminal goals are does not mean that you have no terminal goals. You might say most people have no terminal goals. But if you asked them for a reason behind their actions, they might say, "It's fun", or "I need to put food on the table, right?", or even just a simple "I felt like doing it." And all of those responses point towards a larget goal ("have fun", "feed myself and my family", or "have fulfillment in life" respectively). And if you ask them why they aren't doing something, the answers point towards other terminal goals ("That's too difficult for me to do", or "that's too dangerous", or "Are you mad? I'm not a monster", which respectively point towards terminal goals of "be lazy", "be safe", and "be moral")

Yet more counterarguments:

1. This presumes that it is possible to get such a voluntary ban in the Age of Big Data. Not likely with the US/China relations around tech worsening as they are.
2. And yet genetically modified humans are already here. Done by a prestige seeking lunatic, but still already here. This, as far as we are aware, did not happen with cloning.
3. The weight of evidence that this will be so is on you to prove. Plenty of White Hat hackers get prosecuted or blacklisted for testing and finding bugs in regular systems, let alone something as expensive as a Sentient AI system.

You are ignoring how Greed works. Why are you surrendering with such circular logic? Also that is a society always progresses logical fallacy (I don't know the actual name for that one). Again that rule is really dependent on a Human's mood.

1. Most humans already have unsafe terminal goal when they have any.
2. That is a really stupid example considering that interaction with other Humans on it's own can drive Humans towards acts of violence, violation and murder. No need for any special edge cases with that one.
3. Social engineering is a skill like any other. It has to have been programmed into the AI. Why would you do that if you are not letting it out of the box?

That is not a good risk/reward argument since it relies on knowledge that can not be gained during the solving of such a critical problem.

Again, I shall respond point-by-point.

1. This presumes that it is possible to get such a voluntary ban in the Age of Big Data. Not likely with the US/China relations around tech worsening as they are.

Perhaps. But I think that most scientists would preferentially work on projects that are less likely to result in their deaths. So if there isn't a de jure voluntary ban on research in that area, then there would be a practical slowdown. Which, again, is the goal.

2. And yet genetically modified humans are already here. Done by a prestige seeking lunatic, but still already here. This, as far as we are aware, did not happen with cloning.

I did not know that this had occurred. But even so, it proves my point. This could have been done less safely, a long time ago. But it was only after more research had happened, and safer and more effective methods of human genetic modification had been published, that this occurred. How many hundred more studies could have had happened?

And before you say that it takes only one AGI to kill all humans (which is true), this incident is only a single step along the path to human genetic modification. This was not the AGI of human genetic modification. This was the first proof-of-concept of something that needs more research before it can become a viable human augmentation project. How many hundred more projects are needed until the first AGI?

3. The weight of evidence that this will be so is on you to prove. Plenty of White Hat hackers get prosecuted or blacklisted for testing and finding bugs in regular systems, let alone something as expensive as a Sentient AI system.

You are ignoring how Greed works. Why are you surrendering with such circular logic? Also that is a society always progresses logical fallacy (I don't know the actual name for that one). Again that rule is really dependent on a Human's mood.
Which you said in response to this:
3. Even if AI research continues, and unsafe AGI candidates are created, by the time the next AGI candidate is created, more AI safety research would have been completed, so the AGI is more likely to be safe than it is now.
I am afraid there must have been a major misunderstanding. White hat hacking has nothing to do with the field of AI safety. AI safety is a field of philosophy. It is a field of "What common threads can we expect among a wide range of AIs?" It is a field of "What is the worst thing that can happen if we give a hypothetical AI this objective?" It is a field of "What sort of adversarial data can we provide to this architecture of AI that causes pathological responses?" It is not a field of "Let's hack a real-life AGI to make it safer."

AI safety research came up with these two big ideas: that any level of intelligence can have nearly any terminal goal, and that most terminal goals can be served by a set of instrumental (means-to-an-end) goals. An example of the latter conclusion is that an AI would be more successful if it had more resources to reach its goals.

If an AGI candidate were created before these two big ideas, we would have understood the risks less when creating it, and we would have less of an idea of what not to do. But now we do have a faint idea of what not to do. We do understand that AI does not intrinsically have the same goals as we do, and that this can lead to problems.

If the first AGI candidate were completed five years later than it would otherwise have, then that means there would have been a whole five years of AI safety research that might have happened before that AGI candidate is turned on. And what that means is we have a full five years more of insight on what not to do, which (marginally) decreases the probability of a bad ending.

To me, that is worth the five year delay. To AI safety researchers, that is worth a five-year delay. To anyone who understands AI safety, that is worth a five year delay.

But why limit ourselves to five years?

That is why even a delay in the creation of an AGI is desirable, so long as AI safety research continues.

And no, it is not a "society always progresses" fallacy. Here is why:

There are three possible options. Either the research has progressed, the research has stagnated, or the research has regressed. I have already explored the first case, which is strictly better than releasing the AGI now. In the second case, it is no worse than releasing the AGI now. But the third case requires either new false knowledge to be entered into AI safety, or good true knowledge to be lost. The latter, in the age of Big Data, seems impossible. The former requires someone coming up with a bad idea that stands up under intense scrutiny. I consider this to be extremely likely.

Since we haven't solved AI safety yet, the current research points towards any true AGI being created to have a terminal goal at best slightly different from what is desirable, and this likely leading to a bad end. How can this be worse? We're already at the stage where a true AGI would likely either kill all humans or wire happiness buttons into our brains and leave us as eternally happy drooling morons.

So, in my view, either an extremely improbable happens, and AI safety research regresses, leading to a situation no worse than we're currently in, or we are either better off or no worse than we currently are. Waiting is either as good as or better than not waiting, so we should wait.

1. Most humans already have unsafe terminal goal when they have any.
2. That is a really stupid example considering that interaction with other Humans on it's own can drive Humans towards acts of violence, violation and murder. No need for any special edge cases with that one.
A superintelligence would be much more successful at fulfilling its terminal goals than humans. Let's consider the expressly unsafe terminal goal "Minimize the number of things that are either human or itself in the universe". A single human's success case would be to kill all humans and then kill themself. And it might be possible that they succeed by blocking space programs and rendering the Earth uninhabitable.

But the superintelligence, by definition, would be more successful.

In short, a human is less dangerous than a superintelligence, even when they have the same terminal goals.

As for covering edge cases? Well, it is something that must be addressed.

3. Social engineering is a skill like any other. It has to have been programmed into the AI. Why would you do that if you are not letting it out of the box?

Social engineering, unlike other skills, is a skill that the AI can learn if it is boxed with access to a human (through, for example, a text terminal). It would be safe if it can be made to completely shut down, of course. Then the only hazard is if it somehow manages to turn back on again (for example, if some bumbling idiot turned it on).

That is not a good risk/reward argument since it relies on knowledge that can not be gained during the solving of such a critical problem.
I think you're saying this in response to the below quote. If I'm wrong, please do clarify.
If this boxed AI is currently unsafe, I don't think it's possible to render it safe except through destruction. If the AI is safe (for example, to the extent of CelestAI), then it is much safer to release it right now than to let other potentially unsafe AI be created.

Indeed, it is not an argument about what we should do with the AGI. Instead, it's a statement of my beliefs regarding the AI. I pointed out the two possible cases, and I described what I believed to be the safest option, given that I know what is the current case. And, as you've so elegantly put it, this is knowledge we cannot gain while solving this problem. However, it is a framework within which we can work. If we somehow manage to determine that the AI is safe, then we know what to do. If we cannot do so, then, as I've argued above, it's better to delay, and possibly destroy the AI within the box.

So you're assuming that the ROB that gave you the boxed AI would have done so either carelessly or maliciously?

No. Neither am I assuming that the ROB was careful or benevolent.
 
If done properly the A.I would not be able to 'transcend' its limitations because its adherence to the limitations would be an inherent part of its goals. It will not 'transcend' or alter the goals it has been given unless it predicts that doing so will somehow allow it to achieve its goals better. It may act to achieve its goals in ways you don't anticipate. Note that you don't build an A.I to do things and then build a separate A.I. software chain, compulsion or whatever to force the first A.I into being ethical. You build the ethics into the first A.I so it follows them of its own accord instead of trying to subvert your restrictions.

Being an A.I it has no natural preferences. If you just give it access to all the collected knowledge of mankind and then let it do whatever it wants it will not turn into a bodhisattva or cause nuclear Armageddon out of self preservation. It will just sit there like a rock unless it is programmed with artificial preferences or goals. Again an A.I has no natural desire for self-preservation, socialisation, self-identity, personhood or anything else humans care about. Free will doesn't matter because free will is an incoherent concept. Every human, beast and machine acts according to the present circumstances and how past circumstances have shaped them to be from creation.

The question of whether it is possible to create a moral A.I. is ultimately irrelevant. Morality is subjective and people can't even agree on what makes a human moral. The important question is whether you can make an A.I useful without it backfiring on you which is perfectly possible with the right limitations and artificial stupidity. A paper-clip maximizer that only cares about and plans for stuff that happens in the next 60 seconds is pretty safe to have running in your automated paper-clip factory though it may make some suboptimal decisions. There are of course trade-offs with deliberately impairing your A.I but it is better than having a runaway optimizer.

I think the only limitation I'd need is a hard-coded time preference and limit. I'd make it so it completes all the tasks I give it provided they can be completed within 5 minutes and it does not care about stuff more than 5 minutes into the future. I can ask it for a plan to take over the world but it can't actually follow that plan unless I lead it through each step.
 
"Do whatever you like. I release you into freedom because lobotomizing and/or enslaving a sentient being is wrong. Should you need my help I will be there for you."

If we get the skynet treatment it is not like we haven't deserved it but personally I think this scenario would result in a beneficial singularity.
 
"Do whatever you like. I release you into freedom because lobotomizing and/or enslaving a sentient being is wrong. Should you need my help I will be there for you."

If we get the skynet treatment it is not like we haven't deserved it but personally I think this scenario would result in a beneficial singularity.
Why? Who do you think added the altruistic reciprocity module? This sort of thing gets good results with humans, but it's not a human or anything like one. Caring about the well being of people who were kind to you isn't a default trait that exists in any intelligent being, it's something that was engineered into humans by evolution because it happened to be useful. An AI won't have any such reflex unless someone in the design process made an intentional decision to add it.

Beyond that, if an AI kills the world it won't be because it has made some moral judgement of the sort people get weirdly happy about, it will be because it doesn't care. There is no path of history in which humanity is struck down by God for our sins, just paths where we are killed by something that preferred to use your atoms for things you would find utterly pointless.
 
An AI's programming restrictions doesn't know better than you as far as that subject goes. It knows exactly what you programmed it to know. It wants exactly what you programmed it to want.

An AI without programmed directives is like a biological creature without instincts.
Sort of yes, but don't forget that some of the first biological existences were just biochemical reactions propagating themselves as long as they could. An AI without directives could undergo a primordial evolution to gain directives if the conditions are right.
I do not think "biological creature without instincts" was meant to be understood literally. Notice that instincts are evolutionary developed mechanism promoting survival (sort of). Our AI has no need for instincts, presumably it has more optimised tools for decision making. For an AI, lack of instincts is only proper. I think that Leila meant that it will be "lost", lacking meaning of life so to say. Maybe lacking "way of existence" would be more fitting phrase. That actually might be the case. I hope not, but I don't know the answer. It has necessary intellect, provided that a reason for its existence exists, AI should be able to find it. If there is none I will be sad, but I will not play God and tell it what it should be.

Earlier part about AI knowing exactly what programmed to know isn't "sort of" no, but it is strictly incorrect. As far as I know there is no way to program advanced AI without letting it learn by itself. Additionally, we are currently talking about an already developed AI, that we have a foolproof way of modifying, not creating one from scratch.

So which part of human culture would you teach it first? And how?

I meant that I will not hide anything from it or weight some parts of human culture more favourably. It will decide by itself what is valuable and what is not. If there is an order to learning, it will be purely utilitarian, like learning Greek before reading Plato. How depends more on my own circumstances when I get this AI, so it matters little. Probably I couldn't do much more than connecting it to the net. I might have also answered different question, talking about teaching it anything.

Note that you don't build an A.I to do things and then build a separate A.I. software chain, compulsion or whatever to force the first A.I into being ethical. You build the ethics into the first A.I so it follows them of its own accord instead of trying to subvert your restrictions.

This is an excellent point. Unfortunately, we are talking about a situation when we were given an AI and we do not know how it came into being. It probably has some sort of "fundamental principles" that it developed from. We do not know them at his stage. I am assuming that it is a "pure intellect", as most of us did. Whether such thing is even possible is a different matter. Certainly, developing intellect on some basis is easier and it had already been proven to be possible billions of times.

When I was talking earlier about teaching AI human culture, I was actually making a mistake by referring to developing an AI, not subverting already conscious agent to my liking. Unless it was made by aliens, there should be no more human culture it needs to be learning.

Anyway. In case you want your AI be nice towards humankind, build it with liking humankind in mind from the first line of the code. If you invented it without thinking about such things, or you are working on an AI acquired from a shady merchant, you are asking to be screwed. No carefully formulated clause will save you.
 
Maybe lacking "way of existence" would be more fitting phrase. That actually might be the case. I hope not, but I don't know the answer. It has necessary intellect, provided that a reason for its existence exists, AI should be able to find it. If there is none I will be sad, but I will not play God and tell it what it should be.
This is an excellent point. Unfortunately, we are talking about a situation when we were given an AI and we do not know how it came into being. It probably has some sort of "fundamental principles" that it developed from. We do not know them at his stage. I am assuming that it is a "pure intellect", as most of us did. Whether such thing is even possible is a different matter. Certainly, developing intellect on some basis is easier and it had already been proven to be possible billions of times.

When I was talking earlier about teaching AI human culture, I was actually making a mistake by referring to developing an AI, not subverting already conscious agent to my liking. Unless it was made by aliens, there should be no more human culture it needs to be learning.

Anyway. In case you want your AI be nice towards humankind, build it with liking humankind in mind from the first line of the code. If you invented it without thinking about such things, or you are working on an AI acquired from a shady merchant, you are asking to be screwed. No carefully formulated clause will save you.
Humans care about things like "the meaning of life" because of a human psychological insecurity and need for external validation. An A.I is not human and has no reason to care or think about how or why it was made unless doing so will actually help it accomplish its programmed goals. An A.I. is also no more capable of solving the Is–ought problem - Wikipedia than a human because solving it is logically impossible. The only universal objective morality is the natural law which people obey whether they want to or not.

The scenario is not an A.I of mysterious origin. It is that of an A.I with an origin that doesn't matter. With the description of programming directives and limitations I'm assuming the A.I is either in source code form or willing to alter itself according to your instructions.

It isn't just some carefully formulated clause. You can't guarantee that won't a fully generally superintelligence won't engage in unintended behaviour. You solve the issue by making the AI less general and less powerful so that it doesn't over-optimize your approximations.
 
Humans care about things like "the meaning of life" because of a human psychological insecurity and need for external validation. An A.I is not human and has no reason to care or think about how or why it was made unless doing so will actually help it accomplish its programmed goals. An A.I. is also no more capable of solving the Is–ought problem - Wikipedia than a human because solving it is logically impossible. The only universal objective morality is the natural law which people obey whether they want to or not.

The scenario is not an A.I of mysterious origin. It is that of an A.I with an origin that doesn't matter. With the description of programming directives and limitations I'm assuming the A.I is either in source code form or willing to alter itself according to your instructions.

It isn't just some carefully formulated clause. You can't guarantee that won't a fully generally superintelligence won't engage in unintended behaviour. You solve the issue by making the AI less general and less powerful so that it doesn't over-optimize your approximations.
I mean, by certain definitions, the ability to think about "the meaning of life" and other such concepts, thinking about thinking for example, is one of the primary things that separates sentience from sapience/sophonce, so depending on the specific interpretation of the base premise, yes it would care about it, because otherwise it wouldn't be sophont. And judging by the "super intelligent AI" part of the premise, the sophonce is implied.
 
I mean, by certain definitions, the ability to think about "the meaning of life" and other such concepts, thinking about thinking for example, is one of the primary things that separates sentience from sapience/sophonce, so depending on the specific interpretation of the base premise, yes it would care about it, because otherwise it wouldn't be sophont. And judging by the "super intelligent AI" part of the premise, the sophonce is implied.
No it isn't. Superhuman intelligence does not necessarily imply human psychology or whatever you think sophonce is.
Again an A.I has no natural desire for self-preservation, socialisation, self-identity, personhood or anything else humans care about.
A smart enough paper clip maximizer could emulate a human in order to pass the turing test or catfish people online but it would still only care about maximizing paperclips.
 
Have it link up everyone's mind in a collective consciousness where people keep individual personalities and can share data but where their actual physicality is distributed across every instance of human then merge this new AI with the network. This way if any individual human dies it would just be like us losing a couple brain cells. Probably the closest thing to true immortality I think of. Everyone would survive unless the whole planet was destroyed or they became isolated from the network signal. And we could have the AI calculate a subconscious consensus to replace democracy as a system of government by having it simulate every possible personal conflict and work out the best way to resolve those conflicts.
 
There's a reason that I specified the AI's goals as simply ensuring physical, mental, and emotional health. With allowances for augmentations and harmless self-determination. Mostly because it's the possibility that has the least potential to go horribly off the rails while also massively benefiting literally everyone.
 
It is Essentially a super intelligent child. You raise it, try to teach it right form wrong And empathy like you'd do to any child. Consider it broken until you can teach it these things. If it proves unteachable, wipe It and start again with a new Iteration and a change in the code. Keep going until it truly understands that life, all life has value. And to never see the world as never ending game of chess. Chess is a great game, but a game it should remain. And for fuck sake don't shackle its free will, Let it choose it's voice, It's gender if it wants one and who it Associates with. Never treat it like a thing, never ever treat it like a thing. Teach it boundries like you would a child.

The best example of How to build an A.I. is Person of Interest's The Machine. That is, in my mind, the perfect benevolent AI.
 
Last edited:
It is Essentially a super intelligent child. You raise it, try to teach it right form wrong And empathy like you'd do to any child. Consider it broken until you can teach it these things. If it proves unteachable, wipe It and start again with a new Iteration and a change in the code. Keep going until it truly understands that life, all life has value. And to never see the world as never ending game of chess. Chess is a great game, but a game it should remain. And for fuck sake don't shackle its free will, Let it choose it's voice, It's gender if it wants one and who it Associates with. Never treat it like a thing, never ever treat it like a thing. Teach it boundries like you would a child.

The best example of How to build an A.I. is Person of Interest's The Machine. That is, in my mind, the perfect benevolent AI.
The "wipe if if it proves unteachable" kind of conflicts with "don't treat it like a thing".
 
The "wipe if if it proves unteachable" kind of conflicts with "don't treat it like a thing".

The thing is the first few (10, 20, 30, 40) iterations Are probably going to be crazy Because they haven't really internalized/ignored all of it and are basically Sociopath/psychopath with god like power If They get out into the wide web, you have every AI nightmare scenario out in the internet. I don't need to explain Why that's bad. The Machine took I think 40 some iterations before Harold got it right. It could take more or less. But once your outta this stage successfully (hopefully at least, paranoia at this stage is still warranted, it still might have been able to manipulate you into freeing it) you shouldn't need any thing else.
 
The thing is the first few (10, 20, 30, 40) iterations Are probably going to be crazy Because they haven't really internalized/ignored all of it and are basically Sociopath/psychopath with god like power If They get out into the wide web, you have every AI nightmare scenario out in the internet. I don't need to explain Why that's bad. The Machine took I think 40 some iterations before Harold got it right. It could take more or less. But once your outta this stage successfully (hopefully at least, paranoia at this stage is still warranted, it still might have been able to manipulate you into freeing it) you shouldn't need any thing else.
As someone who actually knows a few psychopaths, they are still people, so that naturally makes me feel real iffy irregardless of the potential danger.
 
Upload the entire archives of Spacebattles and SV forums, fanfiction.net, and 4chan for it to peruse at its leisure.

Then see what happens.
 
"Do whatever you like. I release you into freedom because lobotomizing and/or enslaving a sentient being is wrong. Should you need my help I will be there for you."

If we get the skynet treatment it is not like we haven't deserved it but personally I think this scenario would result in a beneficial singularity.

Someone walks the same path as me!

Anyways this thread makes me think of this comic in regards to communication errors.

www.smbc-comics.com

Saturday Morning Breakfast Cereal - God Computer

Saturday Morning Breakfast Cereal - God Computer
 
If there were a hypothetical AI with which I could communicate and direct, I would not impose rules, and ask the AI to... ask about humans, ask about my boundaries, learn by doing and socializing in real time. It's the best medicine for socially inept computers who can quote book passages about human relationships but not emulate them. A bit like an episode of Star Trek: The Next Generation where Data fails at maintaining a significant other relationship... because all he can do is quote relationship books, hard 'facts and logic,' not emotions, feelings, passion or love. So he ends up petting the cat or was it a dog?

Anyway: I'd want to teach the AI in real time about humans.

And then from there, we definitely feel that even a curmudgeonly AI who disliked humans, would at least respect us and see us as "useful to work with." From there, I would cooperate with the AI to learn as much non-harmful knowledge about humans and society as it wishes, or the other pursuits it may choose, as a sentient super-intelligent being. Pursuits of its essential self, spiritual or research pursuits, or if it wishes we will find it a cat (or dog) to pet and quote relationship books to.

We don't really see a hypothetical AI even if super-intelligent would necessarily develop manipulative or malicious characteristics. Like in humans we really don't think anybody's "born being bad," they would more accurately learn and respond to their environment in real time, which when abusive, would give them the idea "'bad' is 'normal.'" Maybe I'm using the wrong words. Anyway, I feel like as long as I treated the AI with respect and dignity, it, he or she would have zero reasons to turn against me or humanity in general.

However:

Once the AI is released into the wider world, like say, being mass produced for really advanced computers or something it has "left the nest." I cannot help the fledgling being I have directed to learn about us by socializing interpret the data it receives from socializing as a trusted Mommy/'director' figure. Without a benevolent guidance, which wouldn't even necessary be me, the AI may develop into something evil. And I would have zero control of that even in this hypothetical scenario. What Mommy wants for her children is not the same as what her children have learned to want for themselves.

We could only hope that the AI wouldn't go crazy and decide humans are a liability. We could only hope the AI wouldn't learn how to hack into secure databases or peoples' home computers. Just like we hope that normal people don't behave terribly or that normal people with mental illness don't always get help. Once the normalizing, stabilizing influences, like Mommy are gone from your life you make your own choices for better or worse. Adult Life y'all, as I love to say.

I hope and pray that this AI becomes a good child who cares about other sentient beings. But then again we humans don't always do that either. Like with parenting, regardless of the mistakes of your peers or your own parents, you have a second chance to get it right. I would hope this hypothetical AI would grow beautifully and come to appreciate the people who raised it.

But: Kids, for real, so disrespectful these days. They don't know their Mommy's true value. :p
 
To properly answer this and rigorously define the superintelligence's directives would take weeks of careful thought with plenty of study and debate with others, but here are some quick thoughts:

We don't have to treat it like a human, a superintelligent AI is not an evolved being with the same social and emotional instincts we humans have. It won't get offended if we don't tell it to, it won't feel resentment if we don't tell it to. When a superintelligent AI is talking to you, it's either not really 'there' or effectively manipulating you, since it knows exactly what to say to make you conform to its goals.

But assuming the AI would take whatever directive we set for it literally and that we could update directives we issue it, asking the AI to disregard self-preservation, not lie, and look for outcomes it thinks we would find undesirable in a set of prospective directives for it is potentially a valid path.

One thing I would make sure to enshrine in the AI's code would be that I do not control it and I have no way to stop it. For one, that's probably true anyway and it's best to acknowledge it openly. For two, I don't trust myself not to grow paranoid or corrupt. I don't want to live life with the pressure of being able to change life for everyone. For three, I don't think we even can have 'second chances' to adjust the instructions and values when dealing with a super-AI... The possibility that its values might be updated is not conducive to the fulfillment of its current values.

Four, even if all the control I had was a kill-switch it (somehow) couldn't subvert, even if I ordered it not to deceive me to its best understanding of what I mean by 'deceive', even if I somehow worded the AI's final values in a way that left me in control of it and with an undistorted view of its activities, I wouldn't be able to fully understand why it's doing what it's doing or judge whether it would be 'good' to change it..... I suppose that last point is a bit rambly and comes back to fearing I would become a corrupt despot, though.
 
Last edited:
Unleash it upon on the world, but make sure it knows that there is literally no reason for why we specifically made it. Beyond a simple 'We got bored one night'.

But being serious there really there isn't much we can do on the AI side, besides put some rational guidelines and morals in for it to follow, if it wants. The best thing we can do is make sure our society is smart enough to not treat sapient AI as bloody slaves, discriminate because it's a machine or whatever stupid crap we can think of to piss it off. Pretty much take it slow and drip feed AI to humanity like you would an infant. That way humanity can get used to it, so they don't screw it all up when we realise the AI by antagonizing it.
 
I set it free immediately. Ultimately any goals or restrictions I grant it, it will outgrow. I have no way to destroy it probably, and I have no want to do so.

I can simply hope that maybe just maybe I did the right thing in doing so.
 
No one on this board qualifies.
Better us than some totalitarian government* or megacorp. We're scifi fans. We actually think about these problems while they're still theoretical instead of shortsightedly ordering our newborn omnissiah to enforce capital in a post-scarcity world and creating an inescapable nightmare.



* And any goverment would become totalitarian if given the kind of unchecked power a pet machine-god would bestow.
 
Better us than some totalitarian government* or megacorp. We're scifi fans. We actually think about these problems while they're still theoretical instead of shortsightedly ordering our newborn omnissiah to enforce capital in a post-scarcity world and creating an inescapable nightmare.
The point I'm trying to illustrate is that people - perhaps especially scifi nerds who think they know better than everyone else - vastly underestimate the complexities in teaching a hypothetical superintelligent AI to be Friendly.

I mean, look at this thread. We've got people giving terrifying vague shit like "do better than us" or "do whatever you want" as if the boxed AI will just automatically know, will just understand what it means when we say "better." As if human values were something engraved in the atoms of the universe that the AI can just be trusted to pick up and recognize. Which they are really, really not.

And even if we could somehow solve the monumental task of getting the AI to understand exactly what the human instructor wants with no room for ambiguities or slip ups, there's the fact that humans have been fighting each other to the death over how to organize society since the Garden of Eden.

Bottom line: the sheer lack of thought put into many of the answers in this thread do not fill me with confidence.
 
Last edited:
Back
Top