Ted Chiang: ChatGPT Is a Blurry JPEG of the Web

Aineko · Dec 24, 2024

Byzantine said:
You, um, may want to consider what that actually means. It is saying the AI is generating an image from random noise by denoising it based on how it was trained. Which is to say it is attempting to reconstruct some facsimile of those images through the process. It doesn't work through brushstrokes, lines, etc, but it's building 'the most plausible image' based on the prompt, the random noise map, and any other restrictions fed into it. ...And it defines the most plausible image as reflecting what it learned from the training set. Corny isn't exactly correct that it is recreating things bit by bit, because it doesn't have the memory to do that much. But it is trying to create something that could plausibly have been in the training set. That's how these things work.

I don't know why you think you're pointing out anything notworthy here, but mostly yes. The last deserves a clarification, though - a well-trained model (which is not every model) is intentionally engineered to not memorize and regurgitate the training set, but rather obtain what I might call a numerical 'vibe' for what training-set-like things should be. So yes, it's trying to create something that (as it sees it) could plausibly have been in the training set - but not something that was in the training set.

stratigo said:
So... humans are just meat machines with no sense of self? Or artists are all copying hacks?

I mean someone could have shitty enough taste to go "yeah AI slop is good enough for me", even now when it is laughably bad at even the form of art, but dang man, let those of us who have some taste have art that isn't just slop.

I said literally nothing about the quality or adequacy of AI art. What I said was that there's no 'life' in the words or colors separate from the raw data. The work doesn't know its own history. If an AI tells a good story, it's still a good story. I don't know whether an AI has ever told a good story, but that's irrelevant to meeting the philosophical claim on its own rarefied level.

Or about the sense of self of artists. I don't consume anybody's sense of self, that sounds like a weird urban fantasy monster thing.

Byzantine · Dec 24, 2024

Aineko said:
I don't know why you think you're pointing out anything notworthy here, but mostly yes. The last deserves a clarification, though - a well-trained model (which is not every model) is intentionally engineered to not memorize and regurgitate the training set, but rather obtain what I might call a numerical 'vibe' for what training-set-like things should be. So yes, it's trying to create something that (as it sees it) could plausibly have been in the training set - but not something that was in the training set.

The trouble is that is a distinction without a difference. The image generator doesn't know what was in the training set, exactly. But a surprising amount of the training data can be reconstructed with appropriate queries. But that's still not relevant to my point, which is that the image generator is trying to recreate something that was plausibly in the training set, and hence is ripping off the training set to do it.

I'm aware you don't see a difference between a human using references and this. That is a very strange point of view to me.

SeptimusMagisto · Dec 24, 2024

Part of the problem is that saying things like "AI can only copy" or "AI makes a collage" is that it's representing both the learning process and the capabilities incorrectly.

It makes it sound like if you say "sleeping green dragon" it will look through an image set of green dragons until it finds a nice picture then add random alterations to conceal what it did. And that's not how any of that works.

Aineko · Dec 24, 2024

Byzantine said:
The trouble is that is a distinction without a difference. The image generator doesn't know what was in the training set, exactly. But a surprising amount of the training data can be reconstructed with appropriate queries. But that's still not relevant to my point, which is that the image generator is trying to recreate something that was plausibly in the training set, and hence is ripping off the training set to do it.

I'm aware you don't see a difference between a human using references and this. That is a very strange point of view to me.

ISTM the question of what 'ripping off' means to you is rather central if one wanted to actually understand your objection.

Morganite · Dec 24, 2024

Byzantine said:
which is that the image generator is trying to recreate something that was plausibly in the training set,

It still sounds like you're working off of a flawed analogy here. If you ask a SD model to produce an image of a sunset, it doesn't try to recreate an image of a sunset. It tries to gradually push whatever latent image you've fed it towards what it recognizes as "sunset". (Remember, this whole thing started with vision models.) Usually, that latent image starts as noise, but you could just as easily feed pictures of your carpet, the inside of a computer, a drawing of a cow, or screen captures of major public events that occurred after the model was trained, and it's going to try to make it look more like a sunset, based on a mathematical description of every sunset the model has been trained on. And then you can have it do that for a bit, then start pushing it towards astronaut. Or horse. Maybe a mermaid? Do too much of that and you might end up with something that doesn't look like anything at all...

So exactly who owns the concept of sunset?

This ties back to why, although I am very firmly in the camp of "copyrightable style would be Very Bad", I'm generally dubious of style loras based on individual artists and have some self-imposed rules about them. Because that's something that can reasonably be recognized as theirs in a way that does not apply to things like "cow" or "duck", let alone to cowducks, which probably do not feature very prominently in anybody's training data.

-Morgan.

firefossil · Dec 24, 2024

CornyBones said:
Please explain how it is not literally doing that?

The training datasets are measured in terabytes. The resulting diffusion program is measured in gigabytes. What you say here:

CornyBones said:
The machine training is physically taking the *physical* pieces of art that is produced by the artists, and then physically, exactly, reproducing the precise component of those arts.

I'm not being poetic about it. I mean it's *literally* reproducing parts of all the images. Like, the images are *physically used* to chart the machine mathematics. It isn't taking "the style" it's taking the *actual* work of an artist and using that work as a *literal* asset in its image. It isn't making images based on a style, It's taking *actual* images and then manipulating them.

When we say it isn't generating anything new, I'm not being poetic, It *literally* can't generate anything new outside it's data source.

Is just literally impossible. The diffusion program cannot be storing whole images for replication. The diffusion program cannot be storing parts of images for the purposes of collaging them together. What the diffusion program stores is patterns and associations learned through study of the training dataset. It does not remember specific images, only concepts. It does not have access to the training dataset after training has finished, and programs like SD don't have connection to the internet either.

This is actually very similar to how humans learn. Humans don't have memories in the sense that libraries have books. Humans do not literally store complete jpegs of things they see, they don't store compressed or partial jpegs either. They just simply build up webs of patterns and associations that allow them to construct imagery from concepts. This sort of compression allows us to 'remember' far more than we'd actually be able to remember.

Now obviously human patterns and associations can be more philosophical and less mathematical, but if you look at a character design reference (like this or this) you will find that they provide a lot of the same associations an AI would and could learn, like what specific colors are used for the character, what parts of the character are those colors, what are the shapes and proportions of those parts, and how the character looks like when rotated. Those things can definitely be reduced to mathematical associations.

king208 · Dec 24, 2024

The whole theft argument is nonsense in the face of the acceptance of incredibly expansive limits for fair use in art generally by the exact same individuals who argue AI art is theft.

There is no possible standard where the above image I drew is not theft but the results you get googling "AI image Rei Ayanami" are.

You can of course simply eliminate or devalue the concept of fair use but as an artist I honestly believe that particular medicine will be worse than the disease.

The theft argument also doomed to be a moot point soon because Adobe, Facebook and a variety of stock photo and image hosting sites now have the explcit permission of everyone who uses their software to use their images in AI training. Even if it's not enough data to train on now it will be very soon.

Rat King · Dec 24, 2024

Might be a doomed rear action, but I'm still going to fight it. Anti-AI Aktion and all that.

firefossil · Dec 24, 2024

Rat King said:
Might be a doomed rear action, but I'm still going to fight it. Anti-AI Aktion and all that.

Defending the rights of capitalists and harassing their indie competition doesn't really strike me as a rear action.

Byzantine · Dec 24, 2024

firefossil said:
Defending the rights of capitalists and harassing their indie competition doesn't really strike me as a rear action.

It is frankly confusing how you ignoring literally all the indie artists telling you you are wrong. It comes across as very much "I know what is good for you better than you do"

firefossil · Dec 24, 2024

Byzantine said:
It is frankly confusing how you ignoring literally all the indie artists telling you you are wrong. It comes across as very much "I know what is good for you better than you do"

All indie artists oppose AI when one takes the position that anyone who uses AI is not an artist, yes.

I will reiterate that my lived experience of the pro-AI crowd is "look at this cool thing, let's have fun, and make cool things" while my lived experience of the anti-AI crowd is "that's not art and you aren't an artist, we hate your artistry but will endlessly insist that it is you that hates art and artists, please go away, also you are a criminal or would be if the laws matched my ideological preferences". It is my lived experience in part because the people in threads like these chose to make it my experience. Then they tell me that my experience isn't what I experienced.

E.I.G. · Dec 24, 2024

firefossil said:
All indie artists oppose AI when one takes the position that anyone who uses AI is not an artist, yes.

I will reiterate that my lived experience of the pro-AI crowd is "look at this cool thing, let's have fun, and make cool things" while my lived experience of the anti-AI crowd is "that's not art and you aren't an artist, we hate your artistry but will endlessly insist that is you that hates art and artists, please go away, also you are a criminal or would be if the laws matched my ideological preferences". It is my lived experience in part because the people in threads like these chose to make it my experience. Then they tell me that my experience isn't what I experienced.

... so, your pro-AI crowd viewpoint is unbelievably naive.
I mean, it honestly sounds like you are either cherry picking that term to mean the people with that opinion instead of everyone who is pro-AI, or... actually just that. It just sounds like you are defining "pro-AI" to be "people who think AI is cool and fun", and leaving out the pile of people who are actually making AI that are saying "hey, wouldn't it be great if we didn't need to pay people".

It also is very telling that you are also saying that all anti-AI people are saying that they aren't artists. It tells me that you have a single, singular focus when it comes to what parts of this you think are "good" and "bad", and you refuse to even consider anything that falls outside that focus.
Either it helps make more people create things, or it is bad. Any other factors don't seem to enter consideration with you, and it is really annoying because it misses half the conversations that attempt to engage with you and leaves the rest of us feeling like we are talking to a brick wall.

Mizu · Dec 24, 2024

I certainly get tired of my hobbyist AI image generation getting constantly accused of bad stuff, like firefossil says, mm. Nor do I post the results anywhere but an IRC channel with friends, or occasionally share pretty pictures with an artist cousin of mine via Discord, so I don't think I'm quite the target group the complaints are going towards.

That'd be twitter or other social media sites where people spam them

That said I feel like firefossil going on about capitalists and how AI shall democratise art are a bit silly too. It's either gonna be a tool people use as hobby ideas/as part of drawing something, or just a minority use on the net, sometime over the next decade. We can't predict what in the end, even if doom & gloom about less employment for artists is popular atm.

I don't feel it's been long enough to say whether those jobs artists got replaced by a computer for are, uh, gonna pan out long term yet.

king208 · Dec 24, 2024

Rat King said:
Might be a doomed rear action, but I'm still going to fight it. Anti-AI Aktion and all that.

This is perhaps the goofiest characterization you could possibly make of whining at firefossil.

firefossil · Dec 24, 2024

E.I.G. said:
I mean, it honestly sounds like you are either cherry picking that term to mean the people with that opinion instead of everyone who is pro-AI, or... actually just that. It just sounds like you are defining "pro-AI" to be "people who think AI is cool and fun", and leaving out the pile of people who are actually making AI that are saying "hey, wouldn't it be great if we didn't need to pay people".

It also is very telling that you are also saying that all anti-AI people are saying that they aren't artists. It tells me that you have a single, singular focus when it comes to what parts of this you think are "good" and "bad", and you refuse to even consider anything that falls outside that focus.

Shockingly enough when side A pisses on me and side B offers a hand I'm going to be inclined with side B. I mean side A not only failed to conjure up convincing arguments, but repeatedly and relentlessly pushes slanderously bad ones no matter how many times they're debunked or endless gish gallops, but don't be surprised when open contempt and thinly veiled threats does not yield a warm reception.

Speaking of:

E.I.G. said:
Either it helps make more people create things, or it is bad. Any other factors don't seem to enter consideration with you, and it is really annoying because it misses half the conversations that attempt to engage with you and leaves the rest of us feeling like we are talking to a brick wall.

I've repeatedly discussed the cons of AI and my concerns about it thereof. I'm talking about one thing, so you complain that I'm not addressing another. Even if I was discussing it earlier. Even if I was agreeing with you on it. How do you think I feel about that?

10ebbor10 · Dec 24, 2024

king208 said:
The theft argument also doomed to be a moot point soon because Adobe, Facebook and a variety of stock photo and image hosting sites now have the explcit permission of everyone who uses their software to use their images in AI training. Even if it's not enough data to train on now it will be very soon.

It's already enough data to train on, the training has been done, Adobe Firefly was released 1.5 years ago.
You can rent it out for 9.99/month.

All anyone expressing the "AI Art is theft" is doing, in practice, is saying that the real problem with AI art is that corporations can't buy up and monopolize training rights. And boy, would corporations love to help you solve that problem.

Copyright is a corporate tool, it's not going to defend individual artists. Certainly not in this situation, where the entire point of the AI is that individual pieces of trading data are entirely interchangeable.

(Edit : With, like, maybe a handful exceptions if you're a mega-famous artist or celebrity, and they want to stick your name on the cover).

E.I.G. · Dec 24, 2024

firefossil said:
Shockingly enough when side A pisses on me and side B offers a hand I'm going to be inclined with side B. I mean side A not only failed to conjure up convincing arguments, but repeatedly and relentlessly pushes slanderously bad ones no matter how many times they're debunked or endless gish gallops, but don't be surprised when open contempt and thinly veiled threats does not yield a warm reception.

Speaking of:

I've repeatedly discussed the cons of AI and my concerns about it thereof. I'm talking about one thing, so you complain that I'm not addressing another. Even if I was discussing it earlier. Even if I was agreeing with you on it. How do you think I feel about that?

I think that you repeatedly, and constantly go "look how much they hate us".
Then, when someone points out the negative aspects that lead to that dislike of the concept, you can on occasion reach the point where you admit that there are some downsides, and then go "but think of the fun times" again just a little bit later.

I know exactly how you feel with that first part, and guess what? I feel it too every time you use "but more artists" as your main argument for why this stuff is good actually, and artists should just get real jobs instead.

firefossil · Dec 24, 2024

E.I.G. said:
Then, when someone points out the negative aspects that lead to that dislike of the concept, you can on occasion reach the point where you admit that there are some downsides, and then go "but think of the fun times" again just a little bit later.

AI has a wide variety of issues and problems. I think that opponents of AI should their spend efforts on a rear guard action against the actually bad stuff done by the actual capitalists, instead of going after indie artists using public domain AI, which in doing so helps the actual capitalists doing the actual bad stuff.

This is kind of the crux. Discussion of the exact number and balance of the pros and cons of AI is besides the point. AI is here. It isn't going to leave, even after the bubble bursts. We can only decide what to prioritize suppressing and what to prioritize fostering. If you are attacking indie artists you are on the wrong side of the battle lines. No amount of saying "AI does bad things" makes me tolerant of smothering good things AI does in ways that will not stop the bad things anyways.

E.I.G. said:
I know exactly how you feel with that first part, and guess what? I feel it too every time you use "but more artists" as your main argument for why this stuff is good actually, and artists should just get real jobs instead.

You don't even believe that position and neither do I, yet you are still sore about it? My position remains that everyone should be able to do art as a hobby, not just the handful of people 'lucky' enough to be underpaid and overworked by capitalists to make it who I believe would generally be better off doing art as a hobby too. I don't think the latter is actually going to happen to any major degree for a long while, nor do you last I checked, but the former will realize benefits sooner. Something that a supporter of artists and art should favor.

Morganite · Dec 24, 2024

E.I.G. said:
It also is very telling that you are also saying that all anti-AI people are saying that they aren't artists.

Er, correct me if I'm missing something, but that part was in response to Byzantine saying that, and I insert quote:

Byzantine said:
literally all the indie artists

But there are people who were established artists pre-AI who do not share those views. So it sounds like it wasn't Firefossil who was making that inference?

Mizu said:
I certainly get tired of my hobbyist AI image generation getting constantly accused of bad stuff, like firefossil says, mm.

Frankly I get some pretty sketchy feelings about -anyone- getting death threats or told to kill themselves, pretty much no matter what the issue. Which I am not claiming has happened here on SV, but it's pretty routine some other places online, and that's gonna inform how people feel about the overall discourse.

(Artists who don't use AI being attacked for using AI is also not exactly awesome. BTW, people don't actually seem to be very good at telling when images were AI generated.)

10ebbor10 said:
All anyone expressing the "AI Art is theft" is doing, in practice, is saying that the real problem with AI art is that corporations can't buy up and monopolize training rights. And boy, would corporations love to help you solve that problem.

Mmm. Like, going back to what @SeptimusMagisto (and others whose names I've forgotten, sorry) have said about overlapping discussions, I'm gonna ask (and provide my idea of the answers) - what outcomes can actually be had here?

*Generative AI doesn't exist - Nope. Ship has sailed.

*Generative AI only available via training exclusively on public domain or material the party doing the training actually has copyright to. Maybe possible, since Adobe's "btw, we can train on all your stuff" license could theoretically get torpedoed by legal changes. (It was recently pointed out on some law blog or other that I happened to be looking at that making something illegal can be retroactive, though it often is not.)

*As above, but Adobe and some other parties that can buy access to large datasets also get to train base models. A little more possible.

The previous two mostly amount to "a small number of large corporations have effective monopoly over generative AI", just a difference in which and how many. I am really not convinced that would an improvement over the status quo of it being broadly available to medical researchers and hobbyists and basically just about anyone who wants it to some degree or other. I don't think I've seen anyone arguing for it.

(I won't even begin to guess at how "Oh, and even if you outlaw SD in the US, other countries may or may not go along" would affect things. Given some of the existing law in the UK, I'd be pretty surprised if they did so, at least.)

So does that leave anything? Because I'm really not seeing a way to stop at least Disney from being able to train a model on everything they've got.

-Morgan.

E.I.G. · Dec 24, 2024

Right, multiple conversations going on here.

So, my issue with @firefossil is mostly that there is an assumption of the value to these systems as a universal solution to the issue that art is complicated.
My main disagreement there is that I see a difference between:
- The kind of person who just types in a prompt, and then picks the closest result and is content.
- The kind of person who either fine tunes their prompts to get exactly what they want, or more notably starts with the advanced tools that allow for notably greater control over the end result.
The former is making content, which is fine if you just need a bit of content for another project but is not making art. The latter is making art as an artist, and my issue is that I don't think @firefossil is considering that not everyone is going to raise to the level of "artist" just by using these tools.

Not everyone who makes an image is an artist, and not everyone is able to become an artist just because there is one more tool to make art with.
More people perhaps, but the argument being repeated is that this lets everyone become artists.

Now, to be honest about my own feeling of AI generated artwork... well, I'm not going to seek it out or appreciate getting any given to me.
I am willing to say it is possible to make art with it, to have developed the skills to be a generative AI artist, but unfortunately the companies that made these systems have poisoned the well by promoting it as a solution to pushbutton make the desired end result
I am not sure personally how to identify the difference between a result that was worked on for a long time before it was posted, and one that was just one prompt that had the best option picked. I cannot tell the difference between the two categories from a finished work. That is the problem I have with the idea, it makes it hard to identify the hard work from the no work.

Morganite said:
(Artists who don't use AI being attacked for using AI is also not exactly awesome. BTW, people don't actually seem to be very good at telling when images were AI generated.)

Well, that is kind of my point at the end here, only worse:
When dealing with AI generated art, I cannot tell an example that was a work of care and dedication apart from someone's random attempt that they just thought looked okay.

Byzantine · Dec 24, 2024

firefossil said:
You don't even believe that position and neither do I, yet you are still sore about it? My position remains that everyone should be able to do art as a hobby, not just the handful of people 'lucky' enough to be underpaid and overworked by capitalists to make it who I believe would generally be better off doing art as a hobby too.

And here we come down to it, again: You know better than other people what makes them happy and what they should be doing.

You may not even be wrong in some respects. But the fucking arrogance...

Mizu · Dec 24, 2024

E.I.G. said:
So, my issue with @firefossil is mostly that there is an assumption of the value to these systems as a universal solution to the issue that art is complicated.
My main disagreement there is that I see a difference between:
- The kind of person who just types in a prompt, and then picks the closest result and is content.
- The kind of person who either fine tunes their prompts to get exactly what they want, or more notably starts with the advanced tools that allow for notably greater control over the end result.
The former is making content, which is fine if you just need a bit of content for another project but is not making art. The latter is making art as an artist, and my issue is that I don't think @firefossil is considering that not everyone is going to raise to the level of "artist" just by using these tools

What happens if you fall into both camps? As I sometimes just generate a random image for fun and pick whatever looks nice, and other times I aim for a particular result and make many different artbot images with tweaked prompt instructions.

Edit:

Morganite said:
Frankly I get some pretty sketchy feelings about -anyone- getting death threats or told to kill themselves, pretty much no matter what the issue. Which I am not claiming has happened here on SV, but it's pretty routine some other places online, and that's gonna inform how people feel about the overall discourse.

Oh, totally. I was moreso commenting on half the posters here being negative about AI in all art based use cases, which is depressing and makes me wanna get defensive because it's a hobby I enjoy. I do try and see where the other poster is coming from before I reply though, and sometimes decide not to because of it. Or alter my message.

E.I.G. · Dec 24, 2024

Mizu said:
What happens if you fall into both camps? As I sometimes just generate a random image for fun and pick whatever looks nice, and other times I aim for a particular result and make many different artbot images with tweaked prompt instructions.

It is perfectly possible for an artist to just make a bit of content once in a while.

Hm. Let me give a comparison, I am a fanfiction writer, but sometimes I still give out random story ideas in fandom discussion threads.
Just using a prompt to make a bare bones thing is sort of like coming into one of those threads and giving your one paragraph idea on what would be a nice fanfic. It isn't much work, and would take a lot more to be a fully realized story, but it also isn't nothing.
A one page fanfic snippet on the other hand, or a longhorn description of a larger idea, is more work on the idea than that, but an author doesn't always get every idea they have to that point.

ninjasaid13 · Dec 26, 2024

interesting research meta ai.

View: https://streamable.com/kex9mz
LLM successor?

Kuja · Dec 26, 2024

View: https://www.youtube.com/watch?v=8ctGgtWvscQ

Short version: dude tries to accuse a speedrunner of cheating based on some self-taught knowledge about audio splits and using ChatGPT to aid his investigation, ChatGPT produces gobbledygook and false positives.

ChatGPT part of the video starts at around 21:10.

Ted Chiang: ChatGPT Is a Blurry JPEG of the Web

Aineko

Byzantine

SeptimusMagisto

Aineko

Morganite

Happiness is a large book.

firefossil

king208

Rat King

Viera

firefossil

Byzantine

firefossil

E.I.G.

Robo-Bird

Mizu

Software Developer

king208

firefossil

10ebbor10

DON'T PANIC

E.I.G.

Robo-Bird

firefossil

Morganite

Happiness is a large book.

E.I.G.

Robo-Bird

Byzantine

Mizu

Software Developer

E.I.G.

Robo-Bird

ninjasaid13

Kuja

I'M A SQUIIIID