Results of the “Can you tell which images are AI generated?” survey

@popcar2@programming.dev · edit-2 1 year ago

Results of the “Can you tell which images are AI generated?” survey

@SnipingNinja@slrpnk.net · 1 year ago

I would second the comment from the original post which said that they didn’t pay much attention in the latter half.

@yokonzo@lemmy.world · edit-2 1 year ago

One thing I’m not sure if it skews anything, but technically ai images are curated more than anything, you take a few prompts, throw it into a black box and spit out a couple, refine, throw it back in, and repeat. So I don’t know if its fair to say people are getting fooled by ai generated images rather than ai curated, which I feel like is an important distinction, these images were chosen because they look realistic

@popcar2@programming.dev · 1 year ago

Technically you’re right but the thing about AI image generators is that they make it really easy to mass-produce results. Each one I used in the survey took me only a few minutes, if that. Some images like the cat ones came out great in the first try. If someone wants to curate AI images, it takes little effort.

@logicbomb@lemmy.world · 1 year ago

Well, it does say “AI Generated”, which is what they are.

All of the images in the survey were either generated by AI and then curated by humans, or they were generated by humans and then curated by humans.

I imagine that you could also train an AI to select which images to present to a group of test subjects. Then, you could do a survey that has AI generated images that were curated by an AI, and compare them to human generated images that were curated by an AI.

deweydecibel · edit-2 1 year ago

All of the images in the survey were either generated by AI and then curated by humans, or they were generated by humans and then curated by humans.

Unless they explained that to the participants, it defeats the point of the question.

When you ask if it’s “artist or AI”, you’re implying there was no artist input in the latter.

The question should have been “Did the artist use generative AI tools in this work or did they not”?

Terrasque · 1 year ago

Every “AI generated” image you see online is curated like that. Yet none of them are called “artist using generative AI tools”.

@Zeth0s@lemmy.world · 1 year ago

But they were generated by AI. It’s a fair definition

@BlueBockser@programming.dev · edit-2 1 year ago

But not all AI generated images can fool people the way this post suggests. In essence this study then has a huge selection bias, which just makes it unfit for drawing any kind of conclusion.

@Zeth0s@lemmy.world · 1 year ago

This is true. This is not a study, as I see it, it is just for fun.

@yokonzo@lemmy.world · 1 year ago

I mean fair, I just think that kind of thing stretches the definition of “fooling people”

eric · edit-2 1 year ago

LLMs are never divorced from human interaction or curation. They are trained by people from the start, so personal curation seems like a weird caveat to get hung up on with this study. The AI is simply a tool that is being used by people to fool people.

To take it to another level on the artistic spectrum, you could get a talented artist to make pencil drawings to mimic oil paintings, then mix them in with actual oil paintings. Now ask a bunch of people which ones are the real oil paintings and record the results. The human interaction is what made the pencil look like an oil painting, but that doesn’t change the fact that the pencil generated drawings could fool people into thinking they were an oil painting.

AIs like the ones used in this study are artistic tools that require very little actual artistic talent to utilize, but just like any other artistic tool, they fundamentally need human interaction to operate.

@MysticKetchup@lemmy.world · 1 year ago

I think if you consider how people will use it in real life, where they would generate a bunch of images and then choose the one that looks best, this is a fair comparison. That being said, one advantage of this kind of survey is that it involves a lot of random, one-off images. Trying to create an entire gallery of images with a consistent style and no mistakes, or trying to generate something that follows a design spec is going to be much harder than generating a bunch of random images and asking whether or not they’re AI.

@dotMonkey@lemmy.world · 1 year ago

I think getting a good image from the AI generators is akin to people putting in effort and refining their art rather than putting a bunch of shapes on the page and calling it done

@SpaceNoodle@lemmy.world · 1 year ago

Did you not check for a correlation between profession and accuracy of guesses?

@popcar2@programming.dev · 1 year ago

I have. Disappointingly there isn’t much difference, the people working in CS have a 9.59 avg while the people that aren’t have a 9.61 avg.

There is a difference in people that have used AI gen before. People that have got a 9.70 avg, while people that haven’t have a 9.39 avg score. I’ll update the post to add this.

@lol@discuss.tchncs.de · 1 year ago

deleted by creator

WalrusDragonOnABike · edit-2 1 year ago

mean SD
No 9.40 2.27
Yes 9.74 2.30

Definitely not statistically significant.

@popcar2@programming.dev · 1 year ago

I would say so, but the sample size isn’t big enough to be sure of it.

@xkforce@lemmy.world · edit-2 1 year ago

So no. For a result to be “statistically significant” the calculated probability that it is the result of noise/randomness has to be below a given threshold. Few if any things will ever be “100% sure.”

@MysticKetchup@lemmy.world · 1 year ago

If you do another one of these, I would like to see artist vs non-artist. If anything I feel like they would have the most experience with regular art, and thus most able to spot incongruency in AI art.

@brewbellyblueberry@sopuli.xyz · 1 year ago

I don’t feel that’s true coming from more “traditional” art circles. From my anecdotal experience, most people can’t tell AI art from human art, especially digital and the kind the examples are from - meaning, hobbyist/semi-pro/pro deviant art type stuff. The examples seem obviously hand picked from both non-AI and AI-side to eliminate any differences as far as possible. And I feel both, the inability to tell the difference and the reason the dataset is what it is is because, well, they’re very similar, mainly because the whole deviant art/art station/whatever scene is a masssssive part of the dataset they use to train these Ai-models, closing the gap even further.

I’m even a bit of a stickler when it comes to using digital tools and prefer to work with pens and paints as far as possible, but I flunked out pretty bad, but then again I can’t really stand this deviant art type stuff so I’m not a 100% familiar, a lot of the human made ones look very AI.

I’d be interested in seeing the same, artist vs. non-artist survey, but honestly I feel it’s the people more familiar with specifically AI-generated art that can tell them apart the best. They literally specifically have to learn (if you’re good at it) to spot the weird little AI-specific details and oopsies to not make it look weird and in the uncanny valley.

@MooseBoys@lemmy.world · 1 year ago

Sampling from Lemmy is going to severely skew the respondent population towards more technical people, even if their official profession is not technical.

Funderpants · 1 year ago

Can we get the raw data set? / could you make it open? I have academic use for it.

@popcar2@programming.dev · edit-2 1 year ago

Sure, but keep in mind this is a casual survey. Don’t take the results too seriously. Have fun: https://docs.google.com/spreadsheets/d/1MkuZG2MiGj-77PGkuCAM3Btb1_Lb4TFEx8tTZKiOoYI

Do give some credit if you can.

Funderpants · edit-2 1 year ago

If I can be a bother, would you mind adding a tab that details which images were AI and which were not? It would make it more usable, people could recreate the values you have on Sheet1 J1;K20

@popcar2@programming.dev · 1 year ago

Done, column B in the second sheet contains the answers (Yes are AI generated, No aren’t)

Funderpants · 1 year ago

Awesome! Thanks very much.

@Mic_Check_One_Two@reddthat.com · 1 year ago

I’d be curious to see the results broken down by image generator. For instance, how many of the Midjourney images were correctly flagged as AI generated? How does that compare to DALL-E? Are there any statistically significant differences between the different generators?

@popcar2@programming.dev · 1 year ago

Are there any statistically significant differences between the different generators?

Every image was created by DALL-E 3 except for one. I honestly got lazy so there isn’t much data there. I would say DALL-E is much better in creating stylistic art but Midjourney is better at realism.

Funderpants · 1 year ago

Of course! I’m going to find a way to integrate this dataset into a class I teach.

Funderpants · edit-2 1 year ago

Wow, what a result. Slight right skew but almost normally distributed around the exact expected value for pure guessing.

Assuming there were 10 examples in each class anyway.

It would be really cool to follow up by giving some sort of training on how to tell, if indeed such training exists, then retest to see if people get better.

@Touching_Grass@lemmy.world · 1 year ago

I feel like the images selected were pretty vague. Like if you have a picture of a stick man and ask if a human or computer drew it. Some styles aew just hard to tell

Funderpants · 1 year ago

You could count the fingers but then again my preschooler would have drawn anywhere from 4 to 40.

snooggums · 1 year ago

I don’t remember any of the images having fingers to be honest. Assuming this is that recent one, one sketch had the fingers obscured and a few were landscapes.

WalrusDragonOnABike · edit-2 1 year ago

Imo, 3,17,18 were obviously AI imo (based on what I’ve seen from AI art generators in the past*). But whatever original art those are based on, I’d probably also flag as obviously AI. The rest I was basically guessing at random. Especially the sketches.

*I never used AI generators myself, but I’ve seen others do it on stream. Curious how many others like me are raising the average for the “people that haven’t used AI image generators” before.

@De_Narm@lemmy.world · 1 year ago

I’d be curious about 18, what makes it obviously generated for you? Out of the ones not shown in the result, I got most right but not this one.

WalrusDragonOnABike · 1 year ago

Seen a lot of very similar pictures generated via midjourney. Mostly goats fused with the couch.

@renownedballoonthief@lemmygrad.ml · edit-2 1 year ago

The arm rest seamlessly and senselessly blends into the rest of the couch.

@De_Narm@lemmy.world · edit-2 1 year ago

Thank you. I’m not sure how I’ve missed that.

Aatube · 1 year ago

Either the lighting’s wrong or you somehow have a zebra cat

crawley · 1 year ago

It’d be interesting to also see the two non-AI images the most people thought were.

@Syrc@lemmy.world · 1 year ago

Here

@Mojojojo1993@lemmy.world · 1 year ago

Interesting. So you’ve given us a 50/ 50 chance. Usually you’ve given us the art that was used and then the AI has attempted it’s own version? Did you train the ai using the art ?

Are you allowed to do that ? Is the art in the public sphere?

I have no knowledge or understanding of ai.

@popcar2@programming.dev · 1 year ago

No, the AI didn’t try to copy the other art that was included. I also don’t train the model myself, I just tell it to create an image similar to another one. For example the fourth picture I told it to create a rough sketch of a person sitting on a bench using an ink pen, then I went online and looked for a human-made one that’s of a similar style.

@Mojojojo1993@lemmy.world · 1 year ago

Ah ok. So it didn’t use those images to train on. Were those images hard to find ? Does it tell you what it uses to train ?

@MBM@lemmings.world · edit-2 1 year ago

~~Could you credit the human artists somewhere?~~

Nevermind, it’s there on the results page!

ayaya · 1 year ago

You realize these models are trained on billions of images, right? You want a list of millions of names? For what purpose?

@MBM@lemmings.world · 1 year ago

The survey had 9 images made by human artists. Why do you immediately interpret my question in the weirdest way?

ayaya · edit-2 1 year ago

Because the human artists were already credited. It’s under every image on the results page. I just assumed you meant the AI generated ones because it wouldn’t make sense to ask for something that is already there.

@MBM@lemmings.world · 1 year ago

Aah you’re completely right. I didn’t check the results page, only the spreadsheet and this post.

toadyody · 1 year ago

Curious which man made image was most likely to be classified as ai generated

WalrusDragonOnABike · edit-2 1 year ago

https://lh6.googleusercontent.com/J3LvbjpHfm0Gw5r5YqMwehxWOoW3kd39FRHqbFZKKNECTH4QEAflKctJ-lOMKtaYGv15LOBaMJ3TDzKur3ZhdPuFI-eVNS-7yWzYg2QsUzyKsjzz_XKh3jActT2Erqxyxg=w738

and

https://lh3.googleusercontent.com/fTZ0PRtlxfw15iUd-eQWEf0QWo3KpzO-KyVBjdfbT7dgSq-X3lzyUbnuOoFH2e3QxDdswRr1OZF-e9jHRB6DOHS4M-qLRTqwA68jU9HXo1SE4WHHuTkL6v1OR3A2TcqmiQ=w737

About 20% got those correct as human-made.

Aatube · 1 year ago

I think it’s because of the shadow on the ground being impossible

Turun · 1 year ago

I guessed it to be ai generated, because

the horses are very artsy, but undefined
the sky is easy to generate for ai
the landscape as well. There is simply not a lot the ai can do wrong to make a landscape clearly fake looking
undefined city from afar is like the one thing ai is seriously good at.

@Sekoia@lemmy.blahaj.zone · 1 year ago

Yeah, but the bridge is correctly over the river and the buildings aren’t really merged. Tough though.

The second one got me tho

WalrusDragonOnABike · 1 year ago

My first impression was “AI” when I saw them, but I figured an AI would have put buildings on the road in the town and the 2nd one was weird but that parts fit together well enough.

@Fire_Tree@sh.itjust.works · 1 year ago

That’s the same image twice?

WalrusDragonOnABike · 1 year ago

Thanks. Fixed.

@Custoslibera@lemmy.world · 1 year ago

What I learnt is that I’m bad at that task.

@Zeth0s@lemmy.world · 1 year ago

I did 12 and I thought it was a pretty bad result. Apparently I was wrong!

I tried to answer by instinct, without careful examination, to make it more realistic

jungle · edit-2 1 year ago

15 here and when I reviewed the answers I realized most were sheer luck.

The fact that it’s a clean gaussian shows that it’s mostly luck anyway.

@bitsplease@lemmy.ml · 1 year ago

One thing I’d be interested in is getting a self assessment from each person regarding how good they believe themselves to have been at picking out the fakes.

I already see online comments constantly claiming that they can “totally tell” when an image is AI or a comment was chatGPT, but I suspect that confirmation bias plays a big part than most people suspect in how much they trust a source (the classic “if I agree with it, it’s true, if I don’t, then it’s a bot/shill/idiot”)

Spzi · 1 year ago

Right? A self-assessed skill which is never tested is a funny thing anyways. It boils down to “I believe I’m good at it because I believe my belief is correct”. Which in itself is shady, but then there are also incentives that people rather believe to be good, and those who don’t probably rather don’t speak up that much. Personally, I believe people lack the competence to make statements like these with any significant meaning.

@ILikeBoobies@lemmy.ca · 1 year ago

With the majority being in CS fields and having used ai image generation before they likely would be better at picking out than the average person

@bitsplease@lemmy.ml · 1 year ago

You’d think, but according to OP they were basically the same, slightly worse actually, which is interesting

@ILikeBoobies@lemmy.ca · 1 year ago

The ones using image generation did slightly better

I was more commenting it to point out that it’s not necessary to find that person who can totally tell because they can’t

@lloram239@feddit.de · edit-2 1 year ago

deleted by creator

@mykl@lemmy.world · 1 year ago

Oof. I got about 65% on the images I hadn’t seen in the post. I must be pretty close to being replaceable by an adversarial network.

@squirrelwithnut@lemmy.world · 1 year ago

Sketches are especially hard to tell apart because even humans put in extra lines and add embellishments here and there. I’m not surprised more than 70% of participants weren’t able to tell that one was generated.

GVeltaine · 1 year ago

That avocado and tomato post took me out, that and the Legos. Very impressive.

The most obvious ai one for me was the last cat picture, somehow it just screamed ai

@doggle@lemmy.dbzer0.com · 1 year ago

Having used stable diffusion quite a bit, I suspect the data set here is using only the most difficult to distinguish photos. Most results are nowhere near as convincing as these. Notice the lack of hands. Still, this establishes that AI is capable of creating art that most people can’t tell apart from human made art, albeit with some trial and error and a lot of duds.

@blueberrypie@lemmy.world · 1 year ago

These images were fun, but we can’t draw any conclusions from it. They were clearly chosen to be hard to distinguish. It’s like picking 20 images of androgynous looking people and then asking everyone to identify them as women or men. The fact that success rate will be near 50% says nothing about the general skill of identifying gender.

@bitsplease@lemmy.ml · 1 year ago

Idk if I’d agree that cherry picking images has any negative impact on the validity of the results - when people are creating an AI generated image, particularly if they intend to deceive, they’ll keep generating images until they get one that’s convincing

At least when I use SD, I generally generate 3-5 images for each prompt, often regenerating several times with small tweaks to the prompt until I get something I’m satisfied with.

Whether or not humans can recognize the worst efforts of these AI image generators is more or less irrelevant, because only the laziest deceivers will be using the really obviously wonky images, rather than cherry picking

@lloram239@feddit.de · edit-2 1 year ago

deleted by creator

@ilikecoffee@lemmy.world · 1 year ago

You put ‘tell’ twice in the title 😅

@popcar2@programming.dev · 1 year ago

God DAMN it

@jarfil@lemmy.world · edit-2 1 year ago

I can’t tell if that would be more of a human or an AI mistake 🧐