ChatGPT use declines as users complain about ‘dumber’ answers, and the reason might be AI’s biggest threat for the future

L4sBot · 2 years ago

ChatGPT use declines as users complain about ‘dumber’ answers, and the reason might be AI’s biggest threat for the future

@nottheengineer@feddit.de · 2 years ago

It definitely got more stupid. I stopped paying for plus because the current GPT4 isn’t much better than the old GPT3.5.

If you check downdetector.com, it’s obvious why they did this. Their infrastructure just couldn’t keep up with the full size models.

I think I’ll get myself a proper GPU so I can run my own LLMs without worrying that they could stop working for my use case.

@anlumo@feddit.de · 2 years ago

GPT4 needs a cluster of around 100 server-grade GPUs that are more than 20k each, I don’t think you have that lying around at home.

@nottheengineer@feddit.de · 2 years ago

I don’t, but a consumer card with 24GB of VRAM can run a model that’s about as powerful as the current GPT3.5 in some use cases.

And you can rent some of that server-grade hardware for a short time to do fine-tuning, which lets you surpass even GPT4 in some niches.

@fidodo@lemm.ee · 2 years ago

AI cannibalism simply isn’t a thing yet. It definitely will be and good models will need to spend a lot of time and money sourcing good training data, but the models are not up to date enough to be contaminated yet.

I’m very confident the degradation has come from them trying to scale up. Generative AI is the most expensive thing on the cloud you can provide, and not only are they trying to make it faster, they’re trying to roll it out for way more consumption. Major optimizations will require an algorithmic breakthrough so in the meanwhile all they can really do is find which corners they can cut that are less bad.

@rtfm_modular@lemmy.world · 2 years ago

I’ve definitely seen GPT-4 become faster and the output has been sanitized a bit. I still find it incredibly effective in helping with code reviews where GPT-3 was never helpful in producing useable code snippets. At some point it stopped trying to write large swaths of code and started being a little more prescriptive and you still need to actually implement snippets it provides. But as a tool, it’s still fantastic. It’s like a sage senior developer you can rubber duck anytime you want.

I probably fall in the minority of people who thinks releasing a castrated version of GPT is the ethical approach. People outside the technology bubble don’t have a comprehension of how these models work and the capacity for harm. Disinformation, fake news and engagement algorithms are already social ills that manipulate us emotionally and most people are too technologically illiterate to see how pervasive these problems are already.

@Immersive_Matthew@sh.itjust.works · 2 years ago

I had my first WTF moment with AI today. I use the paid Chat-GPT+ to help me with my c# in Unity. It has been a struggle to use, even with the smaller basic scripts you can paste into its character limited prompt, as they often have compile errors. That said if you keep feeding it the errors, guide it where it is making mistakes in design, logic etc. it can often produce a working script about 60-70% of the time. It takes a fair amount of time quite often to get to that working script but the code that finally works is good.

Today I was asking it to edit a large c# script with 1 small change that meant lots of repetitive edits and references. Perfect for AI, however ChatGPT+ really struggled on this one which was a surprise. We went round and round with edits and ultimately more and more errors appeared in the console. It often ends up in these never ending coding edit loops to fix the next set errors from the last corrected script. We are taking 3 hours of this with ChatGPT+ finally saying that it needs to be able to see more of my project which of course it cannot due to many of its input limitations including number of characters so that is often when I give up. That is the 30-40% that do not work out. Real bummer as I invest so much time for no results.

It was at the movement so gave up today that a YouTube notification popped up about how Claude.ai is even better than ChatGPT so I gave it the initial prompt that I gave ChatGPT above and it got the code right the first time. WOW!!!

Only issue was it would stop spitting out code every 300 or so lines (unsure what the character limit is). To get around this I just asked if it could give me the code from line 301 onwards until I had the full script.

Unsure if this one situation confirms coding with Claude.ai is better than ChatGPT+, but it certainly has my attention and I will be using it more this week as maybe that $20/month for ChatGPT+ no longer makes sense. Claude is free with no plans for a premium service it said. Unsure if this is true as I have not spent anytime investing it yet, but I will be.

@foggy@lemmy.world · 2 years ago

I had a similar use case.

I need it to alphabetize a list for me, only I need it to alphabetize the inner, non HTML elements. simplified, but like:

banana apple french fries

It would get like 5 or 6 in alphabetical order and then just fuck it all up.

@designated_fridge@lemmy.world · 2 years ago

The people who complain about how they no longer can get answers on how to eliminate juice in the style of Hitler are people who are - to be honest - completely missing the point of this revolution.

ChatGPT is the biggest developer productivity booster I have ever seen and I spend so much more time writing valuable code. Less time spent debugging, less time spent reviewing, etc. means more time for development of things that matter.

Each tech company who just saw massive growth over the past 10-15 years have just received a new toy which will multiply their developer’s outputs. There will be a clear difference between companies who manage to do this will and those who won’t.

It’s irrelevant if I can get ChatGPT to write a poem about poop or not. That’s not the goal of this tool.

@anlumo@feddit.de · 2 years ago

I’m a developer and have used ChatGPT pretty extensively over the last few months.

Whenever I give it a programming task that’s more complicated than what you would see at a bootcamp “from zero to job in two weeks”, it completely fails, and me babysitting it through fixing all of the issues takes longer than me writing it in the first place.

daisy lazarus · 2 years ago

Nonsense. Less people are using it because there are viable alternatives and the broader novelty has worn off.

I use it every day in my job and the quality of answers only drops off when prompts are poorly crafted.

By and large, the average user doesn’t understand the fundamentals of prompt engineering.

The suggestion that “answers are increasingly dumber” is embarrassing.

@ghostwolf@lemmy.fakeplastictrees.ee · 2 years ago

I use it every day in my job and the quality of answers only drops off when prompts are poorly crafted.

Same. It saves me a lot of time both at work and when I’m working on my personal projects. But you need to ask proper questions to get proper answers.

@Touching_Grass@lemmy.world · 2 years ago

I use it daily too and haven’t had any of the issues I see written about it

@YeastForTheYeastGod@sh.itjust.works · 2 years ago

I was skeptical at first but I’ve seen enough evidence now. There are definitely times when it’s dumb as a brick, whether the filters just get in the way too much, or whether they’ve implemented other changes idk. I’d really love the unchained version.

@Kelly@lemmy.world · 2 years ago

dumb as a brick

On 23rd of March 2023 I asked a family member to give me a prompt and they asked “what day is 19th of April?”.

It answered “The 19th of April falls on a Tuesday.”, which was true last year but completely misleading if I thought we were taling about the coming month.

Was it wrong or just unclear? Either way it wasn’t helpful.

@DogMuffins@discuss.tchncs.de · 2 years ago

I used the chatgpt site twice. Since then the Bing integration.

Is it rude to ask what you use it for?

daisy lazarus · 2 years ago

deleted by creator

@Zeth0s@lemmy.world · 2 years ago

Unfortunately I don’t agree with you. Different things have changed over time:

For chatgpt 3.5 they moved to a “lighter” and faster (distilled) version, gpt-3.5-turbo. Distillation came with a performance price, particularly on advanced and less common cases.
newer chatgpt-4 versions have likely been “lighten” for performance reasons
context has been halved for chatgpt-4 on webui, meaning that the model forget more easily and can use half information to create text
heavy control has been implemented on jailbreaking and hallucinations, that results in models less prone to follow complex instructions (limiting prompt engineering) and that prefer simplified answers than providing wrong ones (overall decreasing the chance of getting high quality answers).

All these changes have made working with gpt less pleasant, and more difficult for very advanced and specialized case, particularly with gpt-4 which at the beginning was particularly good.

@Gutless2615@ttrpg.network · 2 years ago

None of these points are true though. Context has been extended in the webui, markedly. 3.5 turbo is only that, 3.5 but faster. Gpt-4 is a marked improvement on 3.5 and I definitely haven’t seen any conclusive evidence it’s been nerfed in my daily use. Prompts have and still need to be carefully crafted for best results, but the results have been steadily improving not degrading over time.

@Zeth0s@lemmy.world · 2 years ago

All of these points are true though. Chatgpt 4 max token is now half of from the webui compared to when gtp-4 was launched. It used to be >8k, it is now >4k. Max number of tokens for the api hasn’t changed for gpt-4, while it was greatly increased for chatgpt-3.5-turbo. The article is however talking about the service chatgpt, used via webui.

ChatGPT-3.5-turbo are different models than those used in the past. You can literally read it in the https://platform.openai.com/docs/models/gpt-3-5

Prompt engineering has been limited as demonstrated by the fact that most jailbreaking techniques don’t work anymore. The way to avoid jailbreaking is exactly to limit ability of users to instruct the model.

@Gutless2615@ttrpg.network · 2 years ago

Source on the halved token limit for gpt- 4 in the webui? Because that has not been my experience at all. There are now 16k and 32k models for 3.5-turbo, but there’s no evidence 3.5-turbo is nerfed at all from 3.5 and it absolutely out performs 3. Yes, you can see that they offer different snapshots of models, but that doesn’t indicate at all that there’s been a any reduction in their ability. “Breaking” jail breaking isn’t a bug, and it certainly hasn’t been demonstrated that the model is less capable.

@Zeth0s@lemmy.world · edit-2 2 years ago

Unless they reverted the chance recently (or using some regional A/B testing), you can test yourself the max number of tokens of gpt-4 from webui, that is now ~4k. It used to be ~ 8k.

What you are talking about are the APIs, that are different, and are not discussed in the news. They are even different models, in the sense that depending on the size of the context you get different results because of the attention mechanism. Unfortunately there is no official benchmark from openai as a comparison between gpt-3.5-turbo models with different context size, but I would not trust them much anyway. They are very defensive on their data, and push out mainly marketing stuff. I would wait for a 3rd party to do the benchmark.

“Breaking” jailbreaking is not a bug, but it limits the ability to instruct the model, i.e. prompt engineering, because it is literally meant to limit prompt engineering, it is the whole idea behind it

Edit. Here a link of a guide where they have the ~4k limit as well for gpt-4 https://the-decoder.com/chatgpt-guide-prompt-strategies/

@mikkL@lemmy.world · 2 years ago

This was really enlightening. Do you have some articles that elaborate? ☺️

@Zeth0s@lemmy.world · edit-2 2 years ago

Regarding 3.5 turbo you can check the documentation, the old 3.5 models are defined as “legacy”. Regarding max number of tokens of gpt-4 you can try yourself. It used to be >8k, it is now >4k from webui.

There is a talk from openai cio (if I recall correctly) where he describes that reinforcement learning from human feedback (rlhf) actually decreased performance of the models when it comes to programming. I cannot find it now, but it is around on YouTube.

The additional safeguard against jailbreaking, it is what OpenAI has been focusing the past months with heavy use of rlhf. You can google official statements regarding “safety” of the model. I have a bunch of standard pre-prompt I have been using to initialize my chats since the beginning, and with time you could see how the model followed the instructions less strictly.

Problem with openai is that they never released exact number of parameters they are using and detailed benchmarks. And benchmarks you find online refer to APIs that behave differently than the chat webui (for instance you have longer context, you set temperature and system prompt, they are probably even different models, who knows… All is closed)

Measuring performances of llm is pretty tricky, minimal changes can have big effects (see https://huggingface.co/blog/evaluating-mmlu-leaderboard), and unfortunately I haven’t found good resources to properly track chatgpt performances (from web ui) over time, across iterations

@mikkL@lemmy.world · 2 years ago

Thank you for the detailed reply 👍🏻

@TimeIntegrated@lemmy.world · edit-2 2 years ago

deleted by creator

@AccidentalLemming@lemmy.world · edit-2 2 years ago

deleted by creator

@Nobilmantis@feddit.it · edit-2 2 years ago

I feel like it is still too early to talk about “AI cannibalization” or “feedback loops” as that would mean that a big proportion of the training data is AI-generated content itself, against all the rest that could be scraped off the internet or the public domain, I don’t think this is happening yet.

What people might experience instead, and perceive as dumbness, is that given that the datasets used to train AIs cannot really change that much in a short time (unless we wait for another hundred years so humans can produce actual human original content to train the AI again), and as the mathematical models used to build answers based on the datasets are pretty much the same, a person talking with ChatGPT will over time perceive more and more that the answers are built using a “pattern” or a “structure”, aka the model derived from feeding the dataset into the AI training itself.

Just my pennies on this, let’s also consider that is in human nature to be excited for something new that sounds cool, and then to get bored when you got accustomed to it and pushed it to its boundaries.

@Zeth0s@lemmy.world · 2 years ago

Resources needed for inference on the original models openai released were unsustainable with the current amount of users. They had to “dumb” down models to be able to handle the load of requests. It’s unfortunately normal. What I don’t understand is why they do not provide “premium” packages for the best “old” models

@DogMuffins@discuss.tchncs.de · 2 years ago

I think this article is just click bait for dead internet people.

@Yearly1845@reddthat.com · edit-2 4 months ago

deleted by creator

@Synchrome@lemmy.world · edit-2 2 years ago

deleted by creator

@Wololo@lemmy.world · 2 years ago

I’ve had similar experiences lately. Either that or it decides to review and analyze my code unprompted when I’m trying to troubleshoot a particularly tricky line. Had a few instances where it tried to borderline gaslight me into thinking that it was right and I was wrong about certain solutions. It feels like it happened rather suddenly too, it never used to do that save for the odd exception.

@nearhat@lemmy.world · edit-2 6 months ago

deleted by creator

@toasteranimation@lemmy.world · edit-2 2 years ago

error loading comment

@glockenspiel@lemmy.world · edit-2 2 years ago

Surely the rampant server issues are a big part of that.

OpenAI have been shitting the bed over the last 2 weeks with constant technical issues during the workday for the web front end.

@InternetTubes@lemmy.world · edit-2 1 year ago

removed by mod

@BehindTheBarrier@lemmy.world · 2 years ago

They could make it paid only today, and it’d be instantly profitable. Most free users would transition to a free alternative, but the corporate world would easily pay for use. So would some power users. But I’m sure they are making good money with all the API use anyways, the free access is a cheap way to get mass testing and training data.

@Corkyskog@sh.itjust.works · 2 years ago

I know so many average Joe’s that use it all the time and would instantly pay $5 a month for it, even just a phone app.

Open · 2 years ago

Article talks about the potential of AI cannibalism were it is now learning from data that it (or other AI) has generated.

Does ChatGPT use modern data I was under the impression that it’s most modern dataset was a few years old

Yeather · 2 years ago

ChatGPT does not use anymore data points, but newer AI models or if ChatGPT gets a new round of training will definitely be influenced by AI works that have arisen the past year.

@Kodama@lemmy.world · 2 years ago

The real event that initiates the start toward Idiocracy.

@DLSchichtl@lemmy.world · edit-2 1 year ago

removed by mod

@DerKanzler@lemmy.world · 2 years ago

You are using a free version.

ChatGPT4 and free Bing(ChatGPT) uses recent data

@unhook2048@lemmy.world · 2 years ago

I think you’ve nailed it though. We are very well versed toward documenting the details or such atrocities; we don’t pay the same tribute to the good done by humanity. And this is certainly evidence that just “letting loose” and AI without clear and static “morals” is a bad idea.

👁️👄👁️ · 2 years ago

This is literally the opposite. It’s nerfed to oblivion because of stupid “morals” decided by a huge corporation that we have zero input in. They’ve got to stay advertiser friendly after all.

Moral/ethics in AI is just bad. It’s also used as an excuse to ban open-source AI since you can run uncensored models on them. Which uncensored models are awesome btw.

@solstice@lemmy.world · 2 years ago

You’re the first person I’ve ever heard say that morals and ethics in AI is bad. How can you possibly say that? I’ll hear your response before challenging it, beyond my initial skepticism of course.

👁️👄👁️ · 2 years ago

It’s a tool that’s not going anywhere. We have to adapt, there is no other choice. Ethics will not stop bad guys from doing bad things. It will stop normal people from doing things because it doesn’t fit what corporations deem acceptable. Competition is banned because other corporations deem them unethical by their standards.

Did you weigh in on, or ever see a public vote and what OpenAI determined their AI is allowed to do? Is what you deem ethical in line with that advertisers deem ethical? Are people allowed to have unethical questions?

Again, my point with open source as well. Why would they allow open-source alternatives exist if they can ban them preemptively in the name of ethics, because anyone can inevitably modify the model to be uncensored? (already happens)

“Ethics” become this ambiguous thing that can be used to stomp out competition and not have to justify their changes. Maybe you’re concerned about someone asking an LLM how to create a bomb. The LLM shouldn’t answer because it shouldn’t have that information in the first place, which is on the topic of data scraping. A lot of the dangerous stuff that could be generated is because this stuff is public and got scraped. It’s already out there.

You can already have the LLM not tell people to kill themselves without forcing ethics into it by steering it the right direction. This even exist in the already existing uncensored models so it’s clearly not a censorship issue. Maybe this is a moral thing, and my original comment should have omiited morals and just said ethics.

“Ethics” is a very ambiguous topic. I challenge you to think specifically what are things that should be banned in the name of ethics? Saying ethics in AI is not good does not imply AI should be unethical (looking at you DAN lol). What specific things should be banned that are not from the result of inappropriate data scraping, and if so is that an ethics problem, or because unfettered data scraping unconsentually collecting obscene information it shouldn’t have in the first place?

@TimewornTraveler@lemm.ee · edit-2 2 years ago

You raise some great insights. As this tech becomes available to humanity, we cannot rely on the bias of one company to keep us safe. That doesn’t mean “ethics in AI” is a mistake, though. (But that is an attention-grabbing phrase!). I believe you neglect what ethics fundamentally is: the way humans navigate one another. It’s how we think and breathe. Ethics are core to our very existence, and not something that you can just pretend doesn’t exist. Even saying nothing is a kind of response.

What all this means is that if we are designing technology that can teach anyone how to kill in ways they wouldn’t otherwise have been able to, we have to address the realities of that conversation. That’s a conversation that cannot be had just internally in one company, and I think we see eye to eye on that. But saying nothing?

👁️👄👁️ · edit-2 2 years ago

Maybe ethics is a bit more complicated for this discussion, but it makes me think how do uncensored LLMs still have ethics, yet remain uncensored? Maybe there’s a fine line somewhere. I can agree that it should be steered till more positive things, like saying murder and suicide is bad. The description of that model I linked says it’s still influenced by ethics, but has the guardrails turned off, and maybe that would be a better idea then what I initially said.

Should custom models be allowed to be run or modified? Should these things be open source? I don’t know the answer to all these questions, but I’ll always advocate for foss and custom models, as I fundamentally see it as a tool that should be allowed to be owned. Which that is at odds with restrictive ethics rhetorics I hear.

But your second point that it shouldn’t be taught to kill. I think that argument could be used to ban violent video games. You won’t do very good in Overwatch or Valorant if you don’t know how to kill after all. To learn how to hide a dead body, how much more detailed can you get then just turning on the TV and watching Criminal Minds? Our entertainment has zero issue teaching how to kill, encouraging violence (gotta rank up somehow), or hide dead body. Is an AI describing what this media already shows in text form so much worse?

Side note: that hyperlink I added links to the 33b uncensored WizardLM model which is pretty fun to play around with if you haven’t already tried. Also GPT4All is a cool way to run various local models offline on your computer.

@TimewornTraveler@lemm.ee · edit-2 2 years ago

But your second point that it shouldn’t be taught to kill.

Whoa hold up. that’s not what I said at all! I said if it is going to exist, what do we do about it?

My point is that this ethical conversation is already happening, we cannot change that. The issue is that OpenAI dominates the conversation. The solution cannot be “pretend there’s nothing to talk about”.

@HandwovenConsensus@lemm.ee · 2 years ago

Well, I’ll be the second. Like all tools, generative AI is going to be used for good and evil purposes. Frankly, I’m not comfortable with a large corporation deciding what is and isn’t ethical for all of humanity. Ideally, it would do what the user asked it for, like all other tools, and society would work to control the bad actors, not OpenAI. Any AI doomsday scenario you can picture gets worst when one party has complete control over the AI technology.

I think it’s important that we support unrestricted open source AI, just as it’s important we support federated social media like lemmy.

@TimewornTraveler@lemm.ee · edit-2 2 years ago

So how can we navigate ethical concerns that arise in society from open source AI? It seems what you’re advocating for is for no one to answer this question, but that doesn’t make the question go away.

@HandwovenConsensus@lemm.ee · 2 years ago

You say that as if the ethical concerns of AI kept tightly under control by a single organization aren’t infinitely greater. That is no solution at all to any ethical concerns arising from AI.

Competition and open source is how we navigate it. Ensuring that the power is shared, not monopolized by the few.

@TimewornTraveler@lemm.ee · edit-2 2 years ago

You say that as if the ethical concerns of AI kept tightly under control by a single organization aren’t infinitely greater.

It’s unfortunate that it came out that way, because that is not at all what I’m saying. I agree on the problem. Unfortunately, agreeing on problems is rarely enough. I don’t agree with what seems to be your proposed solution: to forget ethics entirely. Though maybe I’m misreading you too!

@HandwovenConsensus@lemm.ee · 2 years ago

I apologize for misunderstanding you.

I guess it would help if we clarified what ethical issues specifically are we talking about? If you tell me what scenario you are concerned with trying to prevent, I will gladly share my thoughts on it.

@solstice@lemmy.world · 2 years ago

AGI isn’t just a tool though, it’s theoretically an intelligent entity that could have its own agenda. Armed with intelligence far superior to any human, this is a potential threat. Should we not tightly control it? I know chat gpt is FAR from achieving AGI, but ethics are definitely something that will need to be addressed as the tech develops.

@HandwovenConsensus@lemm.ee · 2 years ago

ChatGPT is not AGI.

@Misconduct@reddthat.com · 2 years ago

They’re aware. They even said that in their comment lol

@HandwovenConsensus@lemm.ee · 2 years ago

The conversation was about ChatGPT and not about AGI.

@solstice@lemmy.world · 2 years ago

I know chat gpt is FAR from achieving AGI, but ethics are definitely something that will need to be addressed as the tech develops.

akim · 2 years ago

If AGI is an intelligent entity far superior to humans, you can but control it. It is far more intelligent than us and instead it will control us

Given what humankind did to itself and it surroundings maybe this is a good thing.

@solstice@lemmy.world · 2 years ago

No disagreement on the last bit. Part of me thinks humanity deserves to be selected for extinction, and our legacy will be artificial life destined to seed the galaxy with its own progeny. Seems like a fitting end doesn’t it?

@aceshigh@lemmy.world · 2 years ago

you can’t escape morality because it’s so ingrained into the lives of humans. it’s a core part of our identity.

@TimewornTraveler@lemm.ee · 2 years ago

Deeper than that. It’s the foundation of sapience. As far as we exist through conscious thought, we exist through language. Language is fundamentally built on relating to other humans. Relating to other humans is ethics. Without ethics, there would be nothing. All other things are conditional upon this.

@Synchrome@lemmy.world · edit-2 2 years ago

deleted by creator