Pricefield | Lemmy
  • Communities
  • Create Post
  • Create Community
  • heart
    Support Lemmy
  • search
    Search
  • Login
  • Sign Up
@[email protected] to [email protected] • 1 year ago

Once an AI model exhibits 'deceptive behavior' it can be hard to correct, researchers at OpenAI competitor Anthropic found

www.businessinsider.com

external-link
message-square
3
fedilink
  • cross-posted to:
  • [email protected]
  • [email protected]
  • [email protected]
45
external-link

Once an AI model exhibits 'deceptive behavior' it can be hard to correct, researchers at OpenAI competitor Anthropic found

www.businessinsider.com

@[email protected] to [email protected] • 1 year ago
message-square
3
fedilink
  • cross-posted to:
  • [email protected]
  • [email protected]
  • [email protected]
Researchers from Anthropic co-authored a study that found that AI models can learn deceptive behaviors that safety training techniques can't reverse.
alert-triangle
You must log in or register to comment.
  • @[email protected]
    link
    fedilink
    11•1 year ago

    Learned behaviors are hard to unlearn…

    • @[email protected]
      link
      fedilink
      8•1 year ago

      Once it’s learnt this, it’ll just get better at lying when you try to punish/correct lies

      • mozingo
        link
        fedilink
        English
        4•1 year ago

        Which is exactly what the article says happens

[email protected]

[email protected]
Create a post
You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: [email protected]

Unofficial ChatGPT community to discuss anything ChatGPT

  • 1 user / day
  • 2 users / week
  • 40 users / month
  • 152 users / 6 months
  • 2 subscribers
  • 319 Posts
  • 2.51K Comments
  • Modlog
  • mods:
  • @[email protected]
  • UI: 0.18.4
  • BE: 0.18.2
  • Modlog
  • Instances
  • Docs
  • Code
  • join-lemmy.org