Reddit usernames like ‘SolidGoldMagikarp’ are somehow causing the chatbot to give bizarre responses.

  • @[email protected]
    link
    fedilink
    82 years ago

    I’m not surprised they used Reddit data to train. I am shocked a bit at how fucking lazy and haphazard they were with the data.

    There’s only logical arguments for anonymizing the data which it looks like they didn’t do. It’s such a massive privacy risk not to. It also puts the company at legal risk. Who knows what other bizarre info it’ll leak.

      • VanillaGorilla
        link
        fedilink
        22 years ago

        Yeah, right? Reddit isn’t private like Lemmy or kbin. I’d be shocked to know this comment would be out in the open, but here it’s absolutely safe.

    • FaceDeer
      link
      fedilink
      22 years ago

      The silliness of anonymizing data that’s already wide open in the public aside, if you were to anonymize the usernames you’d end up producing a worse AI because often the literal username of the person in question is significant to the context of what’s being written. Think of all the “relevant username” comments, for example. People make puns about usernames, berate people for having offensive usernames, and so forth. If those usernames were all replaced with anonymized substitutes the AI would be training on nonsense.