Pricefield | Lemmy
  • Communities
  • Create Post
  • Create Community
  • heart
    Support Lemmy
  • search
    Search
  • Login
  • Sign Up
David GerardM to [email protected]English • 5 days ago

How to pass an AI coding benchmark: train on the questions

pivot-to-ai.com

external-link
message-square
9
fedilink
20
external-link

How to pass an AI coding benchmark: train on the questions

pivot-to-ai.com

David GerardM to [email protected]English • 5 days ago
message-square
9
fedilink
SWE-Bench Verified by OpenAI tests how well a model can solve real bugs in real Python code from GitHub. These bugs are all public information — so the AI models have almost certainly trained on th…

podcast version
video version

  • @[email protected]
    link
    fedilink
    English
    13•5 days ago

    Artificial intelligence and cheating/lying: two great tastes that go together

[email protected]

[email protected]
Create a post
You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: [email protected]

Big brain tech dude got yet another clueless take over at HackerNews etc? Here’s the place to vent. Orange site, VC foolishness, all welcome.

This is not debate club. Unless it’s amusing debate.

For actually-good tech, you want our NotAwfulTech community

  • 31 users / day
  • 147 users / week
  • 326 users / month
  • 1.07K users / 6 months
  • 2 subscribers
  • 954 Posts
  • 26.7K Comments
  • Modlog
  • mods:
  • David Gerard
  • UI: 0.18.4
  • BE: 0.18.2
  • Modlog
  • Instances
  • Docs
  • Code
  • join-lemmy.org