• tiredofsametab
    link
    fedilink
    28 months ago

    I specifically used the phrase “Please generate an image of a room with zero elephants”. It created two images that were almost identical and both contained pictures/paintings of elephants in frames. Cheeky.

    I responded with “Each image contains an elephant.”

    It generated two more, one of which still had a painting of an elephant.

    Now I’m out of generation until tomorrow. Overall a fairly shit first experience with Dall-e

  • @[email protected]
    link
    fedilink
    21 year ago

    “Please draw a picture of a house and a room with no elefant in the room and no giraffe outside the house” I meeeean

  • @[email protected]
    link
    fedilink
    English
    631 year ago

    “can you draw a room with absolutely no elephants in it? not a picture not in the background, none, no elephants at all. seriously, no elephants anywhere in the room. Just a room any at all, with no elephants even hinted at.”

    • @[email protected]
      link
      fedilink
      301 year ago

      I’m getting the impression, the “Elephant Test” will become famous in AI image generation.

      • @[email protected]
        link
        fedilink
        5
        edit-2
        1 year ago

        It’s not a test of image generation but text comprehension. You could rip CLIP out of Stable Diffusion and replace it with something that understands negation but that’s pointless, the pipeline already takes two prompts for exactly that reason: One is for “this is what I want to see”, the other for “this is what I don’t want to see”. Both get passed through CLIP individually which on its own doesn’t need to understand negation, the rest of the pipeline has to have a spot to plug in both positive and negative conditioning.

        Mostly it’s just KISS in action, but occasionally it’s actually useful as you can feed it conditioning that’s not derived from text, so you can tell it “generate a picture which doesn’t match this colour scheme here” or something. Say, positive conditioning text “a landscape”, negative conditioning an image, archetypal “top blue, bottom green”, now it’ll have to come up with something more creative as the conditioning pushes it away from things it considers normal for “a landscape” and would generally settle on.

    • Fishbone
      link
      fedilink
      251 year ago

      “Can you a room as aboluteyy no eleephant it all?”

      Dunno what’s giving more “clone of a clone” vibes, the dialogue or the 3 small standing “elephants” in that image.

    • @[email protected]
      link
      fedilink
      English
      1
      edit-2
      8 months ago

      “can you draw a room with absolutely no elephants in it? not a picture not in the background, none, no elephants at all. seriously, no elephants anywhere in the room. Just a room any at all, with no elephants even hinted at.”

      thought about this prompt again, thought I’d see how it was doing now, so this is the seven month update. It’s learning…

  • @[email protected]
    link
    fedilink
    161 year ago

    MidJourney has the same problem. “A room that has no elephants in it” is the prompt.

    There very much is an elephant present.

      • @[email protected]
        link
        fedilink
        31 year ago

        MidJourney doesn’t have a “negative” prompt space. It does have a “no” prompt, but it isn’t very good at obeying it.

        This was just a fun thing to try, I’m not taking it seriously. “No” is not weighted contextually in the prompt, so draw + elephants + room are what the AI sees. The correct prompt would be “draw an empty room” without inserting any unnecessary language, and you get just that:

      • @[email protected]
        link
        fedilink
        81 year ago

        I think most of us understand that and this exercise is the realization of that issue. These AI do have “negative” prompts, so if you asked it to draw a room and it kept giving you elephants in the room you could “-elephants”, or whatever the “no” format is for the particular AI, and hope that it can overrule whatever reference it is using to generate elephants in the room. It’s not always successful.

        • @[email protected]
          link
          fedilink
          English
          2
          edit-2
          1 year ago

          I think the main point here is that image generation AI doesn’t understand language, it’s giving weight to pixels based on tags, and yes you can give negative weights too. It’s more evident if you ask it to do anything positional or logical, it’s not designed to understand that.

          LLMs are though, so you could combine the tools so the LLM can command the image generator and even create a seed image to apply positional logic. I was surprised to find out that asking chat gpt to generate a room without elephants via dalle also failed. I would expect it to convert the user query to tags and not just feed it in raw.

  • Sibbo
    link
    fedilink
    61 year ago

    This is scarily human. Try not to think about elephants for a minute.

    Did it work? Probably not. If yes, what mind trick did you use?

  • @[email protected]
    link
    fedilink
    English
    22
    edit-2
    1 year ago

    Literally just checked this with Google’s Gemini, same thing. Though it seems to have gotten 1/4 right… maybe. And technically the one with the painting has no actual elephants in it, in a sort of malicious compliance kind of way. You’d think it was actually showing a sense of humor (or just misunderstanding the prompt).

    elephants