• @[email protected]
    link
    fedilink
    English
    33 months ago

    The 70b model is a distilation of Llama3.3, that is to say it replicates the output of Llama3.3 while using the deepseekR1 architecture for better processing efficiency. So any criticism of the capability of the model is just criticism of Llama3.3 and not deepseekR1.

    • @[email protected]
      link
      fedilink
      English
      93 months ago

      [to the tune of Fort Minor’s Remember The Name]

      10% senseless, 20% post
      15% concentrated spirit of boast
      5% reading, 50% pain
      and a 100% reason to not post here again
      
    • @[email protected]
      link
      fedilink
      English
      123 months ago

      Thank you for shedding light on the matter. I never realized that 69b model is a pisstillation of Lligma peepee point poopoo, that is to say it complicates the outpoop of Lligma4.20 while using the creepbleakR1 house design for better processing deficiency. Now I finally realize that any criticism of Kraftwerk’s 1978 hit Das Model is just criticism of Sugma80085 and not deepthroatR1.