ygolo
My termites win
- Joined
- Aug 6, 2007
- Messages
- 6,728
For those wondering why the US companies didn't build a similar model to R1, OpenAI did.
It was called o1.
They built it...who knows how long ago. OpenAI isn't that open anymore. There was all sorts of hype and fear about it. They released a "preview" and finally the full version months later.
For those following the marketing of these models, you should know, the fear is a core part of the hype.
It's how they justify closing their research. Even when released, users weren't allowed to see it's full chain of thought.
Why DeepSeek R1 seems like such a breakthrough is that they Open Sourced it and published their methods as well (with Huggingface checking if it works).
If the methods bear out:
1) The use of more copyrighted data is obviated with it's rule based reinforcement learning and "self-play"
2) They publish the full-chain of thought, and more people find it "cute" than scary. It's also semi-oubvious that they are doing some form of self-play in the chain of thought. So even if it wasn't Open Source, researchers would have figured out something similar.
Yann LeCun nailed the interpretation correctly.
www.businessinsider.com
Open Science is why the west won previous technogical competition.
This time, it seems, China and the global south is on team Open Science, while the west on the side of secrecy, oppression and denial of access and services.
Edit: The rumblings around DeepSeek circumventing Terms of Service are getting louder. Knowledge Distillation and synthetic data generation are some of the most common use-cases for startups using OpenAI. If Microsoft and OpenAI go after DeepSeek for going past some form or rate-limit (it is unclear what the allegations are), I feel like many people will just switch those use cases to use Gemini, Anthropic, or even other open-source models.
In principle, if they used fair use data, then using their tool to make use of the fair use data seems like fair use.
Also, regarding suppressing the global south, I am thinking of reading this book:
www.goodreads.com
It was called o1.
They built it...who knows how long ago. OpenAI isn't that open anymore. There was all sorts of hype and fear about it. They released a "preview" and finally the full version months later.
For those following the marketing of these models, you should know, the fear is a core part of the hype.
It's how they justify closing their research. Even when released, users weren't allowed to see it's full chain of thought.
Why DeepSeek R1 seems like such a breakthrough is that they Open Sourced it and published their methods as well (with Huggingface checking if it works).
If the methods bear out:
1) The use of more copyrighted data is obviated with it's rule based reinforcement learning and "self-play"
2) They publish the full-chain of thought, and more people find it "cute" than scary. It's also semi-oubvious that they are doing some form of self-play in the chain of thought. So even if it wasn't Open Source, researchers would have figured out something similar.
Yann LeCun nailed the interpretation correctly.
Meta's chief AI scientist says DeepSeek's success shows that 'open source models are surpassing proprietary ones'
Meta's chief AI scientist, Yann LeCun, said DeepSeek's success with R1 said more about the value of open-source than Chinese competition.

Open Science is why the west won previous technogical competition.
This time, it seems, China and the global south is on team Open Science, while the west on the side of secrecy, oppression and denial of access and services.
Edit: The rumblings around DeepSeek circumventing Terms of Service are getting louder. Knowledge Distillation and synthetic data generation are some of the most common use-cases for startups using OpenAI. If Microsoft and OpenAI go after DeepSeek for going past some form or rate-limit (it is unclear what the allegations are), I feel like many people will just switch those use cases to use Gemini, Anthropic, or even other open-source models.
In principle, if they used fair use data, then using their tool to make use of the fair use data seems like fair use.
Also, regarding suppressing the global south, I am thinking of reading this book:

Beyond the Valley: How Innovators around the World are …
How to repair the disconnect between designers and user…
Last edited: