Isn't DeepSeek's code already available on Github https://github.com/deepseek-ai, along with their weights. What's missing is the data that DeepSeek used to train their model(s). I can download DeepSeek from github and run it on my own server, separated from the Internet (after download.) What am I missing?
Not the code. Just the weights. They did not release the code for training the model. They just described the recipe in their paper. There is a project by Hugging Face, open-r1, which is trying to replicate the code for R1.
Thanks. Great survey of the landscape. I would add Tim Lee's theory that the NVDA stock price drop might have been due to leaked news of the tariffs on Taiwan that were announced a few hours later on Monday, rather than to R1 launch a week earlier. As you say, R1 seems likely to be good for the cloud providers +nvda.
Isn't DeepSeek's code already available on Github https://github.com/deepseek-ai, along with their weights. What's missing is the data that DeepSeek used to train their model(s). I can download DeepSeek from github and run it on my own server, separated from the Internet (after download.) What am I missing?
Not the code. Just the weights. They did not release the code for training the model. They just described the recipe in their paper. There is a project by Hugging Face, open-r1, which is trying to replicate the code for R1.
https://github.com/huggingface/open-r1
But have they released the code for Inferencing? Can you comment on usage notes in https://generativeai.pub/how-to-install-and-use-deepseek-r-1-in-your-local-pc-b77bc20f7566 ?
With the weights being available, you can run it with any inferencing engine as long as you convert it to the proper format.
Thanks. Great survey of the landscape. I would add Tim Lee's theory that the NVDA stock price drop might have been due to leaked news of the tariffs on Taiwan that were announced a few hours later on Monday, rather than to R1 launch a week earlier. As you say, R1 seems likely to be good for the cloud providers +nvda.
This is a very early round. We are far from AGI, and will take a lot more resources to get there. It is not clear DeepSeek can keep up.
Agreed. A lot more to go. And many more questions to be answered about DeepSeek itself.