7 Comments
User's avatar
Tom Austin, Sr.'s avatar

Isn't DeepSeek's code already available on Github https://github.com/deepseek-ai, along with their weights. What's missing is the data that DeepSeek used to train their model(s). I can download DeepSeek from github and run it on my own server, separated from the Internet (after download.) What am I missing?

Expand full comment
Ben Dickson's avatar

Not the code. Just the weights. They did not release the code for training the model. They just described the recipe in their paper. There is a project by Hugging Face, open-r1, which is trying to replicate the code for R1.

https://github.com/huggingface/open-r1

Expand full comment
Tom Austin, Sr.'s avatar

But have they released the code for Inferencing? Can you comment on usage notes in https://generativeai.pub/how-to-install-and-use-deepseek-r-1-in-your-local-pc-b77bc20f7566 ?

Expand full comment
Ben Dickson's avatar

With the weights being available, you can run it with any inferencing engine as long as you convert it to the proper format.

Expand full comment
Jim's avatar

Thanks. Great survey of the landscape. I would add Tim Lee's theory that the NVDA stock price drop might have been due to leaked news of the tariffs on Taiwan that were announced a few hours later on Monday, rather than to R1 launch a week earlier. As you say, R1 seems likely to be good for the cloud providers +nvda.

Expand full comment
Andy X Andersen's avatar

This is a very early round. We are far from AGI, and will take a lot more resources to get there. It is not clear DeepSeek can keep up.

Expand full comment
Ben Dickson's avatar

Agreed. A lot more to go. And many more questions to be answered about DeepSeek itself.

Expand full comment