I wonder what his first clue was.
You must log in or register to comment.
After seeing that the public was willing to call DeepSeek “open source” for releasing 800 lines of Python, an opaque model, and a PDF vaguely describing (or just praising) the proprietary training framework… Yeah, I imagine he feels like he missed an opportunity.
At least we have HuggingFace
It’s been a few days and a simple search reveals it’s already been reproduced by many different bodies using the “vague” pdf. What’s this disservice for?
TBH the paper is a bit light on the details, at least compared to the standards of top ML conferences. A lot of DeepSeek’s innovations on the engineering front aren’t super well documented (at least well enough that I could confidently reproduce them) in their papers.