
Byspectra
Add a review FollowOverview
-
Founded Date July 21, 1930
-
Sectors Sales
-
Posted Jobs 0
-
Viewed 48
Company Description
DeepSeek’s First-generation Reasoning Models
DeepSeek’s first-generation thinking models, accomplishing efficiency equivalent to OpenAI-o1 across math, code, and reasoning tasks.
Models
DeepSeek-R1
Distilled models
DeepSeek group has actually demonstrated that the reasoning patterns of larger designs can be distilled into smaller sized models, leading to better efficiency compared to the reasoning patterns discovered through RL on little models.
Below are the models created by means of fine-tuning versus several dense models extensively utilized in the research study neighborhood using reasoning data produced by DeepSeek-R1. The examination results show that the distilled smaller sized dense models carry out extremely well on benchmarks.
DeepSeek-R1-Distill-Qwen-1.5 B
DeepSeek-R1-Distill-Qwen-7B
DeepSeek-R1-Distill-Llama-8B
DeepSeek-R1-Distill-Qwen-14B
DeepSeek-R1-Distill-Qwen-32B
DeepSeek-R1-Distill-Llama-70B
License
The design weights are under the MIT License. DeepSeek-R1 series support business usage, enable any modifications and acquired works, including, but not limited to, distillation for training other LLMs.