With the rising recognition of DeepSeek, a latest report by Bernstein said that the Chinese language AI app seems improbable however just isn’t a miracle, and it has not been constructed for $5 million.
The report talked about that the declare of DeepSeek, which is similar to ChatGPT by OpenAI, constructed at a value of $5 million, is fake.
“We imagine that DeepSeek DID NOT “construct OpenAI for $5M”; the fashions look improbable, however we do not suppose they’re miracles; and the ensuing Twitter-verse panic over the weekend appears overblown,” ANI reported, citing the Bernstein report.
“The fashions they constructed are improbable, however they aren’t miracles both,” mentioned Bernstein analyst Stacy Rasgon, who follows the semiconductor business and was one in all a number of inventory analysts describing Wall Road’s response as overblown, reported Related Press.
The 2 primary households of AI fashions, ‘DeepSeek-V3’ and ‘DeepSeek R1’, have been developed by the Chinese language AI app.
The V3 mannequin is a big language mannequin that makes use of a mix of skilled (MOE) structure. This structure combines a number of smaller fashions to work collectively, leading to excessive efficiency whereas utilizing fewer assets than different giant fashions. In whole, the V3 mannequin has 671 billion parameters with practically 37 billion energetic customers at a time.
This consists of progressive strategies corresponding to Multi-Head Latent Consideration (MHLA), lowering reminiscence utilization, and mixed-precision coaching utilizing FP8 computation for effectivity.
For the V3 mannequin, DeepSeek used a cluster of two,048 NVIDIA H800 GPUs for practically two months, 2.7 million GPU hours for pre-training and a pair of.8 million GPU hours, together with post-training.
Based on estimates, the price of this coaching will likely be practically $5 million primarily based on a $2 per GPU hour rental charge. The report claims that this quantity would not account for different prices incurred for the event of the mannequin.
DeepSeek R1, which majorly competes with OpenAI fashions, is constructed on the V3 basis however makes use of Reinforcement Studying (RL) and different strategies to enhance reasoning capabilities.
The assets required for the R1 mannequin had been very substantial and weren’t accounted for by the corporate, the report mentioned.
Nevertheless, the report acknowledged that DeepSeek’s fashions are spectacular, however the panic and exaggerated claims about constructing an OpenAI competitor for $5 million are incorrect.
========================
AI, IT SOLUTIONS TECHTOKAI.NET
Leave a Reply