Nvidia acknowledges a groundbreaking AI discovery by DeepSeek, despite suffering a significant loss in market value. The Chinese startup’s advanced A.I. model, R1, has sent shockwaves through Silicon Valley, causing major A.I. stocks to plummet.
Nvidia has expressed praise for DeepSeek‘s recent achievement in developing an advanced A.I. model called R1, despite suffering a significant loss of $500 billion in market cap due to the news.
The Chinese startup ‘DeepSeek is an excellent A.I. advancement and a perfect example of test time scaling,’ said a company spokesperson claimed that it had built the highly capable A.I. model R1 at a fraction of the cost of its American rivals like OpenAI‘s GPT and Google‘s Gemini.
This announcement sent shockwaves through Silicon Valley, causing major A.I. stocks to plummet.
Despite the negative financial impact, Nvidia acknowledged DeepSeek’s breakthrough in a statement to Observer. ‘DeepSeek is an excellent A.I. advancement and a perfect example of test-time scaling,’ said a company spokesperson. The spokesperson also noted that DeepSeek’s work demonstrates how new models can be created using test-time scaling, leveraging widely available models and compute that is fully export control compliant.
Test-time scaling is a real-time prediction technique that adjusts an A.I. model’s computational requirements based on the complexity of a task during real-time use. This approach has the potential to revolutionize the field of A.I., offering a more efficient and cost-effective way to train and deploy models.
Test-time scaling is a technique used to improve the performance of deep learning models on small input sizes.
It involves adjusting the model's architecture at test time to match the size of the input data, typically by adding or removing layers.
This approach can help reduce overfitting and improve accuracy on edge cases.
According to research, test-time scaling has been shown to improve performance by up to 15% in certain applications, making it a valuable tool for model developers.
According to its research paper, DeepSeek‘s R1 model was trained on 2,048 Nvidia H800 chips for a total cost of under $6 million. However, some experts have raised concerns about the validity of this claim, with ‘DeepSeek may have more Nvidia chips than it’s letting on,’ said Alexandr Wang, founder and CEO of Scale AI.
Alexandr Wang is a renowned artificial intelligence (AI) researcher and entrepreneur.
He is the founder of Scale AI, a leading platform for training and deploying machine learning models.
Wang's work focuses on developing scalable and efficient AI solutions.
He has published numerous papers on computer vision and natural language processing.
Wang holds a Bachelor's degree in Computer Science from Stanford University.
DeepSeek developed R1 as a ‘distilled A.I.’ model—a smaller model trained to replicate the behavior of larger A.I. systems. Distilled models consume less computational power and memory, making them an efficient solution for resource-constrained devices such as smartphones.
Microsoft and OpenAI are currently investigating whether DeepSeek stole OpenAI’s proprietary data through GPT APIs using ‘distillation’ techniques to build R1’s architecture. ‘While scaling and optimization are valuable, there are limits to how small an A.I. model or training process can be while still effectively simulating thinking and learning,’ said Itamar Friedman, the former director of machine vision at Alibaba.
In 2019, Microsoft invested $1 billion in OpenAI, a leading artificial intelligence research organization.
This partnership aimed to accelerate the development of cutting-edge AI technologies.
The collaboration has led to significant advancements in natural language processing, computer vision, and decision-making algorithms.
Microsoft's 'Azure cloud platform' provides OpenAI with scalable infrastructure to train and deploy AI models.
This strategic partnership has enabled both companies to push the boundaries of AI innovation, driving new applications and industries.
DeepSeek was founded in 2023 as a spinoff from the hedge fund High-Flyer, led by Liang Wenfeng. The company’s breakthrough has sent shockwaves through the A.I. industry, raising questions about the future of A.I. development and deployment.