- February 3, 2025
- Blog , Market Commentary
DeepSeek or DeepFake? The Moore’s Law Moment for Large Language Model
It’s been an amazing week to be a technology investor. We got this news out of China, out of a company called DeepSeek, that’s been around for a while actually, where they have broken the code in open code, creating a model that supposedly replicates the performance of some of the closed generative AI, or what is called large language models, using an open source code and model. This is huge. It’s a huge breakthrough, and people are calling this the Sputnik moment.
The question is, is this a DeepSeek or a deep fake? It’s an interesting debate. There are a lot of people in the US who have been caught off guard, saying, “Oh my gosh, how can the Chinese come up with a model using inferior infrastructure?” If you think about what happened with some of the restrictions that were put on chip exports to China, the US government disallowed the highest-performing chips from Nvidia to be exported to China.
You have this group here called DeepSeek that used a lower-performing chip from Nvidia to create these models because they didn’t have access to the higher-performing models, to see if they could create something that would replicate the higher-performing ones. The data came out, they made the code and the models available, and people started downloading. Lo and behold, the models were very competitive to what we see in the closed models that were developed by OpenAI and as we know, applications like ChatGPT.
The question is this the Sputnik moment? I would call this the Moore’s Law moment for generative AI. If you remember, Gordon Moore was a co-founder of Intel. He wrote a paper back in 1965 predicting that there would be a doubling of transistors in computing every couple of years, and that would drive incredible cost reductions and computing power, and technology adoption would accelerate over the years.
It’s incredible if you think about Moore’s Law today, if you do the mathematical computation of this doubling of transistors in computing, that is actually continuing and people have calculated this. It will continue even until 2034. It’s really come true, this idea that we are packing more and more computing power into chips. My sense is with this breakthrough by DeepSeek, going from closed models to open models and creating a model that is more efficient and very powerful using an open source, that this is Moore’s Law moment, which is going to really accelerate technology adoption.
Just a quick note about how DeepSeek works. It’s fascinating. If you look at it, it uses something that’s called a ‘mixture of experts’ approach. This is really like having a team of specialized experts with only the most relevant ones called upon for each task.
DeepSeek, uses about 671 billion parameters in their model. But what the model does is it only activates the subset of parameters for each request that is needed. Think of it this way. You walk into a library and you want to find a book on a topic. What the model does is finds only the relevant book and not all the books on the topic. It doesn’t have to read every single word or line in the library. It goes straight to the book that you need. That’s really the efficiency that’s happening here.
The ‘mixture of experts’ approach allows DeepSeek to optimize both performance and resource users dynamically adapting to different types of queries. This is in contrast to what ChatGPT does right now, which is the traditional transformer model, which is like having experts working on every task. It’s much less efficient than going straight to the expert that you need to deal with your query.
If you look at both the DeepSeek model and let’s say ChatGPT, what are the pros and cons? The pros for DeepSeek are that it’s open source, it’s cost-effective, and it’s efficient for coding and technical tasks. It has faster response times for structured queries, which is because it’s not looking at every single detail. ChatGPT and the closed models will probably give you a better answer, but it takes a lot longer. And because it has to read every single data point, it’s also going to be a lot more expensive.
If you think about it for simpler tasks, the types of things that are going to be embedded in application software on the service side, think of the traditional call center. That is going to be replaced by a much better long language model that will be your client services person when you call a call center. That’s going to be a computer and it’s probably going to be much better than a call center.
More of the pros for DeepSeek are superior performance in mathematical computations. The ChatGPT or the closed model may be more user-friendly in terms of the interface, but you have superior performance in mathematical computations already in DeepSeek, which is really big, and lower resource requirements. That’s the efficiency and the resources. This is everything from computing power, electricity, all those things. That’s why you saw a sell-off in a lot of the electricity companies, which just seems bizarre.
The closed model still has more consistent performance across various tasks, so if you need a model that is more broad-based, the closed models are there. The cons for DeepSeek are that there are potential biases in politically sensitive topics, and that may happen. The cons of ChatGPT or the closed models, it’s a subscription-based model and requires premium access. Further cons for DeepSeek are that it requires more verification for complex responses.
Okay, so clearly, you need to verify that data that’s in there because it’s an open-source model, so you have to be very careful with the output that it’s verified. It has a less intuitive interface for casual users. You can go and Google DeepSeek and see it. It’s very much for the developer. It’s not something for the casual user like ChatGPT, which anybody can download and use today. If you talk to your children who are in their 20s, probably everyone is using some form of ChatGPT or some of their competitors.
The other thing is there’s a stricter content moderation policy at DeepSeek and at ChatGPT, there’s similar stuff. It can be slower for technical computations, but again, there are some restrictions there. In terms of who’s the winner here, whether it’s DeepSeek or ChatGPT, it’s very interesting. For model architecture, both have strengths. If you look at DeepSeek, it uses that expert approach, which is something that’s very, very good, and ChatGPT uses a much more broad base approach. If you want more nuanced and more precise results, you’re still going to use the higher-end closed models. But again, for simpler tasks, probably you’re going to use DeepSeek.
Data visualizations with ChatGPT are much more developed. They’ve been out longer, they’ve worked on the interface. In technical performance, DeepSeek is actually ahead. You have superiority in mathematics and coding tasks. If you look at ChatGPT, it has strong general performance across tasks, but again, mathematics is better at DeepSeek at the moment. In terms of user experience, again, ChatGPT is much more user-friendly. Looking at cost efficiency, that’s the big news item here, because if you think about a query through Google search and a query through ChatGPT, the cost is 10 times higher to use a long language model on a closed model like ChatGPT versus a Google search.
This is very important to companies that have cost restrictions. The very large companies are deploying a lot of CapEx to deal with this, but if you think about smaller companies, they’re going to be looking very much at these open models that are much cheaper to deploy.
Now, data privacy Without a doubt, closed models hands down are the winner there. That would be the ChatGPTs. If you’re worried about data privacy, you’re not going to use an open model. There’s always going to be a coexistence of the two.
Next, customization. In an open model, you’re going to get a lot more customization, certainly out of open models. The nod there goes to DeepSeek. Response speed, at the moment, of course, it’s DeepSeek. The founders of OpenAI – that’s Sam Altman and a lot of his colleagues – have been falling over themselves in the last couple of days to say, “Oh, we’re coming out with new things and much more powerful kinds of models that will certainly compete very nicely versus DeepSeek.” There’s a real arms race here, if you want to call it that, when it comes to computing power.
When it comes to collaboration features, ChatGPT and the OpenAI applications have been out longer. There’s strong integration and sharing features that are there already. I’m sure on the open model side that will also be the case. We’re seeing some announcements even from some of the big companies. We’ve had earnings from some of the large companies like Microsoft and Meta and so forth.
Amazon just announced yesterday that it’s now allowing developers to use DeepSeek models in its cloud. Amazon is not only an eCommerce company. It is also a major cloud provider, AWS, and they’re allowing DeepSeek models that have been downloaded to be integrated into its cloud. We’re seeing that with Chinese companies as well. Companies like Alibaba have also announced. Alibaba is the Amazon and PayPal and everything combined in China. They’re now using these open source models based on DeepSeek to create their own open source models. Its models are learning from other open source models. It’s creating this domino effect of developers, which is very exciting.
The key takeaway here is that these open source models are going to accelerate the adoption of generative AI while vastly reducing cost. This is very exciting for us as technology investors. And what does this mean for positioning in terms of the investments? It is really the move from the infrastructure phase of this technology. If we think about the last five years, three, four years ago, it was all about cloud computing. Then all of a sudden, we had this massive breakthrough at the end of 2022 when it came to ChatGPT, and that was the open AI development.
Now, we’re at this point where we’re looking at democratization from a developer point of view with these open source models. What does that mean in our portfolio? We’re fading our exposure to the infrastructure side of that. There’s still going to be huge amounts of demand for chips. Of course, that includes the Nvidias and the Broadcoms and things like that. But now, the shift is going to the attention in the market and for investors, it’s going to be on the software side.
If we look at application software companies, everything from the incumbents, from Microsoft as well, who started being in bed with OpenAI as the cloud provider. If you look at OpenAI, it’s run on Azure. But now you see the synergies, if you want to call it that, at application software companies. Attention now should be paid to things like Salesforce or ServiceNow. These are the companies that have to pay a lot of money, meaning CapEx, to deploy all these new technologies.
The exciting thing for these companies is the cost of deploying some of these more basic application software services and unique applications is coming down. For us as investors, it’s very exciting to see this happen, and certainly, it’s going to drive the cost structure down to deploy this. Profit margins for software companies, which were spotlighted, is this CapEx that needs to be deployed to develop these new features? Is that going to crimp their margins? This definitely provides relief for that.
That’s it for our assessment of DeepSeek. We certainly do not think this is a deep fake, but this is the Moore’s Law moment for technology and we find that very exciting. Feel free to contact us if you have any questions!