On the importance of Open Source AI

Notes updated
A large man smoking a cigar (as depicted in the 1925 cartoon 'The Subsidised Mine Owner - Poor Beggar!'), is partially obscured by a pixelated white-and-grey cloud. In the background, part of the Wheel of Fortune can also be seen. The goddess who usually turns the wheel is hidden by the man in the foreground. Four characters (perhaps depicting the same person at different points in time) sit on the wheel. The character with a crown on top is holding two globes/trophy and has treasure on his lap. To his right, one character extends his hand towards him, while another looks on from the left. The fourth character is already falling. Two black, dotted, squares outline the large man and the wheel. To the left of the wheel, in the background, there is a pixelated yellow-orange sun. To the right, are some sparkling, pixelated yellow stars.

Below is a formatted transcript Yann LeCun's keynote speech from a couple of days ago at the UN Open Source Week 2026.

He's saying some important things here, not least that fears around AI are overblown, probably to ensure that regulation destroys legitimate Open Source AI alternatives. I like his analogy of the printing press and use of the phrase "medieval obscurantism".

There's a summary and some commentary here, while LeCun is putting his money and effort where his mouth is with Tapesty, which the GitHub repo describes as an effort "to give every nation and participant frontier AI they can call their own — uniting a global consortium to train a shared frontier model from which partners build and own sovereign models aligned to their national, socio-cultural, and industrial needs."

Here's the transcript:

Thank you very much for the introduction. Thank you for attending this session. I think this is a very timely event, and I'm truly honored to be here and to tell you about open source.

So first of all, I came to this building 18 months ago and argued for AI open source at the UN Security Council. The message I'm going to deliver today is very similar to the one I delivered 18 months ago, which didn't go very far back then. I'm hoping since then many things have happened which I think motivate more countries around the world to have AI sovereignty, but also to subscribe and support an open source approach to it.

AI is becoming quickly a platform. To some extent, actually, it has already become a platform that a lot of us relies on. When I'm talking about AI, I'm not necessarily talking about specialized application of AI, but the AI that is built around large language models.

What is happening is that increasingly, AI is mediating all of our interaction with the digital world, with information more generally. Now, if all of our information is being mediated by AI systems, and by the way, increasingly, this is going to become more prevalent because in addition to smartphones, we're going to be running around with smart devices like the smart glasses I'm wearing at the moment. I can take a picture of you. I hope you smiled. Increasingly, we're going to just ask questions to our AI assistant, which will be with us at all times. It will be personalized AI assistants that hear what we hear. Eventually, they will see what we see and they'll become our best digital friend. They will really become a staff member, if you want. All of us will be acting like a manager being constantly followed by a staff of helpers, but those helpers will be digital.

Now, it's already happening to a large extent that whenever we have a question, we ask an AI assistant. We don't go to a primary source. We certainly don't go to libraries anymore, sadly, and it's a very drastic change of behaviour which may lead to drastic changes in society.

If the information, our information diet, is entirely mediated by AI systems, and those AI systems are proprietary systems produced by a handful of companies, otherwise [it] goes to the US and China. It's very dangerous for culture, linguistic diversity, diversity in centres of interest, in value systems, in political opinions and biases, and for democracy, for human rights.

We cannot afford that all of the information is funneled through systems that are absolutely necessarily biased. There is no such thing as an unbiased AI system.

So what should we do? If we want sovereignty, if we want to preserve cultural diversity, linguistic diversity, if we want people to have access to a wide diversity of AI systems, in my mind, the only way to get to that point is open source AI platforms, or open AI platforms, I should say, open AI platforms. Because most countries around the world cannot necessarily afford, or maybe don't have the resources or the talents, to actually build their own LLM.

A number of countries can do this. The LLMs they produce are good, they're not great. They're not at the top, but they're good. The talents are there. Some countries [have] significant resources in compute, but there is a way that an open source effort that would be collaborative around the world could actually surpass in performance the proprietary systems. Because in the end, people will just use the best system that's around.

Here is how it would work. Each country, each region, each academic institution, whatever it is, would digitize their own cultural material and will contribute to training a global AI system that would constitute a repository of all human knowledge. But they would not have to communicate the data.

They could contribute to training a global model by exchanging parameter vectors, which are how AI systems can distill the data down to knowledge.

I was trying to popularize this idea. I've been trying to popularize this idea for almost three years now, and I talked about it at the UN Security Council, trying to convince also the leadership at Meta to play an important role in there, but that didn't quite fly.

After I left Meta, I started, with some colleagues at the AI Alliance – which is a nonprofit that is promoting open source AI – we decided to start a project called Project Tapestry. You can Google "Project Tapestry", and it's a confederation of partners that can contribute to training a global AI model, while preserving sovereignty on the data and only exchanging parameter vectors as often as possible so that, collaboratively, each region in the world, each academic group, each private company possibly contributes to training a big model while retaining sovereignty on the data, using their own computing infrastructure if they have some.

At the end, we'll get a system that speaks all the languages in the world, understands all the value systems – at least at the basic level – and all the cultural biases, political philosophies, et cetera, and centres of interest.

Now, once we have such an open platform, anybody can take this open platform and fine tune it for their own purpose, whether it's a commercial enterprise or a government or an academic group or a non‑profit, to serve a particular population. That way, people will have access to a wide diversity of AI assistant. We need such a high diversity of AI assistant for the same reason we need a high diversity of the press.

Now, there are issues with this. One issue is to get everybody to work together. The Tapestry project is very much bottom up. It's people who have expertise in training LLM and other AI models who decide to collaborate. There's a GitHub repository, you can just sign up. There's no heavy infrastructure or authorisations to get. The means and the resources are provided by the participants themselves, so that can be completely self‑organised.

But of course, there needs to be political support for it. If governments tell their academics, their companies, and give them an incentive to participate in this project, of course, it will go faster.

At present, Project Tapestry has participants. There was an inaugurating workshop that took place early May in Paris. We had participants from several countries in the EU, Switzerland, UK, United Arab Emirates, India, Kazakhstan, Vietnam, Japan, Korea, various groups from academia and industry, groups like IBM, NVIDIA, AMD, Intel – all the hardware suppliers, basically the main ones. There is a groundswell of interest for this project and we see new partners signing up every day.

So I see the history of AI platforms as following the history of the software and hardware platforms of the Internet. In the late 1990s, if you wanted to start an Internet service of some kind, a website, you would have to buy proprietary hardware from Sun Microsystems, Dell, Hewlett Packard and other companies, and then use their proprietary operating system software on top of it.

All of these were completely wiped out in the early 2000s when people started using commodity hardware with open source software stack. The same thing is going to happen.

A similar thing happened also for the software stack of the mobile communication network. Your cell phone very likely actually runs an open source operating system, unless you have an iPhone. It talks to a cell phone tower that runs an open source software stack based on Linux.

So there is a push by the market. It's not just government decision. The market wants open source platforms because it's cheaper, it's more secure. It's easier to port, to run locally, if you need to, to preserve privacy and everything. There's tons and tons of advantages, which is why we're having this meeting today – I mean this week, I should say.

So, you know, I think that is the direction of history. It's inevitable. Governments should embrace it and accelerate its progress.

Now there is another discourse around AI which is essentially opposed to this, and it's a discourse that essentially claims that AI technology is intrinsically dangerous and should be regulated – its access should be regulated – because bad people will do bad things with it, either with cybersecurity or getting a recipe for a bio‑weapon or something like this.

I think those dangers are very widely overstated. I don't think those dangers are nearly as bad as some people have claimed, some people in the industry and academia have claimed.

I think the alternative, where if you believe AI is intrinsically dangerous and you should regulate its access and therefore open source AI should be banned, I think is extremely dangerous for democracy and human culture in general, as I pointed out earlier.

But those arguments sometimes are justified by security arguments, which I think are contrary. This is a big debate, and I disagree with some of my friends on this issue. But I think to some extent limiting access to AI technology because of security reasons is akin to, in the 15th century, limiting the use of the printing press because, of course, we can control what information will be disseminated through printing.

I think this is akin to medieval obscurantism. I think it's very, very dangerous to limit access to this technology which essentially provides access to all of human knowledge to a wide population.

I'm really happy that this event this week is taking place.

My last point is that there is a feeling in many countries that are neither the US nor China that they've lost the race towards AI and that it's now in the hands of American industry and Chinese industry and there is no way to catch up given the size of the investments that are necessary.

I don't think that's the case. There is a race which perhaps many countries have lost or are losing to build LLM‑based AI systems. Those systems require enormous amounts of computing power. To run them, they require enormous amounts of memory. The reason they require a lot of memory and compute power is because they need to be very large, because they basically are accumulating declarative knowledge.

Those systems are not very smart. They're very good at storing declarative knowledge and regurgitating it at the right time to questions.

They're only smart in two areas which are very special: coding and mathematics. There they can invent new things a little bit. But those are very specific domains where the substrate reasoning is actually linked to the language, first of all, and second of all, if a system figures out a new program or a new theorem, it can be automatically verified whether it's correct or not. Those are very specific domains.

The self‑improvement of AI systems does not apply to any other domain, essentially, or [only] in very limited ways.

So what's going to happen is that we're going to have another revolution in AI. My new company is based on this hypothesis that we're going to have a new revolution, particularly [with] AI systems that can deal with the real world. LLMs are really good at dealing with language and sequences of discrete symbols, not so good at handling the real world, which is why we have systems that can pass the bar exam, solve mathematical problems and write code, but we don't have domestic robots.

AI is still completely insufficient to deal with the real world. That's going to be the next revolution. It's up for grabs. And we're not going to have human‑level intelligence over the next two or three years. This is going to take much longer.

But there are opportunities for international collaborations, both for LLM‑based AI systems and for this next generation of AI systems.

Thank you very much.


Source: Original article · Are.na block · TechFreedom channel · Image: Marcin Wilkowski


Comments (0)

No comments yet. Be the first.

Never shown publicly, used only for Gravatar