“The Heavy Press Program was a Cold War-era program of the United States Air Force to build the largest forging presses and extrusion presses in the world.” This ”program began in 1944 and concluded in 1957 after construction of four forging presses and six extruders, at an overall cost of $279 million. Six of them are still in operation today, manufacturing structural parts for military and commercial aircraft” [1].
$279mm in 1957 dollars is about $3.2bn today [2]. A public cluster of GPUs provided for free to American universities, companies and non-profits might not be a bad idea.
"Eventually though, open source Linux gained popularity – initially because it allowed developers to modify its code however they wanted ..."
I find the language around "open source AI" to be confusing. With "open source" there's usually "source" to open, right? As in, there is human legible code that can be read and modified by the user? If so, then how can current ML models be open source? They're very large matrices that are, for the most part, inscrutable to the user. They seem akin to binaries, which, yes, can be modified by the user, but are extremely obscured to the user, and require enormous effort to understand and effectively modify.
"Open source" code is not just code that isn't executed remotely over an API, and it seems like maybe its being conflated with that here?
- Instead of just wrapping proprietary API endpoints, developers can now integrate AI deeply into their products in a very cost-effective and performant way
- Price race to the bottom with near-instant LLM responses at very low prices are on the horizon
As a founder, it feels like a very exciting time to build a startup as your product automatically becomes better, cheaper, and more scalable with every major AI advancement. This leads to a powerful flywheel effect: https://www.kadoa.com/blog/ai-flywheel
Even if it's just open weights and not "true" open source, I'll still give Meta the appreciation of being one of the few big AI companies actually committed to open models. In an ecosystem where groups like Anthropic and OpenAI keep hemming and hawing about safety and the necessity of closed AI systems "for our sake", they stand out among the rest.
They are positioning themselves as champions of AI open source mostly because they were blindsided by OpenAI, are not in the infra game, and want to commoditize their complements as much as possible.
This is not altruism although it's still great for devs and startups. All FB GPU investments is primarily for new AI products "friends", recommendations and selling ads.
I wish Meta stopped using the "open source" misnomer for free of charge weights. In the US the FTC already uses the term Open-Weights, and it seems the industry is also adopting this term (e.g. Mistral).
Someone can correct me here but AFAIK we don't even know which datasets are used to train these models, so why should we even use "open" to describe Llama? This is more similar to a freeware than an open-source project.
Meta makes their money off advertising, which means they profit from attention.
This means they need content that will grab attention, and creating open source models that allow anyone to create any content on their own becomes good for Meta. The users of the models can post it to their Instagram/FB/Threads account.
Releasing an open model also releases Meta from the burden of having to police the content the model generates, once the open source community fine-tunes the models.
Overall, this move is good business move for Meta - the post doesn't really talk about the true benefit, instead moralizing about open source, but this is a sound business move for Meta.
Huge companies like facebook will often argue for solutions that on the surface, seem to be in the public interest.
But I have strong doubts they (or any other company) actually believe what they are saying.
Here is the reality:
- Facebook is spending untold billions on GPU hardware.
- Facebook is arguing in favor of open sourcing the models, that they spent billions of dollars to generate, for free...?
It follows that companies with much smaller resources (money) will not be able to match what Facebook is doing. Seems like an attempt to kill off the competition (specifically, smaller organizations) before they can take root.
> We’re releasing Llama 3.1 405B, the first frontier-level open source AI model, as well as new and improved Llama 3.1 70B and 8B models.
Bravo! While I don't agree with Zuck's views and actions on many fronts, on this occasion I think he and the AI folks at Meta deserve our praise and gratitude. With this release, they have brought the cost of pretraining a frontier 400B+ parameter model to ZERO for pretty much everyone -- well, everyone except Meta's key competitors.[a] THANK YOU ZUCK.
Meanwhile, the business-minded people at Meta surely won't mind if the release of these frontier models to the public happens to completely mess up the AI plans of competitors like OpenAI/Microsoft, Google, Anthropic, etc. Come to think of it, the negative impact on such competitors was likely a key motivation for releasing the new models.
---
[a] The license is not open to the handful of companies worldwide which have more than 700M users.
I've summarized this entire thread in 4 lines (didn't even use AI for it!)
Step 1. Chick-Fil-A releases a grass-fed beef burger to spite other fast-food joints, calls it "the vegan burger"
Step 2. A couple of outraged vegans show up in the comments, pointing out that beef, even grass-fed beef, isn't vegan
Step 3. Fast food enthusiasts push back: it's unreasonable to want companies to abide by this restrictive definition of "vegan". Clearly this burger is a gamechanger and the definition needs to adapt to the times.
I.e., the more important thing - the more "free" thing - is the licensing now.
E.g., I play around with different image diffusion models like Stable Diffusion and specific fine-tuned variations for ControlNet or LoRA that I plug into ComfyUI.
But I can't use it at work because of the licensing. I have to use InvokeAI instead of ComfyUI if I want to be careful and only very specific image diffusion models without the latest and greatest fine-tuning. As others have said - the weights themselves are rather inscrutable. So we're building on more abstract shapes now.
But the key open thing is making sure (1) the tools to modify the weights are open and permissive (ComfyUI, related scripts or parts of both the training and deployment) and (2) the underlying weights of the base models and the tools to recreate them have MIT or other generous licensing. As well as the fine-tuned variants for specific tasks.
It's not going to be the naive construction in the future where you take a base model and as company A you produce company A's fine tuned model and you're done.
It's going to be a tree of fine-tuned models as a node-based editor like ComfyUI already shows and that whole tree has to be open if we're to keep the same hacker spirit where anyone can tinker with it and also at some point make money off of it. Or go free software the whole way (i.e., LGPL or equivalent the whole tree of tools).
> This is how we’ve managed security on our social networks – our more robust AI systems identify and stop threats from less sophisticated actors who often use smaller scale AI systems.
Ok, first of all, has this really worked? AI moderators still can't capture the mass of obvious spam/bots on all their platforms, threads included. Second, AI detection doesn't work, and with how much better the systems are getting, it's probably never going to, unless you keep the best models for yourself, and it's is clear from the rest of the note that its not zuck's intention to do so.
> As long as everyone has access to similar generations of models – which open source promotes – then governments and institutions with more compute resources will be able to check bad actors with less compute.
This just doesn't make sense. How are you going to prevent AI spam, AI deepfakes from causing harm with more compute? What are you gonna do with more compute about nonconsensual deepfakes? People are already using AI to bypass identity verification on your social media networks, and pump out loads of spam.
This is really good news. Zuck sees the inevitability of it and the dystopian regulatory landscape and decided to go all in.
This also has the important effect of neutralizing the critique of US Government AI regulation because it will democratize "frontier" models and make enforcement nearly impossible. Thank you, Zuck, this is an important and historic move.
It also opens up the market to a lot more entry in the area of "ancillary services to support the effective use of frontier models" (including safety-oriented concerns), which should really be the larger market segment.
The "open source" part sounds nice, though we all know there's nothing particularly open about the models (or their weights). The barriers to entry remain the same - huge upfront investments to train your own, and steep ongoing costs for "inference".
Is the vision here to treat LLM-based AI as a "public good", akin to a utility provider in a civilized country (taxpayer funded, govt maintained, non-for-profit)?
I think we could arguably call this "open source" when all the infra blueprints, scripts and configs are freely available for anyone to try and duplicate the state-of-the-art (resource and grokking requirements nonwithstanding)
Sure but under what license?
Because slapping “open source” on the model doesn’t make it open source if it’s not actually license that way. The 3.1 license still contains their non-commercial clause (over 700m users) and requires derivatives, whether fine tunings or trained on generated data, to use the llama name.
Interesting discussion! While I agree with Zuckerberg's vision, the comments raise valid concerns.
The point about GPU accessibility and cost is crucial. Public clusters are great, but sustainable funding and equitable access are essential to avoid exacerbating existing inequalities.
I also resonate with the call for CUDA alternatives. Breaking the dependence on proprietary technology is key for a truly open AI ecosystem.
While existing research clusters offer some access, their scope and resources often pale in comparison to what companies like Meta are proposing.
We need a multi-pronged approach: open-sourcing models AND investing in accessible infrastructure, diverse hardware options, and sustainable funding models for a truly democratic AI future.
> Third, a key difference between Meta and closed model providers is that selling access to AI models isn’t our business model. That means openly releasing Llama doesn’t undercut our revenue, sustainability, or ability to invest in research like it does for closed providers. (This is one reason several closed providers consistently lobby governments against open source.)
The whole thing is interesting, but this part strikes me as potentially anticompetitive reasoning. I wonder what the lines are that they have to avoid crossing here?
Additional Commercial Terms. If, on the Llama 3.1 version release date, the monthly active users of the products or services made available by or for Licensee, or Licensee’s affiliates, is greater than 700 million monthly active users in the preceding calendar month, you must request a license from Meta, which Meta may grant to you in its sole discretion, and you are not authorized to exercise any of the rights under this Agreement unless or until Meta otherwise expressly grants you such rights.
Which open-source has such restrictions and clause?
Open source "AI" is a proxy for democratising and making (much) more widely useful the goodies of high performance computing (HPC).
The HPC domain (data and compute intensive applications that typically need vector, parallel or other such architectures) have been around for the longest time, but confined to academic / government tasks.
LLM's with their famous "matrix multiply" at their very core are basically demolishing an ossified frontier where a few commercial entities (Intel, Microsoft, Apple, Google, Samsung etc) have defined for decades what computing looks like for most people.
Assuming that the genie is out of the bottle, the question is: what is the shape of end-user devices that are optimally designed to use compute intensive open source algorithms? The "AI PC" is already a marketing gimmick, but could it be that Linux desktops and smartphones will suddenly be "ΑΙ natives"?
For sure its a transformational period and the landscape T+10 yrs could be drastically different...
I think it's interesting to think about this question of open source, benefits, risk, and even competition, without all of the baggage that Meta brings.
I agree with the FTC, that the benefits of open-weight models are significant for competition. The challenge is in distinguishing between good competition and bad competition.
Some kind of competition can harm consumers and critical public goods, including democracy itself. For example, competing for people's scarce attention or for their food buying, with increasingly optimized and addictive innovations. Or competition to build the most powerful biological weapons.
Other kinds of competition can massively accelerate valuable innovation.
The FTC must navigate a tricky balance here — leaning into competition that serves consumers and the broader public, while being careful about what kind of competition it is accelerating that could cause significant risk and harm.
It's also obviously not just "big tech" that cares about the risks behind open-weight foundation models. Many people have written about these risks even before it became a subject of major tech investment. (In other words, A16Z's framing is often rather misleading.) There are many non-big tech actors who are very concerned about current and potential negative impacts of open-weight foundation models.
One approach which can provide the best of both worlds, is for cases where there are significant potential risks, to ensure that there is at least some period of time where weights are not provided openly, in order to learn a bit about the potential implications of new models.
Longer-term, there may be a line where models are too risky to share openly, and it may be unclear what that line is. In that case, it's important that we have governance systems for such decisions that are not just profit-driven, and which can help us continue to get the best of all worlds. (Plug: my organization, the AI & Democracy Foundation; https://ai-dem.org/; is working to develop such systems and hiring.)
In general I look back on my time at FB with mixed feelings, I’m pretty skeptical that modern social media is a force for good and I was there early enough to have moved the needle.
But this is really positive stuff and it’s nice to view my time there through the lens of such a change for the better.
Keep up the good work on this folks.
Time to start thinking about opening up a little on the training data.
The irony of this letter being written by Mark Zuckerburg at Meta, while OpenAI continues to be anything but open, is richer than anyone could have imagined.
Meanwhile Facebook is flooded with AI-generated slop with hundreds of thousands of other bots interacting with it to boost it to whoever is insane enough to still use that putrid hellhole of a mass-data-harvesting platform.
Dead internet theory is very much happening in real time, and I dread what's about to come since the world has collectively decided to lose their minds with this AI crap. And people on this site are unironically excited about this garbage that is indistinguishable from spam getting more and more popular. What a fucking joke
I thoroughly support Meta's open-sourcing of these AI models going forward. However, for a company that absolutely closed down discussions about providing API access to their platform, I'm left wondering what's in it (monetarily) for them by doing this? Is it to simply undercut competition in the space, like some grocery store selling below cost?
Not the usual nation-state rhetoric, but something that justifies that closed source leads to better user-experience and fewer security and privacy issues.
An ecosystem that benefits vendors, customers, and the makers of close source?
Are there historical analogies other than Microsoft Windows or Apple iPhone / iOS?
It'll be interesting to come back here in a couple of years and see what's left. What do they even do anymore? They have Facebook, which hasn't visibly changed in a decade. They have Instagram, which feels a bit sleeker but also remained more or less the same. and Whatsapp. Ad network that runs on top of those services and floods them with trash. Bunch of stuff that doesn't seem to exist anymore - Libra, the grandiose multibillion dollar Legless VR, etc.
But they still have 70 thousand people (a small country) doing _something_. What are they doing? Updating Facebook UI? Not really, the UI hasn't been updated, and you don't need 70 thousand people to do that. Stuff like React and Llama? Good, I guess, we'll see how they make use of Llama in a couple of years. Spellcheck for posts maybe?
This is a very important concern in Health Care because of HIPAA compliance. You can't just send your data over the wire to someone's proprietary API. You would at least need to de-identify your data. This can be a tricky task, especially with unstructured text.
Just added Llama 3.1 405B/70B/8B to https://double.bot (VSCode coding assistant) if anyone would like to try it.
---
Some observations:
* The model is much better at trajectory correcting and putting out a chain of tangential thoughts than other frontier models like Sonnet or GPT-4o. Usually, these models are limited to outputting "one thought", no matter how verbose that thought might be.
* I remember in Dec of 2022 telling famous "tier 1" VCs that frontier models would eventually be like databases: extremely hard to build, but the best ones will eventually be open and win as it's too important to too many large players. I remember the confidence in their ridicule at the time but it seems increasingly more likely that this will be true.
> My framework for understanding safety is that we need to protect against two categories of harm: unintentional and intentional. Unintentional harm is when an AI system may cause harm even when it was not the intent of those running it to do so. For example, modern AI models may inadvertently give bad health advice. Or, in more futuristic scenarios, some worry that models may unintentionally self-replicate or hyper-optimize goals to the detriment of humanity. Intentional harm is when a bad actor uses an AI model with the goal of causing harm.
Okay then Mark. Replace "modern AI models" with "social media" and repeat this statement with a straight face.
When Zuck said spy can easily steal models, I wonder how much of it comes from experiences. I remember they struggled to train OPT not long ago.
On a more serious note, I don't really buy his arguments about safety. First, widespread AI does not reduce unintentional harm but increases it, because the rate of accident is compound. Second, the chance of success for threat actors will increase, because of the asymmetric advantage of gaining access to all open information and hiding their own information. But there is no reverse at this point, I enjoy it while it lasts, AGI will come sooner or later anyway.
How are smaller models distilled from large models, I know of LoRA, quantization like technique; but does distilling also mean generating new datasets for conversing with smaller models entirely from the big models for many simpler tasks?
I think all this discussion around Open-source AI is a total distraction from the elephants in the room. Let's list what you need to run/play around with something like Llama:
1. Software: this is all Pytorch/HF, so completely open-source. This is total parity between what corporates have and what the public has.
2. Model weights: Meta and a few other orgs release open models - as opposed to OpenAI's closed models. So, ok, we have something to work with.
3. Data: to actually do anything useful you need tons of data. This is beyond the reach of the ordinary man, setting aside the legality issues.
4. Hardware: GPUs, which are extremely expensive. Not just that, even if you have the top dollars, you have to go stand in a queue and wait for O(months), since mega-corporates have gotten there before you.
For Inference, you need 1,2 and 4. For training (or fine-tuning), you need all of these. With newer and larger models like the latest Llama, 4 is truly beyond the reach of ordinary entities.
This is NOTHING like open-source, where a random guy can edit/recompile/deploy software on a commodity computer. Wrt LLMs, Data/Hardware are in the equation, the playing field is complete stacked. This thread has a bunch of people discussing nuances of 1 and 2, but this bike-shedding only hides the basic point: Control of LLMs are for mega-corps, not for individuals.
I'm really unsure if it's a good idea given the current geopolitics.
Open-Source Code in the past was fantastic because the West had a monopoly on CPUs and computers. Sharing and contributing was amazing while ensured that tyrants couldn't use this tech to harm people simply because they don't have a hardware to run.
But now, things are different. China is advancing in chip technology, and Russia is using open-source AI to harm people on the scale today, with auto-targeting drones being just the start. Red sea conflict etc.
And somehow, Zuckerberg keeps finding ways to mess up people's lives, despite having the best intentions.
Right now you can build a semi-autonomous drone with AI to kill people for ~$500-700. The western world will still use safe and secure commercial models. While new axis of evil will use models based on Meta or any other open source to do whatever harm they can imagine with not a hint of control.
This particular model. Fine-tune it to develop a nuclear bomb using all possible research that level of government can get on the scale. Killing drone swarms etc. Once the knowledge got public these models can be a base model to get expert-level knowledge to anyone who wants it, uncensored. Especially if you are government that wants to destroy a peaceful order for whatever reason.
Only if it is truly open source (open data sets, transparent curation/moderation/censorship of data sets, open training source code, open evaluation suites, and an OSI approved open source license).
Open weights (and open inference code) is NOT open source, but just some weak open washing marketing.
The model that comes closest to being TRULY open is AI2’s OLMo. See their blog post on their approach:
I think the only thing they’re not open about is how they’ve curated/censored their “Dolma” training data set, as I don’t think they explicitly share each decision made or the original uncensored dataset:
Totally tangential thought, probably doomed to be lost in the flood of comments on this very interesting announcement.
I was thinking today about Musk, Zuckerberg and Altman. Each claims that the next version of their big LLMs will be the best.
For some reason it reminded me of one apocryphal cause of WW1, which was that the kings of Europe were locked in a kind of ego driven contest. It made me think about the Nation State as a technology. In some sense, the kings were employing the new technology which was clearly going to be the basis for the future political order. And they were pitting their own implementation of this new technology against the other kings.
I feel we are seeing a similar clash of kings playing out. The claims that this is all just business or some larger claim about the good of humanity seem secondary to the ego stakes of the major players. And when it was about who built the biggest rocket, it felt less dangerous.
It breaks my heart just a little bit. I feel sympathy in some sense for the AIs we will create, especially if they do reach the level of AGI. As another tortured analogy, it is like a bunch of competitive parents forcing their children into adversarial relationships to satisfy the parent's ego.
this is very cool indeed that meta has made available more than they need to in terms of model weights.
however, the "open-source" narrative is being pushed a bit too much like descriptive ML models were called "AI", or applied statistics "data science". with reinforced examples such as this, we start to lose the original meaning of the term.
the current approach of startups or small players "open-sourcing" their platforms and tools as a means to promote network effect works but is harmful in the long run.
you will find examples of terraform and red hat happening, and a very segmented market. if you want the true spirit of open-source, there must be a way to replicate the weights through access to training data and code. whether one could afford millions of GPU hours or not, real innovation would come from remixing the internals, and not just fine-tuning existing stuff.
i understand that this is not realistically going to ever happen, but don't perform deceptive marketing at the same time.
I am not deep into llms so I ask this.
From my understanding, their last model was open source but it was in a way that you can use them but the inner working were "hidden"/not transparent.
With the new model, I am seeing alot of how open source they are and can be build upon. Is it now completely open source or similar to their last models ?
The real path forward is recognizing what AI is good at and what it is bad at. Focus on making what it is good at even better and faster. Open AI will definitely give us that option but it isn't a miracle worker.
My impression is that AI if done correctly will be the new way to build APIs with large data sets and information. It can't write code unless you want to dump billions of dollars into a solution with millions of dollars of operational costs. As it stands it loses context too quickly to do advance human tasks. BUT this is where it is great at assembling data and information. You know what is great at assembling data and information? APIs.
Think of it this way if we can make it faster and it trains on a datalake for a company it could be used to return information faster than a nested micro-service architecture that is just a spiderweb of dependencies.
Because AI loses context simple API requests could actually be more efficient.
The question is what is "open source" in the case of a matrix of numbers, as opposed to code.
Also, are there any "IP" rights attached at all to a bunch numbers coming out of a formula that someone else calculated for you? (edit: after all, a "model" is just a matrix of numbers coming out of running a training algorithm that is not owned by Meta over training data that is not owned by Meta.)
Meta imposes a notification duty AND a request for another license (no mention of the details of these) for applications of their model with a large number of users. This is against the spirit of open source. (In practical terms it is not a show stopper since you can easily switch models, although they all have subtlely different behaviours and quality levels.)
Open source is a welcome step but what we really need is complete decentralisation so people can run their own private AI Models that keep all the data private to them. We need this to happen locally on laptops, mobile phones, smart devices etc. Waiting for when that will become ubiquitous.
From the "Why Open Source AI Is Good for Meta" section, none of the four given reasons seem to justify spending so much money to train these powerful models and give them away for free.
> Third, a key difference between Meta and closed model providers is that selling access to AI models isn’t our business model. That means openly releasing Llama doesn’t undercut our revenue, sustainability, or ability to invest in research like it does for closed providers. (This is one reason several closed providers consistently lobby governments against open source.)
Maybe this is a strategic play to hurt other AI companies that depend on this business model?
I strongly suspect that what AI will end up doing is push companies and organizations towards open source, they will eventually realize that code is already being shared via AI channels, so why not do it legally with open source?
Thanks to Meta for their work on safety, particularly Llama Guard. Llama Guard 3 adds defamation, elections, and code interpreter abuse as detection categories.
Having run many red teams recently as I build out promptfoo's red teaming featureset [0], I've noticed the Llama models punch above their weight in terms of accuracy when it comes to safety. People hate excessive guardrails and Llama seems to thread the needle.
Deployment of PKI-signed distributed software systems to use community-provisioned compute, bandwidth and storage at scale is, now quite literally, the future.
We mostly don’t all want or need the hardware to run these AIs ourselves, all the time. But, when we do, we need lots of it for a little while.
This is what Holochain was born to do. We can rent massive capacity when we need it, or earn money renting ours when we don’t.
All running cryptographically trusted software at Internet scale, without the knowledge or authorization of commercial or government “do-gooders”.
CrowdStrike just added "Centralized Company Controlled Software Ecosystem" to every risk data sheet on the planet. Everything futureproof is self-hosted and open source.
Has anyone taken apart the llama community license and compared it to validated open source licenses? Red Hat is making a big deal about releasing the Granite LLM released under Apache. Is there a real difference between that and what Llama does?
> Our adversaries are great at espionage, stealing models that fit on a thumb drive is relatively easy, and most tech companies are far from operating in a way that would make this more difficult.
Mostly unrelated to the correctness of the article, but this feels like a bad argument. AFAIK, Anthropic/OpenAI/Google are not having issues with their weights being leaked (are they?). Why is it that Meta's model weights are?
The All-In podcast predicted this exact strategy for keeping OpenAI and other upstarts from disrupting the existing big tech firms.
By giving away higher and higher quality models, they undermine the potential return on investment for startups who seek money to train their own. Thus investment in foundation model building stops and they control the ecosystem.
Can't it be divided into multiple parts to have a more meaningful discussion? For example the terminology could identify four key areas:
- Open training data (this is very big)
- Open training algorithms (does it include infrastructure code?)
- Open weights (result of previous two)
- Open runtime algorithm
Ok one notable difference: did the linux researchers of yore warn about adversarial giants getting this tech? Or is this unique to the current moment? That for me is the largest question when considering the logical progression on "linux open is better therefore ai open is better".
- We need to control our own destiny and not get locked into a closed vendor.
- We need to protect our data.
- We want to invest in the ecosystem that’s going to be the standard for the long term.
Thank you Meta for being the bright light of ethical guidance for us all.
We don't get the data or training code. The small runtime framework is open source but that's of little use as its largely fixed in implementation due to the weights. Yes we can fine tune but that is akin to modifying video games - we can do it but there's only so much you can do within reasonable effort and no one would call most video games 'open source'*.
Its freeware and Meta's strategy is much more akin to the strategy Microsoft used with Internet Explorer to capture the web browser market. No one was saying god bless Microsoft for trying to capture the browser market with I.E. Nothing wrong with Meta's strategy just don't call it open source.
*weights are data and so is the video/audio output of a video game. If we gave away that video game output for free we wouldn't call the video game open source as the myriad freeware games essentially do.
The truth is we need both closed and open source, they both have their discovery path and advantages and disadvantages, there shouldn’t be a system where one is eliminated over the other. They also seem to be driving each other forward via competition.
No matter how "open source" they actually will be, I'm glad this exists as a competitor to Gemini and ChatGPT to help push innovation from a different angle.
Can't wait to see how the landscape will look in 2027 and beyond.
This is obviously good news, but __personally__ I feel the open-source models are just trying to catch up with whoever the market leader is, based on some benchmarks.
The actual problem is running these models. Very few companies can afford the hardware to run these models privately. If you run them in the cloud, then I don't see any potential financial gain for any company to fine-tune these huge models just to catch up with OpenAI or Anthropic, when you can probably get a much better deal by fine-tuning the closed-source models.
Also this point:
> We need to protect our data. Many organizations handle sensitive data that they need to secure and can’t send to closed models over cloud APIs.
First, it's ironic that Meta is talking about privacy. Second, most companies will run these models in the cloud anyway. You can run OpenAI via Azure Enterprise and Anthropic on AWS Bedrock.
And this is happening RIGHT as a new potential leader is emerging in Llama 3.1. I'm really curious about how this is going to match up on the leaderboards...
I love how Zuck decided to play a new game called “commoditize some other billionaire's business to piss him”, I can't wait until this becomes a trend and we get plenty of open source cool stuff.
If he really wants to replicate Linux's success against proprietary Unices, he needs to release Llama with some kind of GPL equivalent, that forces everyone to play the open source game.
Another case of "open-washing". Llama is not available open source, under the common definition of open source, as the license doesn't allow for commercial re-use by default [0].
They provide their model, with weights and code, as "source available" and it looks like they allow for commercial use until a 700M monthly subscriber cap is surpassed. They also don't allow you to train other AI models with their model:
"""
...
v. You will not use the Llama Materials or any output or results of the Llama Materials to improve any other large language model (excluding Meta Llama 3 or derivative works thereof).
...
"""
Reality: they've realize gpt 4 is a wall, they can't keep pouring trillions of dollars into it for no improvement or little at all, so now they want to put it to the open source until someone figures out the next step then they'll take it behind closed doors again.
I hate how the moment it's too late will be, by design, closed doors.
I see it as a new race to build the personal computer (PC) all over again. I hope we can apply the lessons learned and can jump into open source to speed up development and democratize ai for all. We know how Microsoft played dirty in the early days of the PC revolution.
I expect this to end up having been one of the worst timed blog posts in history. Open source AI has mostly been good for the world up until now, but we're getting to the point where we're about to find out why open-sourcing sufficiently bad models is a terrible idea.
I appreciate that Mark Zuckerberg soberly and neutrally talked about some of the risks from advances in AI technology. I agree with others in this thread that this is more accurately called "public weights" instead of open source, and in that vein I noticed some issues in the article.
> This is one reason several closed providers consistently lobby governments against open source.
Is this substantially true? I've noticed a tendency of those who support the general arguments in this post to conflate the beliefs of people concerned about AI existential risk, some of whom work at the leading AI labs, with the position of the labs themselves. In most cases I've seen, the AI labs (especially OpenAI) have lobbied against any additional regulation on AI, including with SB1047[1] and the EU AI Act[2]. Can anyone provide an example of this in the context of actual legislation?
> On this front, open source should be significantly safer since the systems are more transparent and can be widely scrutinized. Historically, open source software has been more secure for this reason.
This may be true if we could actually understand what was happening in neural networks, or train them to consistently avoid unwanted behaviors. As things are, the public weights are simply inscrutable black boxes, and the existence of jailbreaks and other strange LLM behaviors show that we don't understand how our training processes create models' emergent behaviors. The capabilities of these models and their influence are growing faster than our understand of them, and our ability to steer them to behave precisely how we want, and that will only get harder as the models get more powerful.
> At this point, the balance of power will be critical to AI safety. I think it will be better to live in a world where AI is widely deployed so that larger actors can check the power of smaller bad actors.
This paragraph ignores the concept of offense/defense balance. It's much easier to cause a pandemic than to stop one, and cyberattacks, while not as bad as pandemics, seem to also favor the attacker (this one is contingent on how much AI tools can improve our ability to write secure code). At the extreme, it would clearly be bad if everyone had access to a anti-matter weapon large enough to destroy the Earth; at some level of capability, we have to limit the commands an advanced AI will follow from an arbitrary person.
That said, I'm unsure if limiting public weights at this time would be good regulation. They do seem to have some benefits in increasing research around alignment/interpretability, and I don't know if I buy the argument that public weights are significantly more dangerous from a "misaligned ASI" perspective than many competing closed companies. I also don't buy the view of some in the leading labs that we'll likely have "human level" systems by the end of the decade; it seems possible but unlikely. But I worry that Zuckerberg's vision of the future does not adequately guard against downside risks, and is not compatible with the way the technology will actually develop.
Related ongoing thread:
Llama 3.1 - https://news.ycombinator.com/item?id=41046540 - July 2024 (114 comments)
“The Heavy Press Program was a Cold War-era program of the United States Air Force to build the largest forging presses and extrusion presses in the world.” This ”program began in 1944 and concluded in 1957 after construction of four forging presses and six extruders, at an overall cost of $279 million. Six of them are still in operation today, manufacturing structural parts for military and commercial aircraft” [1].
$279mm in 1957 dollars is about $3.2bn today [2]. A public cluster of GPUs provided for free to American universities, companies and non-profits might not be a bad idea.
[1] https://en.m.wikipedia.org/wiki/Heavy_Press_Program
[2] https://data.bls.gov/cgi-bin/cpicalc.pl?cost1=279&year1=1957...
"Eventually though, open source Linux gained popularity – initially because it allowed developers to modify its code however they wanted ..."
I find the language around "open source AI" to be confusing. With "open source" there's usually "source" to open, right? As in, there is human legible code that can be read and modified by the user? If so, then how can current ML models be open source? They're very large matrices that are, for the most part, inscrutable to the user. They seem akin to binaries, which, yes, can be modified by the user, but are extremely obscured to the user, and require enormous effort to understand and effectively modify.
"Open source" code is not just code that isn't executed remotely over an API, and it seems like maybe its being conflated with that here?
The big winners of this: devs and AI startups
- No more vendor lock-in
- Instead of just wrapping proprietary API endpoints, developers can now integrate AI deeply into their products in a very cost-effective and performant way
- Price race to the bottom with near-instant LLM responses at very low prices are on the horizon
As a founder, it feels like a very exciting time to build a startup as your product automatically becomes better, cheaper, and more scalable with every major AI advancement. This leads to a powerful flywheel effect: https://www.kadoa.com/blog/ai-flywheel
Even if it's just open weights and not "true" open source, I'll still give Meta the appreciation of being one of the few big AI companies actually committed to open models. In an ecosystem where groups like Anthropic and OpenAI keep hemming and hawing about safety and the necessity of closed AI systems "for our sake", they stand out among the rest.
They are positioning themselves as champions of AI open source mostly because they were blindsided by OpenAI, are not in the infra game, and want to commoditize their complements as much as possible.
This is not altruism although it's still great for devs and startups. All FB GPU investments is primarily for new AI products "friends", recommendations and selling ads.
https://www.joelonsoftware.com/2002/06/12/strategy-letter-v/
I wish Meta stopped using the "open source" misnomer for free of charge weights. In the US the FTC already uses the term Open-Weights, and it seems the industry is also adopting this term (e.g. Mistral).
Someone can correct me here but AFAIK we don't even know which datasets are used to train these models, so why should we even use "open" to describe Llama? This is more similar to a freeware than an open-source project.
[1] https://www.ftc.gov/policy/advocacy-research/tech-at-ftc/202...
Meta makes their money off advertising, which means they profit from attention.
This means they need content that will grab attention, and creating open source models that allow anyone to create any content on their own becomes good for Meta. The users of the models can post it to their Instagram/FB/Threads account.
Releasing an open model also releases Meta from the burden of having to police the content the model generates, once the open source community fine-tunes the models.
Overall, this move is good business move for Meta - the post doesn't really talk about the true benefit, instead moralizing about open source, but this is a sound business move for Meta.
Huge companies like facebook will often argue for solutions that on the surface, seem to be in the public interest.
But I have strong doubts they (or any other company) actually believe what they are saying.
Here is the reality:
- Facebook is spending untold billions on GPU hardware.
- Facebook is arguing in favor of open sourcing the models, that they spent billions of dollars to generate, for free...?
It follows that companies with much smaller resources (money) will not be able to match what Facebook is doing. Seems like an attempt to kill off the competition (specifically, smaller organizations) before they can take root.
> We’re releasing Llama 3.1 405B, the first frontier-level open source AI model, as well as new and improved Llama 3.1 70B and 8B models.
Bravo! While I don't agree with Zuck's views and actions on many fronts, on this occasion I think he and the AI folks at Meta deserve our praise and gratitude. With this release, they have brought the cost of pretraining a frontier 400B+ parameter model to ZERO for pretty much everyone -- well, everyone except Meta's key competitors.[a] THANK YOU ZUCK.
Meanwhile, the business-minded people at Meta surely won't mind if the release of these frontier models to the public happens to completely mess up the AI plans of competitors like OpenAI/Microsoft, Google, Anthropic, etc. Come to think of it, the negative impact on such competitors was likely a key motivation for releasing the new models.
---
[a] The license is not open to the handful of companies worldwide which have more than 700M users.
I've summarized this entire thread in 4 lines (didn't even use AI for it!)
Step 1. Chick-Fil-A releases a grass-fed beef burger to spite other fast-food joints, calls it "the vegan burger"
Step 2. A couple of outraged vegans show up in the comments, pointing out that beef, even grass-fed beef, isn't vegan
Step 3. Fast food enthusiasts push back: it's unreasonable to want companies to abide by this restrictive definition of "vegan". Clearly this burger is a gamechanger and the definition needs to adapt to the times.
Step 4. Goto Step 2 in an infinite loop
Software 2.0 is about open licensing.
I.e., the more important thing - the more "free" thing - is the licensing now.
E.g., I play around with different image diffusion models like Stable Diffusion and specific fine-tuned variations for ControlNet or LoRA that I plug into ComfyUI.
But I can't use it at work because of the licensing. I have to use InvokeAI instead of ComfyUI if I want to be careful and only very specific image diffusion models without the latest and greatest fine-tuning. As others have said - the weights themselves are rather inscrutable. So we're building on more abstract shapes now.
But the key open thing is making sure (1) the tools to modify the weights are open and permissive (ComfyUI, related scripts or parts of both the training and deployment) and (2) the underlying weights of the base models and the tools to recreate them have MIT or other generous licensing. As well as the fine-tuned variants for specific tasks.
It's not going to be the naive construction in the future where you take a base model and as company A you produce company A's fine tuned model and you're done.
It's going to be a tree of fine-tuned models as a node-based editor like ComfyUI already shows and that whole tree has to be open if we're to keep the same hacker spirit where anyone can tinker with it and also at some point make money off of it. Or go free software the whole way (i.e., LGPL or equivalent the whole tree of tools).
In that sense unfortunately Llama has a ways to go to be truly open: https://news.ycombinator.com/item?id=36816395
> This is how we’ve managed security on our social networks – our more robust AI systems identify and stop threats from less sophisticated actors who often use smaller scale AI systems.
Ok, first of all, has this really worked? AI moderators still can't capture the mass of obvious spam/bots on all their platforms, threads included. Second, AI detection doesn't work, and with how much better the systems are getting, it's probably never going to, unless you keep the best models for yourself, and it's is clear from the rest of the note that its not zuck's intention to do so.
> As long as everyone has access to similar generations of models – which open source promotes – then governments and institutions with more compute resources will be able to check bad actors with less compute.
This just doesn't make sense. How are you going to prevent AI spam, AI deepfakes from causing harm with more compute? What are you gonna do with more compute about nonconsensual deepfakes? People are already using AI to bypass identity verification on your social media networks, and pump out loads of spam.
This is really good news. Zuck sees the inevitability of it and the dystopian regulatory landscape and decided to go all in.
This also has the important effect of neutralizing the critique of US Government AI regulation because it will democratize "frontier" models and make enforcement nearly impossible. Thank you, Zuck, this is an important and historic move.
It also opens up the market to a lot more entry in the area of "ancillary services to support the effective use of frontier models" (including safety-oriented concerns), which should really be the larger market segment.
The "open source" part sounds nice, though we all know there's nothing particularly open about the models (or their weights). The barriers to entry remain the same - huge upfront investments to train your own, and steep ongoing costs for "inference".
Is the vision here to treat LLM-based AI as a "public good", akin to a utility provider in a civilized country (taxpayer funded, govt maintained, non-for-profit)?
I think we could arguably call this "open source" when all the infra blueprints, scripts and configs are freely available for anyone to try and duplicate the state-of-the-art (resource and grokking requirements nonwithstanding)
Sure but under what license? Because slapping “open source” on the model doesn’t make it open source if it’s not actually license that way. The 3.1 license still contains their non-commercial clause (over 700m users) and requires derivatives, whether fine tunings or trained on generated data, to use the llama name.
Interesting discussion! While I agree with Zuckerberg's vision, the comments raise valid concerns. The point about GPU accessibility and cost is crucial. Public clusters are great, but sustainable funding and equitable access are essential to avoid exacerbating existing inequalities. I also resonate with the call for CUDA alternatives. Breaking the dependence on proprietary technology is key for a truly open AI ecosystem. While existing research clusters offer some access, their scope and resources often pale in comparison to what companies like Meta are proposing. We need a multi-pronged approach: open-sourcing models AND investing in accessible infrastructure, diverse hardware options, and sustainable funding models for a truly democratic AI future.
> Third, a key difference between Meta and closed model providers is that selling access to AI models isn’t our business model. That means openly releasing Llama doesn’t undercut our revenue, sustainability, or ability to invest in research like it does for closed providers. (This is one reason several closed providers consistently lobby governments against open source.)
The whole thing is interesting, but this part strikes me as potentially anticompetitive reasoning. I wonder what the lines are that they have to avoid crossing here?
Llama isn't open source. The license is at https://llama.meta.com/llama3/license/ and includes various restrictions on use, which means it falls outside the rules created by the https://opensource.org/osd
Additional Commercial Terms. If, on the Llama 3.1 version release date, the monthly active users of the products or services made available by or for Licensee, or Licensee’s affiliates, is greater than 700 million monthly active users in the preceding calendar month, you must request a license from Meta, which Meta may grant to you in its sole discretion, and you are not authorized to exercise any of the rights under this Agreement unless or until Meta otherwise expressly grants you such rights.
Which open-source has such restrictions and clause?
Open source "AI" is a proxy for democratising and making (much) more widely useful the goodies of high performance computing (HPC).
The HPC domain (data and compute intensive applications that typically need vector, parallel or other such architectures) have been around for the longest time, but confined to academic / government tasks.
LLM's with their famous "matrix multiply" at their very core are basically demolishing an ossified frontier where a few commercial entities (Intel, Microsoft, Apple, Google, Samsung etc) have defined for decades what computing looks like for most people.
Assuming that the genie is out of the bottle, the question is: what is the shape of end-user devices that are optimally designed to use compute intensive open source algorithms? The "AI PC" is already a marketing gimmick, but could it be that Linux desktops and smartphones will suddenly be "ΑΙ natives"?
For sure its a transformational period and the landscape T+10 yrs could be drastically different...
The FTC also recently put out a statement that is fairly pro-open source: https://www.ftc.gov/policy/advocacy-research/tech-at-ftc/202...
I think it's interesting to think about this question of open source, benefits, risk, and even competition, without all of the baggage that Meta brings.
I agree with the FTC, that the benefits of open-weight models are significant for competition. The challenge is in distinguishing between good competition and bad competition.
Some kind of competition can harm consumers and critical public goods, including democracy itself. For example, competing for people's scarce attention or for their food buying, with increasingly optimized and addictive innovations. Or competition to build the most powerful biological weapons.
Other kinds of competition can massively accelerate valuable innovation. The FTC must navigate a tricky balance here — leaning into competition that serves consumers and the broader public, while being careful about what kind of competition it is accelerating that could cause significant risk and harm.
It's also obviously not just "big tech" that cares about the risks behind open-weight foundation models. Many people have written about these risks even before it became a subject of major tech investment. (In other words, A16Z's framing is often rather misleading.) There are many non-big tech actors who are very concerned about current and potential negative impacts of open-weight foundation models.
One approach which can provide the best of both worlds, is for cases where there are significant potential risks, to ensure that there is at least some period of time where weights are not provided openly, in order to learn a bit about the potential implications of new models.
Longer-term, there may be a line where models are too risky to share openly, and it may be unclear what that line is. In that case, it's important that we have governance systems for such decisions that are not just profit-driven, and which can help us continue to get the best of all worlds. (Plug: my organization, the AI & Democracy Foundation; https://ai-dem.org/; is working to develop such systems and hiring.)
In general I look back on my time at FB with mixed feelings, I’m pretty skeptical that modern social media is a force for good and I was there early enough to have moved the needle.
But this is really positive stuff and it’s nice to view my time there through the lens of such a change for the better.
Keep up the good work on this folks.
Time to start thinking about opening up a little on the training data.
Who knew FB would hold OpenAI's original ideals, and OpenAI now holds early FB ideals/integrity.
Meta's article with more details on the new LLAMA 3.1 https://ai.meta.com/blog/meta-llama-3-1/
The irony of this letter being written by Mark Zuckerburg at Meta, while OpenAI continues to be anything but open, is richer than anyone could have imagined.
Interview with Mark Zuckerberg released today: https://www.bloomberg.com/news/videos/2024-07-23/mark-zucker...
Meanwhile Facebook is flooded with AI-generated slop with hundreds of thousands of other bots interacting with it to boost it to whoever is insane enough to still use that putrid hellhole of a mass-data-harvesting platform.
Dead internet theory is very much happening in real time, and I dread what's about to come since the world has collectively decided to lose their minds with this AI crap. And people on this site are unironically excited about this garbage that is indistinguishable from spam getting more and more popular. What a fucking joke
I thoroughly support Meta's open-sourcing of these AI models going forward. However, for a company that absolutely closed down discussions about providing API access to their platform, I'm left wondering what's in it (monetarily) for them by doing this? Is it to simply undercut competition in the space, like some grocery store selling below cost?
Is there an argument against Open Source AI?
Not the usual nation-state rhetoric, but something that justifies that closed source leads to better user-experience and fewer security and privacy issues.
An ecosystem that benefits vendors, customers, and the makers of close source?
Are there historical analogies other than Microsoft Windows or Apple iPhone / iOS?
It'll be interesting to come back here in a couple of years and see what's left. What do they even do anymore? They have Facebook, which hasn't visibly changed in a decade. They have Instagram, which feels a bit sleeker but also remained more or less the same. and Whatsapp. Ad network that runs on top of those services and floods them with trash. Bunch of stuff that doesn't seem to exist anymore - Libra, the grandiose multibillion dollar Legless VR, etc.
But they still have 70 thousand people (a small country) doing _something_. What are they doing? Updating Facebook UI? Not really, the UI hasn't been updated, and you don't need 70 thousand people to do that. Stuff like React and Llama? Good, I guess, we'll see how they make use of Llama in a couple of years. Spellcheck for posts maybe?
Llama 3.1 405B is on par with GPT-4o and Claude 3.5 Sonnet, the 70B model is better than GPT 3.5 turbo, incredible.
> We need to protect our data.
This is a very important concern in Health Care because of HIPAA compliance. You can't just send your data over the wire to someone's proprietary API. You would at least need to de-identify your data. This can be a tricky task, especially with unstructured text.
Just added Llama 3.1 405B/70B/8B to https://double.bot (VSCode coding assistant) if anyone would like to try it.
---
Some observations:
* The model is much better at trajectory correcting and putting out a chain of tangential thoughts than other frontier models like Sonnet or GPT-4o. Usually, these models are limited to outputting "one thought", no matter how verbose that thought might be.
* I remember in Dec of 2022 telling famous "tier 1" VCs that frontier models would eventually be like databases: extremely hard to build, but the best ones will eventually be open and win as it's too important to too many large players. I remember the confidence in their ridicule at the time but it seems increasingly more likely that this will be true.
> My framework for understanding safety is that we need to protect against two categories of harm: unintentional and intentional. Unintentional harm is when an AI system may cause harm even when it was not the intent of those running it to do so. For example, modern AI models may inadvertently give bad health advice. Or, in more futuristic scenarios, some worry that models may unintentionally self-replicate or hyper-optimize goals to the detriment of humanity. Intentional harm is when a bad actor uses an AI model with the goal of causing harm.
Okay then Mark. Replace "modern AI models" with "social media" and repeat this statement with a straight face.
Okay if anyone wants to try Llama 3.1 inference on CPU, try this: https://github.com/trholding/llama2.c (L2E)
It's a bit buggy but it is fun.
Disclaimer: I am the author of L2E
When Zuck said spy can easily steal models, I wonder how much of it comes from experiences. I remember they struggled to train OPT not long ago.
On a more serious note, I don't really buy his arguments about safety. First, widespread AI does not reduce unintentional harm but increases it, because the rate of accident is compound. Second, the chance of success for threat actors will increase, because of the asymmetric advantage of gaining access to all open information and hiding their own information. But there is no reverse at this point, I enjoy it while it lasts, AGI will come sooner or later anyway.
How are smaller models distilled from large models, I know of LoRA, quantization like technique; but does distilling also mean generating new datasets for conversing with smaller models entirely from the big models for many simpler tasks?
Looks like you can already try out Llama-3.1-405b on Groq, although it's timing out. So. Hugged I guess.
I think all this discussion around Open-source AI is a total distraction from the elephants in the room. Let's list what you need to run/play around with something like Llama:
1. Software: this is all Pytorch/HF, so completely open-source. This is total parity between what corporates have and what the public has.
2. Model weights: Meta and a few other orgs release open models - as opposed to OpenAI's closed models. So, ok, we have something to work with.
3. Data: to actually do anything useful you need tons of data. This is beyond the reach of the ordinary man, setting aside the legality issues.
4. Hardware: GPUs, which are extremely expensive. Not just that, even if you have the top dollars, you have to go stand in a queue and wait for O(months), since mega-corporates have gotten there before you.
For Inference, you need 1,2 and 4. For training (or fine-tuning), you need all of these. With newer and larger models like the latest Llama, 4 is truly beyond the reach of ordinary entities.
This is NOTHING like open-source, where a random guy can edit/recompile/deploy software on a commodity computer. Wrt LLMs, Data/Hardware are in the equation, the playing field is complete stacked. This thread has a bunch of people discussing nuances of 1 and 2, but this bike-shedding only hides the basic point: Control of LLMs are for mega-corps, not for individuals.
I'm really unsure if it's a good idea given the current geopolitics.
Open-Source Code in the past was fantastic because the West had a monopoly on CPUs and computers. Sharing and contributing was amazing while ensured that tyrants couldn't use this tech to harm people simply because they don't have a hardware to run.
But now, things are different. China is advancing in chip technology, and Russia is using open-source AI to harm people on the scale today, with auto-targeting drones being just the start. Red sea conflict etc.
And somehow, Zuckerberg keeps finding ways to mess up people's lives, despite having the best intentions.
Right now you can build a semi-autonomous drone with AI to kill people for ~$500-700. The western world will still use safe and secure commercial models. While new axis of evil will use models based on Meta or any other open source to do whatever harm they can imagine with not a hint of control.
This particular model. Fine-tune it to develop a nuclear bomb using all possible research that level of government can get on the scale. Killing drone swarms etc. Once the knowledge got public these models can be a base model to get expert-level knowledge to anyone who wants it, uncensored. Especially if you are government that wants to destroy a peaceful order for whatever reason.
Only if it is truly open source (open data sets, transparent curation/moderation/censorship of data sets, open training source code, open evaluation suites, and an OSI approved open source license).
Open weights (and open inference code) is NOT open source, but just some weak open washing marketing.
The model that comes closest to being TRULY open is AI2’s OLMo. See their blog post on their approach:
https://blog.allenai.org/hello-olmo-a-truly-open-llm-43f7e73...
I think the only thing they’re not open about is how they’ve curated/censored their “Dolma” training data set, as I don’t think they explicitly share each decision made or the original uncensored dataset:
https://blog.allenai.org/dolma-3-trillion-tokens-open-llm-co...
By the way, OSI is working on defining open source for AI. They post weekly updates to their blog. Example:
https://opensource.org/blog/open-source-ai-definition-weekly...
Totally tangential thought, probably doomed to be lost in the flood of comments on this very interesting announcement.
I was thinking today about Musk, Zuckerberg and Altman. Each claims that the next version of their big LLMs will be the best.
For some reason it reminded me of one apocryphal cause of WW1, which was that the kings of Europe were locked in a kind of ego driven contest. It made me think about the Nation State as a technology. In some sense, the kings were employing the new technology which was clearly going to be the basis for the future political order. And they were pitting their own implementation of this new technology against the other kings.
I feel we are seeing a similar clash of kings playing out. The claims that this is all just business or some larger claim about the good of humanity seem secondary to the ego stakes of the major players. And when it was about who built the biggest rocket, it felt less dangerous.
It breaks my heart just a little bit. I feel sympathy in some sense for the AIs we will create, especially if they do reach the level of AGI. As another tortured analogy, it is like a bunch of competitive parents forcing their children into adversarial relationships to satisfy the parent's ego.
this is very cool indeed that meta has made available more than they need to in terms of model weights.
however, the "open-source" narrative is being pushed a bit too much like descriptive ML models were called "AI", or applied statistics "data science". with reinforced examples such as this, we start to lose the original meaning of the term.
the current approach of startups or small players "open-sourcing" their platforms and tools as a means to promote network effect works but is harmful in the long run.
you will find examples of terraform and red hat happening, and a very segmented market. if you want the true spirit of open-source, there must be a way to replicate the weights through access to training data and code. whether one could afford millions of GPU hours or not, real innovation would come from remixing the internals, and not just fine-tuning existing stuff.
i understand that this is not realistically going to ever happen, but don't perform deceptive marketing at the same time.
I never thought I would say this but thanks Meta.
*I reserve the right to remove this praise if they abuse this open source model position in the future.
I am not deep into llms so I ask this. From my understanding, their last model was open source but it was in a way that you can use them but the inner working were "hidden"/not transparent.
With the new model, I am seeing alot of how open source they are and can be build upon. Is it now completely open source or similar to their last models ?
The real path forward is recognizing what AI is good at and what it is bad at. Focus on making what it is good at even better and faster. Open AI will definitely give us that option but it isn't a miracle worker.
My impression is that AI if done correctly will be the new way to build APIs with large data sets and information. It can't write code unless you want to dump billions of dollars into a solution with millions of dollars of operational costs. As it stands it loses context too quickly to do advance human tasks. BUT this is where it is great at assembling data and information. You know what is great at assembling data and information? APIs.
Think of it this way if we can make it faster and it trains on a datalake for a company it could be used to return information faster than a nested micro-service architecture that is just a spiderweb of dependencies.
Because AI loses context simple API requests could actually be more efficient.
The question is what is "open source" in the case of a matrix of numbers, as opposed to code.
Also, are there any "IP" rights attached at all to a bunch numbers coming out of a formula that someone else calculated for you? (edit: after all, a "model" is just a matrix of numbers coming out of running a training algorithm that is not owned by Meta over training data that is not owned by Meta.)
Meta imposes a notification duty AND a request for another license (no mention of the details of these) for applications of their model with a large number of users. This is against the spirit of open source. (In practical terms it is not a show stopper since you can easily switch models, although they all have subtlely different behaviours and quality levels.)
Zuck needs to get real. They are Open Weights not Open Source.
Open source is a welcome step but what we really need is complete decentralisation so people can run their own private AI Models that keep all the data private to them. We need this to happen locally on laptops, mobile phones, smart devices etc. Waiting for when that will become ubiquitous.
I don't think weights is the source. Data is the source. But still better than nothing.
From the "Why Open Source AI Is Good for Meta" section, none of the four given reasons seem to justify spending so much money to train these powerful models and give them away for free.
> Third, a key difference between Meta and closed model providers is that selling access to AI models isn’t our business model. That means openly releasing Llama doesn’t undercut our revenue, sustainability, or ability to invest in research like it does for closed providers. (This is one reason several closed providers consistently lobby governments against open source.)
Maybe this is a strategic play to hurt other AI companies that depend on this business model?
I strongly suspect that what AI will end up doing is push companies and organizations towards open source, they will eventually realize that code is already being shared via AI channels, so why not do it legally with open source?
Thanks to Meta for their work on safety, particularly Llama Guard. Llama Guard 3 adds defamation, elections, and code interpreter abuse as detection categories.
Having run many red teams recently as I build out promptfoo's red teaming featureset [0], I've noticed the Llama models punch above their weight in terms of accuracy when it comes to safety. People hate excessive guardrails and Llama seems to thread the needle.
Very bullish on open source.
[0] https://www.promptfoo.dev/docs/red-team/
> Developers can run inference on Llama 3.1 405B on their own infra at roughly 50% the cost of using closed models like GPT-4o
Does anyone have details on exactly what this means or where/how this metric gets derived?
Deployment of PKI-signed distributed software systems to use community-provisioned compute, bandwidth and storage at scale is, now quite literally, the future.
We mostly don’t all want or need the hardware to run these AIs ourselves, all the time. But, when we do, we need lots of it for a little while.
This is what Holochain was born to do. We can rent massive capacity when we need it, or earn money renting ours when we don’t.
All running cryptographically trusted software at Internet scale, without the knowledge or authorization of commercial or government “do-gooders”.
Exciting times!
Without the raw data that trained the model, how is it open source?
It's more like a freeware than open source. You can launch it on your hardware and use it but how it was created is mostly not published.
Still huge props to them for doing what they do.
CrowdStrike just added "Centralized Company Controlled Software Ecosystem" to every risk data sheet on the planet. Everything futureproof is self-hosted and open source.
Has anyone taken apart the llama community license and compared it to validated open source licenses? Red Hat is making a big deal about releasing the Granite LLM released under Apache. Is there a real difference between that and what Llama does?
https://www.redhat.com/en/topics/ai/open-source-llm
> Our adversaries are great at espionage, stealing models that fit on a thumb drive is relatively easy, and most tech companies are far from operating in a way that would make this more difficult.
Mostly unrelated to the correctness of the article, but this feels like a bad argument. AFAIK, Anthropic/OpenAI/Google are not having issues with their weights being leaked (are they?). Why is it that Meta's model weights are?
Related:
Llama 3.1 Official Launch
https://news.ycombinator.com/item?id=41046540
The All-In podcast predicted this exact strategy for keeping OpenAI and other upstarts from disrupting the existing big tech firms.
By giving away higher and higher quality models, they undermine the potential return on investment for startups who seek money to train their own. Thus investment in foundation model building stops and they control the ecosystem.
Can't it be divided into multiple parts to have a more meaningful discussion? For example the terminology could identify four key areas:
Open-weights models are not really open source.
Ok one notable difference: did the linux researchers of yore warn about adversarial giants getting this tech? Or is this unique to the current moment? That for me is the largest question when considering the logical progression on "linux open is better therefore ai open is better".
“Commoditise your complement” in action!
Small language models is the path forward https://medium.com/thoughts-on-machine-learning/small-langua...
This is very amusing:
- We need to control our own destiny and not get locked into a closed vendor. - We need to protect our data. - We want to invest in the ecosystem that’s going to be the standard for the long term.
Thank you Meta for being the bright light of ethical guidance for us all.
Its not open source.
We don't get the data or training code. The small runtime framework is open source but that's of little use as its largely fixed in implementation due to the weights. Yes we can fine tune but that is akin to modifying video games - we can do it but there's only so much you can do within reasonable effort and no one would call most video games 'open source'*.
Its freeware and Meta's strategy is much more akin to the strategy Microsoft used with Internet Explorer to capture the web browser market. No one was saying god bless Microsoft for trying to capture the browser market with I.E. Nothing wrong with Meta's strategy just don't call it open source.
*weights are data and so is the video/audio output of a video game. If we gave away that video game output for free we wouldn't call the video game open source as the myriad freeware games essentially do.
The truth is we need both closed and open source, they both have their discovery path and advantages and disadvantages, there shouldn’t be a system where one is eliminated over the other. They also seem to be driving each other forward via competition.
No matter how "open source" they actually will be, I'm glad this exists as a competitor to Gemini and ChatGPT to help push innovation from a different angle.
Can't wait to see how the landscape will look in 2027 and beyond.
This is obviously good news, but __personally__ I feel the open-source models are just trying to catch up with whoever the market leader is, based on some benchmarks.
The actual problem is running these models. Very few companies can afford the hardware to run these models privately. If you run them in the cloud, then I don't see any potential financial gain for any company to fine-tune these huge models just to catch up with OpenAI or Anthropic, when you can probably get a much better deal by fine-tuning the closed-source models.
Also this point:
> We need to protect our data. Many organizations handle sensitive data that they need to secure and can’t send to closed models over cloud APIs.
First, it's ironic that Meta is talking about privacy. Second, most companies will run these models in the cloud anyway. You can run OpenAI via Azure Enterprise and Anthropic on AWS Bedrock.
I love llamas but how fb can profit from leading this effort is not clear to me.
We welcome Mark Zuckerberg's Redemption Arc! Opensource AI Here we go!
If you're interested in getting Llama 3.1 to build software, check out https://github.com/OpenDevin/OpenDevin
Massive props to AI teams at Meta that released this model open source
And this is happening RIGHT as a new potential leader is emerging in Llama 3.1. I'm really curious about how this is going to match up on the leaderboards...
I don't believe in any word coming out from this lizard. He is the most evil villain I know, and I live in the middle east, can you imagine
They have earned so much money on all of their users, this is least they can do to give back to the community, if this can be considered that ;)
I love how Zuck decided to play a new game called “commoditize some other billionaire's business to piss him”, I can't wait until this becomes a trend and we get plenty of open source cool stuff.
If he really wants to replicate Linux's success against proprietary Unices, he needs to release Llama with some kind of GPL equivalent, that forces everyone to play the open source game.
First you’re going to have to write some laws that prevent openwashing and legitimate open source projects from becoming proprietary.
Another case of "open-washing". Llama is not available open source, under the common definition of open source, as the license doesn't allow for commercial re-use by default [0].
They provide their model, with weights and code, as "source available" and it looks like they allow for commercial use until a 700M monthly subscriber cap is surpassed. They also don't allow you to train other AI models with their model:
""" ... v. You will not use the Llama Materials or any output or results of the Llama Materials to improve any other large language model (excluding Meta Llama 3 or derivative works thereof). ... """
[0] https://github.com/meta-llama/llama3/blob/main/LICENSE
Am I too skeptical or is this the approach taken when they've decided they can't make proprietary work for them.
OpenAI needs to release a new model setting a new capabilities highpoint. This is existential for them now.
What is the best of the llama 3.1 models that I can fine-tune with a macbook m3 max w/ 96GB of ram?
Great, maybe you (Meta) should actually release the source for your AI then?
What would be the speed of a querry of running this model from disk on a ordinary PC?
Has anyone tried that?
405 sounds like a lot of B's! What do you need to practically run or host that yourself?
Reality: they've realize gpt 4 is a wall, they can't keep pouring trillions of dollars into it for no improvement or little at all, so now they want to put it to the open source until someone figures out the next step then they'll take it behind closed doors again.
I hate how the moment it's too late will be, by design, closed doors.
Cynically I think this position is largely due to how they can undercut OpenAI's moat.
I see it as a new race to build the personal computer (PC) all over again. I hope we can apply the lessons learned and can jump into open source to speed up development and democratize ai for all. We know how Microsoft played dirty in the early days of the PC revolution.
The value of AI is in the information used to train the models, not the hardware.
I expect this to end up having been one of the worst timed blog posts in history. Open source AI has mostly been good for the world up until now, but we're getting to the point where we're about to find out why open-sourcing sufficiently bad models is a terrible idea.
I appreciate that Mark Zuckerberg soberly and neutrally talked about some of the risks from advances in AI technology. I agree with others in this thread that this is more accurately called "public weights" instead of open source, and in that vein I noticed some issues in the article.
> This is one reason several closed providers consistently lobby governments against open source.
Is this substantially true? I've noticed a tendency of those who support the general arguments in this post to conflate the beliefs of people concerned about AI existential risk, some of whom work at the leading AI labs, with the position of the labs themselves. In most cases I've seen, the AI labs (especially OpenAI) have lobbied against any additional regulation on AI, including with SB1047[1] and the EU AI Act[2]. Can anyone provide an example of this in the context of actual legislation?
> On this front, open source should be significantly safer since the systems are more transparent and can be widely scrutinized. Historically, open source software has been more secure for this reason.
This may be true if we could actually understand what was happening in neural networks, or train them to consistently avoid unwanted behaviors. As things are, the public weights are simply inscrutable black boxes, and the existence of jailbreaks and other strange LLM behaviors show that we don't understand how our training processes create models' emergent behaviors. The capabilities of these models and their influence are growing faster than our understand of them, and our ability to steer them to behave precisely how we want, and that will only get harder as the models get more powerful.
> At this point, the balance of power will be critical to AI safety. I think it will be better to live in a world where AI is widely deployed so that larger actors can check the power of smaller bad actors.
This paragraph ignores the concept of offense/defense balance. It's much easier to cause a pandemic than to stop one, and cyberattacks, while not as bad as pandemics, seem to also favor the attacker (this one is contingent on how much AI tools can improve our ability to write secure code). At the extreme, it would clearly be bad if everyone had access to a anti-matter weapon large enough to destroy the Earth; at some level of capability, we have to limit the commands an advanced AI will follow from an arbitrary person.
That said, I'm unsure if limiting public weights at this time would be good regulation. They do seem to have some benefits in increasing research around alignment/interpretability, and I don't know if I buy the argument that public weights are significantly more dangerous from a "misaligned ASI" perspective than many competing closed companies. I also don't buy the view of some in the leading labs that we'll likely have "human level" systems by the end of the decade; it seems possible but unlikely. But I worry that Zuckerberg's vision of the future does not adequately guard against downside risks, and is not compatible with the way the technology will actually develop.
[1] https://thebulletin.org/2024/06/california-ai-bill-becomes-a...
[2] https://time.com/6288245/openai-eu-lobbying-ai-act/
405 is a lot of B's. What does it take to run or host that?