> I think over the past 18 months, that problem has pretty much been solved – meaning when you talk to a chatbot, a frontier model-based chatbot, you can basically trust the answer
Yes, I'm sure if we ask a question about The Party to your (Baidu) model, we can trust the answer.
It takes so much money to be a player in this space; the ante went from ~$100k (GOOG, FB) to like $4B or 100k H100s. That's how I arrive at the statement as 99% don't have the cash.
There are, but then the question really becomes "what is the moat"? I.e. lots of these companies are essentially just providing wrappers around the best models coupled with some type of RAG approach.
FWIW, I believe there is a defendable moat for the players that have really good UI and are really focused on end-user solutions. E.g. I pay for Cursor.sh because I believe it is an easy net win for my productivity. But I do really wonder if these "AI application" companies can support their lofty valuations. I feel like most of them will have limited pricing power because if they try to price too high it's easy for someone to say "OK, we'll just go to a competitor, or even pull it in house."
Can’t believe people still treat this kind of messaging about AI as expert opinion and not as advertising in its purest form. I guess those are the same people who see something profoundly meaningful in machine generated strings.
> I think over the past 18 months, that problem has pretty much been solved – meaning when you talk to a chatbot, a frontier model-based chatbot, you can basically trust the answer
Can't decide if he actually believes this, or he's just spewing his own hype. While I definitely agree the best models have reduced hallucinations, going from, say, 3% hallucinations to .7% hallucinations doesn't really improve the situation much for me, because I still need to double check and verify the answers. Plus, I've found that models tend to hallucinate in these "tricky" situations where I'm most likely to want to ask AI in the first place.
For example, my taxes were more of a clusterfuck than usual this year, and so I was asking ChatGPT to clarify something for me, which was whether the "ordinary dividends" number reported on your 1040 and 1099s is a superset of "qualified dividends" (that is, whether the qualified dividends number is included in the ordinary dividends number), or if they were independent values. The correct answer is that the ordinary dividends number (3b on the 1040) does include qualified dividends (the 3a number), but ChatGPT originally gave me the wrong answer. Only when I dug further and asked ChatGPT to clarify did I get the typical "My mistake, you're right, it is a superset!" response from ChatGPT.
Anybody who says that LLM output doesn't need to be verified is either willfully bullshitting, or they're just not asking questions beyond the basics.
I'm wondering if we are now in the Low Background Steel transition, as far as Internet content goes. I already see material on the open Internet that has very obviously been generated by AI. As the next round of Common Crawl or whatever is slurped in for training, AI ends up eating its own output. Does the quality of the material degrade at that point? Maybe we'll end up searching for an Internet that existed before AI started rewriting it.
There are two main types of data for training intelligences (natural or artificial):
1. Self play
2. Data left behind by other intelligences' self play.
The Internet is 2, but is generated by the self play of humans - who were themselves trained on the self play of previous humans.
That's how civilization is bootstrapped.
Once you have bootstrapped sufficiently smart AI, they can possibly bootstrap themselves further on their own self play, instead of continuing to rely on human self play data.
My bet is on unverified AI going the way of the dodo. I’m sorta sick of hallucinated nonsense. I also don’t see a business use case for a lot of “cool” ai products.
I don't know if you're familiar with how LLMs work, but after diving a bit into that space, I'm a "realistic enthusiast" after a lifetime as a deep skeptic. For me it's like the beginning of the internet (yes...old). It's clear there's something profoundly different and potentially enormously useful going on. Exactly how it will end up being used and who will make money from it is unclear. We know from experience that those questions ended up having answers different than we anticipated with the internet and I'd expect the same to happen with AI.
The parent post is very orthogonal to yours. That isn't saying that LLMs won't be useful -- it's saying that unverified LLMs are of limited value.
This rings true. There's a low economic value for performing activities when it doesn't matter if the output is true or accurate.
Unverified LLMs can generated unlimited output that cannot be trusted; this is output that can approximate truth at times.
I suppose the long term question is whether the approximation is sufficient for value-generating purposes. It clearly is sufficient in cases where it outperforms the status quo (example: summarizing customer feedback).
This is also the case for unverified and verified humans, and one thing I can say with absolute confidence is that the majority of internet content has always, and will always be unverified opinion and ideology. As is the majority of human minds. The most valuable thing to come out of AI is the forced introspection about ourselves. It's "us" times N. For better or worse.
In my realistic moments I think that we will rewrite medical law, banking law, and privacy law and so on to accomodate the shitty AI children of billionaires. Misdiagnosed by "AI"? Too bad. Lose a job because of AI? Too bad. Sentenced to death by an AI-powered criminal investigation? Too bad.
I'll have to assume you don't understand how tech investing works. Give a dollar to 100 companies and you've spent $100. 99% fail, but one ends up returning $1,000 and you've 10x'ed.
In this example, you invested $100 to get a $1,000 return. $99 of those investments failed, but the one that succeeded made up for the losses. You still have to factor in the losses, otherwise if you could have just picked the winner from the get-go, you would have done that and invested just $1, not $100.
> I think over the past 18 months, that problem has pretty much been solved – meaning when you talk to a chatbot, a frontier model-based chatbot, you can basically trust the answer
Yes, I'm sure if we ask a question about The Party to your (Baidu) model, we can trust the answer.
It takes so much money to be a player in this space; the ante went from ~$100k (GOOG, FB) to like $4B or 100k H100s. That's how I arrive at the statement as 99% don't have the cash.
There's a lot of people making AI companies outside LLM model development
There are, but then the question really becomes "what is the moat"? I.e. lots of these companies are essentially just providing wrappers around the best models coupled with some type of RAG approach.
FWIW, I believe there is a defendable moat for the players that have really good UI and are really focused on end-user solutions. E.g. I pay for Cursor.sh because I believe it is an easy net win for my productivity. But I do really wonder if these "AI application" companies can support their lofty valuations. I feel like most of them will have limited pricing power because if they try to price too high it's easy for someone to say "OK, we'll just go to a competitor, or even pull it in house."
Can’t believe people still treat this kind of messaging about AI as expert opinion and not as advertising in its purest form. I guess those are the same people who see something profoundly meaningful in machine generated strings.
Getting beyond the title, which I definitely agree with (https://news.ycombinator.com/item?id=41896346), there was this nugget about hallucinations:
> I think over the past 18 months, that problem has pretty much been solved – meaning when you talk to a chatbot, a frontier model-based chatbot, you can basically trust the answer
Can't decide if he actually believes this, or he's just spewing his own hype. While I definitely agree the best models have reduced hallucinations, going from, say, 3% hallucinations to .7% hallucinations doesn't really improve the situation much for me, because I still need to double check and verify the answers. Plus, I've found that models tend to hallucinate in these "tricky" situations where I'm most likely to want to ask AI in the first place.
For example, my taxes were more of a clusterfuck than usual this year, and so I was asking ChatGPT to clarify something for me, which was whether the "ordinary dividends" number reported on your 1040 and 1099s is a superset of "qualified dividends" (that is, whether the qualified dividends number is included in the ordinary dividends number), or if they were independent values. The correct answer is that the ordinary dividends number (3b on the 1040) does include qualified dividends (the 3a number), but ChatGPT originally gave me the wrong answer. Only when I dug further and asked ChatGPT to clarify did I get the typical "My mistake, you're right, it is a superset!" response from ChatGPT.
Anybody who says that LLM output doesn't need to be verified is either willfully bullshitting, or they're just not asking questions beyond the basics.
I'm wondering if we are now in the Low Background Steel transition, as far as Internet content goes. I already see material on the open Internet that has very obviously been generated by AI. As the next round of Common Crawl or whatever is slurped in for training, AI ends up eating its own output. Does the quality of the material degrade at that point? Maybe we'll end up searching for an Internet that existed before AI started rewriting it.
https://en.wikipedia.org/wiki/Low-background_steel?wprov=sfl...
Yes - this insight has been repeatedly discovered by many, but remains a great analogy.
There are two main types of data for training intelligences (natural or artificial):
1. Self play
2. Data left behind by other intelligences' self play.
The Internet is 2, but is generated by the self play of humans - who were themselves trained on the self play of previous humans.
That's how civilization is bootstrapped.
Once you have bootstrapped sufficiently smart AI, they can possibly bootstrap themselves further on their own self play, instead of continuing to rely on human self play data.
Natural gas at least one more main type of training data, and it's much bigger than self play for natural training:
Testing against the universe.
Self play and self play ignores the whole of Empiricism
90% is the startup fail rate even before AI
After around 7yrs of company life it's about 90%
Doesnt mean they won't make money in the prior years then still shutdown by the 7th year because markets change or competitors beat you
My bet is on unverified AI going the way of the dodo. I’m sorta sick of hallucinated nonsense. I also don’t see a business use case for a lot of “cool” ai products.
I don't know if you're familiar with how LLMs work, but after diving a bit into that space, I'm a "realistic enthusiast" after a lifetime as a deep skeptic. For me it's like the beginning of the internet (yes...old). It's clear there's something profoundly different and potentially enormously useful going on. Exactly how it will end up being used and who will make money from it is unclear. We know from experience that those questions ended up having answers different than we anticipated with the internet and I'd expect the same to happen with AI.
The parent post is very orthogonal to yours. That isn't saying that LLMs won't be useful -- it's saying that unverified LLMs are of limited value.
This rings true. There's a low economic value for performing activities when it doesn't matter if the output is true or accurate.
Unverified LLMs can generated unlimited output that cannot be trusted; this is output that can approximate truth at times.
I suppose the long term question is whether the approximation is sufficient for value-generating purposes. It clearly is sufficient in cases where it outperforms the status quo (example: summarizing customer feedback).
This is also the case for unverified and verified humans, and one thing I can say with absolute confidence is that the majority of internet content has always, and will always be unverified opinion and ideology. As is the majority of human minds. The most valuable thing to come out of AI is the forced introspection about ourselves. It's "us" times N. For better or worse.
I don't know, remember the controversy over the wildly inaccurate historical AI images? That could serve some satirical products.
Weren't the majority of those induced by manipulated politics rather than it being a base characteristic of the technology?
In my optimistic moments, I think that.
In my realistic moments I think that we will rewrite medical law, banking law, and privacy law and so on to accomodate the shitty AI children of billionaires. Misdiagnosed by "AI"? Too bad. Lose a job because of AI? Too bad. Sentenced to death by an AI-powered criminal investigation? Too bad.
But YC just funded hundreds of AI companies. This would mean the world’s smartest investors are wrong. This is, frankly, impossible.
What if they aren’t the world’s smartest investors
Case in point https://x.com/paulg/status/1845936708488970606
Even if these younger vintage batches fail completely, the existing portfolio will benefit from AI being hyped.
I'll have to assume you don't understand how tech investing works. Give a dollar to 100 companies and you've spent $100. 99% fail, but one ends up returning $1,000 and you've 10x'ed.
Why would it be 10x in this example? Isn't it 1000x? Or 900x if you remove the initial investment?
Do 1/100 companies really experience that kind of windfall?
In this example, you invested $100 to get a $1,000 return. $99 of those investments failed, but the one that succeeded made up for the losses. You still have to factor in the losses, otherwise if you could have just picked the winner from the get-go, you would have done that and invested just $1, not $100.
You spent $100 and came out with $1,000. That’s 10x (you’ve returned 10 times the initial investment).