Google is taking a different approach this time, moving quickly. While NotebookLM is indeed a remarkable tool for personal productivity and learning, it also opens the door for spammers to mass-produce content that isn't meant for human consumption.
Amidst all the praise for this project, I’d like to offer a different perspective. I hope the NotebookLM team sees this and recognizes the seriousness of the spam issue, which will only grow if left unaddressed. If you know someone on the team, please bring this to their attention - Could you please provide a tool or some plain-English guidelines to help detect audio generated by NotebookLM? Is there a watermark or any other identifiable marker that can be used?
Where do you get the "low-quality" part from - my experience with NotebookLM is that they create much higher quality, more informative, more fact based, and more concise podcasts than 99% of the stuff I listen to. I've mostly switched entirely over to NotebookLM for my podcast listening. They, generally, offer a far higher quality experience from my perspective.
Maybe you have the problem backwards - we accidentally end up listening to non NotebookLM podcasts?
Yes. For example I fed it a public tender and associated regulations in Norwegian and it was able to answer questions about the parts I mas interested in correctly and succinctly. I have also fed it research papers that normally I would not have the patience or knowledge to go through on my own.
In terms of actual usefulness, it’s one of the AI tools that most impressed me.
The main issue is, of course, privacy.
I have tried to reproduce something similar using AnythingLLM and the low tier Llama models, but of course the experience is much worse, both in terms of results, response times and UI. If someone knows of a better local setup, I’m all ears.
I would consider a Workspace subscription if I could actually trust Google to make good on the commitment of not reading your stuff, which I’m finding hard to do…
Personally, I hate even the idea of an AI made podcast, because to me podcasts are personal and emotional. They're about the individual humans who make them. They're not just a source of "information".
I'm glad there are different kinds of podcasts for different people now.
I've always absolutely hated the focus on the individual humans and their personalities behind the podcast, and wished they'd be a better source of well-structured "information".
I never listened to a podcast I didn't get frustrated with, even at 2x speed. These NotebookLM podcasts have been exactly what I've always wished podcasts were.
It's interesting assumption that by virtue of being AI generated, it's considered bad/fake. 20 years ago, people hated how photoshop changed the photo design industry, NotebookLM is knocking on the door now.
I'm excited by AI, but I've also tried using this specific one to generate a podcast based on one of my own blog posts and will only try again due to this product announcement rather than because I think the state of the art is already "there".
On the plus side, the speech is almost perfect; so good, that I sincerely hope the voices themselves are never fully under user control.
With regards to the actual summary of the content I gave them, I would say they are grade B: only mostly correct, they're still inventing things I didn't say and missing things I did say.
That's not to say humans don't make mistakes, I still consider this objectively impressive, that is able to reach even this level was SciFi when I was a kid — but why waste time on a grade-B podcast when the AAA-tier costs you as a consumer a 30 second advert?
There's an ethical/moral-luck dilemma at the heart of this.
If a AAA-tier podcast on the subject you want to listen to a podcast about exists (and you know about it), then that's probably a better (and obvious) choice for your listening time.
However, if you want to listen to someone discuss or explain something and you don't know about a AAA-tier podcast, it's possible that a generated podcast is better than nothing.
On the other hand, it's also possible that the generated podcast will miss or hallucinate a key detail, and herein lies the dilemma. Is it better to listen to something that might get something wrong, or not to listen and perhaps someday to learn about the subject through some other form that is less likely to include mistakes?
Interesting, are there any podcasts in particular that you recommend? Everything I’ve heard from it just seems like the most banal, cookie cutter stereotype of a podcast with nothing but extremely surface level summarization of a given article, peppered with random cliches and fake sounding reactions “Wow! ok, so let’s hear more about that. I’m intrigued!” “OK, let’s dive deep.” Etc.
There’s a normal, human generated podcast called Deep Dive: AI (https://deepdive.opensource.org/). There’s also a confusingly similar named podcast Deep Dive AI that appears to only have one episode and is NotebookLM generated. Which one are you referring to?
Edit: If I'm understanding it correctly after some googling, supposedly the "name" of all podcasts generated by NotebookLM is "Deep Dive"? That's just confusing.
Its trained on too many shallow podcasts. Go compare any of NotebookLM podcast with an episode of Hardcore History. The latter goes into much more depth (even when you account of it being much longer).
I think OP is presenting a different problem: while this tool gives the possibility of creating good quality podcasts, it also enables spammers to quickly generate tons of garbage episodes just to profit from the advertisement they put in it.
Goody, let’s just drive out all the human creators who actually interview real experts and go in depth on a subject, with AI-generated voices and summaries.
If AI ends up destroying humanity, it isn’t going to be through weapons and death robots, but just by entertaining and placating us all to death.
This doesn't strike me as much of a problem as it appears for you. What are the biggest issues you foresee?
I'm an avid podcast listener, but I already ignore 99.9% of podcasts out there. I'm not concerned that this is going to become 99.99%.
If these AI generated podcasts are all bad, I will just continue to ignore them. If some turn out to be good, it seems like a win to me.
If you're worried about an existential "what happens to the world if all media is machine generated", I guess I'm willing to hop on the ride and see what we find out.
99.9? There are roughly 3mm podcasts out there right now - I listen, regularly, to about 10 over a year (in any given week maybe 3-4). I'm therefore ignoring 2,999,990 or 99.9997% of podcast. I definitely agree with you that this isn't a problem.
(Also - ironically, one of the podcast out of those 10 that I listen to regularly - it's the Deep Dive on AI. A NotebookLM production! )
It could poison the well - make it hard for people to find new good podcasts, and reduce discovery and revenue. Also they could fragment our society even more, disconnect people from people. Doesn't seem worth the risk.
If people want to listen to AI generated podcasts, they can just make them themselves. They don't need publishing on a platform alongside human-made podcasts. If I was Apple, who ultimately control curation of podcasts, then I'd prevent them. After all, Apple Intelligence will soon do as good a job of making your custom podcast if that's what you want.
How are all of the 99.99% podcasts that currently are not worth listening to not already poisoning the well? If the current ranking algorithms work, I don't see why it can't work with more podcasts, AI or not.
Assuming Google retain the current two voices for the audio overviews, it will rapidly become obvious to most people where the podcast came from.
I've seen "creators" on YouTube running NotebookLM-generated audio through (e.g.) ElevenLabs to change the voices, but this invariably degrades the quality.
Counterpoint: Most podcasts were utterly worthless before AI too. The world will do fine losing a few mattress ad vehicles.
Like other data, provenance suddenly matters a lot. From my POV, that's good. Not all data sources are created equal, and this is putting it into stark enough relief it might actually change the landscape. (In case it isn't obvious, I strongly believe most of the Internet was garbage well before LLMs. We just called it "SEO". Still garbage)
I generally agree, but when AI generated content is actively trying to avoid being labelled as “AI generated” it kinda gets depressing. Because in the end, it will just make the entire industry “seem” worthless, akin to AI generated pictures.
I’d rather let the end user know if it was made by humans or not, and let the marker decide. If people love listening to such content, let it be. But hiding how it was made, feels a bit disingenuous.
It sounds more like we should ban email and all email providers should consider the problem of email spam which traditional mail didn't have because no one could afford that many envelops and stamps.
Or like we should go back to carts because cars are noisy and not only that but might collide with pedestrians and not only that, might even collide among each other.
Instead of containing the tools and curtailing the progress (email and cars) we should probably try to contain and curtail abusers. Very hard to do, I know but the right thing to do.
Paper mail does have an issue with spam too. Though I guess the differences in price and scale make a quantitative issue into a qualitative one, which I guess was your point ?
You're assuming that everybody shares your opinion of cars being progress and/or progress being good - you're assuming too much.
Nearly every Google Image search result has AI images now, personally I’m starting to attribute this to Image Search just having been neutered and downgraded a ton over the past few years anyway so worse content surfaces in general.
But consider it from Googles perspective and this is why I think they won’t care, serving snippets and caches of articles had rights holders attacking them, serving thumbnails of images has rights holders attacking them, serving tiny bits of songs in the background of video had rights holders attacking them.
Serving AI doesn’t, I don’t think the current management at Google will care if Google shows fake baby peacocks as long as it can serve them without being bothered by rights holders, same way a Gemini summery can launder article information.
> it also opens the door for spammers to mass-produce content that isn't meant for human consumption.
What's new? Every novel class of genAI product has brought a tidal wave of slop, spam and/or scams to the medium it generates. If anyone working on a product like this doesn't anticipate it being used to mass produce vapid white-noise "content" on an industrial scale then they haven't been paying attention.
What I’m aiming for is to ensure that the NotebookLM team is aware of the impact and actively considering it. Hopefully, they are already working on tools or mechanisms to address the problem—ideally before their colleagues at YouTube and Google Search come asking for help to fight NotebookLM-generated spams :)
> What I’m aiming for is to ensure that the NotebookLM team is aware of the impact and actively considering it.
What is the impact? Have any of them attracted an audience of any meaningful size?
If a month from now there are 1.3 million generated podcasts, what do you anticipate the fallout to be?
The impact of cheap spam, as in the other cases, is to push people towards platforms.
You can even see it in the case of e-mail spam, where while technically not platforms, it helped Gmail and Hotmail to become so big that they can basically impose their own rules on the protocol.
Wouldn’t the value be lower if podcasts end up the way product review blogs have? Endless spam that causes people to append “Reddit” to their searches in hopes of finding something human generated.
Exactly this. It's obvious that generative AI content is bad for search and indexing, and I wish more people would learn to extrapolate from the forms of the problem that already exist, that the quality of the generated content is not going to solve the "please, god, find me something a real person actually said about this real thing" problem.
I just don't understand how people can pretend this isn't happening just because they find each new twist of a technology fascinating.
> It's obvious that generative AI content is bad for search and indexing
It's not obvious at all. You see a problem, I see opportunities. The person you responded to mentioned Reddit. Well, the glut of garbage on the internet makes Reddit more valuable.
The "please, god, find me something a real person actually said about this real thing" isn't actually a problem very many people have. Despite all the complaining here, most people are happy with the results they get when they search Google or Bing or Facebook or Reddit.
So like every technology, there are positives and negatives. It isn't clear to me where this all nets out but I'm leaning slightly positive.
And if you don't find this technology fascinating, then I'm not sure what to say.
>Well, the glut of garbage on the internet makes Reddit more valuable.
If given the choice between a dime and a fiver, is the fiver somehow worth more than if I'd offered you the choice between a fiver and a tenner instead?
> Is there a watermark or any other identifiable marker that can be used?
The problem with this is it's not feasible long-term, or even medium-term - as soon as a watermarking system is implemented, a watermark-removal system will be created.
Who cares? This is a problem for podcast hosters like Spotify, but not for listeners. Listeners can just follow their usual podcasts and never see 99% of the stuff that is out there.
The comments' default remedy is tribal: "The only moral content is my content." We sort of used to live in that world under the studio and TV networks system. Most consumers would say, it was not so bad, maybe better even.
Of course, the commenter never says this, living in the world today, where the writing he likes would never be published by the New York Times like it is on Twitter, the TV he likes would never be offered for free like it is on YouTube, and the music he likes would never been offered for pennies on Spotify. Some meaningful creators will lose from every remedy you could think of, where Google "something somethings" AI. Maybe the root problem is generalizing.
I created a “podcast episode” (???) of my personal blog (not trying to get traffic to it. It’s more of a journal) using NotebookLM. It sounded just as bland and overproduced as a “professional” podcast by NPR like “Planet Money” and “The Indicator”.
Whether that is saying how high quality NotebookLM is or how low quality NPRs podcast are is an exercise for the reader.
The only reason “Stuff you should know” is better is because of the random off topic discussions they go into and that’s not a complaint about SYSK.
Podcasts - episodic radio shows hosted on Apple Music and Spotify - haven't been around for very long. Not long enough to have kids being tutored in making podcasts and then becoming adults with that sentimental hobby, like with playing violin or oil painting. If you believe that the "Human Authenticity Badge" is meaningful for podcasts, it's complicated: traditions play the biggest role in the outrage you are trying to spin, not an appeal to slop and spam, which of course, there is already a ton of low quality podcasts, music and art written by real people for no nefarious purpose whatsoever. Like with many of these posts, which are really common on HN, there isn't a sensible remedy suggested besides pointing the fingers at some giant corporation, and asking them to do something impossible.
If you care a lot about podcast quality, go and make your own podcast service with better discovery. Once you realize the antagonist was collaborative filtering, made possible by non-negative matrix factorization dating from the year 2000, and not AI, you will at least have learned something from the comment, instead of just feeling better. And then, how do you propose to curate by hand, and why would someone choose your curation over the New Yorker's? And maybe those very purists, trying to make everything sentimental, accusing everyone of slop and spam - well, why do so many creators thrive and ignore the New Yorker's opinion about them entirely? Perhaps curation is not only not scalable, but also wrong. Difficult questions for listeners and podcast authors alike.
Well which one is it? Are the podcasts low quality or not? If they are, what the hell are you worried about? To be worried about, idk, disinformation from podcasts of all things is absolutely silliness. Won't someone think of the... podcast audiences? Fuckin what dude?
I was using this yesterday. I dumped all postmortems for an aspect of our infrastructure into a notebook and could then ask it to pull out common themes. It was remarkably effective. I also generated one of these "audio overviews" (aka podcasts) and it was great.
There was a vast improvement in quality from giving it a prompt when generating the overview. The generic un-prompted overview was for entirely the wrong audience, in our case users of our infrastructure rather than the developers. When instructing it to generate an overview for the SRE team and what they should focus on it was far better.
Was it useful for our in-depth analysis, no. Would I listen to one based on the last 100 postmortems for a new team I joined, absolutely. As an overview it was ideal, pulling out common themes from a lot of data and getting some of the vibe right too.
I am late to the Google's AI party but... My personal impression (might be wrong) is that Google's breadth and depth of AI tools is heavily underrated ranging from Notebook LLM to AI studio. Too good as far as I have tried.
Google of course is the birthplace of attention is all you need.
Moreso they're good at attracting talent. Note the attention is all you need paper came out in 2017 - was so hamstrung under Google, that Noam left Google to start Character AI. There he was reportedly close to a foundational model that would've crushed benchmarks. Until Google literally bought him back for $2.7B [1]
My product https://reasonote.com allows you to generate podcasts as well, and it's had this feature for a few weeks.
Improvements over NotebookLM:
(1) You can start with just a subject, and you don't need a full document to begin (though you can do that too![1])
(2) The podcast generates much faster
(3) The podcasts are interactive -- you can ask the hosts to change direction mid podcast, and they will do so.
(4) (Coming soon) You'll be able to make a Spotify-style Queue of Podcast topics, which you can add to as you encounter new ideas.
The primary tradeoff is that the voices / personalities are somewhat less engaging than NotebookLM at this time, though this will be dramatically improved over the coming months.
This is all in addition to the core value proposition, which is roughly "AI Generated Duolingo for Any Subject".
It's early days, but I'd love for you all to check it out and give me feedback :)
[1] Documents are currently heavily length-limited but this will be improved shortly
Have you tried doing what they do, where you generate the script, then run it through again and ask it to add the pauses and personality to the script?
That's coming soon, yes! I'm learning about how to emotionally prompt the different voice APIs right now. Eleven labs has some interesting writing on the subject in their prompting guide.
I'm also playing with other voice models -- built an awesome "voice actor simulator" with OpenAI Realtime Voice -- but it's expensive. Considering asking users to pass in an OpenAI API key for advanced voice? Or maybe just passing the per-token cost along to the user in their subscription.
Nice, I've only scratched the surface of Notebook LM, mainly for dumping lots of component reference material (datasheets, reference guides, application notes, etc). The text querying works great, but the audio overview wasn't very useful when it stuck to the high level of the content. With some ability to steer the topic out might be quite useful!
AI tooling has now made it too easy to find things.
On a web forum I am admin on, a user opened a DM a week ago titled "Google Notebook LM", someone else had shared a generated podcast thing that summarised the view of the forum on a particular subject, and it called out the usernames of someone who had strong opinions.
In response, another user ran with this and asked for a podcast to be generated summarising everything that was said by the user, their political views and all their hot takes.
Erm... uh-oh.
The use of real identity, the use of the same username across multiple sites, now makes it trivial for things like "take this Github username, find what sites the same username exists on, make a narrative of everything they've ever said, find the best and worst of what they've ever said"... which is terrifying.
I've said to the user the same old line we always repeat, "anything placed on the internet is effectively public forever", but only now are the consequences of this really being seen.
The forums I run allow username changes, encourage anonymity as much as possible, but we're at a point where multiple online identities, one for every site, interest, employer, etc... is probably the best way to go.
I notice on HN that there are many accounts that seem to register just to comment on particular stories and nothing more, and the comments are constructive and well thought out, and now I wonder whether some are just ahead of the curve on this — obscuring the totality of their identity from future employers, or anyone else who might use their words against them.
It feels like our lightweight choices in the past will start to have significant consequences in the present or future, and it's only a failure of imagination that is delaying a change in user behaviour.
The ability to do that exists, and was always going to eventually be easier. We used to all use pseudonyms but that fell out of fashion somewhat; and even then over time its inevitable you'll say one or a few things that can deanonymyze you. This was always going to happen and we can only hope it will change the public perception about privacy, which to this point has often been indifference or even annoyance when one brings it up.
> I notice on HN that there are many accounts that seem to register just to comment on particular stories and nothing more, and the comments are constructive and well thought out, and now I wonder whether some are just ahead of the curve on this — obscuring the totality of their identity from future employers, or anyone else who might use their words against them.
Throw aways are very common here for that purpose! On my end I'm becoming more interested in how to safeguard users -- anonymize them -- and also how to make easy to _generate_ throwaways without opening the door to spam (e.g. generate from a valid account, but then detach it). HN likely gets around this by being niche; I think the somewhat unattractive site design helps there.
Yeah the light cone of online activity seems to only grow with little diminishing, which seems unnatural and counter to the type of environment we evolved for. GDPR and the right to be forgotten seemed funny in my youth, now I see it as wisdom ahead of its time.
Surprised this was not there from the beginning. It can result in much better output. My problem with the default prompt is that it often is just two equally "knowledgeable hosts" kind of just bouncing information back and forth. With being able to customize the prompt you can create a kind of "explainer" and "listener" dynamic among the hosts that really helps the overall flow of the episode.
Something like this:
The two podcast hosts have very different levels of knowledge on the topic. The first host is the expert on the topic and explains the subject and the details to the second host. The second host has very little existing knowledge about the subject but will react to the information and ask follow up questions.
I think the default prompt attempts something like that -- one host does tend to take a more questioning role, and I've even had a few pods introduce one of them as a guest expert. But it seems like it sometimes loses track of who's who in that dynamic.
In a sea of similar tools, Google seems to have struck on something semi-viral with NotebookLM. Output can be mediocre, but with the bar for many podcasts being set at "read pages from Wikipedia", that's not bad at all for zero effort.
The 100 baseline on that graph is the highest attention the term has received, and it correlated with a launch and has since decreased.
Google never has problems with first the first few millions for consumer-launched tools. They have problems with the first few millions of net profit almost 100% of the time and shut it down a few years later.
But I do agree this is a good play for Google, it plays to their strengths.
Definitely hear what you are saying but I personally think it is for the best that they are instantly recognizable as NotebookLM podcasters. Especially as this makes the rounds on the internet. If you could manipulate the voices it would just make it more challenging to detect if a "Podcast" is using this tool.
Now that NotebookLM has gone from "small experiment" to "moderate viral success", I expect all kinds of roadmaps are being drawn up to use it to hook users into the broader Google AI ecosystem (e.g. automatically add images and illustrations by Imagen 3, etc.
Will be interesting to see what new ideas NotebookLM leads to. I feel this is how custom GPTs should have been launched. OpenAI is on the backfoot here.
Not an improvement for me. I've been instructing NotebookLM for weeks now already by including the instructions into the sources. That way I have version control on my prompts and can easily drag into the sources upload. This requires finding my instructions and copying and pasting, there's also a 500 character limit which is very small, I have over 2000 characters for my standard prompts.
I think it's an easy affordance for those users who are just interested in the basic functionality of the product.
However like you I cottoned on early that one could put a "Podcast Production Notes.txt" in each one of my Notebooks that allowed me to really put some horsepower behind the generated audio :D
NotebookLM is great to get an overview of a publication.
I created a short podcast focusing on HCI publications using NotebookLM
https://www.deep-hci.org/
Just posted some ISWC, MobileHCI and UbiComp papers, UIST is up next.
I sometimes feel like I'm crazy when I read the comments here. I absolutely cannot listen to these things. They sound like mixture of satire and late-night TV home shopping to me, all this campy hyperbole, the hyping up of even the most mundane things... Also, content-wise, stuff is dumbed down to the point where I can maybe see some value in this as entertainment, but this is not a learning tool, just like you won't become an astrophysicist by watching PBS space time (don't get me wrong, I love space time, but purely as entertainment).
That could actually be a top quality podcast - well moderated content from thoughtful people, many subject matter experts, with mostly thoughtful discussion.. (I read hn for the comments..) sounds good to me.
It’s a work in progress, and I literally just added the first foundation for note taking next to the RAG chat last night(not working yet) but it’s getting there.
The plan is to build the functionality out and then migrate the codebase to an API backend so that people can make their own front-ends for their own needs.
Also, it doesn’t have podcast creation or TTS, yet. They are planned, but it can transcribe and ingest a podcast for RAG chat currently.
Is there an open source tool that copies NotebookLM yet, or did anyone dig a bit into how the prompting is done to generate output in this dialogue format?
<Person1> "Welcome to Podcastfy! <break time="0.5s" /> Today, we're summarizing an interesting text about [topic from input text]. Let's dive in!"</Person1>
And then tells the LLM not to do anything at all like that example
Avoid using statements such as "Today, we're summarizing a fascinating conversation about ...". ]
[PauseInsertion: Avoid using breaks (<break> tag) but if included they should not go over 0.1 seconds]
Is this an intentional technique in prompt construction to avoid the LLM over indexing on the example or something?
I realize now that this is actually a clever way to collect training data. If it were any company other than Google, I'd be like, Awesome toy. With them, I am uneasy.
This works pretty well. I tried it with this guidance prompt:
You are both pelicans who work as data
journalist at a pelican news service.
Discuss this from the perspective of
pelican data journalists, being sure
to inject as many pelican related
anecdotes as possible
You ever find yourself wading through
mountains of data trying to pluck out
the juicy bits? It's like hunting for
a single shrimp in a whole kelp forest,
am I right?
And:
The future of data journalism is
looking brighter than a school of
silversides reflecting the morning sun.
Until next time, keep those wings
spread, those eyes sharp, and those
minds open. There's a whole ocean
of data out there just waiting to be
explored.
It’s amazing how it doesn’t occur to OpenAI and others that the “safety guardrails” really dilute the output. And it’s a display of conservatism. Bizarre to think that in America AI couldn’t swear.
Until today, where Google allowed the AI to behave as the lord intended.
Why do you think it doesn't occur to them? I'm genuinely curious.
It seems to me these alignment questions are a conscious trade-off between giving users what they ask for and brand safety knowing that every spicy output will immediately find its way to twitter/reddit/etc.
NotebookLM is contributing to fake podcasts across the internet, with over 1,300 and counting:
https://github.com/ListenNotes/ai-generated-fake-podcasts/bl...
Google is taking a different approach this time, moving quickly. While NotebookLM is indeed a remarkable tool for personal productivity and learning, it also opens the door for spammers to mass-produce content that isn't meant for human consumption.
Amidst all the praise for this project, I’d like to offer a different perspective. I hope the NotebookLM team sees this and recognizes the seriousness of the spam issue, which will only grow if left unaddressed. If you know someone on the team, please bring this to their attention - Could you please provide a tool or some plain-English guidelines to help detect audio generated by NotebookLM? Is there a watermark or any other identifiable marker that can be used?
Just recently, a Hacker News post highlighted how nearly all Google image results for "baby peacock" are AI-generated: https://news.ycombinator.com/item?id=41767648
It won't be long before we see a similar trend with low-quality, AI-generated fake podcasts flooding the internet.
Where do you get the "low-quality" part from - my experience with NotebookLM is that they create much higher quality, more informative, more fact based, and more concise podcasts than 99% of the stuff I listen to. I've mostly switched entirely over to NotebookLM for my podcast listening. They, generally, offer a far higher quality experience from my perspective.
Maybe you have the problem backwards - we accidentally end up listening to non NotebookLM podcasts?
A coworker fed some EU trade regulation page and its official FAQ to NotebookLM, and I was quite impressed with the results.
It was factually accurate, and presented the topic in a manner that was easy to digest and kept it interesting.
I didn't plan to but ended up listening to the whole thing, and I normally don't enjoy the podcast format.
For someone new to the topic, it'd be a pretty great intro compared to reading the official pages.
Yes. For example I fed it a public tender and associated regulations in Norwegian and it was able to answer questions about the parts I mas interested in correctly and succinctly. I have also fed it research papers that normally I would not have the patience or knowledge to go through on my own.
In terms of actual usefulness, it’s one of the AI tools that most impressed me.
The main issue is, of course, privacy.
I have tried to reproduce something similar using AnythingLLM and the low tier Llama models, but of course the experience is much worse, both in terms of results, response times and UI. If someone knows of a better local setup, I’m all ears.
I would consider a Workspace subscription if I could actually trust Google to make good on the commitment of not reading your stuff, which I’m finding hard to do…
Personally, I hate even the idea of an AI made podcast, because to me podcasts are personal and emotional. They're about the individual humans who make them. They're not just a source of "information".
I'm glad there are different kinds of podcasts for different people now.
I've always absolutely hated the focus on the individual humans and their personalities behind the podcast, and wished they'd be a better source of well-structured "information".
I never listened to a podcast I didn't get frustrated with, even at 2x speed. These NotebookLM podcasts have been exactly what I've always wished podcasts were.
Have you listened to any audio overviews in NotebookLM? They can be surprisingly good.
I don't know... I think it's pretty cool. In fact, I just found a podcast talking about this very thread (about a minute in, your comment even gets discussed ;-) https://drive.google.com/file/d/130s6OzcfsZam8V-6S5ugKmc0M7O...
I may have to re-task my paperclip making AI on generating podcasts...
It's interesting assumption that by virtue of being AI generated, it's considered bad/fake. 20 years ago, people hated how photoshop changed the photo design industry, NotebookLM is knocking on the door now.
I'm excited by AI, but I've also tried using this specific one to generate a podcast based on one of my own blog posts and will only try again due to this product announcement rather than because I think the state of the art is already "there".
On the plus side, the speech is almost perfect; so good, that I sincerely hope the voices themselves are never fully under user control.
With regards to the actual summary of the content I gave them, I would say they are grade B: only mostly correct, they're still inventing things I didn't say and missing things I did say.
That's not to say humans don't make mistakes, I still consider this objectively impressive, that is able to reach even this level was SciFi when I was a kid — but why waste time on a grade-B podcast when the AAA-tier costs you as a consumer a 30 second advert?
There's an ethical/moral-luck dilemma at the heart of this.
If a AAA-tier podcast on the subject you want to listen to a podcast about exists (and you know about it), then that's probably a better (and obvious) choice for your listening time.
However, if you want to listen to someone discuss or explain something and you don't know about a AAA-tier podcast, it's possible that a generated podcast is better than nothing.
On the other hand, it's also possible that the generated podcast will miss or hallucinate a key detail, and herein lies the dilemma. Is it better to listen to something that might get something wrong, or not to listen and perhaps someday to learn about the subject through some other form that is less likely to include mistakes?
Thanks to Photoshop and the similar, many people have a false idea of what a normal body looks like.
If AI texts have a similar effect then AI = bad is quite correct.
Interesting, are there any podcasts in particular that you recommend? Everything I’ve heard from it just seems like the most banal, cookie cutter stereotype of a podcast with nothing but extremely surface level summarization of a given article, peppered with random cliches and fake sounding reactions “Wow! ok, so let’s hear more about that. I’m intrigued!” “OK, let’s dive deep.” Etc.
DeepDive AI - I'm addicted to it.
That does not appear to be an AI generated podcast.
There’s a normal, human generated podcast called Deep Dive: AI (https://deepdive.opensource.org/). There’s also a confusingly similar named podcast Deep Dive AI that appears to only have one episode and is NotebookLM generated. Which one are you referring to?
Edit: If I'm understanding it correctly after some googling, supposedly the "name" of all podcasts generated by NotebookLM is "Deep Dive"? That's just confusing.
Okay, I will bite.
Its trained on too many shallow podcasts. Go compare any of NotebookLM podcast with an episode of Hardcore History. The latter goes into much more depth (even when you account of it being much longer).
I think OP is presenting a different problem: while this tool gives the possibility of creating good quality podcasts, it also enables spammers to quickly generate tons of garbage episodes just to profit from the advertisement they put in it.
That’s the same point being discussed: who’s to say this podcast is more garbage than the average podcast listened to by users?
No, it's not the same: being able to generate podcasts quickly means that we could get a lot more garbage content. The point here is about quantity.
That has not been my experience. The podcasts are vapid and full of cliches.
Sounds like they learned the source material well
I uploaded 3 graduate-level quantum mechanics textbooks and it generated a New Scientist style podcast for middle schoolers.
Goody, let’s just drive out all the human creators who actually interview real experts and go in depth on a subject, with AI-generated voices and summaries.
If AI ends up destroying humanity, it isn’t going to be through weapons and death robots, but just by entertaining and placating us all to death.
This doesn't strike me as much of a problem as it appears for you. What are the biggest issues you foresee?
I'm an avid podcast listener, but I already ignore 99.9% of podcasts out there. I'm not concerned that this is going to become 99.99%.
If these AI generated podcasts are all bad, I will just continue to ignore them. If some turn out to be good, it seems like a win to me.
If you're worried about an existential "what happens to the world if all media is machine generated", I guess I'm willing to hop on the ride and see what we find out.
99.9? There are roughly 3mm podcasts out there right now - I listen, regularly, to about 10 over a year (in any given week maybe 3-4). I'm therefore ignoring 2,999,990 or 99.9997% of podcast. I definitely agree with you that this isn't a problem.
(Also - ironically, one of the podcast out of those 10 that I listen to regularly - it's the Deep Dive on AI. A NotebookLM production! )
It could poison the well - make it hard for people to find new good podcasts, and reduce discovery and revenue. Also they could fragment our society even more, disconnect people from people. Doesn't seem worth the risk.
If people want to listen to AI generated podcasts, they can just make them themselves. They don't need publishing on a platform alongside human-made podcasts. If I was Apple, who ultimately control curation of podcasts, then I'd prevent them. After all, Apple Intelligence will soon do as good a job of making your custom podcast if that's what you want.
How are all of the 99.99% podcasts that currently are not worth listening to not already poisoning the well? If the current ranking algorithms work, I don't see why it can't work with more podcasts, AI or not.
Maybe the current ones aren't trying to poison the well.
As an exception, consider Infowars. Now imagine someone 10 times smarter, maybe even with no monetary goals.
Imagine if reality wasn't actually real! That would be nuts.
Assuming Google retain the current two voices for the audio overviews, it will rapidly become obvious to most people where the podcast came from. I've seen "creators" on YouTube running NotebookLM-generated audio through (e.g.) ElevenLabs to change the voices, but this invariably degrades the quality.
This is like saying: “Text based LLMs should do more to stop people from publishing the results of what they produce”
NotebookLM seems wonderful for digesting various content in an alternative way. It’s not a “fake podcast” either.
Nobody is saying that the audio output should or should not be published somewhere. That’s a user decision for both publishing and subscribing.
Indexes and discovery on the internet is where you advocate policing instead of nit picking a useful tool.
Counterpoint: Most podcasts were utterly worthless before AI too. The world will do fine losing a few mattress ad vehicles.
Like other data, provenance suddenly matters a lot. From my POV, that's good. Not all data sources are created equal, and this is putting it into stark enough relief it might actually change the landscape. (In case it isn't obvious, I strongly believe most of the Internet was garbage well before LLMs. We just called it "SEO". Still garbage)
> Counterpoint: Most podcasts were utterly worthless before AI too.
Yet more "but humans also".
I generally agree, but when AI generated content is actively trying to avoid being labelled as “AI generated” it kinda gets depressing. Because in the end, it will just make the entire industry “seem” worthless, akin to AI generated pictures.
I’d rather let the end user know if it was made by humans or not, and let the marker decide. If people love listening to such content, let it be. But hiding how it was made, feels a bit disingenuous.
It sounds more like we should ban email and all email providers should consider the problem of email spam which traditional mail didn't have because no one could afford that many envelops and stamps.
Or like we should go back to carts because cars are noisy and not only that but might collide with pedestrians and not only that, might even collide among each other.
Instead of containing the tools and curtailing the progress (email and cars) we should probably try to contain and curtail abusers. Very hard to do, I know but the right thing to do.
Paper mail does have an issue with spam too. Though I guess the differences in price and scale make a quantitative issue into a qualitative one, which I guess was your point ?
You're assuming that everybody shares your opinion of cars being progress and/or progress being good - you're assuming too much.
Nearly every Google Image search result has AI images now, personally I’m starting to attribute this to Image Search just having been neutered and downgraded a ton over the past few years anyway so worse content surfaces in general.
But consider it from Googles perspective and this is why I think they won’t care, serving snippets and caches of articles had rights holders attacking them, serving thumbnails of images has rights holders attacking them, serving tiny bits of songs in the background of video had rights holders attacking them.
Serving AI doesn’t, I don’t think the current management at Google will care if Google shows fake baby peacocks as long as it can serve them without being bothered by rights holders, same way a Gemini summery can launder article information.
> it also opens the door for spammers to mass-produce content that isn't meant for human consumption.
What's new? Every novel class of genAI product has brought a tidal wave of slop, spam and/or scams to the medium it generates. If anyone working on a product like this doesn't anticipate it being used to mass produce vapid white-noise "content" on an industrial scale then they haven't been paying attention.
This is definitely not a new issue.
What I’m aiming for is to ensure that the NotebookLM team is aware of the impact and actively considering it. Hopefully, they are already working on tools or mechanisms to address the problem—ideally before their colleagues at YouTube and Google Search come asking for help to fight NotebookLM-generated spams :)
It's certainly easier for the creators of genAI to build detection tools than for outsiders to do so. AI audio detection is a hard problem - https://www.npr.org/2024/04/05/1241446778/deepfake-audio-det...
> What I’m aiming for is to ensure that the NotebookLM team is aware of the impact and actively considering it.
What is the impact? Have any of them attracted an audience of any meaningful size? If a month from now there are 1.3 million generated podcasts, what do you anticipate the fallout to be?
The impact of cheap spam, as in the other cases, is to push people towards platforms.
You can even see it in the case of e-mail spam, where while technically not platforms, it helped Gmail and Hotmail to become so big that they can basically impose their own rules on the protocol.
> If a month from now there are 1.3 million generated podcasts, what do you anticipate the fallout to be?
Is this a rhetorical question? Because the answer for podcast indexing and search services is surely pretty obvious.
Why is it a problem? There's even more material for those services now and for their customers, the value these services can provide is even higher.
Wouldn’t the value be lower if podcasts end up the way product review blogs have? Endless spam that causes people to append “Reddit” to their searches in hopes of finding something human generated.
Exactly this. It's obvious that generative AI content is bad for search and indexing, and I wish more people would learn to extrapolate from the forms of the problem that already exist, that the quality of the generated content is not going to solve the "please, god, find me something a real person actually said about this real thing" problem.
I just don't understand how people can pretend this isn't happening just because they find each new twist of a technology fascinating.
> It's obvious that generative AI content is bad for search and indexing
It's not obvious at all. You see a problem, I see opportunities. The person you responded to mentioned Reddit. Well, the glut of garbage on the internet makes Reddit more valuable.
The "please, god, find me something a real person actually said about this real thing" isn't actually a problem very many people have. Despite all the complaining here, most people are happy with the results they get when they search Google or Bing or Facebook or Reddit.
So like every technology, there are positives and negatives. It isn't clear to me where this all nets out but I'm leaning slightly positive.
And if you don't find this technology fascinating, then I'm not sure what to say.
>Well, the glut of garbage on the internet makes Reddit more valuable.
If given the choice between a dime and a fiver, is the fiver somehow worth more than if I'd offered you the choice between a fiver and a tenner instead?
> Is there a watermark or any other identifiable marker that can be used?
The problem with this is it's not feasible long-term, or even medium-term - as soon as a watermarking system is implemented, a watermark-removal system will be created.
(Happy to be proven wrong)
Who cares? This is a problem for podcast hosters like Spotify, but not for listeners. Listeners can just follow their usual podcasts and never see 99% of the stuff that is out there.
> a Hacker News post highlighted how nearly all Google image results for "baby peacock" are AI-generated
Unfortunately Kagi image search is polluted with the same crap. I'd started to trust it recently but not so sure now.
[Edit] This is true even when you specifically use the 'exclude AI' filter
So what do you propose Google do to prevent this from happening?
The comments' default remedy is tribal: "The only moral content is my content." We sort of used to live in that world under the studio and TV networks system. Most consumers would say, it was not so bad, maybe better even.
Of course, the commenter never says this, living in the world today, where the writing he likes would never be published by the New York Times like it is on Twitter, the TV he likes would never be offered for free like it is on YouTube, and the music he likes would never been offered for pennies on Spotify. Some meaningful creators will lose from every remedy you could think of, where Google "something somethings" AI. Maybe the root problem is generalizing.
I created a “podcast episode” (???) of my personal blog (not trying to get traffic to it. It’s more of a journal) using NotebookLM. It sounded just as bland and overproduced as a “professional” podcast by NPR like “Planet Money” and “The Indicator”.
Whether that is saying how high quality NotebookLM is or how low quality NPRs podcast are is an exercise for the reader.
The only reason “Stuff you should know” is better is because of the random off topic discussions they go into and that’s not a complaint about SYSK.
Only 1300? I imagine it would be soo many more.
It’s definitely more than that.
The 1,300+ shows are just the ones recently removed from Listen Notes.
Give it a few days, and I’m sure the number will double, quadruple, and continue to grow. :(
I cannot wait for them to feed this content back into the machine.
Podcasts - episodic radio shows hosted on Apple Music and Spotify - haven't been around for very long. Not long enough to have kids being tutored in making podcasts and then becoming adults with that sentimental hobby, like with playing violin or oil painting. If you believe that the "Human Authenticity Badge" is meaningful for podcasts, it's complicated: traditions play the biggest role in the outrage you are trying to spin, not an appeal to slop and spam, which of course, there is already a ton of low quality podcasts, music and art written by real people for no nefarious purpose whatsoever. Like with many of these posts, which are really common on HN, there isn't a sensible remedy suggested besides pointing the fingers at some giant corporation, and asking them to do something impossible.
If you care a lot about podcast quality, go and make your own podcast service with better discovery. Once you realize the antagonist was collaborative filtering, made possible by non-negative matrix factorization dating from the year 2000, and not AI, you will at least have learned something from the comment, instead of just feeling better. And then, how do you propose to curate by hand, and why would someone choose your curation over the New Yorker's? And maybe those very purists, trying to make everything sentimental, accusing everyone of slop and spam - well, why do so many creators thrive and ignore the New Yorker's opinion about them entirely? Perhaps curation is not only not scalable, but also wrong. Difficult questions for listeners and podcast authors alike.
Geez, I hope there aren't people like you working at Google
Well which one is it? Are the podcasts low quality or not? If they are, what the hell are you worried about? To be worried about, idk, disinformation from podcasts of all things is absolutely silliness. Won't someone think of the... podcast audiences? Fuckin what dude?
I was using this yesterday. I dumped all postmortems for an aspect of our infrastructure into a notebook and could then ask it to pull out common themes. It was remarkably effective. I also generated one of these "audio overviews" (aka podcasts) and it was great.
There was a vast improvement in quality from giving it a prompt when generating the overview. The generic un-prompted overview was for entirely the wrong audience, in our case users of our infrastructure rather than the developers. When instructing it to generate an overview for the SRE team and what they should focus on it was far better.
Was it useful for our in-depth analysis, no. Would I listen to one based on the last 100 postmortems for a new team I joined, absolutely. As an overview it was ideal, pulling out common themes from a lot of data and getting some of the vibe right too.
I am late to the Google's AI party but... My personal impression (might be wrong) is that Google's breadth and depth of AI tools is heavily underrated ranging from Notebook LLM to AI studio. Too good as far as I have tried.
Google of course is the birthplace of attention is all you need.
Moreso they're good at attracting talent. Note the attention is all you need paper came out in 2017 - was so hamstrung under Google, that Noam left Google to start Character AI. There he was reportedly close to a foundational model that would've crushed benchmarks. Until Google literally bought him back for $2.7B [1]
[1] https://www.wsj.com/tech/ai/noam-shazeer-google-ai-deal-d360...
My product https://reasonote.com allows you to generate podcasts as well, and it's had this feature for a few weeks.
Improvements over NotebookLM:
(1) You can start with just a subject, and you don't need a full document to begin (though you can do that too![1])
(2) The podcast generates much faster
(3) The podcasts are interactive -- you can ask the hosts to change direction mid podcast, and they will do so.
(4) (Coming soon) You'll be able to make a Spotify-style Queue of Podcast topics, which you can add to as you encounter new ideas.
The primary tradeoff is that the voices / personalities are somewhat less engaging than NotebookLM at this time, though this will be dramatically improved over the coming months.
This is all in addition to the core value proposition, which is roughly "AI Generated Duolingo for Any Subject".
It's early days, but I'd love for you all to check it out and give me feedback :)
[1] Documents are currently heavily length-limited but this will be improved shortly
Have you tried doing what they do, where you generate the script, then run it through again and ask it to add the pauses and personality to the script?
That's coming soon, yes! I'm learning about how to emotionally prompt the different voice APIs right now. Eleven labs has some interesting writing on the subject in their prompting guide.
I'm also playing with other voice models -- built an awesome "voice actor simulator" with OpenAI Realtime Voice -- but it's expensive. Considering asking users to pass in an OpenAI API key for advanced voice? Or maybe just passing the per-token cost along to the user in their subscription.
Sounds awesome. However I couldn't find any pricing on the website.
thanks for the feedback! Pricing is evolving but my current plan is here:
https://www.reasonote.com/app/upgrade
Nice, I've only scratched the surface of Notebook LM, mainly for dumping lots of component reference material (datasheets, reference guides, application notes, etc). The text querying works great, but the audio overview wasn't very useful when it stuck to the high level of the content. With some ability to steer the topic out might be quite useful!
Google Illuminate recently also introduced a customization feature. I use this customization with it:
audience=technical, duration=long, tone=professional & engaging
AI tooling has now made it too easy to find things.
On a web forum I am admin on, a user opened a DM a week ago titled "Google Notebook LM", someone else had shared a generated podcast thing that summarised the view of the forum on a particular subject, and it called out the usernames of someone who had strong opinions.
In response, another user ran with this and asked for a podcast to be generated summarising everything that was said by the user, their political views and all their hot takes.
Erm... uh-oh.
The use of real identity, the use of the same username across multiple sites, now makes it trivial for things like "take this Github username, find what sites the same username exists on, make a narrative of everything they've ever said, find the best and worst of what they've ever said"... which is terrifying.
I've said to the user the same old line we always repeat, "anything placed on the internet is effectively public forever", but only now are the consequences of this really being seen.
The forums I run allow username changes, encourage anonymity as much as possible, but we're at a point where multiple online identities, one for every site, interest, employer, etc... is probably the best way to go.
I notice on HN that there are many accounts that seem to register just to comment on particular stories and nothing more, and the comments are constructive and well thought out, and now I wonder whether some are just ahead of the curve on this — obscuring the totality of their identity from future employers, or anyone else who might use their words against them.
It feels like our lightweight choices in the past will start to have significant consequences in the present or future, and it's only a failure of imagination that is delaying a change in user behaviour.
The ability to do that exists, and was always going to eventually be easier. We used to all use pseudonyms but that fell out of fashion somewhat; and even then over time its inevitable you'll say one or a few things that can deanonymyze you. This was always going to happen and we can only hope it will change the public perception about privacy, which to this point has often been indifference or even annoyance when one brings it up.
> I notice on HN that there are many accounts that seem to register just to comment on particular stories and nothing more, and the comments are constructive and well thought out, and now I wonder whether some are just ahead of the curve on this — obscuring the totality of their identity from future employers, or anyone else who might use their words against them.
Throw aways are very common here for that purpose! On my end I'm becoming more interested in how to safeguard users -- anonymize them -- and also how to make easy to _generate_ throwaways without opening the door to spam (e.g. generate from a valid account, but then detach it). HN likely gets around this by being niche; I think the somewhat unattractive site design helps there.
And that's just what amateurs can do.
Yeah the light cone of online activity seems to only grow with little diminishing, which seems unnatural and counter to the type of environment we evolved for. GDPR and the right to be forgotten seemed funny in my youth, now I see it as wisdom ahead of its time.
This is awesome! I have actually been using NotebookLM to create daily digests of HN and publish them to YouTube: https://www.youtube.com/@HackerCasts
I'm still getting the tooling right so that the videos will get made in a better and more consistent schedule.
Surprised this was not there from the beginning. It can result in much better output. My problem with the default prompt is that it often is just two equally "knowledgeable hosts" kind of just bouncing information back and forth. With being able to customize the prompt you can create a kind of "explainer" and "listener" dynamic among the hosts that really helps the overall flow of the episode.
Something like this:
The two podcast hosts have very different levels of knowledge on the topic. The first host is the expert on the topic and explains the subject and the details to the second host. The second host has very little existing knowledge about the subject but will react to the information and ask follow up questions.
I think the default prompt attempts something like that -- one host does tend to take a more questioning role, and I've even had a few pods introduce one of them as a guest expert. But it seems like it sometimes loses track of who's who in that dynamic.
Here's an open source version that generates Podcasts:
https://github.com/souzatharsis/podcastfy
Developer's twitter: @souzatharsis
In a sea of similar tools, Google seems to have struck on something semi-viral with NotebookLM. Output can be mediocre, but with the bar for many podcasts being set at "read pages from Wikipedia", that's not bad at all for zero effort.
https://trends.google.com/trends/explore?geo=US&q=NotebookLM...
The 100 baseline on that graph is the highest attention the term has received, and it correlated with a launch and has since decreased.
Google never has problems with first the first few millions for consumer-launched tools. They have problems with the first few millions of net profit almost 100% of the time and shut it down a few years later.
But I do agree this is a good play for Google, it plays to their strengths.
> The 100 baseline on that graph is the highest attention the term has receive
Good point. I couldn't come up with a well-enough known competing tool to compare against.
I really wish it had more voices, notebooklm-guy and notebooklm-girl get tiring
Definitely hear what you are saying but I personally think it is for the best that they are instantly recognizable as NotebookLM podcasters. Especially as this makes the rounds on the internet. If you could manipulate the voices it would just make it more challenging to detect if a "Podcast" is using this tool.
Now that NotebookLM has gone from "small experiment" to "moderate viral success", I expect all kinds of roadmaps are being drawn up to use it to hook users into the broader Google AI ecosystem (e.g. automatically add images and illustrations by Imagen 3, etc.
I am very very bullish on NotebookLM. There is an awesome notebooklm list here now:
https://github.com/etewiah/awesome-notebooklm
Will be interesting to see what new ideas NotebookLM leads to. I feel this is how custom GPTs should have been launched. OpenAI is on the backfoot here.
Not an improvement for me. I've been instructing NotebookLM for weeks now already by including the instructions into the sources. That way I have version control on my prompts and can easily drag into the sources upload. This requires finding my instructions and copying and pasting, there's also a 500 character limit which is very small, I have over 2000 characters for my standard prompts.
I think it's an easy affordance for those users who are just interested in the basic functionality of the product.
However like you I cottoned on early that one could put a "Podcast Production Notes.txt" in each one of my Notebooks that allowed me to really put some horsepower behind the generated audio :D
The length went from 13mins to 37mins (!) with small prompts.
NotebookLM is great to get an overview of a publication. I created a short podcast focusing on HCI publications using NotebookLM https://www.deep-hci.org/
Just posted some ISWC, MobileHCI and UbiComp papers, UIST is up next.
> With over 80,000 organizations already using NotebookLM
Really. "Using"? (as in an email from an org owned domain logged in to notebooklm page?..)
Really wish there was an API so I can upload my content and connect it to my website to make it interactive for my potential clients.
I’ve recently started using NotebookLM and I wish either it was from any other company besides Google or that Google would charge for it.
Google has the attention span and product focus of a crack addled flea. I’m afraid the entire project will be killed.
NotebookLM is a great product. I just started using it this week to ingest artifacts for a new project and get an overview.
I sometimes feel like I'm crazy when I read the comments here. I absolutely cannot listen to these things. They sound like mixture of satire and late-night TV home shopping to me, all this campy hyperbole, the hyping up of even the most mundane things... Also, content-wise, stuff is dumbed down to the point where I can maybe see some value in this as entertainment, but this is not a learning tool, just like you won't become an astrophysicist by watching PBS space time (don't get me wrong, I love space time, but purely as entertainment).
I want the HN comment section as a podcast
Your wish is my command ;)
https://news.gipety.com/
Made one for these comments https://notebooklm.google.com/notebook/e3b9d8c5-6243-4ae6-ab... (although I haven't checked the quality since I'm in a public space as of writing)
That could actually be a top quality podcast - well moderated content from thoughtful people, many subject matter experts, with mostly thoughtful discussion.. (I read hn for the comments..) sounds good to me.
It's a shame that folks look at this and think it's awesome but then have the dawning question of "When will Google kill it?"
People building on top of this will likely want to know what the Open Source / non doomed version will be!
Hey, I’m working on exactly that: https://github.com/rmusser01/tldw
It’s a work in progress, and I literally just added the first foundation for note taking next to the RAG chat last night(not working yet) but it’s getting there. The plan is to build the functionality out and then migrate the codebase to an API backend so that people can make their own front-ends for their own needs. Also, it doesn’t have podcast creation or TTS, yet. They are planned, but it can transcribe and ingest a podcast for RAG chat currently.
I hope Google never kills it. It is a useful tool. But then whatever Google killed was useful too.
Is there an open source tool that copies NotebookLM yet, or did anyone dig a bit into how the prompting is done to generate output in this dialogue format?
Check out this prompt: https://github.com/souzatharsis/podcastfy/blob/6ad5734c3ffb5...
Huh, that gives the LLM an example
And then tells the LLM not to do anything at all like that example Is this an intentional technique in prompt construction to avoid the LLM over indexing on the example or something?Looks good, I wonder if F5 could replace 11labs?
I need different voices, people think the guy is me.
One day too late. ^-^
What did I miss yesterday?
I used it to create an audio version of my "game theory and agent reasoning" series of posts.
... ...
tl;dr, not much.
I realize now that this is actually a clever way to collect training data. If it were any company other than Google, I'd be like, Awesome toy. With them, I am uneasy.
The terms explicitly say that uploaded content is not used for training the model.
What they say isn't necessarily what they end up doing, like many corporations.
https://en.wikipedia.org/wiki/Privacy_concerns_with_Google
This works pretty well. I tried it with this guidance prompt:
Against this article: https://simonwillison.net/2024/Oct/17/video-scraping/You can listen to the 7m40s resulting MP4 here: https://simonwillison.net/2024/Oct/17/notebooklm-pelicans/
Example snippets:
And:I haven't had a face to go with these voices, and now they are canonically two pelicans. Thank you for putting this in my head.
What did you mean by pelicans? Googled but didn't understand, is it a term for having a type of data expertise?
Pretty sure they just meant one of these: https://en.m.wikipedia.org/wiki/Pelican
For someone from Turkey, "Pelican Journalists" indeed has a political meaning.
https://en.m.wikipedia.org/wiki/Pelikan_(organization)
No way! That's fascinating, thanks.
Simon likes to test LLMs' creativity by asking them questions about pelicans
I did! See also: https://simonwillison.net/search/?q=pelican
Do you think it's possible to make the conversation adversarial or is that against the guardrails?
I'd be very surprised if they had that completely airtight, I bet we'll see some examples of people breaking that soon.
Hah, just found this on Reddit: https://www.reddit.com/r/notebooklm/comments/1g64iyi/holy_sh... - many f-bombs.
That feels/sounds so natural.
It’s amazing how it doesn’t occur to OpenAI and others that the “safety guardrails” really dilute the output. And it’s a display of conservatism. Bizarre to think that in America AI couldn’t swear.
Until today, where Google allowed the AI to behave as the lord intended.
Why do you think it doesn't occur to them? I'm genuinely curious.
It seems to me these alignment questions are a conscious trade-off between giving users what they ask for and brand safety knowing that every spicy output will immediately find its way to twitter/reddit/etc.
Is there any way to use NotebookLM audio generation with an API?