Its a bit old, but I just learned it via the retro-dodo article here: https://retrododo.com/google-is-killing-retro-dodo/
ChatGPT4: tl;dr The universe is bigger than we thought.
ChatGPT5: fuck spez
It’s* a bit old
Is it just me or are 60 million a ridiculously small price for that whole dataset?
To be fair it’s a pretty terrible dataset. The AI is just going to say “this” to every question you ask
deleted by creator
This.
and “and my axe”
I’m personally curious whether Reddit actually has any ability to protect that database. I don’t remember Reddit TOS, but usually those things give them license to use and copy the data, maybe even to sell it, but not actually the copyright on it. So if someone made a Reddit scraper and copied the comments, wouldn’t only the actual commenter be able to sue?
$60M may be reflecting that, in that it’s more a convenience fee to shield Google against individual Redditors going after them than something that Reddit itself could actually sue over.
Perhaps, but not worth buying if you can’t make profit or keep it from your competition.
60M is for over almost 20 years of data, but once it’s ingested, google will only want new content. Next year, it’ll be more like 3M if the dataset isn’t poisoned by bots or the AI fad hasn’t collapsed. Reddit will struggle with finances again and users will suffer. At least that’s my prediction.
the AI fad
LOL. Do you realize that makes you sound like Boomers talking about the internet in the late 90’s and early 00’s?
It currently looks very much like a bubble. After the dot com bubble, the internet didn’t go away, but most companies died off and all the stupid monetisation went bankrupt.
We may be seeing something similar
I wonder if Google’s unlimited legal budget plays a role. Not a lawyer, so probably way off here…
But, for example, reddit’s success in part depends on Google ingesting their data — reddit shows up in Google searches all the time, which can only happen if Google uses reddit’s content. So reddit telling Google “you can’t use our content” doesn’t work, and they need to say something like, “you can use our content for search results but you can’t consume it as training data.”
This is a pretty straightforward statement/request/demand, but one could imagine Google lawyers maliciously complying and throwing their hands up dramatically, claiming “well we use some amount of AI in our search results, so if we can’t use your content for AI training then we can’t risk using it for search results.” Which would, I imagine, really, really hurt reddit (no Google results would be catastrophic I suspect).
So, perhaps the “low” 60M figure is just Google using their leverage.
Or not. As a random person on the Internet, I can say I’m probably not contributing anything meaningful here…
Considering it’s all full of Nazis and bots, and if you get to filter all of them out you’re left with reposts and low quality memes followed by comments that represent the hostile side of each of us… I’d say anything over $5 is a good deal for spez.
Now, I hope Google uses this data exclusively for detecting inappropriate answers. Can you imagine it giving answers based on the endless threads i of " I’m not your mate, bro; I’m not your bro, dude…".
is there a way to mass delete my old content? the service i used in the past doesn’t seem to have worked. i recently got a reply from a 6 year-old post from someone saying they got there on google.
My understanding is that the mass delete you did probably had worked, but reddit rolled back your deletions. I heard it happened to a lot of mass deleters after the lemmy exodus.
Can we still mass edit our previous comments with random stuff, a little bit at a time to avoid detection? Poison the data, yada yada.
Can’t wait to see an AI chatbot in my Google searches that behaves like a typical redditor.
Every thing you google is just going to direct you to a link to let me google that for you
– Hey Google/reddit, what does xxxxxx mean?
–Wtf is people so lazy, Google it yourself it’s only 5 seconds!
–But but, you are Google, are you not?
–Buahaha , haha!
I deleted my comment history after the API exodus. I’m sure they could dig it up if they wanted but at least they’ll have to click like 3 more buttons if they want to train AI on my nonsense.
Before:
SELECT * FROM `comments` WHERE is_deleted=0;
After:
SELECT * FROM `comments`;
Oh no, my thousands of identical messages!
You sir are a scholar and a gentleman.
I also choose this man’s wife.
This
And my axe