As far as we know, Google is not giving up any data. The crawler still must store a copy of the text for the index. The only certainty we have is that Google is no longer sharing it.
Here’s the heart of the not-so-obvious problem:
Websites treat the Google crawler like a 1st class citizen. Paywalls give Google unpaid junk-free access. Then Google search results direct people to a website that treats humans differently (worse). So Google users are led to sites they cannot access. The heart of the problem is access inequality. Google effectively serves to refer people to sites that are not publicly accessible.
I do not want to see search results I cannot access. Google cache was the equalizer that neutralizes that problem. Now that problem is back in our face.
From the article:
“was meant for helping people access pages when way back, you often couldn’t depend on a page loading. These days, things have greatly improved. So, it was decided to retire it.” (emphasis added)
Bullshit! The web gets increasingly enshitified and content is less accessible every day.
For now, you can still build your own cache links even without the button, just by going to “https://webcache.googleusercontent.com/search?q=cache:” plus a website URL, or by typing “cache:” plus a URL into Google Search.
You can also use 12ft.io.
Cached links were great if the website was down or quickly changed, but they also gave some insight over the years about how the “Google Bot” web crawler views the web. … A lot of Google Bot details are shrouded in secrecy to hide from SEO spammers, but you could learn a lot by investigating what cached pages look like.
Okay, so there’s a more plausible theory about the real reason for this move. Google may be trying to increase the secrecy of how its crawler functions.
The pages aren’t necessarily rendered like how you would expect.
More importantly, they don’t render the way authors expect. And that’s a fucking good thing! It’s how caching helps give us some escape from enshification. From the 12ft.io faq:
“Prepend 12ft.io/ to the URL webpage, and we’ll try our best to remove the popups, ads, and other visual distractions.”
It also circumvents #paywalls. No doubt there must be legal pressure on Google from angry website owners who want to force their content to come with garbage.
The death of cached sites will mean the Internet Archive has a larger burden of archiving and tracking changes on the world’s webpages.
The possibly good news is that Google’s role shrinks a bit. Any Google shrinkage is a good outcome overall. But there is a concerning relationship between archive.org and Cloudflare. I depend heavily on archive.org largely because Cloudflare has broken ~25% of the web. The day #InternetArchive becomes Cloudflared itself, we’re fucked.
We need several non-profits to archive the web in parallel redundancy with archive.org.
Bingo. When I read that part of the article, I felt insulted. People see the web getting increasingly enshitified and less accessible. The increased need for cached pages has justified the existence of 12ft.io.
~40% of my web access is now dependant on archive.org and 12ft.io.
So yes, Google is obviously bullshitting. Clearly there is a real reason for nixing cached pages and Google is concealing that reason.
This is probably an attempt to save money on storage costs.
That’s in fact what the article claims as Google’s reason. But seems irrational. Google still needs to index websites for the search engine. So the storage is still needed since the data collection is still needed. The only difference (AFAICT) is Google is simply not sharing that data. Also, there are bigger pots of money in play than piddly storage costs.
You were given plenty of references. You can verify it yourself if you want to get a clue – or continue to spread misinfo to the contrary. You are disservicing your users and the fedi by maintaining patronage to the privacy-abusing corp.
If you truly don’t understand the problems with Cloudflare, why not embrace transparency and inform people who visit your site that CF is used and that CF sees all their traffic despite the padlock? If you are proud of this, why conceal it?
Not exactly. !showerthoughts@lemmy.world
was a poor choice, as is:
!showerthoughts@zerobytes.monster
← Cloudflare!showerthoughts@sh.itjust.works
← Cloudflare!showerthoughts@lemmy.ca
← Cloudflare!showerthoughts@lemm.ee
← Cloudflare!hotshowerthoughts@x69.org
← Cloudflare, and possibly irrelevant!showerthoughts@lemmy.ml
← not CF, but copious political baggage, abusive moderation & centralized by disproportionate sizeThey’re all shit & the OP’s own account is limited to creating a new community on #lemmyWorld. !showerthoughts@lemmy.ml
would be the lesser of evils but the best move would be create an acct on a digital rights-respecting instance that allows community creations and then create showerthoughts community there.
(EDIT) !showerThoughts@fedia.io
should address these issues.
Normal users don’t have these issues.
That’s not true. Cloudflare marginalizes both normal users and street-wise users. In particular:
There are likely more oppressed groups beyond that because there is no transparency with Cloudflare.
It’s an abuse of the fediverse and antithetical to #decentralization to use Cloudflare. And ironically your comment comes in response to broken functionality manifesting from links to exclusive venues appearing in an openly public forum.
“Petty” for not supporting the elitist exclusivity that you support? Cloudflare blocks impoverished communities whose ISPs use CGNAT because they cannot afford an IPv4 for everyone. Shame on CF pushers and shame on you for supporting marginalization by giant corps while backing privacy abuse.
And cf also allows you to block and report child porn
That’s been tried. When someone reported CP to Cloudflare, CF demanded the identity of the whiste blower then doxxed them to the offending CF user, who then published the whistle blower’s identity so their users could retaliate. When the CEO (Matthew Prince) was confronted about this, his reply was that the whistle blowers “should have used fake names”. Then this company you support had the nerve to claim to have a privacy pledge: “[A]ny personal information you provide to us is just that: personal and private.”
Also cf is about the only way to make federation affordable and safe. (emphasis mine)
Forcing children to reveal their residential IP addresses to the fedi whereby any interested person (read: child preditors) can derive their approximate location – do you really think that’s a good idea for safety?
What are you even thinking? It most certainly is not safe to expose 20%+ of everyone’s traffic to a single corporation.
#digitalExclusion
Shame this is posted on a centralized Cloudflare instance, which causes problems for people using Tor,VPNs,CGNAT,etc:
Isn’t this different because there are specifically truth-in-advertising laws? Not even a natural person is immune to truth-in-advertising laws. So it seems like Tesla is making a despirate move.
In addition to its first amendment argument, Tesla also said that the California DMV is violating its rights to have a jury trial, under the US Constitution’s 7th Amendment and Article I, Section 16 of California’s Constitution, both of which cover rights to trial by a jury.
Yikes. What does a jury of Tesla’s peers look like? Representatives from 12 other giant corporations?
I’ve been saying for years that Invidious needs to support comments. Glad there’s finally a free world option.
I’m not keen on browser extensions though. Is there a manual way? Is it a matter of searching a particular Lemmy instance for the video ID?
Ungoogled Chromium indeed reproduces the issue. But so does the public library, which likely was Firefox in Windows. So i guess it might be hasty to conclude that it’s browser specific, particularly when other videos on the same instance behave differently in the same browser.
It’s like saying “you’re a bad company. . .but damn do I like your product and will consume it anyway!” it doesn’t make much sense, logically or morally.
Sony is a dispensible broker/manager who no one likely assigns credit to for a work. I didn’t even know who Sony pimped – just had to look it up. The Karate Kid, Spider-man, Pink Floyd… Do you really think that when someone experiences those works, they walk away saying “what a great job Sony did”?
I don’t praise Sony for the quality of the works they market any more than I would credit a movie theater for a great movie that I experience. Roger Waters will create his works whether Sony is involved or not.
You also seem to be implying they have good metrics on black market activity and useful feedback from that. This is likely insignificant compared to rating platforms like Netflix and the copious metrics Netflix collects.
Can you explain further why grabbing an unlicensed work helps Sony? Are you assuming the consumer would recommend the work to others who then go buy it legitimately?
If it becomes a trend to shoplift Sony headphones, the merchant takes a hit and has to decide whether to spend more money on security, or to simply quit selling Sony headphones due to reduced profitability. I don’t see how that helps Sony. I don’t shoplift myself but if I did I would target brands I most object to.
That’s is how I got around it in the past. For some reason that was not an option where I needed it (perhaps the browser I was using was locked down in some way). In any case, I’m wondering why the variation in behavior. Is this a bug in Invidious?
Why would a browser handle it incorrectly for one video on one invidious instance, but not for most other videos and other instances?
Note that I’ve seen this broken behavior both in my own Chromium installation as well as Firefox in Windows as a public library.
My question is what is forcing me to create an email address.
Does the law force me to create an email address (knowing that it would then be unavoidably used to facilitate the sender sharing whatever they want about me with Microsoft)?
It’s important to note that if your email address falls in the hands of a gov or org, they will use it without encryption. They will share willy nilly anything they want with Microsoft (their email provider) in the loop. And if you make a GDPR art.17 request to have your email address erased from their records after they abuse it, they ignore those requests and continue using your email address. So it’s best not to give them an email address to begin with.
(edit) govs and orgs seem to always put my full name in the e-mail headers, sometimes even including my middle name. And they usually greet me by surname. This ensures that Microsoft trivially knows exactly who to associate the content with. IMO it infringes on the data minimisation principle.