Google confirmed indexing issue affecting a large number of sites

Ryuzaki · Jul 15, 2022

Source: https://searchengineland.com/google...ssue-affecting-a-large-number-of-sites-386526

"The issue started sometime before 7 am ET today." [...] "There’s an ongoing issue with indexing in Google Search that’s affecting a large number of sites. Sites may experience delayed indexing. We’re working on identifying the root cause. Next update will be within 12 hours."

They're saying this is some brand new issue, but.... D O U B T. There's been something going on with indexing for a while now, in my opinion, and my opinion is that it's because they're tinkering with it and introducing bugs.

This is total speculation but the amount of people that have been complaining about indexation, and people's indexation dropping more than drastically during the May Core Update, and me watching them drop out tons of pages from big sites over the past 4 month (I think is the time frame)...

I said it elsewhere, but I think they're gearing up to stop indexing AI content and also going ahead and creating some kind of quality threshold where they're not going to index "content that doesn't add value" and content from unproven, new, low trust, low brand sites... Again, I'm speculating but something is definitely going on and we're hearing only a part of the story.

What's the point of indexing trillions of pages if you're only going to expose the top 10% in the SERPs anyways? Why spend the resources indexing and ranking them if they'll never get an outbound click from Google? Especially since the web is growing exponentially in size and everyone and their mom wants to create programmatic and AI content sites that offer nothing to users.

That's my thinking and I'm sticking to it. But if we take the story at face value, there's some new "ongoing" bug that only started this morning (contradictory language to some degree).

Philip J. Fry · Jul 15, 2022

That's a good guess. They already do filter content from being indexed if the content doesn't meet their quality guidelines. If they can figure out AI content from "quality" content, then they would too.

harrytwatter · Jul 15, 2022

I mean, theoretically don't the other 9 pages serve as kind of a test gauntlet to properly vet a given page before it has the chance to see page 1 and volumes of clicks? If they only expose the top 10%, and if they use their current preferred method of links and content volume to choose, then how in the hell is anything new ever going to birth into existence?

I'm not arguing that isn't what they intend to do, indeed it would follow larger economic trends towards complete monopolization and abuse of said monopoly power, but I just don't see how they serve this up in a valuable educational manner to any prospective new web publishers, which they have at least historically pretended to care about.

Like is Mueller going to drop the nice guy office hours persona, start going by "M-dog", and completely re-invent himself into a slick suit-wearing C-suite stooge, educating the 10% with enough domain authority to be "in the club" on how to further maintain their positions?

Yes Bon Apetite, you want more traffic you say? Continue publishing 4,000 word guides on muffin baking, and make sure to bury your ingredient list at the bottom of the page, below a biography of your great grandma, approximately 5 adsense spaces, 2 affiliate offers, and one email capture. Any questions? Huh? Competition? ROFL!

-Google 2022-[the apocalypse]

Ryuzaki · Jul 15, 2022

Yeah, I’m being hasty with my words. Let’s say that they surface 10% of good pages. I don’t mean that they’ll just not index 90%. Like you said, they need a measuring stick and frame of reference. So say they cut their indexation by half. 10% gets traffic, 40% feeds long tails and Adsense and is the “measuring stick”, and 50% is automated nonsense.

I’d guess that we’ll over 50% of the net is automated garbage. And of the remaining amount most may never get search traffic but needs to be indexed for other reasons as described above.

I’m sure there’s trillions of pages that simply will never be seen by a human, has zero value to humans, and has zero value to robots. They don’t contribute to the algorithm in the sense of what not to do (and the models are already trained anyways, which is why single pages rarely get a honeymoon period any more).

It’s going to be a lot cheaper to crawl and dismiss a page (tagging it in an index of URLs) than to enter it into an index of page content and then process the on-page and off-page and rendering it and all the domain wide metric flows. They’re simply not going to do that over quintillions of trash pages) and the problem will be compoundingly worse every passing year). We already see them dismiss entire sites based on domain-wide metrics & behavior. That's one of the earlier ways to skip out on processing nonsense.

I’m strictly talking about dismissing spam and having a low but real quality threshold for new domains. One that is easy for anyone who’s not a lazy spammer to surmount.

Google confirmed indexing issue affecting a large number of sites

Ryuzaki

お前はもう死んでいる

Philip J. Fry

harrytwatter

just be nice ffs

Ryuzaki

お前はもう死んでいる