Indexing issues anyone?

Joined
Sep 23, 2020
Messages
11
Likes
6
Degree
0
Hi, it's been a while beeing here,

I have seen it on several projects and heard from several SEOs out there.
There are lots of websites having issues getting new content indexed by Google since the last few months.

Big sites with thousands of pages but also small sites with a few hundred, they all have relatively high DR in their niche and are following all the common recommendations like ...
- pinging their new articles via GSC
- no technical issues (mentioned in xml sitemaps & html sitemaps, well internally linked, meta tags are fine, no blockings from robots.txt, Google can crawl them, and so on)
- posted them on SocialMedia to "generate" social signals
... but pages still aren't indexed by Google since weeks or months.

Do you guys experience similar issues? has anyone found a solution?
I'm wondering if Google has issues again and doesn't talk about it as usual.
 
Yes I have similar issues. I have one article that isn't indexed since July, and I tried to force it via GCS a couple of times. Also publishing on social media with hashtags didn't help.
 
relatively high DR
This doesn't really tell us anything, due to the word "relatively". The reason I mention it is because...

99% of the time this question has been asked on here, and it comes up frequently, the answer to the asker is always the same, and the asker's response is always the same.

It's always an issue of a site not having enough backlinks / page rank / authority to warrant indexing at any fast rate. Low authority sites can submit sitemaps but the URLs within the sitemaps may only be crawled once every 6 months, etc.

As the internet explodes in size, Google is having to choose what to crawl and what to index. And the best way to do that is to determine what the people are "voting" for in terms of quality, and that's done with backlinks still, or really a question of sitewide page rank scores, in my opinion. Google has to allocate their crawl budgets somehow, and the best way to do it is to not waste too much time on low authority stuff.

In the rare chance that that's not actually the issue, a slow down in crawling and indexing on Google's part is usually indicative of a big ass update on the horizon.
 
I think indexing also has a lot to do with content, intent and competition these days.

If you're basically the same as everyone else, that is the same content, the same sources etc, then you will probably get indexed, but not in a rush.

If its a keyword that is lacking content, you will get indexed fast.

If you're doing something unique and fresh, you'll be indexed same day etc.

I feel like Google has a much better grasp at how similar content is now.
 
It's been 23 years since Google launched, billions if not trillions of dollars going through their hands probably has resulted in them knowing a thing or two on quality content.

I hate to say this but the complaints are always from low authority sites, and I'm not using ahrefs' DR as a measurement. I'm using Google sending traffic they trust as the measuring stick.

If Google sends you 1,000 to 10,000+ visitors a day your pages get index within hours. SERPWoo new pages all get index within 24 hours. Same with BuSo.

So it's a chicken and egg scenario.
 
Thanks in general, sorry for not giving the right context here ...

Site #1 - >60DR - Ecommerce (Shopify) - fashion
80k organic branded + 40k organic non branded traffic per month (just started with seo)
Inventory few hundred urls & no technical issues
Issue: blog posts are topical relevant, well linked and crawled but not indexed since several months

Site #2 - >90DR - SaaS
>2M organic branded + >5M organic non branded traffic per month
Big website, still a lot to improve domain quality & reduce noise (less valuable urls)
Issue: blog posts are topical relevant, well linked and crawled but not indexed since several weeks

I could go on with 4-5 more similar domains from US and EU.

Unfortunately, it cannot be summarised with "low authority sites". These are brands that are well known in their niches and also create good and highly relevant content pieces. They are all in business since ages and have a lot of branded traffic - so trust shouldn't be the issue eighter.

I know there could be ton's of stumbling stones why a piece of content istn't indexed. I digged deep but haven't found any onpage road blockers. What makes me really suspicious is that these indexing-issues exist in many niches, from large websites to small ones, from self-developed to established cms / shop systems.

Therefore I was wondering if someone else experienced same ...
 
Are these sites Javascript sites where Google may have to delay indexing because they have to first go through rendering?
 
Yes, both sites rely on JS.
A pageload for both sites could be summarized in ...
  • 40-50% of requests are JS
  • 70-80% of downloaded bytes are JS
  • 60-70% main-thread processing breakdown due to scripting
On one site the main navigation & internal links work without js on the other site it does not - which is clearly not ideal. In general content / article itself is available without JS on both sites.

I have also already looked into the partial indexing topic mentioned by the Onely guys. But can't draw a conclusion for those articles.
 
Concentrating on the blog post issues, is the blog linked to within the header menu and the footer menu?

As well are there backlinks (external) to the blog posts?

Also are there more than one image within each blog post (not just a header image with walls of text)?
 
If Google sends you 1,000 to 10,000+ visitors a day your pages get index within hours. SERPWoo new pages all get index within 24 hours. Same with BuSo.
I think it also depends on how often you publish the posts. I have a site that gets 100k visitors a month, yet it is taking several days to weeks to get new posts indexed.

However, if I force index through the search console it gets indexed in a day.
 
This is the guy from YouTube
who clearly explains about crawled but not indexed and discovered currently not indexed issues. He clearly mentions that you have to wait until Google responds from their side. It takes even two months but you wait to wait. Finally Google starts to index your blog posts one by one or bulk at a correct time.

No force indexing ---- No social signals ---- No Inspect URL ---- Just Patience

So I tried to wait for my blog posts to index in Google. I scheduled blog posts for 2 months (around 60 articles - Each of them around 500+ words). All of those 60 blog posts moved on my GSC as Discovered but not indexed option. After waiting long time Google response very well on my blog posts. It started indexing my blog posts one by one. The GSC screenshot clearly shows the Google indexing process in a graph format.


SaAiiEJ.png
 
Speaking off, I wrote an article based on keyword research I did in Google Search Console and published it yesterday and today it's ranking in the 15th spot.

Which shows me that it probably is about if Google considers your content to be sufficiently unique to bother indexing.

If it is in serps that lack results, indexing will be fast.
 
@khajam, John Mueller made some statements on Twitter yesterday regarding how sites that are "on the edge of quality" end up being "on the edge of indexing". Here's the context.

Which is what I was getting at above. One of the ways they understand quality is through backlinks. That's not an issue for you.

So I'm wondering if there's something on-page going on or with your technical SEO side of things going on. Javascript rendering is something I try to stay away from beyond maybe the mobile menu, and that can be done with CSS really.

I'd try to look at your page resource waterfalls and record a timeline of the page being rendered on an uncached page load, and watch it in slow motion and see if anything obvious is going on. I'm sure you can render it as Googlebot too some how and see it exactly how they see it as it happens (not just the final product). If you want to see the final product you can look at the cache in the SERPs for a bunch of your pages and see if anything catastrophic is going on there.
 
Just to throw it out there, have you verified that the articles are not being indexed?

If you look at Google Consol reports, they can often say articles are not indexed - even though they are. If you use the site search command and search for your page in Google it may appear in the search results.

Of course, that is not always the case and perhaps your articles really are not being indexed. So just mentioning this in case it’s something you haven’t tried already.
 
Honestly, the only solution I found was indexing API. My journey was slowed down multiple months due to non indexing. Got demotivated and only pushed a couple posts per month, tried literally everything. Consistent posts, social media shares, backlinks etc.

Friend told me about indexing API and all posts index within 24hrs magically :D
 
Concentrating on the blog post issues, is the blog linked to within the header menu and the footer menu?

As well are there backlinks (external) to the blog posts?

Also are there more than one image within each blog post (not just a header image with walls of text)?

1.) blog overview page is linked from header menu and footer + due to the indexing issue we link the latest 4 articles directly from the homepage.

2.) Not really, there is definitely potential for expansion here.

3.) Yes, several images are optimized and included in the articles

@khajam, John Mueller made some statements on Twitter yesterday regarding how sites that are "on the edge of quality" end up being "on the edge of indexing". Here's the context.

Which is what I was getting at above. One of the ways they understand quality is through backlinks. That's not an issue for you.

So I'm wondering if there's something on-page going on or with your technical SEO side of things going on. Javascript rendering is something I try to stay away from beyond maybe the mobile menu, and that can be done with CSS really.

I'd try to look at your page resource waterfalls and record a timeline of the page being rendered on an uncached page load, and watch it in slow motion and see if anything obvious is going on. I'm sure you can render it as Googlebot too some how and see it exactly how they see it as it happens (not just the final product). If you want to see the final product you can look at the cache in the SERPs for a bunch of your pages and see if anything catastrophic is going on there.
In case of the fashion shop (Shopify self developed theme by anyone else) the <a href=> links are even in the (non rendered) source code; the page renders visually if JS is disabled but the links & main menu are not clickable - which is a strange but maybe some weird browser behaviour - but it should work.

The SaaS pages works completely without JS, here and there are some widgets missing, but no crucial parts from the main content.

I'll take a closer look into rendering, let's see if anything pops up there.

Just to throw it out there, have you verified that the articles are not being indexed?

If you look at Google Consol reports, they can often say articles are not indexed - even though they are. If you use the site search command and search for your page in Google it may appear in the search results.

Of course, that is not always the case and perhaps your articles really are not being indexed. So just mentioning this in case it’s something you haven’t tried already.

Yep, I've tested it with site search and after you can't always rely on it I've also searched for paragraphs and main keyword from the articles.

Honestly, the only solution I found was indexing API. My journey was slowed down multiple months due to non indexing. Got demotivated and only pushed a couple posts per month, tried literally everything. Consistent posts, social media shares, backlinks etc.

Friend told me about indexing API and all posts index within 24hrs magically :D
Are you talking about "exploiting" Google's own indexing api by pushing normal articles as news articles (cause only jobs & news are supported atm) ... or are you talking about indexnow.org which is supported by Bing&Yandex and Google is "looking into it" sincerly?


Again, Thanks to all of you, really appreciate the high quality of answers here !
 
Once again, I publish a piece of content for a keyword that lacks content and it is indexed within hours.

While a post I published about an EAT topic is unindexed for 1 month.

So yeah, this seems to me to be about Google prioritizing the top 10-20 content and not really bothering to speed up the indexing for "me too" content, unless you make it unique.
 
I've always been quite patient with Google's indexing speed. One, because I myself can't even begin to fathom the amount of processing required to discover and accurately rank a new URL in the ocean that is the usable web but also second, because I work mostly with low-medium authority affiliate sites and well, everything seems to take longer when you don't gOt dA jUiCe of big publishers.

However recently I've been running into what seems like an increasing volume of "discovered-not indexed" GSC statuses for freshly published content.

Niches and site authority levels (number and quality of RDs) vary widely, issue doesn't seem to be confined to one "tier" or another. Index requesting doesn't help. URLs are well optimized, interlinked, navigable, and in XML sitemaps.

Interestingly this is seeming to impact both higher competition terms and lower competition super long tail terms as well.. which seems contradictory to Muller claims that you edge of indexing is relatable to it being "worthwhile".. these long tail terms don't have a lot of publishers going after them and I cannot therefore see why they'd be excluded based on quality.

Also doing some spot checking across 170+ pieces dripped out consistently going back to October it looks like the frequency with which things are being excluded seems to have increased in December.

Oddly I also found 3 instances where GSC reported the URL as indexed but site: operator and content snippet quote searches yielded nothing.

Overall I do think people expect a bit too much from Google when it comes to the actual speed of indexation, especially given the amount of low quality garbage many affiliate marketers seem intent to continually push out.

That said I haven't seen this low of an indexation rate across domains of varying authority, pertaining to long-tail low-comp keywords in a long time which makes me wonder if Googs may actually be struggling with some issues here..
 
@harrytwatter Yes, there's definitely a problem here, not just a conscious decision.

I hear the same from SEO types, random clients can't get indexed, small or big.
 
This feels like either a large amount of water being blocked in a pipe that is eventually going to explode, (from which I'd expect a Google statement) OR this is the new normal, they don't feel beholden to index shit, adding another layer of obfuscation for SEOs to mire in. I'm still hoping it's the former.
 
This feels like either a large amount of water being blocked in a pipe that is eventually going to explode, (from which I'd expect a Google statement) OR this is the new normal, they don't feel beholden to index shit, adding another layer of obfuscation for SEOs to mire in. I'm still hoping it's the former.

Yeah, but this is nearing critical levels imo.

They open up a flank for Apple to finally give it a go.

If this really is a technical issue, then it might be the tip of the iceberg of increasing incompetence.
 
Indexing has been a serious problem for some time now, and seems to be getting worse.

I've put a lot of thought into this. From what I can see, Google seems to have applied a score threshold on pages / sites. If your page is below this score - no indexing.

The lower the score, the slower the indexing. The higher the score, the faster the indexing. This was always the case, sites with high Page Rank scores get crawled more frequently, and new URL's indexed quickly.

But something has changed.

Instead of slow indexing for low page scores, there seems to be no indexing below a certain score. I suppose as the internet grows, Google will find it a huge resource drain to index everything, so they need to cut a chunk of the internet out of their index.

Gary Illyes mentions the "quality threshold" for indexing here;


John Meuller describes indexing issue as "teetering on the edge of indexing" - which I interpret to be the edge of the indexing score threshold that Gary mentions. He also says you need to "convince" Google to index your page, and with Google being a counting machine, the "convincing" is done with numbers - a score.


Of course, I'm just speculating, but thought I would share my thoughts since I've been experiencing this issue too.
 
Last edited:
That makes sense, but doesn't seem to work well as the quality scores that Google apply are also based on interaction, right?

Now maybe they have so much confidence in their AI that they apply "estimated interaction scores" to new content?
 
Back