- Joined
- Sep 3, 2014
- Messages
- 6,230
- Likes
- 13,100
- Degree
- 9
Last night, CCarter linked me to a new Search Engine Journal post, entitled "Google's New Algorithm Creates Original Articles From Your Content." I thought it'd make an interesting thought experiment for us to churn.
The basic idea is that, just like the featured answers and meta descriptions where Google will take your content and display it on their own site, most of the time not following your directives, they have a new algorithm they've been working on to do the same but for entire articles.
The longer version is that the algorithm can read and understand your content, finding the most important sentences or paragraphs and save them, calling them "extractive summaries." It's like Reddit bots that say "here's the best summary we could make of the link, reducing the text by 90%."
The difference is they do this across a handful of articles. So now they have the most pertinent content from 5 articles, and then apply what they're calling "abstractive summaries" which SEJ calls a form of paraphrasing. But let's be real, it's grammar and syntax spinning to fill in the gaps between the extractive summary sentences.
Admittedly, I didn't read all of this yet, but here's a link from a Hong Kong group on their recent late 2017 research on the topic - Faithful to the Original: Fact Aware Neural Abstractive Summarization. And another, from Google themselves, called Generating Wikipedia by Summarizing Long Sequences.
The question becomes "What will this be used for, or is it purely for research at this point?" My guess is they'd probably like to not send you any traffic from the Hummingbird related Featured Snippets, where if you ask a question often you'll get an answer in the Google SERPs. If it rolls out into the SERPs at all, I'd guess that's where it would be. Another guess is that this is probably going to be used for the home pods and Siri style voice assistant stuff, with the internet of things coming along.
As always, I'd like to point out the hypocrisy of Google not liking you to use spintax or filling up a website purely full of copy & paste syndicated articles, yet here they go doing it for themselves. Also, they'll be scraping the living hell out of all of our servers like they do, but we're not supposed to scrape their results. ¯\_(ツ)_/¯ I guess this is why "Don't Be Evil" got removed from their company motto.
The basic idea is that, just like the featured answers and meta descriptions where Google will take your content and display it on their own site, most of the time not following your directives, they have a new algorithm they've been working on to do the same but for entire articles.
The longer version is that the algorithm can read and understand your content, finding the most important sentences or paragraphs and save them, calling them "extractive summaries." It's like Reddit bots that say "here's the best summary we could make of the link, reducing the text by 90%."
The difference is they do this across a handful of articles. So now they have the most pertinent content from 5 articles, and then apply what they're calling "abstractive summaries" which SEJ calls a form of paraphrasing. But let's be real, it's grammar and syntax spinning to fill in the gaps between the extractive summary sentences.
Admittedly, I didn't read all of this yet, but here's a link from a Hong Kong group on their recent late 2017 research on the topic - Faithful to the Original: Fact Aware Neural Abstractive Summarization. And another, from Google themselves, called Generating Wikipedia by Summarizing Long Sequences.
The question becomes "What will this be used for, or is it purely for research at this point?" My guess is they'd probably like to not send you any traffic from the Hummingbird related Featured Snippets, where if you ask a question often you'll get an answer in the Google SERPs. If it rolls out into the SERPs at all, I'd guess that's where it would be. Another guess is that this is probably going to be used for the home pods and Siri style voice assistant stuff, with the internet of things coming along.
As always, I'd like to point out the hypocrisy of Google not liking you to use spintax or filling up a website purely full of copy & paste syndicated articles, yet here they go doing it for themselves. Also, they'll be scraping the living hell out of all of our servers like they do, but we're not supposed to scrape their results. ¯\_(ツ)_/¯ I guess this is why "Don't Be Evil" got removed from their company motto.