The Core

Why We Are Here => Traffic => Topic started by: Rumbas on October 27, 2015, 02:59:39 PM

Title: RankBrain
Post by: Rumbas on October 27, 2015, 02:59:39 PM
Google Turning Its Lucrative Web Search Over to AI Machines

http://www.bloomberg.com/news/articles/2015-10-26/google-turning-its-lucrative-web-search-over-to-ai-machines

For the past few months, a "very large fraction" of the millions of queries a second that people type into the company's search engine have been interpreted by an artificial intelligence system, nicknamed RankBrain, said Greg Corrado, a senior research scientist with the company, outlining for the first time the emerging role of AI in search.

What's new here? Hasen't it always been machines?
Title: Re: RankBrain
Post by: Rumbas on October 27, 2015, 03:13:46 PM
http://www.thesempost.com/rankbrain-everything-we-know-about-googles-ai-algorithm/

http://www.thesempost.com/googles-rankbrain-9-industry-experts-weigh-in/
Title: Re: RankBrain
Post by: JasonD on October 27, 2015, 03:14:50 PM
I's arguably it's a big step towards a self learning algo.

Delivering what user's want to see rank. For me the take away is user engagement, high quality sites as a prerequisite and lots of links.
Title: Re: RankBrain
Post by: JasonD on October 27, 2015, 03:20:50 PM
However..... I think it's no more than a query modifier at the moment.

EG

I search for "What is the largest cat statue that ever appeared on photo sharing sites" will get converted internally to

site:flickr.com big cat pics
Title: Re: RankBrain
Post by: Mackin USA on October 27, 2015, 03:27:54 PM
NOT HERE
Title: Re: RankBrain
Post by: JasonD on October 27, 2015, 03:38:37 PM
Sorry, I should have said that, the post above is how I believe it will be interpreted.

I've not actually searched for the phrase :)
Title: Re: RankBrain
Post by: ergophobe on October 27, 2015, 04:06:16 PM
There's a bit of grousing in... ahem... another forum about how Google is "dumbing down" search and equating specific terms with general terms and yadda yadda yadda.

Same reaction as when they announced the Bacon Calculator and everyone was saying it was a stupid waste of resources. Back then I said it was a big deal assuming Google was honest about their method. I was essentially shouted down. Fast forward a few years and the Knowledge Graph has dramatically changed search.

I think we're seeing something similar. Yes, it's still an algo, but it's self-learning (or, actulaly, periodically retrainable) and it is drawing on a massive and growing linguistic database.

I always try to put it in this perspective: There is no doubt in my mind that in 25 years when we're at the old folks home telling kids about putting in keywords and optimizing for keywords, they are going to be incredulous. It will be like telling them that people used to enter short pulses and long pulses and translate those into letters on the other end and it wasn't to keep things secret, it was actually the most efficient long-distance communication.

Anything that makes a step toward natural language processing and does so on a path that is sustainable and scalable is a BIG deal and will ultimately change your world. The question is always the "sustainable and scalable" part.

The key point, I think, from the first SEM Post article linked above is:

QuoteIf we can convert a sentence into a vector that captures the meaning of the sentence, then Google can do much better searches.  They can search based on what is being said in a document.

In other words, keywords as such will continue to see their value fall. Basically, you can think of a progression where first Google tackled
- obvious mispellings
- then stemming (like, liked, likes which still isn't great - like and liking will get you different results),
- then synonyms
- then co-occurrence (simple relationships between words)
- to vectors (complex relationships between words)
- to trying to capture higher-order meaning from these complex relationships

I think links already have lost some value. That will accelerate, but that's a way off because

QuoteTo understand at human levels, we are probably going to need human level resources, and we've got trillions of connection and the biggest neural net we run so far have at most a few billion connections, so we are a few orders of magnitude off still.   But I'm sure the hardware people will help us out.

So for the time being, the link graph is still the best way to under

And orders of magnitude are coming - we're poised to see storage speed and density increase by three orders of magnitude in the next decade.  
- http://www.geek.com/chips/new-intel-storage-is-1000-times-faster-than-your-ssd-1629656/

I don't know if we'll see similar advances in processing power. If engineers can somehow push  Moore's Law for another 15 years, that's three orders of magnitude there as well.

Then another decade to refine the software that can use that technology...
Title: Re: RankBrain
Post by: ergophobe on October 27, 2015, 04:07:15 PM
Just read the second article.

See Roger Montti's comments

http://www.thesempost.com/googles-rankbrain-9-industry-experts-weigh-in/
Title: Re: RankBrain
Post by: littleman on October 27, 2015, 04:32:36 PM
>equating specific terms with general terms

As a user I have found this irritating as well.  When I am searching for specific strings I do not like G second guessing my intent and giving me results that it thinks are related -- I've been seeing this even when quotation marks are used.
Title: Re: RankBrain
Post by: JasonD on October 27, 2015, 04:35:38 PM
> As a user I have found this irritating as well.

Agreed but what percentage are we as users who dislike, compared to the larger majority who gain from it?

I think their financial results show that it works, even if only to send more clicks to the Ads.
Title: Re: RankBrain
Post by: ergophobe on October 27, 2015, 05:01:13 PM
Quote from: littleman on October 27, 2015, 04:32:36 PM
>equating specific terms with general terms

As a user I have found this irritating as well.  When I am searching for specific strings I do not like G second guessing my intent and giving me results that it thinks are related -- I've been seeing this even when quotation marks are used.

As a power user often doing obscure searches, this is annoying. My point was, however, that this is a baby step and the end of the journey is ultimately to understand your query more completely and therefore give you better results than you will ever get with keyword searches.

I think we may now be in "the dip" where the keywords aren't as targetable as before, but the new tech isn't refined enough to give better results. But the long-term of this approach, I'm convinced, is much better results
Title: Re: RankBrain
Post by: BoL on October 27, 2015, 05:03:48 PM
>equating specific terms with general terms

> As a user I have found this irritating as well.

Me also. It certainly needs some refinement, most definitely so for technical queries like programming. There's an overemphasis on matching synonyms when sometimes the ~synonyms (and their stems) are not so relevant to the query.
Title: Re: RankBrain
Post by: littleman on October 27, 2015, 05:14:28 PM
Ergophobe, JasonD
Yeah, you are probably right, but it would be so much more useful if G gave the option of advanced queries.  I bet that there is already an unpublished override out there that the G engineers use.

I am sure they cranked the numbers and are making more money this way.
Title: Re: RankBrain
Post by: Rumbas on October 28, 2015, 08:39:40 AM
>I am sure they cranked the numbers and are making more money this way.

Bingo. I agree.

I also agree that on the very specific long tail queries, it could be a problem, but if it works for the vast majority.. However too much press around not being able to actually dig into really long tail, would eventually give Google a problem?
Title: Re: RankBrain
Post by: ergophobe on October 28, 2015, 03:44:09 PM
QuoteHowever too much press around not being able to actually dig into really long tail

Frankly, most of the press around Panda and Penguin was web marketers griping about dropping rankings, but for the rank and file Google users, it has made Google much, much more useful.

As a scholar trying to find information via Google, around 2008-2009 the spam was so bad Google was all but unusable for long-tail queries on obscure topics that I was doing. As an example, I was doing a LOT of searches on obscure place names. There were spammers who had scanned gazateers and created pages that were mostly random collections of keyphrases that people commonly associate with places  -- "Books about Smalltown" "History of Smalltown"

That's now gone. In my world as a scholar, Panda, Penguin and associated updates *returned* the long tail to Google.

So for a tiny, tiny number of us who use linkdomain: + "word" -site: blah blah blah this new update may make it harder to target super specific stuff where we know what we're looking for.

My gut feeling is that for the vast majority of users who have no idea that you can put quotes around a phrase, this will improve long-tail results.
Title: Re: RankBrain
Post by: Adam C on October 28, 2015, 06:01:33 PM
Quote from: ergophobe on October 27, 2015, 04:07:15 PM
Just read the second article.

See Roger Montti's comments

http://www.thesempost.com/googles-rankbrain-9-industry-experts-weigh-in/

Thanks for the prompt.  The most insightful of the lot.