Trillions of Questions, No Easy Answers

Posted on 2022-05-21 Edited on 2023-09-03 In CS , NLP

Trillions of Questions, No Easy Answers: A (home) movie about how Google Search works

[TYPING SOUND]

[MUSIC PLAYING]

NARRATOR: This is Louis XIV, also known as Louis the Great, Louis the Grand Monarch and Louis the Sun King.
Famous for supposedly declaring “l’etat, c’est moi.”
But in 1685, even the self-declared direct representative of God on Earth had questions he could not answer on his own–questions about the ruling Qing dynasty in China.
How big is it?
How many people live in the capital?
What can they teach us about music? Culture? Astronomy?
So in the spring of that year, Louis sent members of France’s order of royal mathematicians on a voyage that would span three continents and three oceans.
Their task–gather information that would satisfy the King’s curiosity.
It was a journey with numerous hardships and countless setbacks.
But five years four months and two days later, Louis’s answers finally arrived.
In the grandest of human traditions, he had become curious, asked a question, and learned a new piece of information, just like billions of people who had come before him and billions who have come since.
People who had access to cave walls, clay tablets, oracles, scrolls, books, the printing press, libraries, semaphore towers, telegraphs, the radio, the television, Betamax tape, and the short-lived French national internet system called Minitel.

[ELECTRONIC VOICE SPEAKING FRENCH]

Which brings us to today.
Fishermen looking up when tomorrow’s tide comes in.
Careful cooks wondering when anchovies expire.
Travelers trying to figure out how to say Chapstick in Turkish.
Friends settling a bet about which team won the ‘92 NBA Eastern Conference Finals.
Job seekers looking to make a move.
And a fourth grader looking at facts about the Qing dynasty for a history paper that’s due tomorrow.
Billions of King Louis asking trillions of questions in hundreds of languages, expecting someone to give them an answer in under one second.
Now, who would sign up for a challenge like that?

BEN GOMES: Interesting setup.
CREW: Yeah.

[MUSIC PLAYING]

NARRATOR: This is Ben Gomes.
BEN GOMES: Well, the correct pronunciation is “Gah-mez.”
NARRATOR: This is Ben Gomes.
BEN GOMES: But I say “Gohms.”
It’s a Portuguese name.
NARRATOR: This is Ben Gomes.
He knows a few things about search.
Uh, that search.
Anyway, he’s kind of a big deal, even though he’d try to convince you otherwise.
Ben worked on Search for more than 20 years.
But that’s now where his story started.
BEN GOMES: So I was born in Dar Es Salaam in Tanzania.
But at a very early age, my parents moved back to India to Bengaluru.
And there was a few books at home from my elder siblings.
And that’s the information I had access to, including I remember one torn encyclopedia that I think my grandfather had given my mom.
So it was really out of date.
In 5th grade, I got two presents–a bike, which my parents thought I’d be very excited about, and a much better encyclopedia.
And I was actually much more excited about the encyclopedia–this is where geeks come from–than the bicycle.
And my parents didn’t know what to do with this.

[MUSIC PLAYING]

When I look back at how we found information, it was so dramatically different from today.
When my mother was growing up, where there was not even access to a good library, you would have just accepted the fact that you didn’t have the information, and that’s the way it was going to be.
When I was growing up, for some kinds of information, there was a decent library.
But you still had to take this bus.
It took about an hour.
You had to look things up in a card catalog.
That took time.
Now today, we measure in fractions of a second the time it takes for you to get information.
I think that reduction in friction is absolutely dramatic, because it can enable people around the world to have equal access to information.
It’s not just that people in some places who have access to the best libraries.
Everybody should have access to the highest quality information.
So that combination of a deep technical problem and I think a fundamental human need to understand the world around us, to know more about the world around us, is the heart of Google Search, and what keeps me coming to work still so excited 20 years later.

So in the early days, I wondered whether the company had the infrastructure to be a real company.
Because when I had come for my interview actually, there was not even a sign indicating that this was Google.
So I was not sure I’d come to the right place.
But halfway up the staircase, there was a small neon sign that said Google.
So that’s when I knew.

[MUSIC PLAYING]

And it generally felt completely chaotic.
And Jeff was there.
Jeff is also brilliant.
JEFF DEAN: Yeah, we were a very small company.
We were maybe about 25 people.
We were all kind of wedged into this second floor area in downtown Palo Alto.
I was in an office with Urs Holzle.
BEN GOMES: Urs was in charge of all of engineering.
And at the time, I don’t think I knew how to pronounce his name.
But he put the three of us named Ben in one office, just so people would walk by and say, hey, Ben.
URS HOLZLE: Yes, we had the Ben Pen.
I think it was pure coincidence, actually.
My first reaction to Google was, I have no idea what Search is, so it’s probably not for me.
But then I was intrigued by the problem.
It was clear that there was some real value there.
Because without really good ranking, all that growth of the web would be wasted if nobody could actually find the things that were there.
BEN GOMES: So one of the core aspects of Search is, how do we rank results and how do we find the most relevant information.
So a lot of people work on that.
You’ll get really good stuff on this from Pandu, actually.
CREW: Pandu?
OK.

[MUSIC PLAYING]

NARRATOR: This is Pandu Nayak–
PANDU NAYAK: Hi, I’m Pandu.
NARRATOR: –head of Search Ranking.
His personal motto–
PANDU NAYAK: No query left behind.
NARRATOR: Before working at Google, Pandu worked at an artificial intelligence lab at NASA.
PANDU NAYAK: Yeah, we built an autonomous system that provided high-level control to a spacecraft called Deep Space 1, really the most exciting thing that has ever happened in my life–in my professional life, I guess.
NARRATOR: After doing that, he wanted a new challenge.
PANDU NAYAK: I oversee the Ranking team.
So ranking is important because if we simply return the million pages that match your search query, that’s not particularly helpful.
And so we need to rank the pages that you might find useful.
Hopefully, these are at the top of the results.
We’re really trying to bring information to the world at large and make it useful so people can improve their day-to-day lives.
And I feel really lucky to have the opportunity to work on this mission.

[MUSIC PLAYING]

NARRATOR: Let’s go back a bit.
Summer, 1999, room 300 and something in the Gates building at Stanford.
And these two guys, Larry and Sergey, who were about to announce something so big it merited matching polo shirts.
LARRY PAGE: OK, maybe we should get started.
So what is our mission?
So how is Google different?
Basically, we want to organize the world’s information and make it universally accessible and useful.
NARRATOR: 20 years later, bigger stage, same deal.

[CHEERING]

SPEAKER 2: And today, our mission feels as relevant as ever.

[MUSIC PLAYING]

NARRATOR: So what does this actually mean?
Here are a few takes.
CATHY EDWARDS: I think if we weigh up the various parts of the mission, to me the most important piece is organizing.
There are hundreds of billions of web pages that are out there.
Our job is to filter through that and to really give you what you are looking for at that moment in time.
NICK FOX: And then the next part is the world’s information.
So information means really anything.
It started out for Google with web pages, but it’s so much more than that.
DAVID BESBRIS: Whether it’s physical books that we need to scan or maps that we build of every place on Earth, that’s information, too.
And it’s not web pages.
It’s the kind of stuff that we organize today.

TULSEE DOSHI: And then I think that word universal is important, because universal means for everyone.
NICK FOX: Whether it’s someone that can’t see, whether it’s someone that can’t hear, people that speak different languages, really make it accessible to as broad a set of people as possible.
DAVID BESBRIS: We might be goofy people who come to work in T-shirts and desperately need haircuts and things like that.
We may not look super serious, but we know how much people rely on this.
We take that mission and really, really seriously.
SPEAKER 3: 1.0.
NARRATOR: So it sounds like the mission is pretty important to these folks.
But here’s another important question.
CREW: So how would you explain how Search works?
BEN GOMES: Right.
Yeah, so how does Search work?
TULSEE DOSHI: How Search works?

[MUSIC PLAYING]

PANDU NAYAK: How Search works, in a nutshell.

NARRATOR: This is server rack 3349b.
It lives here in Ballybane, Ireland, along with cows, a golf course, and Kavanagh’s Auto Accident Repair Center.
This is one of the places where Search happens.
Search is a big piece of software that takes the words you type in here and looks for them here, on the worldwide web.
It can do that because first it downloads a copy of the entire web, scans it, and makes a list of all the words and lists of all the pages each word appears on.
It’s like the index of a book, except 10 trillion times longer.
Lasagna appears on 59 million of those pages.
When you search for lasagna, the software puts these pages in order with what it hopes are the most useful at the top and less useful at the bottom.
Most people searching for lasagna want a recipe for lasagna.
COOK: Look at how delicious that looks.
NARRATOR: Some people want nutrition facts for lasagna.
And a few people want to learn about the life and research of Louis C. Lasagna, MD.
They call him the father of modern pharmacology.
The software living on server rack 3349b helps rank those pages, depending on where you live, whether the page was updated recently,

[OVERLAPPING]

how many other pages link to that page, how many times the word lasagna appears on the page, is lasagna in the title, is lasagna bolded, are there pictures of the lasagna?
It does all this in less than one second, billions of times a day every day, mostly for things that are tougher to figure out than lasagna.

BEN GOMES: So behind the scenes of Google Search, there are many kinds of engineers and many different teams that come together to bring you to such experience you see, teams around the world, in many other countries–Zurich, London, India, Japan, so on.
You have teams that are working on the interface by which we present this information, teams working on the evaluation processes, processes that sure that the changes that are happening are good changes.
And then there are teams of engineers who work on ranking.
They might examine the kinds of queries where we are not doing well today, and think about, what are the kinds of techniques we could use to enable us to do better in the future?
NARRATOR: Like the team that’s about to enter this meeting.
ELIZABETH TUCKER: Anything we need to know?
CREW: Don’t look at the lens of the camera.
ELIZABETH TUCKER: OK.
All right, let’s do it.
NARRATOR: Despite their lack of on-camera experience, they’re working on what could be the biggest change to Search in over a decade.
SUNDEEP TIRUMALAREDDY: Things are getting exposed.
SPEAKER 4: Part of that is building–NARRATOR: But we’ll get back to them later.
SPEAKER 4: But we will actually see some–[DOOR SLAMS]

BEN GOMES: So Search is a pretty complex product.
It’s a big effort to actually make these things work, to take all of these different pieces of the system, using a lot of mathematics, and then trying to bring them together into something more real, into something that can actually be turned into an algorithm.
NARRATOR: All right, so behind the scenes, people at Google are working on algorithms.

[MUSIC PLAYING]

Let’s dig into that for a minute.
At its most basic, an algorithm is just a set of mathematical instructions that a computer follows, kind of like a recipe.
Just like there are different recipes for different dishes, there are different algorithms for different jobs.
Some make elevators go up and down.
Some predict subway delays.
Some help cars parked themselves.
The Google Search algorithms exist to return high-quality information based on a user’s query, stuff like all of the text, pictures, videos, and ideas that people have taken the time to put on the open web, stuff they want other people to find and read and watch and look at and learn from.

PRESENTER: Hey, guys!
LUCAS: Lucas here.
ROOFER: In today’s video, I want to show you–
TEACHER: –how to simplify a rational expression.
NARRATOR: We’re talking about the angle of the Leaning Tower of Pisa, how to hit a 7-10
split, whatever this thing is.
This is the information that Google tries to organize and make universally accessible and useful, because this is the kind of information that people are out there looking for.
But you know what they’re not looking for?

VOICE: Act now!
VOICE: We will be with you shortly.
VOICE: A whiter, brighter smile!
VOICE: Hey!

NARRATOR: Spam.
Not the delicious kind, the bad kind.
CATHY EDWARDS:
Yeah, so let me just talk about spam for a minute, because spam is one of the biggest problems that we face.
NARRATOR: This is Cathy Edwards, head of User Trust for Search, which basically means she deals with a lot of crap so the rest of us never have to.
CATHY EDWARDS: Broadly, spam is what we consider a low-quality page that is artificially boosted in our results.
NARRATOR: She’s talking about pages that use AI-generated nonsense text, hidden keywords, and hijacked URLs to trick their way into people’s Search results, pages like fastcashonline.org, topicalarticles.info, the kind of websites that, when you end up on them, you hit the Back button as quickly as possible, because they’re [BLEEP].
Because they’re spam.
DAVID BESBRIS: There’s a wide variety of motivations why people do this.
Sometimes it’s commercial interests.
CATHY EDWARDS: Spam, where they’re trying to sell things that are a little bit dubious, right?
Or sometimes it can just be to capture more of the user’s clicks.
And that’s not right.
That site is not getting those links organically.
It dilutes the value of that signal.
It makes it even harder for us and it makes it harder for users to find great information.
DAVID BESBRIS: It’s a very, very hard problem, because people on the other side are very motivated to succeed.
And they’re smart, too.
And they have resources, and they’re working on it also.
We solve one part of it, and they adapt and they do something else.
CATHY EDWARDS: And that’s the reason that we keep Google’s Search algorithm a very closely guarded secret, recipe-for-Coke-level guarded secret.
DAVID BESBRIS: Because if we talk about our Search signals too much, then people will manipulate them.
And that breaks Search entirely.
Fighting spam is a cat and mouse game.
It’s not something that I think will ever be solvable.
CATHY EDWARDS: As an example, 40% of pages that we crawled in the last year in Europe were spam pages.
This is a war that we’re fighting, basically.
NARRATOR: So yeah, people at Google hate spam, which is one of the reasons they’re always making changes to Search, to keep spam out of your results and to keep high-quality information in.

[MUSIC PLAYING]

BEN GOMES: OK, so you’ve got the Search engine and it’s working.
And by all accounts, it’s working better than any other search engine has worked before.
And every day, you see millions of queries.
And clearly users are happy.
But as an engineer, you ask yourself, how can I make this better?
You see many ways in which we are still failing.
And you see a ton of opportunity for us to make it even better.
And over a period of time, the developments we’ve made in the Search Engine have had a dramatic impact on how well it actually works for users.
PANDU NAYAK: No, no, I don’t think we had that particular problem.
Even though we’ve launched a whole series of changes over the years that have, I think, meaningfully and materially improved the Search result sets, I’m here to tell you that Search is far from a solved problem.
In fact–BEN GOMES: There’s actually no end in sight, in terms of when this will actually be solved.
Because the world keeps evolving.
We’re coming up with new devices.
We’re coming up with new ways of interacting with information.
We’re coming up with new information sources, like videos and so on, that are adding in new opportunities, as well as new challenges.
CATHY EDWARDS: The content on the web has changed.
Users have changed what they’re searching for and how they search.
For example, 15% of the queries–
PANDU NAYAK: 15% of the queries–
BEN GOMES: 15% of queries we see every day–
CATHY EDWARDS: –we have never seen before.
That’s just going to keep happening, and we’re going to need to constantly evolve to keep up.
It’s a little bit like the Red Queen says to Alice in “Alice in Wonderland,” you need to run as fast as you can to stay where you are.

SPEAKER: We’re going to add some friction.
We don’t actually think we have good results.
The idea is to add friction for the worst of the worst results to start with.
CATHY EDWARDS: We change the Search algorithm, on average, six times per day.
It’s actually really frequent.
However, to get to those six launches per day, roughly a couple of thousand launches in a year, we’re doing 200,000 to 300,000 experiments.
So the vast majority of changes that we think about making, that we might try, actually fail.
PANDU NAYAK: Imagine you have a smart engineer on the team, and they come to you and say, I’ve got this great idea on how to improve Search.
And you talk to the engineer, and they come back a little while later and say, OK, I’ve got the change, can I launch it?
And you’re like, no, you can’t launch it.
You’ve got to prove that this is actually good.
NARRATOR: Proof comes from data.
Data comes from experiments, side-by-side tests where results from the current version of Google Search are compared to the proposed version.
If the proposed version gives better quality results–AKA, links to better quality websites–then it gets closer to being put into production, which is a fancy way of saying, actually in use by people around the world.
Which brings up a question.
Who decides what makes a better quality website?
RAMI BANNA: Those people that we asked the question of, which is better, A or B, are known as Search Quality Raters.
NICK FOX: The people at Google aren’t deciding what’s a good result from a bad result. The people at Google aren’t determining what results to show for any given query.
But rather, the Raters are basically teaching our computers what’s good and what’s bad.
Is this a high-quality result?
Is this low-quality result?
CATHY EDWARDS: And they are trained on what are called our Rater Guidelines.
NARRATOR: The Search Quality Evaluator Guidelines are a 168-page document establishing what makes a good Search result good.
We’re talking about websites exhibiting expertise, authoritativeness, and trustworthiness.
These words are given clear, detailed definitions so the thousands of independent evaluators keeping an eye on Search know what they’re looking for.
Want your website to show up higher in Search?
Read the guidelines– seriously.
They’re publicly available, and the more people that read them, the better the web could be, for everybody.
All right, let’s get back to Ben.
BEN GOMES: Making changes to Search is a bit of a balancing act.
There are many different things you’re trying to balance together–quality, freshness, relevance, but we also have to balance the performance.
Some ideas may be really good, but they may result in Search that takes a lot longer.
So we have to be careful that we are not making Search slower in the process of giving you slightly better results.
SUNDEEP TIRUMALAREDDY: In some ways, the key innovation worked.
BEN GOMES: And what about latency?
Does this introduce new latency or–?
SPEAKER: The distilled model’s pretty quick [INAUDIBLE]

SUNDEEP TIRUMALAREDDY: Yeah, I think 10 milliseconds or so.
BEN GOMES: It seems like a reasonable trade-off for this level of win.
JEFF DEAN: From– when we first started really, we were focused on how can we make Search run very fast so we respond more quickly with better results to more people every day, every week.
INTERVIEWER: I did a search a couple days ago, a complicated thing, three-hundredths of a second.
I mean, it seems inconceivable you can do all that that quickly.
RAMI BANNA: We are about finding the world’s information and bringing it to your fingertips the second you ask it–in fact, less than 0.5 seconds.
BEN GOMES: It seems incredibly difficult, and yet that’s an area that works reliably 24 hours a day, 365 days a year, around the world.
But how are you going to look up an index that goes to the moon and back several times in a fraction of a second?
NARRATOR: I don’t know Ben.
Maybe we should ask the expert.
This guy you saw earlier, this is Urs Holzle.
He manages the technical infrastructure at Google.
This is Urs Holzle in 1999, when he also managed the technical infrastructure at Google.
URS HOLZLE: My first business card actually said Search Engine Mechanic because my job was fixing things that were broken.
And the problem was hard, because really everything was broken, and it was just about fixing the thing that’s most broken.
To people sometimes, the internet seems kind of like it’s nowhere.
I’m using my phone, and then here’s wireless, and I don’t really see anything.
But when it comes to a Search engine, when it comes to a data center, these are really physical, big machines, so to speak.
A data center actually is conceptually very simple.
It’s a building with lots and lots of servers.
And that’s really it.
So in Dublin, we have one of the data center campuses.
It’s actually one of the smaller ones.
PETRA: I think it’s the smallest data center we have.
JAMES: We’re considered the baby data center of the fleet.
We’re–DANIEL: –quite small.
KEVIN: Quite the snowflake.
PHILLIP: Actually, this is quite big.
For any other company, this is bewildering.
This is just not a thing.
NARRATOR: This is Phillip, Kevin, James, Daniel, Petra, and the crew we hired to film them.
And this is where they and all their coworkers work, the Google Data Center in Dublin, Ireland.
PHILLIP: The scale of what we do here can be kind of crazy.
PETRA: [INAUDIBLE]

searches a day goes through those machines.
That’s why they’re very loud and they produce lots of heat.
That means they’re constantly working, constantly answering your queries.
URS HOLZLE: And so how do we really store the web, so to speak?
The way to think about it is, we take the internet, download it, index it, and chop it into small pieces.

And then each server has a small piece.
All of the servers for that data center work together to each search their little part of the internet.
RAMI BANNA: And it literally takes millions of servers and hard drives to be able to support the world’s websites.
URS HOLZLE: So each of these data centers has a complete copy of the web.
RAMI BANNA: So if you’re in France or if you’re in South Africa, you’re not sending a query that goes through the wires, underwater cables, and comes to Mountain View, asks that question, and we send it back.
That’s just not possible.
That’s never going to work as a solution that’s fast.
URS HOLZLE: How it actually works is, if you go into Google and you type in a search, then we direct your query to the data center that is closest.
And so that’s actually the reason why we have data centers everywhere, because we want to be close to the users that we’re serving.
RAMI BANNA: Because that’s the only way to get you the most accurate response as fast as possible.

CREW: So there’s a lot of expensive equipment here, huh?
KEVIN: Yeah.
CREW: How does that all get paid for?
KEVIN: I have absolutely no idea.
I guess it’s from advertising.
JAMES: Ads keeps the lights on and probably puts gas in my car at the end of the day.
CREW: All right, yeah, I think we might have to talk about ads a bit here.
Any last thoughts before we cut?
JAMES: Keep it sweet.
NARRATOR: All right, ads.
Why are there ads?
Two reasons– one, ads keep Search universally accessible, no paywalls, no subscriptions, no “you’ve used your last credit, want to buy a 50-pack.”
just search that’s free for everyone.
And two, ads help people who want to buy a thing find people who sell that thing.
Like Bart here–
BART: Hi.
NARRATOR: –and his employees.
ALL: Hi!
NARRATOR: At Carr Hardware in Pittsfield, Massachusetts.
BART: Yep, we sell 38,000 items.
NARRATOR: Like weed whackers, tack hammers, wrenches, and M10 metric castle nuts.
MARIE: I think the only thing we don’t sell is milk and bread.
NARRATOR: Bart buys ads on Google that only get shown when someone near their town–BART: Pittsfield!
NARRATOR: –searches, for instance, “lawn mower dealers near me.”
And Google only gets paid if the person doing the search, maybe your neighbor or your brother-in-law, clicks on Bart’s ad, which is always labeled “Ad.”
It helps people find mowers to buy, and it helps Bart and the store get business.
BART: Have a nice day.
NARRATOR: And it helps pay for all the stuff that keeps Search and Maps and Docs working and free.
That’s why there are ads.

PANDU NAYAK: Since I’ve been at Google and worked on Search for the last 14 years, I have to say that no one, absolutely no one, comes to me and says, you know, I did this search and the results were great.
Nobody says this.
They only call to complain that they did something and it didn’t work.
NARRATOR: And the name of the man who’s been collecting Google’s dumbest Search mistakes for the last 14 years?

[CHEERING]

Senior Software Engineer Eric Lehman.
CREW: Eric L, take 1, mark it.
ERIC LEHMAN: Over the years, I’ve been gathering some of my favorite bloopers.
I’ll walk you through some of those.
So how far from the coast is Cambridge, Massachusetts?
It’s actually a little over 3,000 miles from the West Coast.

How many calories in 330 tons of butter?
So this caused an overflow error, and we said about minus 2 billion.
Mm-hmm.
What color is green?
That’s a tough one.
Blue?
Sure.
For the search “meat nutrition facts,” we brought up all kinds of detailed information.
I think it’s quite good.
The query’s a little ambiguous because it didn’t say what kind of meat.
And so the system chose roasted muskrat.

[LAUGHS] Yeah.
Avogadro’s number is a sort of important constant in chemistry.
It’s also, apparently, the name of a restaurant.
And so we’ve given a lot of chemistry students their phone number.
Is that what you were shooting for?
CREW: Yes, yes, that’s perfect.
NARRATOR: Since you started watching, people have done over 100 million searches, enough results to fill 27 libraries, but none as cool as this one.
This is the Weston Library on Oxford’s campus.
Two buildings down, you’ll find the office of Dr. John-Paul Ghobrial, a professor of Early Modern History.
He specializes in the history of information and archives.
Suffice to say, he’s an expert on this stuff.
JOHN-PAUL GHOBRIAL: It used to be, before, say, the 16th or 17th century, that if you were reading a manuscript copied by someone, perhaps someone you knew, perhaps someone who you didn’t know but they were recommended to you by someone else, you could have a certain trust that the text you are reading was stable, was authoritative, was right.
Printing changes all of this.
Sure, printed word can flow everywhere.
But that worried lots of people.
Because for example, if we don’t know who printed it, well then, what should we think about this information?
If there’s an error in the printed word, then everyone will get it wrong.
So we look now actually at the print revolution, which we used to think about almost in a celebratory way, and we think now that actually the anxieties that people had about print in many ways paralleled the anxieties that people have today about fake news, about origins of information.

NICK FOX: Google Search is an index on what exists.
And so if that content is out there, sometimes we can surface it.
That can present results that are accurate when it comes to the content of the web out there, but not accurate in terms of what the truth actually is.
But that can result in some, what I would consider to be reprehensible or really offensive results.

[MUSIC PLAYING]

BEN GOMES: A few years ago, people were pointing out that, for some queries, like, “did the Holocaust happen,”
we were giving people results that had the words and were on the topic, but were from low-quality sites.
And we viewed this as a pretty profound failure.
PANDU NAYAK: This is clearly bad because this is clearly a case of misinformation, because the Holocaust did actually occur.
And so then we wanted to understand why it is that this was happening.
BEN GOMES: So we take a very algorithmic approach.
We did not go in and say, oh, for this query, we’ve got to change the results.
PANDU NAYAK: The fundamental reason for that is, every problem that is reported to us like this is usually the tip of the iceberg.
And it’s usually just a representation of a whole class of problems, in this case problems of misinformation.
And just solving the specific problem that was reported to us does not solve the large iceberg of problems that were not reported to us.
FEDE LEBRON: Part of the reason why we were all in Search is because we want to give good results to users.
We want to make their lives better by giving them good information.
This was contrary to everything that we wanted as employees in Search, in a very egregious sense.
It wasn’t just a misspelling or something that.
MEG AYCINENA LIPPOW: Every query is going to have some notion of relevance and each one’s going to have some notion of quality.
And we’re constantly trying to trade off which set of results balances those to the best.
SPEAKER: That’s a good question.
MEG AYCINENA LIPPOW: But if you type in the query, “did the Holocaust happen,”
higher quality web pages may not really bother to explicitly say that the Holocaust did happen.
They’re talking about the Holocaust and taking for granted the fact that we, as informed citizens, are aware that the Holocaust happened, because we learned about it in school and so on.
And so the only kinds of websites that are actually going to have the combination of terms that seem to closely match a query like that might be ones which in fact say, no, the Holocaust didn’t actually happen, it’s all a big hoax.
Those results are not the high-quality results.
They tend to be lower quality even though they’re more relevant.
And so what was happening on the “did the Holocaust happen”
type of queries is that the relevant signals were overpowering the quality signals to a degree that was resulting in low-quality results for users.
PANDU NAYAK: We have long recognized that there’s a certain class of queries, like medical queries, like finance queries, in all of these cases, authoritative sources are incredibly important.
And so we emphasize expertise over relevance in those cases.
So we try to get you results from authoritative sources in a more significant way.
MEG AYCINENA LIPPOW: And by authoritative, we mean that it comes from trustworthy sources, that the sources themselves are reputable, that they are upfront about who they are, where the information has come from, that they themselves are citing sources.
PANDU NAYAK: And so the change we have made in the case of misinformation is to change the ranking function to emphasize authority a lot more, and this has made all the difference.
SPEAKER: Actually, not these.
NARRATOR: Misinformation is one of the challenges that comes with helping people find what they’re looking for.
But it’s not the only one.
Launched in 2010, the Autocomplete feature has saved millions of hours in people’s time by guessing what they’re searching for before they finish typing.
But when those guesses have been wrong, it’s led to some pretty disturbing predictions.

REESE PECOT: A few years back, we started hearing from people that sometimes folks were typing things into Autocomplete and they would be shocked by some of the predictions that they were getting.

Autocomplete was designed to help people complete their searches faster.
Instead, we were actually returning them information that they weren’t searching for.
When we provide you with something that’s shocking, that’s not relevant, we’ve really at that point not stood up to our core principles.
PANDU NAYAK: I think I and all the members of the team felt a deep personal responsibility to try and develop the systems to minimize these kinds of occurrences as much as possible.

First, we developed a set of policies that say what kind of predictions that we would not want to offer to users.
REESE PECOT: Things like violent content, sexually explicit content, hate speech.
But we also publish those policies.
That way people can see where we stand, and then that gives us some accountability.
PANDU NAYAK: With these Autocomplete algorithms, we try not to surface predictions that violate the policies.
Now, these algorithms are very good at what they do, but they’re not perfect.
And every so often, we’ll get some predictions that in fact violate them.
REESE PECOT: So you can report if you’ve seen a prediction that violates those policies.
And every day we get flags from our users out there to tell us where we might be seeing problems in the product.
PANDU NAYAK: We use those reports to improve our algorithms to try and see whether we can address the whole class of problems that the report might be just pointing towards.
But one thing that I would like to emphasize is that this in no way prevents users from searching for whatever it is that they want.
They’re absolutely free to do that.
NARRATOR: Think about it this way.
Search is like a door that leads to the web.
With Autocomplete, it’s the kind of door that senses you walking towards it and opens for you.
But if you’re typing a query that violates its policies, the automatic part stops.
The content of the web is still behind the door, but you won’t see any results until you complete the query yourself.

NICK FOX: Search isn’t perfect.
We do make mistakes.
We make more mistakes than we would like to make.
But we need to learn from them.
We need to get better.
And we need to continue to improve to avoid those cases in the future.
Each time that something happens where we become aware of a bad result, we use that as learning.
We use all that feedback to continue to improve it and make sure that Google one day from now, five days from now, 10 days from now, 10 years from now, is continuing to get better.

BEN GOMES: Many people tend to think that Search is really easy.
You type in a few words, you get a few documents, and the process feels very easy.
And in many ways, that’s what we want to achieve.
We want Search to be very easy for people.
But behind that is an extremely hard technical problem of actually understanding what people mean when they type in a query, not just matching words, but actually understanding language much better over time so that we can match the thing you asked to the concept that you were really looking for in the documents, and we can bring these two things together.
It’s an absolutely fascinating problem to work on, because it lies at the frontiers of what computers and computer science can do and our understanding of basic aspects of how we wish to interact with computers as human beings.

NARRATOR: As long as there have been machines, humans have tried to get those machines to do more.
Of course, for most of history, the machines couldn’t speak human.
So humans had to come up with new ways to tell machines what to do.
Joseph Jacquard used cards with holes punched in them to tell his loom, put the thread here and here and here.
It made weaving complex patterns easier.
Punch cards were a big idea.
They’re how early computers took instruction, did math, solved equations.
NARRATOR 2: Holes punched in the card represent data to be placed in the computer.
NARRATOR: Then computers got screens and keyboards.
But you still couldn’t talk to it like you’d talk to a human.
You had to write it in code.
C colon, slash carat smartdrv dot exe.
Once Search came along, things got a little easier.
You just put in the words you were looking for and Google came back with websites.
But you were still writing in code–”ice cream shop 27705,” when really you meant, “where can I get some ice cream around here?”
BEN GOMES: As we understand language better, you should be able to ask a question in a much more natural way.

[CHEERING]

NARRATOR: What time is tonight’s match on?
Who do I call for a tow truck around here?
Does anyone make a nail polish that’s safe for dogs?
BEN GOMES: So rather than you having to craft keyword-ese that the search engine can understand, we want to be able to understand what you had in mind in the most natural way you can express it so that we can satisfy that information need with information that we have available.
NARRATOR: We call this problem natural language processing.
BEN GOMES: So where are we in the space of solving this problem?

[MUSIC PLAYING]

I think we’ve come a long ways, but the journey’s so long, it’s very hard to see where it ends, right?
I mean, we began to work on this problem 19 years ago with a system that I worked on call Spelling Correction.
We got to beyond that to understanding synonyms and how words are related to each other.
But to go deeper, we needed a different approach.
Google has been doing research in something called machine learning for almost a decade.
And Geoff Hinton was at the forefront of that.

[APPLAUSE]

HOST: Please welcome Geoffrey Hinton, the engineering fellow at Google.
NEWSCASTER: When Geoffrey Hinton began work in the 1970s, people said artificial intelligence was the stuff of science fiction.
Today, he is revolutionizing how we live.
BEN GOMES: Geoff Hinton combined forces with Jeff Dean at some point, and we began to see these huge breakthroughs in machine learning.
JEFF DEAN: If you look at the last, say, 8 or 10 years, machine learning has gone from a small part of overall computer science research to something that is now affecting many, many fields of endeavor.
BEN GOMES: And we realized this could pay off in a big way in helping us do search better.
INTERVIEWER: What kind of impact do you hope deep learning has on our future?
GEOFFREY HINTON: I hope that it allows Google to read documents and understand what they say, and so return much better search results to you.

[CHEERING]

NARRATOR: A few years later, a new development in natural language processing was announced.
They called it–
JEFF DEAN: Bi-directional Encoder Representations from Transformers–it’s a bit of a mouthful, so we just call it BERT.
Research like this gets us closer to technology that can truly understand language.
NARRATOR: So BERT’s a big deal for Search.
At least it could be, which brings us back to this team from earlier.

It’s going to be up to them–Elizabeth, Jingcao, Sundeep, Eric, and a few other folks, to figure out how to get BERT working in Search.
They named their project DeepRank after the deep learning methods used by BERT and the ranking aspect of Search.
And also because it sounds cool.
SPEAKER: It’s cool.

[MUSIC PLAYING]

ELIZABETH TUCKER: So I think we’re finally getting going here.
One of the things that we can do today is talk through some of the new evals.
When I first joined the project, I got really, really excited thinking, this system is doing something pretty special that most of our other systems in Search probably can’t do.
JINGCAO HU: We are still at the very early stage of building such system which truly understands human beings.
But this project is very unique in the sense that this is the first time for Search we have a signal which understands the relationship between different terms.
SUNDEEP TIRUMALAREDDY: That’s why we are very excited about DeepRank because we are hoping that this could help us make Google Search more intuitive to use and make it feel like Google Search actually understands our users.
ERIC LEHMAN: –is the most ambiguous wording.
So people use language every day.
We don’t even really think about how we put sentences together.
It’s just a tremendously subtle thing.
Some slight changes of wording can change the meaning of what we’re saying.
And it’s very hard to write a computer program that captures all of that subtlety.
So it’s actually sort of interesting.
Early on in information retrieval, which is the science behind Search, people would tend to just give up on these things.
So like a lot of little connector words, they’d simply ignore them.
They call them stop words.
They’d just throw them out.
I think we’ve learned over time that those words often have an important role in communicating what we’re trying to say, communicating an idea.
And so through machine learning systems like DeepRank, we hope to pick up on these subtleties of language that humans get so naturally but are so difficult to program.
So hopefully people will be able to phrase Search queries in a more natural way for humans and not suffer from this problem that machines don’t get the subtleties.
NARRATOR: Eric makes it all sound pretty straightforward.
But actually getting BERT to play nicely with Search, it’s not going to be easy.

[MUSIC PLAYING]

SPEAKER: These all look like the queries where we would expect to see wins from DeepRank, like the longer natural language.
ELIZABETH TUCKER: I would have guessed that–
NARRATOR: The team starts by testing their theories.
Months go by.
Progress is slow.
PANDU NAYAK: And it’s not trying to make a distinction in that rank, so I’m just not that thrilled with this part of it.
With change that is so positive and so powerful, there is a tendency to feel like, oh, we should just get it out there as soon as possible.
And so you have to temper that with some pragmatism.
If this is where your IS win is coming from, that’s not so thrilling.
Let’s put it that way.
NARRATOR: For each result that gets better, others are getting worse.
SPEAKER: Single term queries are also way more negative.
When we don’t know what we’re doing, we’re doing great.
NARRATOR: Each failure requires a new test.
Each test requires rewriting big chunks of code.
They don’t have all the time in the world.
Even just experimenting with a system based on BERT takes thousands of servers, crunching quadrillions of numbers.
ERIC LEHMAN: So DeepRank needs an enormous amount of computing power.
Google has tremendous resources.
But even by Google’s standards, this is a lot.
We have enough TPUs to launch DeepRank, but barely.
NARRATOR: If they don’t show progress soon, the resources will go to some other team with a more promising idea.
PANDU NAYAK: It’ll all hinge on getting a strong quality rank.
Let’s put it that way.
ELIZABETH TUCKER: We can get–PANDU NAYAK: If we don’t get that, then we’re not getting the resources.
NARRATOR: Time is running out.

ELIZABETH TUCKER: I would say, in general, on many of the examples I see when we have optionalization on both sides, this is actually someplace where DeepRank typically does better.
But if once we mix in the localness–So we have these high-level measurements that we do to say whether something is good or not.
Because if something’s not good for people searching on Google, we are not going to launch it, period, no matter how great the technology is.
So this was the week where we saw some really nice experimental results.
And that was so reassuring.
I would like us to go through some wins.
So one of my favorites is, what temperature should you preheat your oven to when cooking fish?
I was kind of fascinated with this one.
ERIC LEHMAN: It is a tough query.
Holy cow.
That’s really, really nice.
NARRATOR: Here’s what they’re so excited about.
Without DeepRank, the Google search algorithms were surfacing some good information about cooking fish, but they were also getting confused, showcasing a recipe for baking cookies.
When DeepRank was tested on this query, it understood that the result was about cookies, reducing the prominence of the incorrect recipe, and instead elevating useful, relevant information about cooking fish.
These are the kinds of wins the team will need to see more of if they want their project to launch and start improving search results for billions of people around the world.

ELIZABETH TUCKER: However, before we can launch, we need to get launch approval.
It’s a formal process where any change to Search gets a lot of scrutiny.
Hi, guys.
So I’m feeling a little pressure to like–I don’t know.
TULSEE DOSHI: Yeah, Launch Committee.

[LAUGHS]

So Launch Committee is essentially the final review before you actually choose to launch a project.
ERIC LEHMAN: I mean, I feel like that we’ve seen that pattern.
TULSEE DOSHI: So when you go to Launch Committee, you’re essentially saying, hey, we have a project that we’ve built. We have all this data that we think shows that it’s a good thing.
And now we’re getting approval to actually put it into production.

[MUSIC PLAYING]

ERIC LEHMAN: There’s always a little bit of anxiety, because the outcomes of these meetings are really important to people.
People have put a lot of work into them.
And to have a change rejected is pretty dispiriting.
JINGCAO HU: Before the meeting, I always feel like there are things that I forgot to catch.
So I was going over the launch report again trying to see if there was anything I’m missing.
There are lots of stress, but also hope.
Like OK, no matter what, we will have some reasonable feedback from the launch discussion.
It may be over, or it may be approved and then we can launch it.
Regardless, it’s a big milestone.
NARRATOR: Jingcao has every right to be nervous.
Around here, Launch Committee is known for killing experiments.
Because despite their best intention, despite the months of work that went into them, most experiments never make it out of the building.
CATHY EDWARDS: If you talk to the average engineer, they will have their share of war stories of moments that have been incredibly frustrating for them.
But the flip side of that is, there’s not many products that are more impactful than Google Search.
So when you can ship something that’s really great, it’s really an amazing feeling.

ELIZABETH TUCKER: All right, are we ready?
So we are here to get launch approval for DeepRank.
DAVID BESBRIS: Launch committee is the meeting where we all get together, look at the metrics and argue with each other.

[INTERPOSING VOICES]

PANDU NAYAK: That’s not what this is saying.
This is saying, when site diversity increases, the–
DAVID BESBRIS: Generally speaking, the engineers don’t present their own work.
SPEAKER: So let’s take a look at the logic parametrics.
DAVID BESBRIS: They’re there often for context and to answer questions.
But your work is presented by an analyst, because we want the analyst to be an impartial third party.
Because it can be a little tough.
ELIZABETH TUCKER: There is a slight issue in the way the metrics are calculated.
PANDU NAYAK: It’s important to realize that most of the changes we make in Search are not ones that are 100% good.
There are always wins and losses.
BEN GOMES: There is only one thing that is [INAUDIBLE] positive but is not out of the noise.
PANDU NAYAK: Actually, the one that I think is particularly worth looking at is the long tail asset, right?
BEN GOMES: Yeah, let’s look at that.
PANDU NAYAK: So one of the things that the Launch Committee is doing is to weigh these wins and losses.
BEN GOMES: Wow.
ELIZABETH TUCKER: It’s pretty clear from the wins and losses there are some interesting relationship understandings going on in here.
However–
PANDU NAYAK: DeepRank illustrates some really nice wins we get from understanding language and the nuance of language.
SPEAK: This is my favorite win.
Can you get medicine for someone pharmacy.
It’s a very beautiful natural language one.
You see–SPEAKER: Yeah, it’s an important question, right?
Can you pick up medicine for somebody else?
SPEAKER: This is wonderful.
SPEAKER: And DeepRank brings up this very relevant, very specific result.
SPEAKER: You can imagine why this happened.
Because before, all those words like “for” and maybe “you”
and “get,” they’re all stop words, largely ignored.
And now, because of BERT, it actually understands that those are very important to [INAUDIBLE]..
SPEAKER: Yeah, yeah, but for someone is a really hard concept to get in IR.

PANDU NAYAK: We saw some wins that was really, really beautiful in various ways.
BEN GOMES: So point two is, I think, one of the biggest changes we have seen in a long time.
Because you’re getting to more semantics and all over here when you’re ranking.
PANDU NAYAK: When that’s all you have.
You don’t have other signals, right?
And so this is where it can excel.
BEN GOMES: All right, this seems like a great launch.
Really excited about this.
DAVID BESBRIS: When it’s all done, the coordinator of the Launch Meeting just changes a field in a spreadsheet, changes it from blank to Yes.
It’s a very momentous occasion.
SPEAKER: Approved– we’ll mark this as Search [? Leads ?]

Flagged, I’m guessing?

[LAUGHTER]

ERIC LEHMAN: This was a very positive launch meeting.
The decision is to launch DeepRank.
ELIZABETH TUCKER: I thought I wasn’t feeling nervous.
But when the moment came, it felt so good to get that approval.
JINGCAO HU: [LAUGHS]

[SIGHS]

ELIZABETH TUCKER: Thanks, guys.
SPEAKER: Awesome.
SPEAKER: Pretty darn cool.
ERIC LEHMAN: Yeah, so after a launch, you might imagine there’s some great big celebration.
More typically, people stand around the meeting room a little awkwardly for a few minutes, and say, hey, good job.
And then they nervously shuffle back to their desks and try to catch up on life.
And probably that’ll happen here.
Maybe we’ll do something a little bit more in this case.
It was a pretty remarkable project.
ELIZABETH TUCKER: Congratulations.
NARRATOR: In the moment, this approval feels big.
It feels significant.
But in the grand scheme of things, it’s just another step forward, an improvement, just like all the others that came before it, that helps make Search a little bit more useful than it was yesterday.
ELIZABETH TUCKER: We will work on that.

[LAUGHTER]

I think there was a promise there of something.

[MUSIC PLAYING]

PANDU NAYAK: Solving the Search problem is not easy, that’s for sure.
We’ve been at it for 20 years, and I think there’s still a lot to be done.
CATHY EDWARDS: Humans have more access to information than at any other time in history.
And I really feel like it’s our job to make sure that they’re connecting with the highest quality, the most authoritative, the most relevant information for them, and that they’re really able to access the information that makes a difference in their lives.

[NON-ENGLISH SPEECH]

PANDU NAYAK: This is sort of a core value, and we feel deeply responsible to our users to make this happen.
URS HOLZLE: What is Google in 20 years?
It’s very hard to predict the future.
I would never have predicted 20 years ago how Google looks today.
The mission will still be there, making information accessible to people.
And I think the thirst will still be there, that people really want to find the things that they’re looking for.
BEN GOMES: Information really releases things that are in people’s potential.
It enables them to make decisions that they couldn’t make before.
It enables them to know about things that they couldn’t know about before, to know about things in the world, to know about the people around them.
And I hope it also improves their understanding of the world around them as they do that.

[LAUGHTER]

And I believe that our role in Search is to actually help serve that curiosity in people, to help them find that information that they are looking for, that takes them on the next step of their journey of curiosity.

NARRATOR: All kinds of people on all kinds of journeys, curious about the thing holding them back, curious about the thing pushing them forward, people searching for themselves and their families, just like people always have and always will.

BEN GOMES: And while that curiosity lives on in us, I think our job here in Search is never done.