Saturday, May 22, 2010

It's Time for Better Search Engines (Who said: "..put my boot heel on the throat of BP")

Is there a search engine that will let me ask?

Who said: "..put my boot heel on the throat of BP"

OK Rand Paul said it, or lets say the the answer looks like a sort of summary "Rand Paul said it" AND a list of pointers to articles quoting Paul as saying it, and maybe a few quite different entries, such as Rand Paul saying he didn't say it. So what if I could say "show me the most atypical entries first". That sounds like a very generally useful followup question when you get 2 million hits, and as far as you can tell the all say more or less the same thing. Could a computer program do a reasonable approximation of what a human (with a year to wade through the 2 million hits) could do? My guess is yes, that wouldn't be a big stretch even.

I've been skimming so many web pages, I feel like I've seen something somewhere quoting someone in Obama's cabinet actually using a phrase like: "..put my boot heel on the throat of BP". Can I confirm that? or be very comfortable in saying it didn't happen (or hear who the Cabinet member was, and see if he/she gets fired the next day)? Well, I can find someone directly attributing the phrase to Obama: "I'll put my boot heel on the throat of BP." Barry Obamma

An important question, and perhaps it represents one of those big stories the news media misses: How many people today, next week, next month, next November literally believe or will believe Obama did say that? Are there any pollsters asking that sort of question? My guess, it could easily be something on the order of as many people as think Saddam Hussein was directly behind 9/11 (at least some pollsters paid attention to that).

Relying on existing search engines and their limited abilities, how close could I come to answering this sort of question?
Well if somebody said it before Rand Paul, and Paul picked it up a couple of days later, wouldn't there be some references to this on the web, before there was any association between the phrase and Rand Paul?

Consider this Google search: "put my boot heel on the throat of BP" -paul

The quotes ("") mean I don't want just any combination of the words "put", "my", "boot".... but want that exact phrase. The "-paul" means nothing containing the word "Paul". So I get 4 hits, all from context being clearly from the Ron Paul interview, except for the twitterer directly attributing it to "Barry Obamma".

OK, but what if the quoted secretary was named Paul ____?_____ ?
I tried already -"rand paul", which picked up too many pages in which Rand Paul was just referred to as Paul. Some other approach? OK, when Paul was putting words in the President's mouth, he started with "What I don't like from the president's administration..."

HOW ABOUT: "put my boot heel on the throat of BP" -paul -"What I don't like from"

That cuts the hit count down quite a bit. There are a couple in which "don't" came out "dont" or "donit", or they just cut the quote down so the whole phrase
"What I don't like from" didn't appear, and finally we are left with the twitterrer quoting "Barry Obamma" which I'm inclined to discount.

Suppose I could say "Who said it first"? Computer logic to approximate that could rely on that fact that every internet page in google's (or another search engine's) vast database will have a date and time of posting. In fact, can't I just tell google "display in order of posting", which would make the question much more easy to answer? NO, apparently not; at least I don't see how. I could probably put a front end on google accessing google via it's more computer friendly interface (or API), and voila, a new and useful search engine.

