Categories
Main

Firefox Extensions and Greasemonkey

greasemonkey

If you use Firefox, and we all hope you use Firefox, then you should know about Extensions. Firefox Extensions let you add new functionality to Firefox — there are extensions for blocking ads, showing the weather, or checking your Gmail account for new mail. There’s even a Google Toolbar. Web developers will love the swiss-army-knife-of-an-extension from Chris Pederick.

One of the coolest extensions, with awesome potential, is Greasemonkey. Greasemonkey doesn’t do anything by itself, but when loaded with Greasemonkey scripts, it lets you change websites. If you don’t like the colors of a certain website, you can change them. Or you can add a Delete button to Gmail. There’s even a Greasemonkey script that will show you what books are available are your public library when you’re browsing Amazon.com. Greasemonkey is Burger King for the Internet — have it your way.

Here’s how I used Greasemonkey this week:

My home page is My Yahoo, which I like because I can customize it with weather, stocks, a Foxtrot comic, movie show times, and TV guide listings. However, the TV guide listings always look so busy — so many channels, so much on TV — it’s not well suited for quick glances.

So I set up a Greasemonkey script that highlights my favorite TV shows. If Seinfeld is on, I can easily see it because it’s highlighted in yellow. I love it.

TV listings on My Yahoo
my yahoo before greasemonkey

TV listings on My Yahoo with Greasemonkey
my yahoo after greasemonkey

Categories
Main

Diving Deep

The internet is a big place. Search engines like Google and Yahoo are the best tools we have for knowing what’s out there, but even they don’t capture everything.

A little background: Think of a search engine as an automated browser that clicks on every link it can find and saves every page it can find. Together all the saved pages make an “index”. Google claims to have an index of 8 billion pages, meaning it has saved 8 billion pages from the internet. And it re-saves them every week or two. When you search Google for “iPod earphones”, Google looks in its own index for those terms, then lets you know where the original content was. (So with Google, or any search engine, you’re not really searching the internet but searching a “copy” of the internet. For Google that’s an 8-billion-page copy, but still just a subset of the entire internet.)

While Google claims to index 8 billion pages, Yahoo claims 20 billion pages. Recent news pieces have asked if Yahoo’s larger index makes it a better search engine, but they’ve found that Google gives more relevant results slightly more often, despite having fewer pages in its index. The challenge for them is to add more pages to their indexes without losing efficacy. No search engine comes even close to finding everything on the internet.

For instance, take your local library website. At pac.provo.lib.ut.us you can search the Provo library for thousands of books. But type “site:pac.provo.lib.ut.us” into Google (that’s how you see what pages Google has indexed for that “site”) and you won’t find any books — just a couple hundred garbage pages. That means that while Google can help you find the library, it can’t help you find library books (maybe you already noticed).

Another example is the LDS Church‘s “Gospel Library”: at library.lds.org you can browse or search hundreds of volumes of Church magazines and books, but when you type “site:library.lds.org” into Google, you get just 39 hits. And those 39 aren’t the least bit useful.

Tons of data is inaccessible to search engines because its found on sites like these — real estate listings on MLS websites, legal proceedings on court websites, and job listings on some company websites.

A startup company called Glenbrook Networks is hoping to change this. It is developing a search engine to dive into the “deep web”. I look forward to when Glenbrook or Google will help us find information from these previously unavailable sources. It will mean billions more pages of relevant information available to the world.

In the meantime, websites like the LDS Gospel Library can use “rewrite engines” (for example, Apache’s mod_rewrite) to make themselves more accessible to search engines.