The Google Enterprise Search XML API and Ruby on Rails
October 25th, 2007by Jeremy Thomas
I’m a Java and C# guy. I’ve grown to be very comfortable with the two languages and associated frameworks (J2EE, ASP.NET), and can write programs that at least compile in each without having to search through the internet for examples. Hence my hesitation to learn Ruby - the language required for me to indulge in Ruby on Rails. But everybody’s doing it, and I don’t want to be left behind.
I decided to build a Ruby on Rails-based enterprise search application leveraging the Google Search Appliance (GSA) REST API. My aim was to show that A) Ruby on Rails is an agile framework leading to quick application development and B) the extensibility of the Search API (being that it can be leveraged from virtually any programming framework).
Here’s what I did:
I created controller class called Search Controller that manages the API invocation. This is done in an action handler method called on_search. This method invokes the GSA API, then converts the XML response into a REXML::Document object (part of the Ruby on Rails library.

url_escape is a nifty little method I found here that converts a search term like, say “technology integration” into “technology%20integration”. If you don’t do this things will break. params[:query] contains the search term the user enters into the search box on my search.rhtml page. Last the @search_results object is stored in the session and is a container for a collection of search results plus some global information about the search (like result_count and search_time). This object is bound back to my search.rhtml page to display the search results.
Next, I had to pull relevant information out of the XML response and add it to user-friendly objects that are then used to display the results on the view:

This method leverages ‘rexml/document’ to parse XML, i.e. search_response.elements[”GSP/RES/M”].text. “GSP” is the root node in the GSA XML API response document, and in this example I’m pulling out the result count, appropriately called “M”, inside the “RES” element under the document root. Parsing continues where I iterate through each search result, pull out the relevant information (snippet, url, title), add it to a SearchResult object, then add said object to the results collection in the SearchResults object (which, again, is stored in the session).
That’s it. In two quick and easy methods I can invoke the GSA XML API and parse its response into usable objects that I bind to my view. There is a lot that I’m not doing (i.e. good error handling), but I’ve at least proven what I wanted to. I’ve done this before in Java, and it took me less time to do it in Ruby even though I was learning the framework as I went.
I love Ruby on Rails.
A Google Approach to Signals
September 25th, 2007by Jeremy Thomas
I was searching for a Google Reader notifier the other day when I stumbled across Google Alerts (I ended up using this firefox extension for the notifier). I read the FAQ and could instantly see the value something like this could add to the “Signals” aspect of the Enterprise 2.0 SLATES meme. Here’s how it works:
- Enter a search term for a topic that you’d like to be notified of (i.e. “Enterprise 2.0″).
- Select how often you’d like to be notified.
- Google will then send you an email with content items pertaining to your topic (videos, blogs, news articles etc.).
Very simple. Very powerful.
Within the enterprise, search is a very under-exploited capability. Why not take advantage of enterprise search and augment the signals capability to do exactly what Google Alerts does - contextual notification. Instead of creating an RSS subscription to the tag “marketing” or to a knowledge worker’s marketing blog, why not let the search engine do the work and determine what is “marketing”? In this way the user does not depend on other user’s tagging ability or new blog posts on the topic. And the powerful algorithms already trusted to deliver relevant search results will be the same used to keep the knowledge worker up to date on a topic in near real-time.
Wiki Federation
July 27th, 2007by Jeremy Thomas
Within my organization we’re working hard to socialize the benefits of socially-oriented collaboration tools and have made great progress with our initiative. But an interesting dilemma has surfaced, and I’ve read about this happening elsewhere too (but I can’t remember where - otherwise I’d link to it). The dilemma revolves around whether an enterprise should focus its energy on promoting a single instance of a collaboration tool (i.e. wiki), or if it should instead embrace wiki federation. The inherent benefit of having everybody use a single instance is, of course, that all collaboration occurs in one spot. This makes it easier to find content and people since there’s only one place to look. From an IT perspective this approach also makes sense as it consolidates governance of the tool and makes it more manageable.
But the “single instance” approach might be more of a utopian ideal. We often talk about having a bottom up, non-sanctioned approach to Enterprise 2.0 adoption. Bottom up often entails disparate groups creating their own collaboration environments for specific needs, and the result is wiki proliferation. And I’m not sure there’s much corporate IT can do to keep this from happening.
So, pragmatically speaking, it makes sense for the enterprise to embrace wiki federation. This can be accomplished through Enterprise Search. Slap a Google Search Appliance or FAST instance inside the firewall, point it to the federated wikis, and we have discoverability across all collaboration tools. This negates the impetus behind moving the enterprise toward a single collaboration tool instance. Of course the challenge here is to keep the search index up to date with all of the new wikis that popup. But that’s why we have IT guys.
Search Engine Land
July 11th, 2007by Jeremy Thomas
I’ve become fairly interested in search technology over the past year or so. Today I was watching a video about how Robert Scoble picks the blogs he wants to read from his feed reader when he mentioned Search Engine Land. I’ve added it to my feed aggregator and have found it to have some really interesting and industry-relevant posts.
Anyway, it’s worth a read.
Is IT Really Clueless?
June 25th, 2007by Jeremy Thomas
Update: Paula and Jevon make a good point that Search on its own is not Enterprise 2.0.
I find it interesting reading about IT being clueless when it comes to Enterprise 2.0 (like Paula Thornton’s recent post over at the FASTForward blog). I do a lot of work in the Enterprise Search space (Search being the first “S” in SLATES), and more often than not we are approached by IT departments looking for a Search solution. They understand the difficulty knowledge workers face in finding enterprise content. I’ve worked closely with several IT departments to integrate Search on their intranets - a task that is very security intensive as Search musn’t expose knowledge workers to content they don’t have access to, and this means close involvement with IT.
I’ve written about this before as have others, but Enterprise Search needs to be the focal point of any Enterprise 2.0 ecosystem. Companies should invest in Search first - they must enable discovery - before collaboration can happen. I applaud the IT departments I’ve dealt with in taking this first step.
So no, I don’t think IT is really clueless.
Follow Me