BAPI Presentation
December 18th, 2009by Jeremy Thomas
The Business of APIs Conference went well. Mashery put on a great conference, and over 200 people attended. There was an ensemble of impressive speakers, including Michele Azar of Bestbuy, Marc Frons of New York Times and Fred Wilson of Union Square Ventures. And then there was me. Checkout my presentation below:
Speaking at the Business of APIs Conference
November 12th, 2009by Jeremy Thomas
I’m happy to announce that I’m one of the featured speakers at the Business of APIs Conference in NYC on 16 November. I’ve been leading the charge to open our data at Active.com, and we’ve started a slow rollout of our API. I’ll be talking about the journey we’ve taken to get to where we are today with our API. We’ve still got a long way to go.
If you’re in NYC on Monday and are interested in APIs, come by and check it out!
Dear Consumer
June 13th, 2009by Jeremy Thomas
(cross-posted from the active.com Product Development blog)
Data. Data. Data.
December 29th, 2008by Jeremy Thomas
Something I learned while working with the Information Management group at BearingPoint down in Australia continues to resonate for me at my “Web 2.0-ish” job in San Diego, CA. Data integrity is king but is bloody hard to maintain. Consider a datawarehouse, where information about information is stored, often for reporting purposes. Datawarehouses can be used to answer the question “how many customers do I have?”, or more specifically, “how many residential customers do I have?”. Seems simple enough.
But data, dare I say “truth”, is federated. And each member of the federation has its own vernacular.
For example, the residential loan processing system might call a customer a “customer”, while the commercial loan processing system calls a customer a “client”. At the core these are the same entities, with “residential” or “commercial” being a modifier (as an adjective is to a noun). So a datawarehousing solution would apply its central vernacular to these entities allowing the question “how many customers do I have?” to be answered even though the answer is informed by two sources of truth.
Data transformation and categorization works moderately well when an organization has control over its data sources (and has, therefore, a limited number of vernaculars). But consider the La Jolla, CA, page on Yelp, http://www.yelp.com/la-jolla-ca, which claims that La Jolla has 1028 restaurants worth reviewing. Most of this data is user-submitted. And how does a user classify Starbucks? “Food”? “Restaurants”? And what about subcategories? “Coffee and Tea”? “Desserts”? Some users might choose to use some of these categories, while others might use all. And it’s consistency that lies at the heart of the issue of maintaining data integrity. A user should have access to all restaurants when browsing by “Restaurants”.
If information is consistently categorized, even incorrectly, we can get accurate answers to our queries. But if it’s inconsistently categorized our answers will not be comprehensive.
So how, then, do websites like yelp.com deliver meaningful, consistently categorized results when they’re reliant on crowdsourcing? Are there really only 1028 review-worthy restaurants in La Jolla? And what of those restaurants that are mistakingly subcategorized as “Turkish” when they’re actually “Lebanese”?
Manual Labor is the answer.
I suspect sites like Yelp.com leverage services like mechanical turk to comb through the thousands of user-submitted records apply a more uniform categorization scheme. And this is why data integrity is bloody hard to maintain as there is so much manual labor involved. I question the sustainability of such a model, especially as a site grows and gathers more data.
But, what I can say, is it is more important for data to be correctly categorized than it is for it to be mostly correctly categorized. If users on Yelp search for “Automotive” assets and are shown beauty salons they will leave. Data integrity is king.
Made It
December 19th, 2007by Jeremy Thomas
I safely arrived in Colorado last night. It’s been 1 year since I’ve been in the US. I immediately noticed how festive the US is around the holidays. Carols were playing in the LA airport and people said “merry christmas” to me. This doesn’t really happen in Australia, where if they said it they’d prefer “happy christmas” instead. It’s also pretty cold in Denver (at least compared with how hot it’s been in Melbourne this past few weeks). And I had my first Chipotle burrito since last December for dinner tonight.
What I’ve noticed immediately is that most blog and twitter updates now happen throughout the day. In Australia it was like reading the newspaper, where when I got to work most of the blog posts and twitter updates from my US counterparts had been completed already, so I’d spend 20 minutes or so catching up on the “news”. Now my reader constantly has updates. I’m not sure which way is better actually.
Anyway, I’m hoping to visit more conferences now that I’m in North America and meet some of my blogging friends in person. I might look into heading to the FASTForward conference this year to get me going.
It’s good to be back.
Follow Me on Twitter
Co-Author