We’re always looking for productive self motivated developers, we’ve put together this coding challenge to give you a sense of the type of challenges we work at Sortable. Impress us! Use whatever language you’d like, we’re looking for elegant solutions, with great precision and recall.

The goal is to use text similarity to match product listings from a 3rd party retailer, e.g. “Nikon D90 12.3MP Digital SLR Camera (Body Only)” against a set of known products, e.g. “Nikon D90″.
Read the instructions, download the data files below, code up your solution, post it to your github account and impress us!
download the data
At Sortable we do a lot of data integration. A simplified example: product specifications might come from one data source, product pricing from another, and product reviews from a third. A major challenge in working with all this data is determining when two pieces of information from disparate data sources are actually talking about the same product. In academic circles this problem is sometimes called Record Linkage, Entity Resolution, Reference Reconciliation, or a host of other fancy terms. We describe this problem very technically as “matching.”
We’ll provide you with a set of products and a set of price listings matching some of those products. The task is to match each listing to the correct product. Precision is critical. We much prefer missed matches (lower recall) over incorrect matches, so try hard to avoid false positives. A single price listing may match at most one product.
Be careful not to tie your logic too tightly to the input data. We will run your solution against both the listings provided in the challenge, and a different set of listings that you don’t get to see ahead of time. No giant if statements tailored exactly to the test data, please!
The inputs and outputs for the challenge are all text files. Each file has one JSON object per line. The following section describes what those objects look like.
{
"product_name": String // A unique id for the product
"manufacturer": String
"family": String // optional grouping of products
"model": String
"announced-date": String // ISO-8601 formatted date string, e.g. 2011-04-28T19:00:00.000-05:00
}
{
"title": String // description of product for sale
"manufacturer": String // who manufactures the product for sale
"currency": String // currency code, e.g. USD, CAD, GBP, etc.
"price": String // price, e.g. 19.99, 100.00
}
A file full of Result objects is what your solution will be generating. A Result simply associates a Product with a list of matching Listing objects.
{
"product_name": String
"listings": Array[Listing]
}
Download: challenge_data_20110429.tar.gz (370 KB)
Contains two files:
Text file with one Product object per line.
Text file with one Listing object per line.
The output your solution creates should be a text file with one Result object per line.

We’re in Waterloo, in the Bauer Building, just south of uptown Waterloo
Just downstairs in our building there are some great places to eat including Vincenzo’s (gourmet sandwich bar, hot plate counter, sushi and more), the Bauer Kitchen (a great restaurant), the Bauer Bakery (quick food to go). Uptown Waterloo is just down the street, where there are tons of great bars, restaurants and stores plus the busline stops right outsdie our new building!
Our mission at Sortable is to make it easy for people to make decisions about which product or service to use, for example their next camera, phone, or tv purchase, or their next meal, movie, or trip. Sortable’s focus is making these decisions easy for ordinary people by handling all the data analysis and surfacing the best options in simple, beautiful interfaces. Sortable is a 10+ person engineering driven startup, our fast growing websites are used by millions of visitors each month, and we need to expand!

Sortable perks include the candy mountain, lunches, drinks, an arcade, full array of video games, health benefits, and vacation time.
//