Blog from December 2011

There are 5 blog entries from December 2011

Exploring Ancestry Dot Com
December 31st, 2011 | View Post

The cover to my grandfather's 1939 DeWitt Clinton high school yearbook
This is definitely NOT an endorsement for the website, namely because they traditionally spammed the hell out of people for business, however my dad recently signed up for their system and started building all sorts of artifacts pertinent to my grandparents. Suffices to say it has some pretty phenomenal digitized records available on it. While the hand-written census tract data is cool enough (it's all been scanned), they even had pages from my grandfather's yearbook at DeWitt Clinton High School and other similar types of documents. I don't know exactly where they get some of their data from and I'm guessing that one of the benefits to ancestral research is that the population was significantly smaller a century ago. That might not seem significant, but it makes fuzzy data matching much easier to do. Incidentally, they got his date of death wrong despite that fact that such records are readily available in this day in age. Again, I suspect it's due to having TOO MUCH data vs. the limited record-keeping we did a century ago.

Their overall UI/UX is lacking quite a bit for my personal taste (to non developers, this just means it's not especially convenient to navigate), but the information and records available for digital consumption are excellent. I will definitely be using their site to gather additional information for my own family philanthropy work and to anybody especially interested in this type of research, would at least recommend you check them out on a trial basis - if for no other reason than to grab the documents and images they've collected for you.
New Website Progression Updates
December 29th, 2011 | View Post
It's been a couple of months since I finally decided that it was time for a full overhaul of and there have been some pretty significant changes already made to the architecture of the site. I was joking to DaveG just a couple of days ago how significantly my CMS architecture skills have improved since I first developed the concept back in 2005, and even since formally releasing it in 2007.

The size of the codebase has been significantly reduced and puts much more strain on the SQL server for larger, more complex operations (as opposed to parsing data via code after simple queries). Additionally, most all of the original Javascript used has been replaced with jQuery or removed entirely. While I'm thrilled to have done this, it's not entirely an original design flaw. When I first began coding the system, the jQuery library was not yet available and developers were essentially limited to the Prototype / Scriptaculous packages. To this day I contend that those packages are hugely bloated and should not be used for web development projects given the complexities that they add to the maintenance of projects. This was especially the case in 2007 when I first released openFace given processors were that much slower and front-end Javascript weighs a browser down. Fortunately jQuery provides an excellent balance of functionality and programmability while still remaining pretty light (provided one avoids their various UI packages).

In parallel to this development (which is really just in my spare time), I have been hiring people to take on the arduous tasks of scanning photos, digitizing movies, digitizing old audio recordings, and etc. As of this writing I already have 11,254 new photos that will be added to the website once it is fully launched, though I'm guessing it will be closer to 17,000 new photos once I'm finished!
Wall Socket Sex
December 25th, 2011 | View Post
This is what I spent my post-Christmas dinner with the family working on. I saw a hand-drawn version sort of like it on Reddit and thought I could run with the concept and practice some Photoshop.

The red Fedora was indeed a tribute to my OS.

The Mechanical Turk Project
December 19th, 2011 | View Post
In early 2008 I came across an interesting idea that Amazon had started working on called Mechanical Turk. The idea was pretty simple: rather than writing software that was capable of artificially intelligent tasks, it was instead plausible simply to pay people very small sums of money to perform very small tasks. This is what the Turk program did.

The Amazon system allowed developers like myself to create small programs that would integrate with their Turk API. In my case, I wrote a very simple program that would display a random photo from my collection and request that the user describe the photo in the text-box I provided to them.

All of this was spelled out on within my user space, but provided that the person described the photo in at least 10 words, they would be paid a sum of $0.01 (1 penny). I suspect that this type of thing is mostly aimed at people in English-speaking emerging markets around the world (India for example), but proved to be invaluable to me. On the initial 3 month run, I wound up having over 8,000 photos described.

Update March 3rd, 2012: I've done some additional exploring of this system and in the few years since I first started experimenting with it, Amazon has made huge strides. In 2008 their developer API was pretty cumbersome to figure out, but they've really managed to solve this by simplifying the submission process. The downside is that it requires a few extra steps on the developer's end, but the upside is the simplicity to understand how it all works and to begin a campaign. The dataset importation used to be automated, but now one just uploads a CSV file of the data. Again, it's an extra step, but is so simple to do that I can't really complain.

All that being said, I've recently launched another several thousand photos and had 100% of them described within just a few days. For the most part, the entries submitted were very good. I will likely do a full-scale launch within the next few weeks and have Amazon Turk workers detail ALL of my photo collection. This is a pretty mammoth amount of data processing that people will be doing for me, and I'm excited to think how it will change my search criteria.

Update October 28th, 2015 (small political rant within): After many, many years of wonderful successes with the Mechanical Turk project, I decided that it was time to once again overhaul some of my older code. I spent several days rewriting a bunch of the internal mechanics of openFace (the software I designed that powers this entire website). Given the importance of this piece to me, I focused heavily on the Mechanical Turk processing. From a code point of view everything went great. As it happens though, there is now a community of Mechanical Turk "watchdogs" (they have no affiliation with Amazon by the way). This was an interesting experience for me.

The new scripts I wrote (to compliment the detailed classes I've already written) were probably a few thousand photos in when a small group of people (Californians as it happened) started sending me angry emails. In a nutshell, they told me that the price of each photo should be more like $0.30 - $0.40. Keep in mind it takes about 15 seconds to look at a picture and type a 10 word sentence about it. Always one to engage conflict, I tried reasoning with them. I explained that this was just a personal project and that I couldn't possibly spend that kind of money on it. At $0.01 - $0.02 per photo it costs me about $40.00 for 1,000 photos (the fees are over 100% at such a low rate). Still, when I process 8,000 photos it can add up. So while the fees would be a lot less at $0.30 per photo, 1,000 photos would wind up costing me nearly $400.00. It then follows that 8,000 photos would cost me about $3,200. I explained to this group that if it *did* cost that much, I simply wouldn't be able to use the service and then nobody would get my money. It's a hobby of mine to catalog my life and the time I share with those people important to me; it's not a business. It's all at a financial loss.

They weren't having it. They were angry and not shy about letting me know. But then I took a step back from their complaints and I realized that like most marketplaces, the people who are doing the work are extremely delighted to have a steady source of income coming in. Many of those people might live on just a few dollars per month and so the opportunity to make even $20.00 from me is huge to them. For some of them it can entirely change their quality of life and open new doors. Of course their lives are nothing like my own. I'm sad for that and would do most anything to change that. I'd sure as shit rather help a group of people from emerging countries like Brazil, India, China, etc. than have a bunch of people from California complain that I'm not doing enough to help THEM (while they sit in an air-conditioned Starbucks sipping lattes and yelling at me from behind their MacBook Pro).

So like so many things in life, once you tune out the background noise, the Mechanical Turk project continues to be pretty amazing. I'm delighted to give money to these people all around the world; they do a wonderful job for me.
Terms and Conditions
December 13th, 2011 | View Post
We don't really have any terms and conditions.