Open government developer Waldo Jaquith had a problem: he wanted transcripts for videos of the Virginia legislature but didn’t have the resources to fund their creation nor time to transcribe sessions himself.
When he talked to Matt Cutts at the Newsfoo unconference last December, Google’s lead for Web spam suggested to Jacquith that he make use of YouTube ability to automagically created machine-generated transcripts of video.
Last week, Jaquith posted a $500 bounty for a speech transcription program, funded by 95 backers for a Kickstarter campaign to liberate Virginia’s legislative video.
That’s when something interesting happened, as Jacquith blogged today: Aaron Williamson, a lawyer for the Software Freedom Law Center, created a Python script to fix the problem.
Jaquith intends to use the code in the Richmond Sunlight project — and because it’s open source, anyone else can press it into service as a means to generate transcripts of video.
The quality of YouTube’s machine-generated transcriptions are, to be fair, mixed, although they are improving. That said, they’re better than none at all.
Williamson told Jaquith that he’ll donate the Kickstarter cash to charity.