Transcribing the EWDs

Call for Volunteers

The EWDs in their current PDF-bitmap format are a priceless resource, but they would be even more valuable if they could be searched for significant words and phrases, to answer such questions as

  • what did Dijkstra write about <topic>?
  • where did Dijkstra write <pithy epigram> ?

Furthermore, as bitmaps the EWDs are inaccessible to visitors who are visually impaired.

Therefore we've started a project to transcribe the EWDs to text files. If you feel like contributing to this effort, we invite you to transcribe as many EWDs as your inclination and available time permit.

Output format

As you can see by inspecting the transcriptions that have been completed so far, our aim is not to replace the EWDs, but only to provide them with searchable companions. The transcriptions contain only enough HTML markup to provide convenient links to the original PDFs (if adding the markup is not convenient for you, feel free to send transcriptions in plain text, and we'll add the markup).

The one exception to this simplicity is the EWDs' formulas. For purposes of searching, the formulas don't matter much --visitors aren't likely to search for formulas. For visitors who are visually impaired, however, the formulas must eventually be represented in a format which caters to audio software. At the moment the best long-term bet looks like MathML, but support for MathML, both for writing it and for rendering it (either visually or aurally), is still somewhat spotty. For now, therefore, we'll tend to concentrate on the EWDs that are less formula-intensive. When we do encounter formulas, we'll transcribe them as ordinary text, planning to come back and do them properly once we know what "properly" means.

Logistics

Since all of the EWDs to be transcribed are already available on the web, you can simply email your transcriptions to , and I'll install them in the web site. To avoid collisions, let me know when you're about to start on a transcription, and I'll add an "in-progress" entry to the index.

As noted above, it makes sense to steer clear of the most formula-intensive EWDs until we know more about how formulas should be represented. Others to avoid are ones for which OCR software may be effective; these include ones that Dijkstra originally typed, as well as ones that were subsequently published and consequently typeset (these are the ones that have copyright notices on their cover pages).

To summarize, it's best to choose EWDs that are handwritten and that lack copyright notices. Of course, if you have a favorite EWD such that transcribing it would be pure pleasure, by all means go right ahead!

Suggestions

Comments and suggestions on any aspect of this project are always welcome.

--


Last revision: Tue, 10 Jun 2003

home