. Distributed Proof-reading at Project Madurai

"Distributed Proof-reading" is a web-based method (adopted from Project Gutenberg) for preparation of etexts of Tamil Literary works for Project Madurai. By breaking the work into individual pages several volunteers (based in different parts of the world) can be working on the same book at the same time. This significantly speeds up the keying in and proof-reading parts of the etext creation process.

Scanned image files of individual pages of printed version of Tamil works are stored in the web-server. Promads simply pick up one of these image files in a split screen window, where equivalent Tamil Text can be keyed in directly. The frame for display of the image and Tamil text can be in horizontal or vertical mode.

For Project Madurai, equivalent etext is keyed in Tamil script as per Unicode encoding. Several freeware Text Editors (e.g. Murasu Anjal, ekalappai) are available for use in Windows, Macintosh and Unix platforms that allow direct keying in of the Tamil text in the browser display window.
Click here to see a screen shot of the split screen display of scanned image and equivalent Text

When a proofer elects to proofread a page for a particular work, the text and image file are displayed in the same way. This allows the text file to be easily reviewed and compared to the image file, thus assisting the proofreading of the text file. The edited text file is then submitted back to the site via the same webpage that it was edited on.

Once all pages for a particular book have been processed the Project Manager joins the pieces, properly formats them into a PM E-Text and submits it to the PM archive. Advantage of DPPM are: i) image files are available 24 hours online and accessible to anyone anywhere (no need to have a printed copy of the book yourself to participate in etext preparation; ii) Promads deal with only one page at a time (hence time commitments are less); iii) several Promads can work same time on the same work sharing the load. At Project Gutenberg, Distributed Proofreaders posted their 5,000th book of total 14000 in their collections; about 300 new books were being finished each month.

For Project Madurai, Distributed Proof-reading Web-server can be accessed using the following URL

  • DP-PM server at Virginia Tech, USA


    What is the entire process for creating an etext via Distributed Proof-reading?


    Here is a list of the steps involved in the preparation of an etext.

  • Locate a target Tamil literary work for etext preparation . Target works must either be in public domain (free of all copyrights) or those where the author (or legal heirs) willing to give permission to Project Madurai to reproduce the work electronically and to distribue them free on the Net. For the type of works that has been covered earlier, please consult the webpage Subjectwise listing of Project Madurai works .

    Prior to starting etext preparation, send details to the Project Leader, Dr. K. Kalyanasundaram (kalyan AT geocities.com ) and get his approval. This is mainly to ensure that we do respect copyright restrictions if any and that other PM volunteers are not concurrently involved in the preparation of the same etext.

  • You need to have acces to an authentic version (printed copy) of the literary work you are interested to prepare etext. Here again, please consult Project Leader by providing him with the details of the source work you are planning to use.
    For Distributed Proof-reading, scanned image files of each page of the printed version needs to be produced. Scan resolution need to be at least 150-200 dpi, preferably 300 dpi. If you have access to image cleaning software (such as Adobe Photoshop), try to use it to increase the contrast.

  • When the scanned image files are ready for upload, contact the Project Manager for DPPM (Dr. Kumar Mallikarjunan) or Project Leader (Dr. K. Kalyanasundaram) for guidelines to upload by ftp the image files to the DPPM web-server upload the images and text to the web site.

  • Promads (PM volunteers) visit the DPPM web-site and after registration/log-in, view image files of individual pages and key in the corresponding Tamil Text in Unicode format.

  • For those for which keying has been done, Promads call up the same pages at DPPM web-server and proof-read/correct the equivalent Tamil text.

  • When image files of all pages corresponding to a single Tamil work has been keyed in and proof-read at least once, Project Manager assembles all the Tamil text and compiles a single etext file. He then forwards the same to the Project Leader for preparation of the html and PDF versions and release of the work to the general public.