Archæology

The assorted finds of Artefact Publishing

Modern English to Old English wordlist

I am currently studying Old English, which is a great deal of fun, but am lacking composition exercises. There is the Englisc Composition mailing list, but nothing in the way of exercises. I have consequently decided to create some of my own, both for myself and for future students of the course I am taking. Naturally, this immediately led me to looking into the best way of marking up a lexicon (rather than doing actual work). The Text Encoding Initiative have pretty much what I need.

After all, why should I simply write: “immediately: ardliċe, sōna” when I can write instead:

<entry>
  <form>
    <orth>immediately</orth>
  </form>
  <sense>
    <trans>
      <tr>ardliċe</tr>
      <tr>sōna</tr>
      <gramGrp>
        <pos>&adv;</pos>
      </gramGrp>
    </trans>
  </sense>
</entry>

A normal person would probably answer that by saying that the former is much less work, but that’s because there’s much less information in it. After creating something like the latter I can easily generate something I can feed to a DICT dictionary server, I can have HTML and PDF plain text versions, all with easily varying amounts of grammatical information and formatting.

I suspect I may be somewhat prone to over-engineering, at least in this particular case, since I’m unlikely to end up with more than 500 words in this lexicon. However, it’s fun and allows me to ponder fruitlessly on the relative merits of ċ and ċ on my display.

Posted by jamie on August 10, 2003 22:02+12:00

Comments