An Icelandic Primer
Finished 7-12-02
Download
Text
txt | PG txt | PG txt (zip) | HTML | PG HTML | PG HTML (zip) | PDF | PG PDF | PG PDF (zip) | TeX (zip) | PG TeX (zip) | PHP sourcePage images
PDF (6.5 megs) | PostScript (gzip) (4 megs) | DjVu (1 meg)Note: This etext is in Unicode, which means it will look like a lot of gibberish unless you view it with a Unicode-capable editor/viewer. You'll also need a Unicode font that supports these characters:
| Character | Description |
| ¯ | macron (code x00AF) |
| ´ | acute accent (code x00B4) |
| ̄ [E.g. œ̄] | combining macron (code x0304) |
| ̈ [E.g. ǫ̈] | combining diaresis (code x0308) |
| þ | small thorn (code x00FE) |
| Þ | capital thorn (code x00DE) |
| ð | small eth (code x00F0) |
| Ð | capital eth (code x00D0) |
| æ | small ae (code x00E6) |
| ǣ | small ae with macron (code x01E3) |
| œ | small oe (code x0153) |
| ā | small a with macron (code x0101) |
| ē | small e with macron (code x0113) |
| ī | small i with macron (code x012B) |
| ō | small o with macron (code x014D) |
| ū | small u with macron (code x016B) |
| ȳ | small y with macron (code x0233) |
| Ā | capital a with macron (code x0100) |
| Ē | capital e with macron (code x0112) |
| Ī | capital i with macron (code x012A) |
| Ō | capital o with macron (code x014C) |
| Ū | capital u with macron (code x016A) |
| ę | small e with ogonek (code x0119) |
| ǫ | small o with ogonek (code x01EB) |
| ø | small o with stroke (code x00F8) |
| ö | small o with diaresis (code x00F6) |
| é | small e with acute accent (code x00E9) |
| § | section sign (code x00A7) |
If there are characters missing or the above table looks like a big mess, you can look at this image for reference. (Note that the combining diaresis and macron don't show up correctly. They should be over the characters they follow.)
The Making of the Etext
I found the image files for this book at Sean Crist's site (which has many other excellent resources, I might add) and downloaded them all. I then uploaded them one at a time to DocMorph to OCR them. The English text came out fairly well, but the Old Icelandic parts were unrecognizable. :)
Because there are many characters in the primer which aren't in the standard character set, and because I wanted to stay as close to the original as possible, I decided to encode the text in Unicode (UTF-8). I renamed all of the text files from *.txt to *.ice and put the following in my .vimrc (I'm using Vim 6.1 -- a screenshot is here):
au BufNewFile,BufRead *.ice set textwidth=65
au BufNewFile,BufRead *.ice set encoding=utf-8
au BufNewFile,BufRead *.ice set ff=unix
au BufNewFile,BufRead *.ice set expandtab
au BufNewFile,BufRead *.ice set scrolloff=4
For a while I used the "Ctrl-V U" sequence to enter the Unicode characters ("Ctrl-V U 00FE" to put in an eth, for example), but then I found that Vim's digraph support for Unicode was much nicer. (Instead of using "Ctrl-V U 00FE", for example, I could just type "Ctrl-K d -".) That saved a lot of time. After cleaning the text files up and comparing against the images, I collated them into one big text file and ran a proofing check, making sure all the paradigm tables were the same. Then I converted the etext to HTML, using PHP to automate parts (the paradigm tables and the glossary, mainly). Finally, I proofed the entire etext against the images. After I finished the HTML version I converted the etext to Omega (a TeX package that sits on top of LaTeX) for output to a nice PDF version.
I also converted the images into PDF and DjVu. The DjVu version is much nicer than the PDF, and is much smaller as well. I found that saving the DjVu file into PostScript made the PostScript look much nicer, so I've put that on here as well (and the gzipped file is even smaller than the PDF :)).

