Sending a new language to the
Caver's Multi-Lingual Dictionary
[ UISIC Home ]
[ Dictionary words ]
[ Dictionary Introduction ]
The Caver's Dictionary is always looking for more languages to add. If you would like to add a
whole new language, please contact the Chairman before starting work on it. The dictionary terms are stored in a database, and we upload the file of new words into it as described below. You can also just add new words to an existing language, or make corrections - in these cases just send the details by email to the Chairman. (at "dictionary@uis-speleo.org", or click here for further contact details.)
First, some definitions we use in the Dictionary:
- Concept
- A group of terms with the same meaning, regardless of the language used.
In the Dictionary, each concept is identified by a number, i.e. it is language-independent.
This is the number which appears on the first line of each group of terms.
- Term
- A word or phrase, in any language, which is used for a given concept.
- Term Sequence Number
- The number specifying the sequence position of
the term when there is more than one term for a concept. Naturally each sequence number must be
unique (no duplicates) within the one concept.
For example, Concept 1 on the web page contains, among others, two terms in English, namely
"littoral cave" and "sea cave", and one term in Spanish, namely
"cueva marina (f)". The Term Sequence Number for "littoral cave" and
"cueva marina (f)" is 1, and the Term Sequence Number for "sea cave" is 2.
The method we use to add a new language is:
- You prepare the words in a suitable file which shows which existing concepts they belong
to, and email it to us. New concepts can also be included.
- We check the file for format, and if necessary, reformat it to suit the database. We also
check we have correctly interpreted any accented characters.
- We may send the new terms for peer review until everyone is happy.
- We load the final file into the database.
- We generate the new web pages with your new language included.
However it would be a big help, and will speed up the inclusion of your
new language into the web pages, if you can supply the file to us in the following format:
- a plain-text file, in comma-separated-value (.csv) format. The words could be prepared
in a spreadsheet, database, or table in a word-processing document, and then exported to
a csv text file.
- only one term per line.
- each line with three comma-separated fields with no leading or trailing spaces in the fields:
[concept number],[term sequence number],["term"]
(omit the square brackets)
For example, if you were supplying the English language terms for the first three concepts in the Dictionary, the file would look like this:
1,1,"littoral cave"
1,2,"sea cave"
2,1,"clay"
3,1,"pit(US)"
3,2,"pitch(GB)"
3,3,"pot"
3,4,"shaft"
- if your file includes accented characters:
- tell us which code page or character set you have used.
- send us an image file of your computer screen displaying some sample terms which show each accented character correctly. This image enables us to avoid any character translations which may occur when moving the text file between your computer and software and ours. We can then ensure that we have interpreted your accented characters correctly. (To make an image file of your computer screen, if Windows, get a suitable image on your screen, press <Alt-PrintScreen>. This will copy the image into your paste buffer. Then open a document which can accept images, or your graphics program, and paste the image into it. Save and compress the image as a JPEG file and send it to us.)
If you find that you cannot provide a text file in the above preferred format, then we could accept another format but it must include the following information. However because of the extra work involved for us, there could be a delay before your new language appears on the web pages.
- Each term or group of terms must be identified with the Dictionary concept number to
which it belongs.
- If you supply more than one term for a concept, the word(s) of each term must be
clearly distinguished from the words of the other terms in that concept. Commas alone
between terms are not enough, as some terms may contain commas within them. Unless each
term is on a separate line, each should at least be inside "double quotes"
If it helps, we could provide you with a file from the database containing the existing terms in a language of your choice, to help you relate your terms to their corresponding concept number.
[ UISIC Home ]
[ Dictionary words ]
[ Dictionary Introduction ]
[ Top ]
This page: http://www.uisic.uis-speleo.org/lexsend.html
Initial version: 2007-09-03
Site: P. Matthews