Fidonet - Training GoldED to Play UTF-8.





How To per utilizzare la codifica UTF-8 su GoldED
By Michiel van der Vlist, 2:280/5555:


Questo รจ un utile howto scritto da Michiel van der Vlist (Nodo Fidonet 2:280/5555 - RC Olanda http://www.vlist.eu ) per configurare l'utilizzo della codifica UTF-8 su GOLDED.



                  Training Golded to play UTF-8
               By Michiel van der Vlist, 2:280/5555


Golded has limited support for reading and writing messages in UTF-8.
There are two ways. One is using an external editor. The other way
is to use translation tables. As ever so often both methods have pros
and cons.


Using an external editor.
=========================


Golded has an option to use an external editor. Unfortunately this
can only be invoked when writing a message. So to just read a message
in utf-8 one has to "answer" it and when ready not save it.

Make a file gold8.cfg that is a copy of your golded.cfg  with the
following statements: (replacing any with lines with the same
keywords.)


EDITOR <myutfeditor> @file
EDITORFILE GOLDED.MSG
XLATPATH d:\fido\golded\
XLATIMPORT UTF-8
XLATLOCALSET UTF-8
XLATEXPORT UTF-8
XLATCHARSET UTF-8 UTF-8 UTF_UTF.CHS

where <myutfeditor> is the UTF-8 capable editor of your choice. I use
the Windows version of Vi. An alternative for Windows is Notepad. For
Linux there is nano. Almost any UTF-8 editor will do and there is
plenty of choice.

UTF_UTF.CHS is a dummy translation table who's main purpose it to get
the '4' into the "CHRS: UTF-8 4" kludge line of the message.

You may have this file already. If not get it from my system by
file request or download  http://www.vlist.eu/downloads/utf8_850.chs

Start golded with "golded -cgold8.cfg" or make a .bat file to that
effect.

When answering or creating a message select "External Editor" and
there you go...

Golded has limited support for reading and writing messages in UTF-8.
There are two ways. One is using an external editor. The other way
is to use translation tables. Last week was about the first method.
This week is about using translation tables.


Using translation tables.
=========================

Golded uses 8 bit translation tables to convert one character set
into another. Oddly enough Golded ignores the current code page
setting and always uses the code page in effect when the system is
booted up. At least that is how it works in Windows. Linux or OS/2
may be different. So if the systems boots up in CP850 all you will
ever see in the Golded screen are the characters in the CP850 set.
The character encoding scheme in the messages can be different, but
the gliphs are limited to those in the CP850 set.


Add this to your golded.cfg

XLATCHARSET CP850 UTF-8 850_UTF8.CHS
XLATCHARSET UTF-8 CP850 UTF8_850.CHS

GROUP UTF8
        xlatimport utf-8
        member utf-8
        xlatexport utf-8
        origin UTF-8 enthousiast
ENDGROUP


If you do not already have 850_UTF8.CHS and UTF8_850.CHS you can get
them from my system by file request or http://www.vlist.eu/downloads/

Oddly enough the Golded translation mechanism allows for translating
one byte into one or more bytes but not the other wat around. As a
result all characaters in CP850 can be corectly translated into
UTF-8 but the translation fom UTF-8 into CP850 is very, very limited.
Actually it only works for characters in the range u+00C0 - u+00FF.

This covers the accents and umlauts but not much more.

If your system's native character set is not CP850 you will need
translation tables to and from that character set.


The pros and cons of both methods.
==================================


External editor                       Translation tables

Pros

 Support for any unicode character    Easy installation.
 that the external editor and the     Editing and formatting text
 underlying OS support.               works as one is used to.

Cons

 External editors normally only       The set of usable characters is
 support entering text "lines"        limited to those in the codepage
 separated by a line seperator.       installed at startup of the sys-
 Fidonet text consists of "para-      tem. (CP850 for Western Europe).
 graphs" seperated by a CR. The       Incoming non ASCII limited to
 two methods do not mix well.         u+00C0 - u+00FF.



                  __
                 /  \
                /|oo \
               (_|  /_)
                _`@/_ \    _
               |     | \   \\
               | (*) |  \   ))
  ______       |__U__| /  \//
 / FIDO \       _//|| _\   /
(________)     (_/(_|(____/
(c) John Madil

Torna su Fidonet How To

Torna su Fidonet

Torna su Mimac