Remaining small localization issue

33 views
Skip to first unread message

Yongwei Wu

unread,
Aug 25, 2007, 4:05:46 AM8/25/07
to vim...@googlegroups.com
Hi Bram and Vim gurus,

I am glad to see that in the last few years most of localization
problems in Vim disappeared. I am able to switch languages and
encodings swiftly with the following simple _vimrc (tested with latest
GVim and Windows XP):

--------------------------- begin _vimrc ---------------------------
if $LANG !~ '\.' || $LANG =~? '\.UTF-8$'
set encoding=utf-8
else
let &encoding=matchstr($LANG, '\.\zs.*')
let &fileencodings='ucs-bom,utf-8,' . &encoding
endif

source $VIMRUNTIME/vimrc_example.vim
--------------------------- end _vimrc ---------------------------

I have verified that menus, tool tips, welcome screen, and prompt
messages are correct with the following values of LANG:

* zh_CN.UTF-8
* zh_TW.UTF-8
* zh_CN.GBK
* zh_TW.Big5
* ja_JP.SJIS
* fr_FR.Latin1
* de_DE.Latin1

However, the last three have problems with UTF-8:

* ja_JP.UTF-8
* fr_FR.UTF-8
* de_DE.UTF-8

French and German cannot display the accented letters in the prompt
messages correctly. E.g., "Détacher ce menu" becomes "D<e9>tacher ce
menu", and "E15: ungültiger Ausdruck" becomes "E15: ung<fc>ltiger
Ausdruck". However, the translations for "Press ENTER or type command
to continue" are always correct, and all other things are OK.
Japanese has all the prompt message wrong, and also has the wrong
welcome screen; menus and tool tips are correct.

I am not sure whether this is a problem in the program or in the
translation files, but I think it is worth reporting and recording.

Another minor issue I found during the test. When $LANG=='zh_CN' (the
default value on Windows when locale is set to Chinese (PRC), with no
manual setting of the LANG environment variable), I would like to
choose UTF-8 as the default encoding (the above _vimrc does this).
Then the display for the welcome message and prompt messages will be
incorrect. In order to work around it, I add the following code:

--------------------------- begin code ---------------------------
if $LANG !~ '\.'
exec 'language messages ' . $LANG . '.UTF-8'
runtime! delmenu.vim
endif
--------------------------- end code ---------------------------

Then LANG settings like "zh_CN" will work perfectly (of course, ja_JP
and fr_FR still have the same problem as above). However, a
side-effect appears. Any menus created in plug-ins will appear
unlocalized. E.g., if a plug-in creates submenus under "Tools", then
both "Outils" and "Tools" will appear if LANG=fr_FR. There are no
problems if the menu creation is delayed to after the first buffer is
loaded (au BufEnter *).

Best regards,

Yongwei

--
Wu Yongwei
URL: http://wyw.dcweb.cn/

Tony Mechelynck

unread,
Aug 25, 2007, 5:33:24 AM8/25/07
to vim...@googlegroups.com
[...]

The above errors seem to indicate that Vim has been reading translation texts
which were in Latin1, thinking that they were in UTF-8. Codepoints U+0000 -
U+00FF have the same ordinals in both encodings, but their bytecode
representation diverges above 0x7F: U+00E9 is e-acute and U+00FC is u-umlaut,
but in UTF-8 they each require two bytes.


Best regards,
Tony.
--
(letter from Mark to Mike, about the film's probale certificate)
I would like to get back to the Censor and agree to lose the shits, take
the odd Jesus Christ out and lose Oh fuck off, but to retain 'fart in
your general direction', 'castanets of your testicles' and 'oral sex'
and ask him for an 'A' rating on that basis.
"Monty Python and the Holy Grail" PYTHON (MONTY) PICTURES LTD

Yongwei Wu

unread,
Aug 25, 2007, 9:26:23 PM8/25/07
to vim...@googlegroups.com
On 25/08/07, Tony Mechelynck <antoine.m...@gmail.com> wrote:

>
> Yongwei Wu wrote:
> >
> > French and German cannot display the accented letters in the prompt
> > messages correctly. E.g., "Détacher ce menu" becomes "D<e9>tacher ce
> > menu", and "E15: ungültiger Ausdruck" becomes "E15: ung<fc>ltiger
> > Ausdruck". However, the translations for "Press ENTER or type command
> > to continue" are always correct, and all other things are OK.
> > Japanese has all the prompt message wrong, and also has the wrong
> > welcome screen; menus and tool tips are correct.
> >
> > I am not sure whether this is a problem in the program or in the
> > translation files, but I think it is worth reporting and recording.
> [...]
>
> The above errors seem to indicate that Vim has been reading translation texts
> which were in Latin1, thinking that they were in UTF-8. Codepoints U+0000 -
> U+00FF have the same ordinals in both encodings, but their bytecode
> representation diverges above 0x7F: U+00E9 is e-acute and U+00FC is u-umlaut,
> but in UTF-8 they each require two bytes.

Thanks, Tony. That was my thought too. However, since I am not
familiar with the .mo files, I cannot dig any further what went wrong.

Tony Mechelynck

unread,
Aug 25, 2007, 10:10:48 PM8/25/07
to vim...@googlegroups.com, Bram Moolenaar

The message translations ("sources" *.po and "binaries" *.mo) live in
<buildDirectory>/vim<version>/src/po/. Inspection shows that for most
languages (including French and German) there is no "encoding" in the
filename: compare (French) fr.po fr.mo with (mainland Chinese)
zh_CN.UTF-8.[pm]o zh_CN.cp936.[pm]o zh_CN.[pm]o

The following (extract from the log of "make install" on Linux) installs the
translations under $VIMRUNTIME (which is, here, at its Unix default, i.e.,
/usr/local/share/vim/vim71/):

> make[2]: Entering directory `/root/.build/vim/vim71/src/po'
> make[3]: Entering directory `/root/.build/vim/vim71/src/po'
> make[3]: Leaving directory `/root/.build/vim/vim71/src/po'
> for lang in af ca cs de en_GB es fr ga it ja ko no pl ru sk sv uk vi zh_CN zh_CN.UTF-8 zh_TW zh_TW.UTF-8; do \
> dir=/usr/local/share/vim/vim71/lang/$lang/; \
> if test ! -x "$dir"; then \
> mkdir $dir; chmod 755 $dir; \
> fi; \
> dir=/usr/local/share/vim/vim71/lang/$lang/LC_MESSAGES; \
> if test ! -x "$dir"; then \
> mkdir $dir; chmod 755 $dir; \
> fi; \
> if test -r $lang.mo; then \
> cp $lang.mo $dir/vim.mo; \
> chmod 644 $dir/vim.mo; \
> fi; \
> done
> make[2]: Leaving directory `/root/.build/vim/vim71/src/po'

For some reason the zh_??.cp936.mo files are apparently not installed.

src/po/README.txt starts as follows:

> TRANSLATING VIM MESSAGES
>
> In this directory you will find xx.po files, where "xx" is a language code.
> Each file contains the translation of English Vim messages for one language.
> The files are in "po" format, used by the gettext package. Please refer to
> the gettext documentation for more information.
>
> The GNU gettext library, starting with version 0.10.37, supports converting
> messages from one encoding to another. This requires that it was compiled
> with HAVE_ICONV. The result is that the messages may be in any encoding
> supported by iconv and will be automatically converted to the currently used
> encoding.
>
> The GNU gettext library, starting with version 0.10.36, uses a new format for
> some encodings. This follows the C99 standard for strings. It means that
> when a multi-byte character includes the 0x5c byte, this is not recognized as
> a backslash. Since this format is incompatible with Solaris, Vim uses the old
> format. This is done by setting the OLD_PO_FILE_OUTPUT and OLD_PO_FILE_INPUT
> environment variables. When you use the Makefile in this directory that will
> be done for you. This does NOT work with gettext 0.10.36. Don't use it, get
> 0.10.37.

It seems that the run-time conversion of Latin1 to UTF-8 (required for
codepoints 0x80 to 0xFF when the messages are in Latin1 and 'encoding' is
UTF-8) does not happen on your system. I wonder how you can check that your
gettext library was compiled with HAVE_ICONV.

Bram, what do you think?


Best regards,
Tony.
--
Man is the only animal that can remain on friendly terms with the
victims he intends to eat until he eats them.
-- Samuel Butler

Yongwei Wu

unread,
Aug 25, 2007, 10:53:57 PM8/25/07
to vim...@googlegroups.com, Bram Moolenaar
On 26/08/07, Tony Mechelynck <antoine.m...@gmail.com> wrote:
> It seems that the run-time conversion of Latin1 to UTF-8 (required
> for codepoints 0x80 to 0xFF when the messages are in Latin1 and
> 'encoding' is UTF-8) does not happen on your system. I wonder how
> you can check that your gettext library was compiled with
> HAVE_ICONV.

I am really not sure. I downloaded gettext-win32 and libiconv-win32
from http://sourceforge.net/project/showfiles.php?group_id=25167, and
extracted iconv.dll and intl.dll in the path. I guess other Windows
users may do something similar. And my Windows Vim are built with
+gettext/dyn and +iconv/dyn.

I *guess* there are no problems here.

Patrick Texier

unread,
Aug 26, 2007, 3:52:52 PM8/26/07
to vim...@googlegroups.com
On Sun, 26 Aug 2007 10:53:57 +0800, Yongwei Wu wrote:
<8eeef57f0708251953l1a7...@mail.gmail.com> :

> I am really not sure. I downloaded gettext-win32 and libiconv-win32
> from http://sourceforge.net/project/showfiles.php?group_id=25167, and
> extracted iconv.dll and intl.dll in the path. I guess other Windows
> users may do something similar. And my Windows Vim are built with
> +gettext/dyn and +iconv/dyn.
>
> I *guess* there are no problems here.

libintl.dll (gettext) is too old in Vim7.1 Windows distribution. On
Windows 98, I'm using 0.14.1 version (2004) and I have no problems with
UTF-8 files and french Latin-1 vim.mo

--
Patrick Texier

Bram Moolenaar

unread,
Aug 26, 2007, 4:28:24 PM8/26/07
to Yongwei Wu, vim...@googlegroups.com

Yongwei Wu wrote:

I suspect this is caused by the gettext library used. The ordinary one
doesn't support encoding conversion. There is also one that uses iconv
for conversion, but this means you need to use a library that's much
bigger. It should be possible to make a gettext library that at least
knows to convert latin1 to utf-8, and use the MS-Windows libraries for
most other conversions. I don't know if this exists.

If you did install a gettext that supports iconv, make sure that Vim
uses it. You may need to delete a gettext.dll in the Vim directory.

> Another minor issue I found during the test. When $LANG=='zh_CN' (the
> default value on Windows when locale is set to Chinese (PRC), with no
> manual setting of the LANG environment variable), I would like to
> choose UTF-8 as the default encoding (the above _vimrc does this).
> Then the display for the welcome message and prompt messages will be
> incorrect. In order to work around it, I add the following code:
>
> --------------------------- begin code ---------------------------
> if $LANG !~ '\.'
> exec 'language messages ' . $LANG . '.UTF-8'
> runtime! delmenu.vim
> endif
> --------------------------- end code ---------------------------
>
> Then LANG settings like "zh_CN" will work perfectly (of course, ja_JP
> and fr_FR still have the same problem as above). However, a
> side-effect appears. Any menus created in plug-ins will appear
> unlocalized. E.g., if a plug-in creates submenus under "Tools", then
> both "Outils" and "Tools" will appear if LANG=fr_FR. There are no
> problems if the menu creation is delayed to after the first buffer is
> loaded (au BufEnter *).

This is caused by a gettext library that doesn't support specifying the
encoding different from the system locale. This feature only appeared
in more recent versions of the library.

--
Vi is clearly superior to emacs, since "vi" has only two characters
(and two keystrokes), while "emacs" has five. (Randy C. Ford)

/// Bram Moolenaar -- Br...@Moolenaar.net -- http://www.Moolenaar.net \\\
/// sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
\\\ download, build and distribute -- http://www.A-A-P.org ///
\\\ help me help AIDS victims -- http://ICCF-Holland.org ///

Yongwei Wu

unread,
Aug 26, 2007, 9:45:00 PM8/26/07
to vim...@googlegroups.com, Bram Moolenaar
Hi Patrick,

You hit on the point! After I removed libintl.dll from the vim71
directory, and copied my intl.dll to it (renamed to libintl.dll), the
problem with translation disappeared for all the three languages that
had had problems with UTF-8!

So Bram, I think in the next release of Vim (7.2) you should bundle an
updated version of libintl.dll :-).

Yongwei Wu

unread,
Aug 26, 2007, 10:05:24 PM8/26/07
to Bram Moolenaar, vim...@googlegroups.com
On 27/08/07, Bram Moolenaar <Br...@moolenaar.net> wrote:

>
> Yongwei Wu wrote:
>
> > French and German cannot display the accented letters in the prompt
> > messages correctly. E.g., "Détacher ce menu" becomes "D<e9>tacher ce
> > menu", and "E15: ungültiger Ausdruck" becomes "E15: ung<fc>ltiger
> > Ausdruck". However, the translations for "Press ENTER or type command
> > to continue" are always correct, and all other things are OK.
> > Japanese has all the prompt message wrong, and also has the wrong
> > welcome screen; menus and tool tips are correct.
> >
> > I am not sure whether this is a problem in the program or in the
> > translation files, but I think it is worth reporting and recording.
>
> I suspect this is caused by the gettext library used. The ordinary one
> doesn't support encoding conversion. There is also one that uses iconv
> for conversion, but this means you need to use a library that's much
> bigger. It should be possible to make a gettext library that at least
> knows to convert latin1 to utf-8, and use the MS-Windows libraries for
> most other conversions. I don't know if this exists.
>
> If you did install a gettext that supports iconv, make sure that Vim
> uses it. You may need to delete a gettext.dll in the Vim directory.

You are right. After I removed libintl.dll, and copied a new version,
the problem was gone.

The new version I use is 45,056 bytes. However, it requires
iconv.dll, which is 892,928 bytes.

Maybe we should add some Windows notes on the download page. I can
write something if you think this is the way to go (not wanting to
ship with iconv.dll).

> > Another minor issue I found during the test. When $LANG=='zh_CN' (the
> > default value on Windows when locale is set to Chinese (PRC), with no
> > manual setting of the LANG environment variable), I would like to
> > choose UTF-8 as the default encoding (the above _vimrc does this).
> > Then the display for the welcome message and prompt messages will be
> > incorrect. In order to work around it, I add the following code:
> >
> > --------------------------- begin code ---------------------------
> > if $LANG !~ '\.'
> > exec 'language messages ' . $LANG . '.UTF-8'
> > runtime! delmenu.vim
> > endif
> > --------------------------- end code ---------------------------
> >
> > Then LANG settings like "zh_CN" will work perfectly (of course, ja_JP
> > and fr_FR still have the same problem as above). However, a
> > side-effect appears. Any menus created in plug-ins will appear
> > unlocalized. E.g., if a plug-in creates submenus under "Tools", then
> > both "Outils" and "Tools" will appear if LANG=fr_FR. There are no
> > problems if the menu creation is delayed to after the first buffer is
> > loaded (au BufEnter *).
>
> This is caused by a gettext library that doesn't support specifying the
> encoding different from the system locale. This feature only appeared
> in more recent versions of the library.

The delmenu problem persists. If I create a menu in the plugin, it will
appear unlocalized. If I do the same manually in the command mode, it
will appear localized. It should not be related with gettext.

Oh, wait, do you mean delmenu is not necessary? ... That is true. The
problem itself is gone :-).

Cyril Slobin

unread,
Aug 27, 2007, 7:28:28 AM8/27/07
to vim...@googlegroups.com
On 8/27/07, Yongwei Wu <wuyo...@gmail.com> wrote:

> You hit on the point! After I removed libintl.dll from the vim71
> directory, and copied my intl.dll to it (renamed to libintl.dll), the
> problem with translation disappeared for all the three languages that
> had had problems with UTF-8!

Another problem appears: with new libintl.dll Russian messages appears
in the proper encoding, but I cannot disable them! I have write :language C,
or :language english, or :language en_US, or :language message C, or
anything I can imagine -- messages still are Russian, not default English.

--
Cyril Slobin <slo...@ice.ru> `When I use a word,' Humpty Dumpty said,
http://wagner.pp.ru/~slobin/ `it means just what I choose it to mean'

Yongwei Wu

unread,
Aug 27, 2007, 8:49:24 AM8/27/07
to vim...@googlegroups.com
On 27/08/07, Cyril Slobin <slo...@ice.ru> wrote:
>
> On 8/27/07, Yongwei Wu <wuyo...@gmail.com> wrote:
>
> > You hit on the point! After I removed libintl.dll from the vim71
> > directory, and copied my intl.dll to it (renamed to libintl.dll),
> > the problem with translation disappeared for all the three
> > languages that had had problems with UTF-8!
>
> Another problem appears: with new libintl.dll Russian messages
> appears in the proper encoding, but I cannot disable them! I have
> write :language C, or :language english, or :language en_US, or
> :language message C, or anything I can imagine -- messages still are
> Russian, not default English.

Exactly! This behaviour is reproduced here. Now one may only set the
language by $LANG on start-up, but not later.

Any experts on gettext can help?

Bram Moolenaar

unread,
Aug 27, 2007, 4:55:08 PM8/27/07
to Yongwei Wu, vim...@googlegroups.com

Yongwei Wu wrote:

> > > French and German cannot display the accented letters in the prompt

> > > messages correctly. E.g., "D=E9tacher ce menu" becomes "D<e9>tacher ce
> > > menu", and "E15: ung=FCltiger Ausdruck" becomes "E15: ung<fc>ltiger


> > > Ausdruck". However, the translations for "Press ENTER or type command
> > > to continue" are always correct, and all other things are OK.
> > > Japanese has all the prompt message wrong, and also has the wrong
> > > welcome screen; menus and tool tips are correct.
> > >
> > > I am not sure whether this is a problem in the program or in the
> > > translation files, but I think it is worth reporting and recording.
> >
> > I suspect this is caused by the gettext library used. The ordinary one
> > doesn't support encoding conversion. There is also one that uses iconv
> > for conversion, but this means you need to use a library that's much
> > bigger. It should be possible to make a gettext library that at least
> > knows to convert latin1 to utf-8, and use the MS-Windows libraries for
> > most other conversions. I don't know if this exists.
> >
> > If you did install a gettext that supports iconv, make sure that Vim
> > uses it. You may need to delete a gettext.dll in the Vim directory.
>
> You are right. After I removed libintl.dll, and copied a new version,
> the problem was gone.

I'm glad you figured it out.

> The new version I use is 45,056 bytes. However, it requires
> iconv.dll, which is 892,928 bytes.

This was discussed before. I don't want to include a dll that requires
a megabyte of space, especially since it's rarely used. We really need
a libintl.dll that dynamically loads iconv.dll, or uses the MS-Windows
native conversion support.

> Maybe we should add some Windows notes on the download page. I can
> write something if you think this is the way to go (not wanting to
> ship with iconv.dll).

Or make a tip on the wiki.

--
hundred-and-one symptoms of being an internet addict:
257. Your "hundred-and-one" lists include well over 101 items, since you
automatically interpret all numbers in hexadecimal notation.
(hex 101 = decimal 257)

Tony Mechelynck

unread,
Aug 27, 2007, 9:27:57 PM8/27/07
to vim...@googlegroups.com
Cyril Slobin wrote:
> On 8/27/07, Yongwei Wu <wuyo...@gmail.com> wrote:
>
>> You hit on the point! After I removed libintl.dll from the vim71
>> directory, and copied my intl.dll to it (renamed to libintl.dll), the
>> problem with translation disappeared for all the three languages that
>> had had problems with UTF-8!
>
> Another problem appears: with new libintl.dll Russian messages appears
> in the proper encoding, but I cannot disable them! I have write :language C,
> or :language english, or :language en_US, or :language message C, or
> anything I can imagine -- messages still are Russian, not default English.
>

The following (at the *very top* of the vimrc) works for me to disable French
messages. It doesn't work if called after menus have been set up.


" set Vim (not vi) defaults. This is only required when invoked with -u
if &cp
set nocompatible
endif
" force English messages and menus
if has('unix')
language messages C
else
language messages en
endif
" set a number of useful settings
runtime vimrc_example.vim
" add additional customizations below this line

Best regards,
Tony.
--
A diplomat is a man who can convince his wife she'd look stout in a fur
coat.

Patrick Texier

unread,
Aug 28, 2007, 8:46:48 AM8/28/07
to vim...@googlegroups.com
On Mon, 27 Aug 2007 09:45:00 +0800, Yongwei Wu wrote:

> > libintl.dll (gettext) is too old in Vim7.1 Windows distribution. On
> > Windows 98, I'm using 0.14.1 version (2004) and I have no problems with
> > UTF-8 files and french Latin-1 vim.mo
>
> You hit on the point! After I removed libintl.dll from the vim71
> directory, and copied my intl.dll to it (renamed to libintl.dll), the
> problem with translation disappeared for all the three languages that
> had had problems with UTF-8!

I had big problems with gettext 0.14.1 (File Explorer crashing W98) and
I'm using now old libintl.dll, UTF-8 french .mo file and :let
$LANG='fr.UTF-8' for UTF-8 editing.

--
Patrick Texier

Cyril Slobin

unread,
Aug 28, 2007, 6:23:13 PM8/28/07
to vim...@googlegroups.com
On 8/28/07, Tony Mechelynck <antoine.m...@gmail.com> wrote:

> The following (at the *very top* of the vimrc) works for me to disable French
> messages. It doesn't work if called after menus have been set up.

Tested, failed. :-(

Cyril Slobin

unread,
Aug 28, 2007, 6:29:27 PM8/28/07
to vim...@googlegroups.com
On 8/28/07, Patrick Texier <p.te...@genindre.org> wrote:

> I had big problems with gettext 0.14.1 (File Explorer crashing W98) and

O, you've enlighten me! My File Explorer crashed too, but I never blamed
gettext for this. After all I've decided that I need non-English messages very
rarely, and totally removed both versions of libintl from vim directory. Rude
solution, bot works for me.

Yongwei Wu

unread,
Aug 28, 2007, 10:49:30 PM8/28/07
to vim...@googlegroups.com

I do not use File Explorer. I have verified VimExplorer (script#1950)
has no problems with iconv-enabled gettext 0.13.1/0.16.1 (which will
cure your small glitch about "Détacher ce menu" too). It is very
powerful, and work with the mouse and keyboard nicely.

Reply all
Reply to author
Forward
0 new messages