UTF-8 in Freebsd

From banana_wiki
Revision as of 15:50, 10 April 2014 by Bananafish (talk | contribs) (Created page with "Unicode is a set of character encodings that are compatible with the Universal Coded Character Set (UCS) defined by ISO/IEC 10646. Unicode was designed to replace all previous...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Unicode is a set of character encodings that are compatible with the Universal Coded Character Set (UCS) defined by ISO/IEC 10646. Unicode was designed to replace all previous character encodings such as the American Standard Code for Information Interchange (US-ASCII) and ISO/IEC 8859.

UTF-8, which is also described in RFC 3629, is a variable-length Unicode character encoding that is backwards compatible with US-ASCII. That is, all US-ASCII characters have the same encoding under both US-ASCII and UTF-8. Due to the widespread use of US-ASCII in computing environments, this backwards compatibility makes UTF-8 convenient to deploy and therefore a popular choice for multilingual computing environments.

FreeBSD, like many UNIX-based operating systems, is unfortunately not configured to use UTF-8 by default. This sometimes causes confusion about whether Unicode is supported on FreeBSD. Fortunately, it is easy to enable UTF-8 on FreeBSD.

First we might want to see what locales are available to us:

locale -a | grep '\.UTF-8$'

Now if we want to change the locale for the entire system we open /etc/login.conf and add

        :charset=UTF-8:\
        :lang=en_US.UTF-8:\
        :setenv=LC_COLLATE=C:

If we want to make the locale change on a per user basis just add the above to ~.login_conf

now we rebuild the login class capabilities database

cap_mkdb /etc/login.conf

Finally reboot or kill all sessions to get the new locale.