Changing to UTF-8

From TNG Wiki
Jump to: navigation, search

Notes on changing a TNG Site and database to UTF-8 provided by User:TheKiwi


Create a copy of your database

1 - Use phpMyAdmin to make a copy of your database (always make a copy in case something goes wrong!!!!!!!)

1.1 - In the databases list on the left click on the Database to select it.

1.2 - On the right side click on the Operations Tab. In the section called "Copy Database to" enter a name for your new database, and make sure that "Structure and Database" and "CREATE DATABASE before copying" are both checked, then click "Go".

When done this will have created a copy of your database.

Change database to UTF-8

2 - You need to run a script to change the structure of your database to UTF-8

2.1 - download the script from

http://www.phoca.cz/documents/38-tools/154-how-to-change-collation-in-database
and put it into your site's folder, then load the page to your site.
http://URLToYourSite/tool_phoca_changing_collation/
(NOTE - you need to replace "URLToYourSite" above with the actual URL to your site.)

2.2 - Fill out the 5 boxes with the values needed for your site including choosing a collation of the form

utf8_xxxxxx_ci
xxxxxx is whatever you choose to use as your utf8 collation - eg swedish, general, unicode etc. The collation affects how characters are sorted, eg does ø sort with o or come at the end of the alphabet? Does å sort with a or come at the end of the alphabet? Does ß sort with s or with ss? etc. If you need help selecting a collation for your site, please read Selecting your TNG Database Collation. This page
http://dev.mysql.com/doc/refman/5.0/en/charset-unicode-sets.html
has some information about how some characters are handled in different collations. Choose the collation based on the principal language that your site is about.
  • I had problems using utf8_general_ci. Text using accented vowels in French was not properly displayed in any of the browsers (IE, Chrome, Firefox). When I used collation utf8_swedish_ci there were no problems. [posted by Chuck Filteau]
Once you've decided on what collation you want to use and entered it into the 5th box, click the Submit button. This will change the collation of the database, tables and columns to the collation you've chosen. The progress will be shown as it goes along. The last tables altered are tng_xnotes, so if you don't see these as the last items listed in the output the script didn't complete.

2.3 - when this has completed (on a large database it can take some time) a message appears at the bottom of the page with a link back to the Home Page. Until this link appears the script hasn't completed.

2.4 - When the script has completed, you should delete the folder "tool_phoca_changing_collation" from your website.


Change TNG settings

3 Change the settings that TNG is using for Character Set.

3.1 - In the TNG Admin ------> Setup ------> General Settings ------> Language change the Character Set to "UTF-8" (without the "" marks) and save the changes.

3.2 - Do the same in TNG Admin ------> Languages ------> each language you support to change them to UTF-8

Make sure that you are using UTF-8 Encoded language files

4.1 - You can download UTF-8 encoded files in all of the languages that TNG supports from the TNG Downloads page - the same page you download your TNG software and TNG updates from. There is a download for each language supported and that download includes the files text.php, admintext.php, alltext.php and an empty cust_text.php file.

4.2 - If you have made changes to any of the language files for your own purposes, you should move all of those changes to the file cust_text.php for each language, so that you can use "standard" language files as supplied by TNG, with the changes being made only in cust_text.php so that you don't lose your changes in any future TNG updates. You should use the supplied empty cust_text.php that came with your UTF-8 encoded language files from 4.1 above.

If you have a cust_text.php file that you need to convert to UTF-8, then the steps below will help you do that on Macintosh and Windows computers

on Macintosh system

4.3 - Here's what works on Macintosh. You need TextWrangler - a free text editor. If you don't already have TextWrangler, or its big brother BBEdit, you can download the free TextWrangler from

http://www.barebones.com/products/textwrangler/

4.3.1 - For each language folder open the file cust_text.php in TextWrangler

4.3.2 - After the file is open, go to the File menu and choose "Reopen using Encoding ------> Western (Windows Latin 1).

4.3.3 - At the bottom of the TextWrangler window is a pop up menu should now say "Western (Windows Latin 1)". Click on this and choose "Unicode™ (UTF-8, no BOM)" Note: Starting with TextWrangler Ver 3.5, the UI terminology has been changed, and "UTF-8, no BOM" is no longer listed as an option. Per the Release Notes, "Unicode (UTF-8, no BOM)" has been renamed to "Unicode (UTF-8)".

4.3.4 - Save the file.

on Windows system

4.4 - need instructions here on how to do this on Windows. It needs to use a text editor that can handle saving files in UTF-8 format WITHOUT writing the BOM (Byte Order Mark) to the file, which causes odd characters  to appear on TNG pages, and your Admin Menu may display a blank page because there are characters before the <?php which should be first in the file.

4.4.1 - Notepad++ (Current Version) will allow converting the $text variable files to UTF-8 without a BOM.

That's it. Your site should now be running in UTF-8.

Related links