Changing to UTF-8

From TNG_Wiki
Revision as of 12:08, 24 March 2018 by Chris Moss (talk | contribs) (→‎Change database to UTF-8: don't import ISO into UTF)
Jump to navigation Jump to search

Notes on changing a TNG Site and database to UTF-8 provided by User:TheKiwi

Create a copy or backup of your database

You can use phpMyAdmin to make a copy of your database (always make a copy in case something goes wrong!!!!!!!)

  1. In the databases list on the left click on the Database to select it.
  2. On the right side click on the Operations Tab. In the section called "Copy Database to" enter a name for your new database, and make sure that "Structure and Database" and "CREATE DATABASE before copying" are both checked, then click "Go".

When done this will have created a copy of your database.

Alternatively you can back up your database using any of the methods described in Database - Backup.

Change database to UTF-8

Change coding
Summary Allows a simple change of character set and collation sequence for the TNG database
Validation
Mod Updated 23 Mar 2018
Download link 11.1.0.1
Download stats show
Author(s) Chris Moss
Homepage
Mod Support contact author
Contact Developer contact author
Latest Mod 11.1.0.1
Min TNG V
Max TNG V 11.1.2
Files modified
Related Mods
Notes
only available with English instructions


You need to run a script to change the structure of your database to UTF-8. The previously recommended script has some problems so a new mod has been created which is much simpler to use and works in a flexible way.

Download the mod in the normal way, unzip, place it in your mods directory and click the install button.

After running the change coding mod

When it is installed, click on the green "Installed" line and then hit the "Change database" button below. The conversion will take a number of seconds and when it is complete a summary will appear similat to that shown at the left. Note in this case that the default language is English but the language actually in use when the conversion was done is German. The "utf8_general_ci" collation is suitable for most languages, but if you want a different one, then you can edit the options of the mod and put another collation, as long as it's consistent with the character set chosen.

If for any reason you want to change back to one of the other two supported character sets, ISO-8859-1 and ISO-8859-2, then choose the Edit options button on the mod and put one of these in the first box. In this case you will need to change the collation to latin1_swedish_ci or similar (there is no 'general' for the latin languages, the MySQL variant of ISO/8859).

The mod alters all the appropriate settings for the different language folders, including cookies and session variables, so you shouldn't need to do anything else. Once you have changed the database it's ok to uninstall the mod as it shouldn't be needed again. But's ok to run the mod again if, for instance, an incompatible table has been installed into your database or the settings have been disturbed by an update.

But don't try and import an ISO Gedcom file into a UTF-8 database (or vice versa). There will be no checking or conversion and you will end up with a mess if it contains accented characters.

Make sure that you have all the necessary UTF-8 Encoded language files

In the languages directory, there are pairs of folders for each language, e.g. French and French-UTF8, which have all the strings used in the two different character sets. If you have deleted the -UTF8 directories to save space, then these should be restored before running the change script above. Most mods will also include changes to both of these sets, normally changing the cust_text.php file.

However if you made changes to cust_text.php file yourself then you need to make sure these changes are also in the corresponding -UTF8 version of cust_text.php. The steps below will help you do that on Macintosh and Windows computers

On Macintosh system

Here's what works on Macintosh. You need BBEdit - a free text editor. If you don't already have BBEdit, you can download it from https://www.barebones.com/products/bbedit/download.html

  1. For each language folder open the file cust_text.php in BBEdit
  2. After the file is open, go to the File menu and choose "Reopen using Encoding ------> Western (Windows Latin 1).
  3. At the bottom of the BBEdit window is a pop up menu should now say "Western (Windows Latin 1)". Click on this and choose "Unicode (UTF-8)".
  4. Save the file and place in the appropriate directory.

On Windows system

You need to use a text editor that can handle saving files in UTF-8 format WITHOUT writing the BOM (Byte Order Mark) to the file such as Notepad++ (Current Version).

Load the cust_text.php file and save it specifying UTF-8 encoding without BOM, and place it in the corresponding folder (e.g. French-UTF8).

Related links