|
|||
|
Hi,
I set up a website on a 1and1 hosting with a MySQL db containing different languages, including greek, hebrew and coptic and it all works fine with Firefox. I no longer maintain that site, so have set up my own MySQL db on my computer (Windows XP) and copied the pages to another 1and1 site. I cannot get the scripts to display properly now with the new site, and linking into my own db. Using Navicat I can see that at least the greek and hebrew have imported and are displaying fine in my db. I went into the original site and briefly switched the db source to mine, and it did not display the scripts correctly either - all '?' again. I switched back to the original database, and all displayed fine again. Using the Console function in Navicat, I can get the Greek and Hebrew to display correctly. So, the source of the problem must be with my database - possibly some setting somewhere. I am using php, with ... <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> on both sites. Any suggestions as to why my db is not working properly? TIA -- Iain |
|
|
||||
|
||||
|
|
|
|||
|
On 03/08/12 04:51, Iain wrote:
> Hi, > > I set up a website on a 1and1 hosting with a MySQL db containing > different languages, including greek, hebrew and coptic and it all works > fine with Firefox. > > I no longer maintain that site, so have set up my own MySQL db on my > computer (Windows XP) and copied the pages to another 1and1 site. I > cannot get the scripts to display properly now with the new site, and > linking into my own db. Using Navicat I can see that at least the greek > and hebrew have imported and are displaying fine in my db. I guess your export/import broke the site, when you do a mysql dump, you can specify which charset to use, you nay also have to specify the charset you art using on your computer and which one is used on the mysql server. When you import, you should use the same values again. > So, the source of the problem must be with my database - possibly some > setting somewhere. > > I am using php, with ... > <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" > "http://www.w3.org/TR/html4/loose.dtd"> > <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> > <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> > on both sites. You can't use two charset for one page, but in your case you may get some pages to work if you switch to another charset on your secondary site. I suggest you do a new dump which you import into your new database. -- //Aho |
|
|||
|
El 03/08/2012 4:51, Iain escribió/wrote:
> I set up a website on a 1and1 hosting with a MySQL db containing > different languages, including greek, hebrew and coptic and it all works > fine with Firefox. > > I no longer maintain that site, so have set up my own MySQL db on my > computer (Windows XP) and copied the pages to another 1and1 site. I > cannot get the scripts to display properly now with the new site, and > linking into my own db. Using Navicat I can see that at least the greek > and hebrew have imported and are displaying fine in my db. > > I went into the original site and briefly switched the db source to > mine, and it did not display the scripts correctly either - all '?' > again. I switched back to the original database, and all displayed fine > again. > > Using the Console function in Navicat, I can get the Greek and Hebrew to > display correctly. > > So, the source of the problem must be with my database - possibly some > setting somewhere. > > I am using php, with ... > <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" > "http://www.w3.org/TR/html4/loose.dtd"> > <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> > <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> > on both sites. > > Any suggestions as to why my db is not working properly? I have the impression that the problem can be at any step of the process, not necessarily the database, and you are not sure about how to very it. You'll find pretty good advice in this article: http://www.itnewb.com/tutorial/UTF-8...and-JavaScript Read it carefully and debug each part separately. -- -- http://alvaro.es - Álvaro G. Vicario - Burgos, Spain -- Mi sitio sobre programación web: http://borrame.com -- Mi web de humor satinado: http://www.demogracia.com -- |
|
|||
|
"Iain" <spam@smaps.net> wrote in message
news:a80sl8FvblU1@mid.individual.net... > Hi, > > I set up a website on a 1and1 hosting with a MySQL db containing different > languages, including greek, hebrew and coptic and it all works fine with > Firefox. > > I no longer maintain that site, so have set up my own MySQL db on my > computer (Windows XP) and copied the pages to another 1and1 site. I > cannot get the scripts to display properly now with the new site, and > linking into my own db. Using Navicat I can see that at least the greek > and hebrew have imported and are displaying fine in my db. > > I went into the original site and briefly switched the db source to mine, > and it did not display the scripts correctly either - all '?' again. I > switched back to the original database, and all displayed fine again. > > Using the Console function in Navicat, I can get the Greek and Hebrew to > display correctly. > > So, the source of the problem must be with my database - possibly some > setting somewhere. My guess (and apologise if you've already checked this) is taht its the charset used for the database. I had similar issues once and it turned out that the database was using the default charset (Latin1) and not utf8. The problem went away when I switched charsets and re-imported. If this is your issue then the notes I made at the time might help: http://www.cryer.co.uk/brian/mysql/h...tion_order.htm If it is the charset then once you've changed it then you will need to re-import your data. Hope this helps. -- Brian Cryer http://www.cryer.co.uk/brian |
|
|||
|
Brian Cryer wrote:
> "Iain" <spam@smaps.net> wrote in message > news:a80sl8FvblU1@mid.individual.net... > > Hi, > > > > I set up a website on a 1and1 hosting with a MySQL db containing > > different languages, including greek, hebrew and coptic and it all > > works fine with Firefox. > > > > I no longer maintain that site, so have set up my own MySQL db on my > > computer (Windows XP) and copied the pages to another 1and1 site. I > > cannot get the scripts to display properly now with the new site, > > and linking into my own db. Using Navicat I can see that at least > > the greek and hebrew have imported and are displaying fine in my db. > > > > I went into the original site and briefly switched the db source to > > mine, and it did not display the scripts correctly either - all '?' > > again. I switched back to the original database, and all displayed > > fine again. Using the Console function in Navicat, I can get the Greek > > and > > Hebrew to display correctly. > > > > So, the source of the problem must be with my database - possibly > > some setting somewhere. > > My guess (and apologise if you've already checked this) is taht its > the charset used for the database. I had similar issues once and it > turned out that the database was using the default charset (Latin1) > and not utf8. The problem went away when I switched charsets and > re-imported. > If this is your issue then the notes I made at the time might help: > http://www.cryer.co.uk/brian/mysql/h...tion_order.htm > If it is the charset then once you've changed it then you will need to > re-import your data. > > Hope this helps. I checked these earlier and changed them. Viewing the database properties through Navicat, they are: Character set: utf8 -- UTF-8 Unicode Collation: utf8_general_ci That is what the original db is - getting that info from querying the 'INFORMATION_SCHEMA.COLUMNS' table ... CHARACTER_SET_NAME: utf8 - utf8_general_ci DEFAULT_COLLATE_NAME: utf8 - utf8_general_ci The strange thing is that when I do a .sql dump of the various tables, which can include a 'show create table ...', the tables always come up with 'latin1' and 'latin1_general_ci'. This is also confirmed by doing another query on the 'INFORMATION_SCHEMA.COLUMNS', with the corresponding table. The results are always: (field name code, (type varchar(4), latin1, latin1_general_ci(field name texts, (type text, latin1, latin1_general_ciHowever, if I do not edit the 'create table ...' (which uses the 'latin1'), at the first instance of trying to 'insert', it comes up with an error: [Err] 1366 - Incorrect string value: '\xCE\xB2\xCE\xB9 ...' which is the rtf-8 coding (in this instance, Greek). Replying to Alvaro: Thanks for the link. I checked the settings and set up the .htaccess file (the REM '//' lines have to be removed otherwise it will not work). I also added the settings to the my.ini file. The 'default-character-set = utf8' under the server section stopped MySQL from restarting, so I had to remove that, although there was already a setting, 'character-set-server=utf8' in the server section. Unfortunately, that has not sorted it out. Replying to Aho I have tried several new dumps, with importing them and maybe with one or two tweeks, but still no success. ![]() Thanks for all the suggestions so far, but still no success. -- Iain |
|
|||
|
Iain wrote:
> > The strange thing is that when I do a .sql dump of the various tables, > which can include a 'show create table ...', the tables always come up > with 'latin1' and 'latin1_general_ci'. This is also confirmed by doing > another query on the 'INFORMATION_SCHEMA.COLUMNS', with the > corresponding table. The results are always: > (field name code, (type varchar(4), latin1, latin1_general_ci> (field name texts, (type text, latin1, latin1_general_ci> There is a switch on mysqldump --create-options which I THINK you probably need to use to recreate character sets.. Someone better versed than I will confirm or refute that, I am sure. -- To people who know nothing, anything is possible. To people who know too much, it is a sad fact that they know how little is really possible - and how hard it is to achieve it. |
|
|||
|
On 03/08/12 18:14, Iain wrote:
> I have tried several new dumps, with importing them and maybe with one > or two tweeks, but still no success. Make a dump and use --skip-set-charset and see if it works better, there are always issues with charsets when export/import, I hope this could improve in mysql. Other options is that you use sed to replace the charset in the dump you made. sed 's/Latin1/utf8/g' -i yourdatabasedump.sql and then import it. |
|
|||
|
The Natural Philosopher wrote:
> Iain wrote: > > > > > The strange thing is that when I do a .sql dump of the various > > tables, which can include a 'show create table ...', the tables > > always come up with 'latin1' and 'latin1_general_ci'. This is also > > confirmed by doing another query on the > > 'INFORMATION_SCHEMA.COLUMNS', with the corresponding table. The > > results are always: (field name code, (type varchar(4), latin1,> > latin1_general_ci (field name texts, (type text, latin1,> > latin1_general_ci > > There is a switch on mysqldump --create-options which I THINK you > probably need to use to recreate character sets.. > > Someone better versed than I will confirm or refute that, I am sure. What I have been doing is editing the .sql dump before running it. I have both removed the latin1 references, and allowed it to re-create the table using the default utf-8, and replacing the the latin1 with utf-8. Both run OK and the data is imported into the table correctly and running a query shows both the Greek and Hebrew characters coming up correctly(? - don't know either language, but recognise the characters). So it seems that within the environment of the database / tables (Navicat), the utf-8 is working correctly. Somewhere along the line, the getting of the data from the tables to the php changes the text from being recognisable within the table environment, to becoming '???' in the php. Viewing the page source for both the site where it works, and the site where it doesn't; where it works, the correct characters appear within the page source, and the site where it doesn't work, the '?' appears in the page source. It's similar stories for both Firefox and MS IE, except the coptic characters do not display properly on IE. Maybe that's just IE! -- Iain |
|
|||
|
J.O. Aho wrote:
> On 03/08/12 18:14, Iain wrote: > > > I have tried several new dumps, with importing them and maybe with > > one or two tweeks, but still no success. > > Make a dump and use --skip-set-charset and see if it works better, > there are always issues with charsets when export/import, I hope this > could improve in mysql. > > Other options is that you use sed to replace the charset in the dump > you made. > > sed 's/Latin1/utf8/g' -i yourdatabasedump.sql > > and then import it. I can only access the data in the database that's working (the other one) through php - I do not have direct access to the database itself now. I'm using a php routine 'mysqldump.php', which I have been modifying to get other bits of information, eg. from the 'INFORMATION_SCHEMA.COLUMNS' table. http://forums.phpfreaks.com/index.php?topic=162154.0 There seem to be various versions around. But surely does it not mean something that the characters display correctly after I have imported them into my new database? Does that not show that the dump is correct, and that they have also imported correctly? (Coptic is a different story at the moment, and may follow once the Greek and Hebrew are working OK). -- Iain |
|
|||
|
> I checked these earlier and changed them. Viewing the database properties
> through Navicat, they are: > Character set: utf8 -- UTF-8 Unicode > Collation: utf8_general_ci > > That is what the original db is - getting that info from querying the > 'INFORMATION_SCHEMA.COLUMNS' table ... > CHARACTER_SET_NAME: utf8 - utf8_general_ci > DEFAULT_COLLATE_NAME: utf8 - utf8_general_ci You need to be concerned with: - The character set of the connection. ("SET NAMES utf8") - The character set of the client. - The character set of the server (--character-set-server=latin1 is the default) - The character set of the database (CREATE DATABASE `dbname` DEFAULT CHARACTER SET utf8) - The character set of the table (CREATE TABLE `tablename` CHARACTER SET utf8) - The character set of the column (CHARACTER SET utf8 clause in a column definition in CREATE TABLE) The character set of a column is the first of the column, table, database, and server character set that is specified. - What's really stored in the column. If it doesn't match what MySQL thinks the character set is for the column, you're in trouble. One of the messier situations is having the characters actually utf8, but labelled latin1, so as long as the character sets are all labelled equally WRONG, it works, but if you set one of them correctly, it tries to convert and you get a mess. The solution is usually to re-import the data after fixing the column character sets. - The character set of a string literal in a query is the character set of the connection. If the character set of the column is different from the character set of the connection, MySQL will try to convert it (and it may fail, e.g. converting Greek letters from utf8 to latin1). Oh, yes, you probably need to worry about collations also. > > The strange thing is that when I do a .sql dump of the various tables, which > can include a 'show create table ...', the tables always come up with > 'latin1' and 'latin1_general_ci'. This is also confirmed by doing another > query on the 'INFORMATION_SCHEMA.COLUMNS', with the corresponding table. That sounds like trouble, perhaps indicating that the tables were created before the default character set for the database was set. > The results are always: > (field name code, (type varchar(4), latin1, latin1_general_ci> (field name texts, (type text, latin1, latin1_general_ci> > However, if I do not edit the 'create table ...' (which uses the 'latin1'), > at the first instance of trying to 'insert', it comes up with an error: > [Err] 1366 - Incorrect string value: '\xCE\xB2\xCE\xB9 ...' which is the > rtf-8 coding (in this instance, Greek). The query: SHOW VARIABLES LIKE 'character%'; after setting the default database to the one you're using may be useful. Also try typing \s to the MySQL command-line client. |
|
|||
|
Iain wrote:
> > Somewhere along the line, the getting of the data from the tables to the > php changes the text from being recognisable within the table > environment, to becoming '???' in the php. Hmm. Have you ;viewed source' to see if the php is spitting out UTF8 but not informing the browser that it is? > Viewing the page source for both the site where it works, and the site > where it doesn't; where it works, the correct characters appear within > the page source, and the site where it doesn't work, the '?' appears in > the page source. Ah...possibly there is a PHP character set variable? "To change the character encoding in your php.ini file, find the following line and input your preferred character encoding. In the example below, UTF-8 is the character set. default_charset = "UTF-8"" > > It's similar stories for both Firefox and MS IE, except the coptic > characters do not display properly on IE. Maybe that's just IE! > probably not got the right fonts loaded. -- To people who know nothing, anything is possible. To people who know too much, it is a sad fact that they know how little is really possible - and how hard it is to achieve it. |
|
|||
|
The Natural Philosopher wrote:
> Iain wrote: > Ah...possibly there is a PHP character set variable? > > "To change the character encoding in your php.ini file, find the > following line and input your preferred character encoding. In the > example below, UTF-8 is the character set. > > default_charset = "UTF-8"" > I don't think that I can gain access to the php.ini file in a commercially hosted site. I have already put: <meta http-equiv="content-type" content="text/html;charset=utf-8" /> in the corresponding page headers. > > > > It's similar stories for both Firefox and MS IE, except the coptic > > characters do not display properly on IE. Maybe that's just IE! > > > probably not got the right fonts loaded. It works perfectly with Firefox - maybe there's a specific IE coptic font! Anyway, I don't normally worry too much about this in IE; it has enough quirks in javascript to keep anyone going. The main original site has the Firefox logo on it and says that it is optimized for Firefox. -- Iain |
|
|||
|
Iain wrote:
> The Natural Philosopher wrote: >> Iain wrote: > >> Ah...possibly there is a PHP character set variable? >> >> "To change the character encoding in your php.ini file, find the >> following line and input your preferred character encoding. In the >> example below, UTF-8 is the character set. >> >> default_charset = "UTF-8"" >> > I don't think that I can gain access to the php.ini file in a > commercially hosted site. I think there is usually a locally overridable one. That you can access. knock up a script to display te output of phpinfo..and see what that says. Thats how I have tracked down several - 'my test site and my production site behave diffrently' issues I have already put: > <meta http-equiv="content-type" content="text/html;charset=utf-8" /> > in the corresponding page headers. > >> > >> > It's similar stories for both Firefox and MS IE, except the coptic >> > characters do not display properly on IE. Maybe that's just IE! >> > >> probably not got the right fonts loaded. > > It works perfectly with Firefox - maybe there's a specific IE coptic > font! Anyway, I don't normally worry too much about this in IE; it has > enough quirks in javascript to keep anyone going. The main original > site has the Firefox logo on it and says that it is optimized for Firefox. > Oh, the firefox is on a windows system? There is something about IE that is ticking away at tteh back of my brain - and font smoothing..firefox sees a better job IIRC but IE may be tweakable..or a font size change may help. -- To people who know nothing, anything is possible. To people who know too much, it is a sad fact that they know how little is really possible - and how hard it is to achieve it. |
|
|||
|
The Natural Philosopher wrote:
> I think there is usually a locally overridable one. > That you can access. > > knock up a script to display te output of phpinfo..and see what that > says. > Thats how I have tracked down several - 'my test site and my > production site behave diffrently' issues I have created a phpinfo.php file that displays all of the settings for phpinfo settings: 1, 2, 4, 8, 16, 32, 64. Both sites have the same settings (they are both 1and1). default_charset both have 'no value' for both Local and Master values. and both have idn.default_charset set to 'ISO-8859-1' for both Local and Master values. So there are no differences there, nor in all of the other settings, except for the Environment and PHP variables. ...... > Oh, the firefox is on a windows system? > > There is something about IE that is ticking away at tteh back of my > brain - and font smoothing..firefox sees a better job IIRC but IE may > be tweakable..or a font size change may help. Thanks for the suggestion - it was worth checking. -- Iain |
|
|||
|
>> Other options is that you use sed to replace the charset in the dump
>> you made. >> >> sed 's/Latin1/utf8/g' -i yourdatabasedump.sql >> >> and then import it. Beware that queries like: set names utf-8; do not work. And if it produces an error message when running a whole dump, it may be difficult to see. MySQL managed to spell the character set names differently from what you have to put in the HTTP headers (in this case, utf8, *not* utf-8 goes in the above set names query). Note that if you need 4-byte UTF-8 encodings, you need utf8mb4, not utf8, as the character set. Unless you need the "Ancient Greek Numbers" block, you do not need this for Greek, Hebrew, and Coptic. It may be useful to see what is actually in the database field. This requires a little knowledge of character encoding. Something like: select hex(last_name) from employees where empid = 33; (pick a record where last_name includes some non-ASCII letters) can get you a dump of the data actually there. Edit your import file (in a UTF-8 editor) so that one of the last names is "André", (that's A n d r e-with-acute-accent. The accented e is code point 0xE9, encoded in UTF-8 as C3A9.) with a known record number, here assumed to be 33. Duplicate the import process as exactly as you can. Now run: select hex(last_name) from employees where empid = 33; The correct encoding is: 416E6472C3A9 and if you got that, the correct data went *in* to the database. If you got: 416E6472E9 it went in encoded as iso-8859-1 or Windows-1252. Somehow it thinks your database field is latin1, not utf8. If you got: 416E6472C383C2A9 it got encoded as UTF-8 *TWICE*. It probably thought the data being imported was in latin1 and translated it to utf-8. Fix the character set for the database connection. This is what I got with "set names" with the charset spelled incorrectly. If you got the correct data *in* to the database, now get the value out and display it (in a web browser?). If you get: A n d r e-with-acute-accent it's working. If you get: A n d r capital-A-with-tilde copyright-symbol it's getting the bytes out and treating them as iso-8859-1 or Windows-1252. If you get: A n d r capital-A-with-tilde unknown-character capital-A-with-circumflex copyright-symbol it's getting the bytes out, treating them as iso-8859-1 or Windows-1252, and translating them to UTF-8. > I can only access the data in the database that's working (the other one) > through php - I do not have direct access to the database itself now. > I'm using a php routine 'mysqldump.php', which I have been modifying to get > other bits of information, eg. from the 'INFORMATION_SCHEMA.COLUMNS' table. > http://forums.phpfreaks.com/index.php?topic=162154.0 > There seem to be various versions around. > > But surely does it not mean something that the characters display correctly > after I have imported them into my new database? If you consistently get the character set *WRONG*, but consistently *WRONG* (say, everything is consistently labelled as Romulan-13), and nobody is doing any translating it might look like it works. Fix any one character set, and it will be translated and look wrong, even though you are now "closer" to correct. > Does that not show that > the dump is correct, and that they have also imported correctly? > (Coptic is a different story at the moment, and may follow once the Greek > and Hebrew are working OK). > |
|
|
![]() |
| Thread Tools | |
| Display Modes | |
|
|