Collation conversion

4 years 8 months ago #313322 by krileon
Replied by krileon on topic Collation conversion
We're not the developers of Akeeba Admin Tools so I've no idea why it didn't convert those. I recommend contacting Akeeba Admin Tools. CB does not change collations.


Kyle (Krileon)
Community Builder Team Member
Before posting on forums: Read FAQ thoroughly + Read our Documentation + Search the forums
CB links: Documentation - Localization - CB Quickstart - CB Paid Subscriptions - Add-Ons - Forge
--
If you are a Professional, Developer, or CB Paid Subscriptions subscriber and have a support issue please always post in your respective support forums for best results!
--
If I've missed your support post with a delay of 3 days or greater and are a Professional, Developer, or CBSubs subscriber please send me a private message with your thread and will reply when possible!
--
Please note I am available Monday - Friday from 8:00 AM CST to 4:00 PM CST. I am away on weekends (Saturday and Sunday) and if I've missed your post on or before a weekend after business hours please wait for the next following business day (Monday) and will get to your issue as soon as possible, thank you.
--
My role here is to provide guidance and assistance. I cannot provide custom code for each custom requirement. Please do not inquire me about custom development.
The following user(s) said Thank You: mikerotec

Please Log in to join the conversation.

4 years 8 months ago - 4 years 8 months ago #313324 by mikerotec
Replied by mikerotec on topic Collation conversion
OK, so nothing unusual about those three tables, then... will contact the AdminTools team.
For now, I guess I can just manually convert their collation,will have to look that up!

Please Log in to join the conversation.

4 years 8 months ago - 4 years 8 months ago #313325 by mikerotec
Replied by mikerotec on topic Collation conversion
FYI - Akeeba Dev says this...



This is not a bug, it’s by design.

UTF8MB4 can use up to 4 bytes per character instead of up to 3 bytes per character for UTF8. This is important for indices. A MySQL index can have a maximum index key size of 767 bytes. If you do the math that’s 191 characters for UTF8MB4 and 255 characters for UTF8. Many extensions create tables with VARCHAR(255) fields which are subsequently indexed. MySQL in this case needs to allocate an index size which is equal to the maximum number of characters in the field times the maximum number of bytes per character, i.e. 765 bytes for UTF8 and 1020 bytes for UTF8MB4. In the latter case the key size is larger than the MySQL hard limit on key sizes which makes the conversion impossible.

There are two alternatives to fixing this kind of an issue. The first is to reduce the length of the field to 191 characters. However, this might have unforeseen side-effects in the component(s) using it, making it an unsafe option. This is why we cannot enforce it.

The other option is to tell MySQL that we don’t care to index the entire field value but only a part of this. Research has shown that 64 characters is enough in most cases. In this case the index key size under UTF8MB4 would be 256 bytes which is well within MySQL’s capabilities. However, if the data in that column really tends to have a lot of common prefixes up to or exceeding 64 characters this would effectively render the index useless and cause PHP timeouts or other unwanted effects. Once again, this solution is unsafe and cannot be enforced.

Since there is no plausible automatic solution Admin Tools will detect MySQL’s error message and suppress it. This is by design. We have observed that not converting the very few tables left over is much safer than either stopping the conversion before it finishes (what would happen in case of an error) or applying an arbitrary solution. The best thing you can do is get in touch with the developer of the extension which created the table and ask them for a solution. The solution is typically a new version employing either of the two alternatives I explained above.

Finally note that this feature should no longer need to be used except on sites which began their life back in Joomla! 3.4 or earlier. Even in this case both Joomla itself and its extensions should already be taking care to upgrade their tables to UTF8MB4. If you had to use this feature to update your site despite installing new versions of Joomla and its extensions you probably still have one or more extensions which do not fully support UTF8MB4. I would strongly recommend getting in touch with their developers to address this issue. In most cases it is a simple oversight on a rarely-used table which doesn’t really hold data affected by encoding (I know because we had such a table for a very long time).

Please Log in to join the conversation.

4 years 8 months ago #313331 by mikerotec
Replied by mikerotec on topic Collation conversion
PS: we have a big extension migration coming up this weekend, so to be on the safe side (for now) I have just set all the tables to utf8_general_ci. ( to be in sync with all the CBSubs stuff!)

The AdminTools utility was fine with that ( except it left all of its own tables MB4 - but they don't interact with any of our extension tables, so no crisis I think) :)

We don't use Asian characters, and we can live without emoji support for a while...

Please Log in to join the conversation.

4 years 8 months ago #313338 by krileon
Replied by krileon on topic Collation conversion
The index length for innodb is only 767 bytes if you don't have innodb_large_prefix enabled, which then becomes a limit of 3072 bytes assuming the table row format has been set to dynamic or compressed. See the below on how to enable this.

dev.mysql.com/doc/refman/5.6/en/innodb-parameters.html#sysvar_innodb_large_prefix
stackoverflow.com/questions/35847015/mysql-change-innodb-large-prefix

Without doing this the index limit is simply too small. Smaller than MyISAM, which is what CBSubs tables were originally designed for before Joomla switched to InnoDB. Then Joomla switched collations. This shrunk potential index length limit even further, which becomes more and more a problem for complex indexes.

I recommend always enabling innodb_large_prefix, because it's the direction MYSQL is going. I MYSQL 5.7.7 innodb_large_prefix is deprecated as a setting is and is now just the default behavior for MYSQL.


Kyle (Krileon)
Community Builder Team Member
Before posting on forums: Read FAQ thoroughly + Read our Documentation + Search the forums
CB links: Documentation - Localization - CB Quickstart - CB Paid Subscriptions - Add-Ons - Forge
--
If you are a Professional, Developer, or CB Paid Subscriptions subscriber and have a support issue please always post in your respective support forums for best results!
--
If I've missed your support post with a delay of 3 days or greater and are a Professional, Developer, or CBSubs subscriber please send me a private message with your thread and will reply when possible!
--
Please note I am available Monday - Friday from 8:00 AM CST to 4:00 PM CST. I am away on weekends (Saturday and Sunday) and if I've missed your post on or before a weekend after business hours please wait for the next following business day (Monday) and will get to your issue as soon as possible, thank you.
--
My role here is to provide guidance and assistance. I cannot provide custom code for each custom requirement. Please do not inquire me about custom development.
The following user(s) said Thank You: mikerotec

Please Log in to join the conversation.

4 years 8 months ago #313360 by mikerotec
Replied by mikerotec on topic Collation conversion
Interesting! I am running MySQL 5.7.27 and it appears that innodb_large_prefix is already ON by default.

also, innodb_file_format=Barracuda and innodb_file_per_table=ON

I guess the only question is what is preferable for Joomla tables: ROW_FORMAT=DYNAMIC; -- or ROW_FORMAT=COMPRESSED?

( it's currently empty - no row format specified... )

Please Log in to join the conversation.

Moderators: beatnantkrileon
Time to create page: 0.210 seconds

Facebook Twitter LinkedIn