Introduction
This week we saw some strange symbols on our launchpad in one of the excerpts of a post. We did however not see this on the blog part of our site. So there must be a difference between the two.
The solution
We had several ideas but most of them had to do with encoding.
This is what it looked like.
The first thing I tried after some google was the utf8 encode command.
But there was a little problem. the little single quote (or whatever it was) was no longer there. So I went out to find out what the problem was. Since b2evo did not have this problem I searched how they solved it and I found it.
/**
* Convert all non ASCII chars (except if UTF-8, GB2312 or CP1251) to &#nnnn; unicode references.
* Also convert entities to &#nnnn; unicode references if output is not HTML (eg XML)
*
* Preserves < > and quotes.
*
* fplanque: simplified
* sakichan: pregs instead of loop
*/
function convert_chars( $content, $flag = 'html' )
{
global $b2_htmltrans, $evo_charset;
/**
* Translation of invalid Unicode references range to valid range.
* These are Windows CP1252 specific characters.
* They would look weird on non-Windows browsers.
* If you've ever pasted text from MSWord, you'll understand.
*
* You should not have to change this.
*/
static $b2_htmltranswinuni = array(
'€' => '€', // the Euro sign
'‚' => '‚',
'ƒ' => 'ƒ',
'„' => '„',
'…' => '…',
'†' => '†',
'‡' => '‡',
'ˆ' => 'ˆ',
'‰' => '‰',
'Š' => 'Š',
'‹' => '‹',
'Œ' => 'Œ',
'Ž' => 'ž',
'‘' => '‘',
'’' => '’',
'“' => '“',
'”' => '”',
'•' => '•',
'–' => '–',
'—' => '—',
'˜' => '˜',
'™' => '™',
'š' => 'š',
'›' => '›',
'œ' => 'œ',
'ž' => 'ž',
'Ÿ' => 'Ÿ'
);
// Convert highbyte non ASCII/UTF-8 chars to urefs:
if( ! in_array(strtolower($evo_charset), array( 'utf8', 'utf-8', 'gb2312', 'windows-1251') ) )
{ // This is a single byte charset
// fp> why do we actually bother doing this:?
$content = preg_replace_callback(
'/[x80-xff]/',
create_function( '$j', 'return "&#".ord($j[0]).";";' ),
$content);
}
... rest of code in b2evo source.```
That is one hell of a solution and it would require a fari bit of research to find out why you need all this. It will need a good knowledge to see the difference between the codecs and how you can solve it. I did not feel the need to do this and just used the b2evo code. And that solved the problem.
<div class="image_block">
<a href="https://lessthandot.z19.web.core.windows.net/wp-content/uploads/users/chrissie1/encoding/encoding3.png?mtime=1295782805"><img alt="" src="https://lessthandot.z19.web.core.windows.net/wp-content/uploads/users/chrissie1/encoding/encoding3.png?mtime=1295782805" width="682" height="167" /></a>
</div>
## Conclusion
No need to reinvent the wheel. Use what is there and credit the maker and move on. But I do understand what it does and why it does it and that is important.