Erlang Central

ISO 8859 1 TO UTF8

From ErlangCentral Wiki

Problem

You want to transform a string encoded as ISO-8859-1 into UTF-8 format.

Solution

The following function does the magic:


to_utf8([H|T]) when H < 16#80 -> [H | to_utf8(T)];                                                                 
to_utf8([H|T]) when H < 16#C0 -> [16#C2,H | to_utf8(T)];                                                           
to_utf8([H|T])                -> [16#C3, H-64 | to_utf8(T)];                                                       
to_utf8([])                   -> [].   
                     

Example


1> "This is some extra Swedish chars: åäö".                    
"This is some extra Swedish chars: åäö"
2> iso_8859_1:to_utf8("This is some extra Swedish chars: åäö").
"This is some extra Swedish chars: åäö"