Skip to content
  • Martin Dürst's avatar
    369ff793
    add encoding conversion from/to CESU-8 · 369ff793
    Martin Dürst authored
    Add encoding conversion (transcoding) from UTF-8 to CESU-8
    and back. CESU-8 is an encoding similar to UTF-8, but encodes
    codepoints above U+FFFF as two surrogates, these surrogates
    again being encoded as if they were UTF-8 codepoints. This
    preserves the same binary sorting order as in UTF-16. It is
    also somewhat similar (although not exactly identical) to an
    encoding used internally by Java.
    
    This completes issue #15995.
    
    enc/trans/cesu_8.trans: Add encoding conversion from/to CESU-8
    test/ruby/test_transcode.rb: Add tests for above
    369ff793
    add encoding conversion from/to CESU-8
    Martin Dürst authored
    Add encoding conversion (transcoding) from UTF-8 to CESU-8
    and back. CESU-8 is an encoding similar to UTF-8, but encodes
    codepoints above U+FFFF as two surrogates, these surrogates
    again being encoded as if they were UTF-8 codepoints. This
    preserves the same binary sorting order as in UTF-16. It is
    also somewhat similar (although not exactly identical) to an
    encoding used internally by Java.
    
    This completes issue #15995.
    
    enc/trans/cesu_8.trans: Add encoding conversion from/to CESU-8
    test/ruby/test_transcode.rb: Add tests for above
Loading