Unicode Programming Examples

A collection of Unicode-related tasks in multiple programming languages. Feel free to add or improve examples and new languages.

Contents

  1. UTF-8 source code

Julia, Perl 5, Perl 6, Python, Ruby

  1. Encoded I/O

Perl 5, Perl 6

  1. Encode and decode

Julia, Perl 5, Perl 6

  1. Count encoded bytes

Julia, Perl 5, Perl 6

  1. Count characters of a string

JavaScript, Julia, Perl 5, Perl 6, PHP, Ruby

  1. Unicode normalization

C♯, Go, Java, JavaScript, Julia, Perl 5, Perl 6, PHP, Python, R, Ruby, Tcl, VB

  1. Letter casing

Go, Julia, Perl 5, Perl 6, PHP, Python, R, Ruby

  1. Sorting with the UCA

Perl 5, PHP, Python, Ruby

TODO

  • Unicode Character Database
  • Unicode regular expressions
  • CLDR

Style guide

The main goal is to have a cohesive style where examples in different languages can easily be compared, as opposed to following the most popular styles for each language.

  • document syntax: Markdown
  • code indentation: four spaces
  • naming convention: single-word lower-case names
  • string literals: preference for single quotes over double quotes

© 2013–2016 Nova Patch

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Unicode is a registered trademark of Unicode, Inc., in the United States and other countries.