RUNE(2)                                                   RUNE(2)

     NAME
          runetochar, chartorune, runelen, runenlen, fullrune,
          utfecpy, utflen, utfnlen, utfrune, utfrrune, utfutf -
          rune/UTF conversion

     SYNOPSIS
          #include <u.h>
          #include <libc.h>

          int    runetochar(char *s, Rune *r)

          int    chartorune(Rune *r, char *s)

          int    runelen(long r)

          int    runenlen(Rune *r, int n)

          int    fullrune(char *s, int n)

          char*  utfecpy(char *s1, char *es1, char *s2)

          int    utflen(char *s)

          int    utfnlen(char *s, long n)

          char*  utfrune(char *s, long c)

          char*  utfrrune(char *s, long c)

          char*  utfutf(char *s1, char *s2)

     DESCRIPTION
          These routines convert to and from a UTF byte stream and
          runes.

          Runetochar copies one rune at r to at most UTFmax bytes
          starting at s and returns the number of bytes copied.
          UTFmax, defined as 4 in <libc.h>, is the maximum number of
          bytes required to represent a rune.

          Chartorune copies at most UTFmax bytes starting at s to one
          rune at r and returns the number of bytes copied.  If the
          input is not exactly in UTF format, chartorune will convert
          to Runeerror (0xFFFD) and return 1.

          Runelen returns the number of bytes required to convert r
          into UTF.

          Runenlen returns the number of bytes required to convert the
          n runes pointed to by r into UTF.

     Page 1                       Plan 9            (printed 11/21/24)

     RUNE(2)                                                   RUNE(2)

          Fullrune returns 1 if the string s of length n is long
          enough to be decoded by chartorune and 0 otherwise.  This
          does not guarantee that the string contains a legal UTF
          encoding.  This routine is used by programs that obtain
          input a byte at a time and need to know when a full rune has
          arrived.

          The following routines are analogous to the corresponding
          string routines with utf substituted for str and rune sub-
          stituted for chr.

          Utfecpy copies UTF sequences until a null sequence has been
          copied, but writes no sequences beyond es1. If any sequences
          are copied, s1 is terminated by a null sequence, and a
          pointer to that sequence is returned.  Otherwise, the origi-
          nal s1 is returned.

          Utflen returns the number of runes that are represented by
          the UTF string s.

          Utfnlen returns the number of complete runes that are repre-
          sented by the first n bytes of UTF string s. If the last few
          bytes of the string contain an incompletely coded rune,
          utfnlen will not count them; in this way, it differs from
          utflen, which includes every byte of the string.

          Utfrune (utfrrune) returns a pointer to the first (last)
          occurrence of rune c in the UTF string s, or 0 if c does not
          occur in the string.  The NUL byte terminating a string is
          considered to be part of the string s.

          Utfutf returns a pointer to the first occurrence of the UTF
          string s2 as a UTF substring of s1, or 0 if there is none.
          If s2 is the null string, utfutf returns s1.

     SOURCE
          /sys/src/libc/port/rune.c
          /sys/src/libc/port/utfecpy.c
          /sys/src/libc/port/utfrune.c
          /sys/src/libc/port/utfrrune.c
          /sys/src/libc/port/utflen.c
          /sys/src/libc/port/utfnlen.c
          /sys/src/libc/port/utfutf.c

     SEE ALSO
          utf(6), tcs(1)

     BUGS
          When re-encoding UTF strings with chartorune and runetochar
          one has to consider that encoding a Runeerror (0xFFFD) that
          resulted from invalid encoded input can yield a longer UTF
          sequence on the output.

     Page 2                       Plan 9            (printed 11/21/24)