REGEXP(2)                                               REGEXP(2)

     NAME
          regcomp, regcomplit, regcompnl, regexec, regsub, rregexec,
          rregsub, regerror - regular expression

     SYNOPSIS
          #include <u.h>
          #include <libc.h>
          #include <regexp.h>

          Reprog  *regcomp(char *exp)

          Reprog  *regcomplit(char *exp)

          Reprog  *regcompnl(char *exp)

          int  regexec(Reprog *prog, char *string, Resub *match, int msize)

          void regsub(char *source, char *dest, int dlen, Resub *match, int msize)

          int  rregexec(Reprog *prog, Rune *string, Resub *match, int msize)

          void rregsub(Rune *source, Rune *dest, int dlen, Resub *match, int msize)

          void regerror(char *msg)

     DESCRIPTION
          Regcomp compiles a regular expression and returns a pointer
          to the generated description.  The space is allocated by
          malloc(2) and may be released by free. Regular expressions
          are exactly as in regexp(6).

          Regcomplit is like regcomp except that all characters are
          treated literally.  Regcompnl is like regcomp except that
          the . metacharacter matches all characters, including new-
          lines.

          Regexec matches a null-terminated string against the com-
          piled regular expression in prog. If it matches, regexec
          returns 1 and fills in the array match with character point-
          ers to the substrings of string that correspond to the
          parenthesized subexpressions of exp: match[i].sp points to
          the beginning and match[i].ep points just beyond the end of
          the ith substring.  (Subexpression i begins at the ith left
          parenthesis, counting from 1.)  Pointers in match[0] pick
          out the substring that corresponds to the whole regular
          expression.  Unused elements of match are filled with zeros.
          Matches involving `*', `+', and `?'  are extended as far as
          possible.  The number of array elements in match is given by
          msize. The structure of elements of match is:

     Page 1                       Plan 9            (printed 11/22/24)

     REGEXP(2)                                               REGEXP(2)

               typedef struct {
                       union {
                          char *sp;
                          Rune *rsp;
                       };
                       union {
                          char *ep;
                          Rune *rep;
                       };
               } Resub;

          If match[0].sp is nonzero on entry, regexec starts matching
          at that point within string. If match[0].ep is nonzero on
          entry, the last character matched is the one preceding that
          point.

          Regsub places in dest a substitution instance of source in
          the context of the last regexec performed using match. Each
          instance of \n, where n is a digit, is replaced by the
          string delimited by match[n].sp and match[n].ep.  Each
          instance of `&' is replaced by the string delimited by
          match[0].sp and match[0].ep.  The substitution will always
          be null terminated and trimmed to fit into dlen bytes.

          Regerror, called whenever an error is detected in regcomp,
          writes the string msg on the standard error file and exits.
          Regerror can be replaced to perform special error process-
          ing.  If the user supplied regerror returns rather than
          exits, regcomp will return 0.

          Rregexec and rregsub are variants of regexec and regsub that
          use strings of Runes instead of strings of chars.  With
          these routines, the rsp and rep fields of the match array
          elements should be used.

     SOURCE
          /sys/src/libregexp

     SEE ALSO
          grep(1)

     DIAGNOSTICS
          Regcomp returns 0 for an illegal expression or other fail-
          ure.  Regexec returns 0 if string is not matched.

     BUGS
          There is no way to specify or match a NUL character; NULs
          terminate patterns and strings.  The size of a character
          class and the number of sub-expression matches are hard-
          coded limits. The library uses the worst-case space estimate
          for allocating VM runtime threads.

     Page 2                       Plan 9            (printed 11/22/24)

     REGEXP(2)                                               REGEXP(2)

     HISTORY
          Regexp(2) first appeared in Plan 9 from Bell Labs. This
          implementation was written from scratch for 9front (May,
          2016).

     Page 3                       Plan 9            (printed 11/22/24)