REGEXP(6) REGEXP(6)
NAME
regexp - regular expression notation
DESCRIPTION
A regular expression specifies a set of strings of charac-
ters. A member of this set of strings is said to be matched
by the regular expression. In many applications a delimiter
character, commonly `/', bounds a regular expression. In
the following specification for regular expressions the word
`character' means any character (rune) but newline.
The syntax for a regular expression e0 is
e3: literal | charclass | '.' | '^' | '$' | '(' e0 ')'
e2: e3
| e2 REP
REP: '*' | '+' | '?'
e1: e2
| e1 e2
e0: e1
| e0 '|' e1
A literal is any non-metacharacter, or a metacharacter (one
of .*+?[]()|\^$), or the delimiter preceded by `\'.
A charclass is a nonempty string s bracketed [s] (or [^s]);
it matches any character in (or not in) s. A negated charac-
ter class never matches newline. A substring a-b, with a
and b in ascending order, stands for the inclusive range of
characters between a and b. In s, the metacharacters `-',
`]', an initial `^', and the regular expression delimiter
must be preceded by a `\'; other metacharacters have no spe-
cial meaning and may appear unescaped.
A `.' matches any character.
A `^' matches the beginning of a line; `$' matches the end
of the line.
The REP operators match zero or more (*), one or more (+),
zero or one (?), instances respectively of the preceding
regular expression e2.
A concatenated regular expression, e1e2, matches a match to
e1 followed by a match to e2.
Page 1 Plan 9 (printed 10/26/25)
REGEXP(6) REGEXP(6)
An alternative regular expression, e0|e1, matches either a
match to e0 or a match to e1.
A match to any part of a regular expression extends as far
as possible without preventing a match to the remainder of
the regular expression.
SEE ALSO
awk(1), ed(1), grep(1), sam(1), sed(1), regexp(2)
Page 2 Plan 9 (printed 10/26/25)