regular expression converter
Англо-русский словарь компьютерных терминов:
regular expression
= regexp
регулярное выражение
нотация для описания текстовых фрагментов (образов) в процедурах типа "найти" и "найти-и-заменить"; первоначально регулярные выражения были разработаны для текстовых процессоров ОС Unix (awk и sed), а сейчас они поддерживаются многими языками программирования и приложениями. Эту нотацию довольно трудно освоить, однако она гораздо более эффективна, чем инструменты текстового поиска, поставляемые обычно с приложениями для конечного пользователя и с операционными системами. Регулярное выражение содержит как обычные символы, так и метасимволы, имеющие специальные значения. Регулярные выражения появились как результат развития различных направлений математической теории, таких, как теория автоматов и теория формальных языков
Free On-line Dictionary of Computing:
regular expression
1.
(regexp, RE) One of the
wild card
patterns used by
Perl
and other languages, following
Unix
utilities such as
grep
, sed, and
awk
and editors
such as
vi
and
Emacs
. Regular expressions use conventions
similar to but more elaborate than those described under
glob
. A regular expression is a sequence of characters with
the following meanings:
An ordinary character (not one of the special characters
discussed below) matches that character.
A backslash (\) followed by any special character matches the
special character itself. The special characters are:
"." matches any character except NEWLINE; "RE*" (where
the "*" is called the "
Kleene star
") matches zero
or more occurrences of RE. If there is any choice, the
longest leftmost matching string is chosen, in most
regexp flavours.
"
" at the beginning of an RE matches the start of a line and
"$" at the end of an RE matches the end of a line.
[string] matches any one character in that string. If the
first character of the string is a "
" it matches any
character except the remaining characters in the string (and
also usually excluding NEWLINE). "-" may be used to indicate
a range of consecutive ASCII characters.
\( RE \) matches whatever RE matches and \n, where n is a
digit, matches whatever was matched by the RE between the nth
\( and its corresponding \) earlier in the same RE. Many
flavours use ( RE ) used instead of \( RE \).
The concatenation of REs is a RE that matches the
concatenation of the strings matched by each RE. RE1 | RE2
matches whatever RE1 or RE2 matches.
\< matches the beginning of a word and \> matches the end of a
word. In many flavours of regexp, \> and \< are replaced by
"\b", the special character for "word boundary".
RE\{m\} matches m occurences of RE. RE\{m,\} matches m or
more occurences of RE. RE\{m,n\} matches between m and n
occurences.
The exact details of how regexp will work in a given
application vary greatly from flavour to flavour. A
comprehensive survey of regexp flavours is found in Friedl
1997 (see below).
[Jeffrey E.F. Friedl, "Mastering Regular Expressions
http://enterprise.ic.gc.ca/~jfriedl/regex/index.html
,
O'Reilly, 1997].
2. Any description of a pattern composed from combinations
of symbols and the three operators:
Concatenation - pattern A concatenated with B matches a match
for A followed by a match for B.
Or - pattern A-or-B matches either a match for A or a match
for B.
Closure - zero or more matches for a pattern.
The earliest form of regular expressions (and the term itself)
were invented by mathematician
Stephen Cole Kleene
in the
mid-1950s, as a notation to easily manipulate "regular sets",
formal descriptions of the behaviour of finite state machines, in regular algebra.
[S.C. Kleene, "Representation of events in nerve nets and
finite automata", 1956, Automata Studies. Princeton].
[J.H. Conway, "Regular algebra and finite machines", 1971, Eds
Chapman & Hall].
[Sedgewick, "Algorithms in C", page 294].
(2004-02-01)