Regular expressions can become error prone when defining a complex
language even for those experienced in writing grammars. Determine if
several smaller regular expressions simplifies one large regular
expression. Also, subject your regular expression to thorough testing
techniques such as equivalence partitioning, boundary value analysis,
and robustness. After testing and a reasonable confidence level is
achieved a regular expression may not be full proof. If an exploit is
allowed to slip through, then record the exploit and refactor your
regular expression.
Other Notes
Keywords: regexp
This can seem to overlap whitelist/blacklist problems, but it is intended
to deal with improperly written regular expressions, regardless of the
values that those regular expressions use. While whitelists and blacklists
are often implemented using regular expressions, they can be implemented
using other mechanisms as well.
Regexp errors are likely a primary factor in many MFVs, especially those
that require multiple manipulations to exploit. However, they are rarely
diagnosed at this level of detail.