Vim Regex Summary
Alevosamente robado de Mastering Vim
Overview * Vim uses an extended version of regular expressions that is not the same as vi's * It's also not the same as Perl's (but of comparable power) * The basic rule is that almost every character matches itself * With a very few exceptions, only backslash-escaped characters are special * The main features are: Subpattern... Matches... . ...any character except newline * ...zero or more of the preceding ^ ...start of line (only at start of pat) $ ...end of line (only at end of pat) [...] ...an explicit character class \<numerous characters> ...special behaviour \\ ...a literal backslash <any other character> ...itself Characters and character classes * The following backslashed characters are short-hands for one or more "difficult" characters: Escape Matches... \^ literal '^' \$ literal '$' \_. any character, including newline \a alphabetic character: [A-Za-z] \A non-alphabetic character: [^A-Za-z] \b <BS> \d digit: [0-9] \D non-digit: [^0-9] \e <ESC> \f any character that might appear in a filename \F like "\f", but excluding digits \h head of word character: [A-Za-z_] \H non-head of word character: [^A-Za-z_] \i identifier character \I like "\i", but excluding digits \k keyword character \K like "\k", but excluding digits \l lowercase character: [a-z] \L non-lowercase character: [^a-z] \n end-of-line \o octal digit: [0-7] \O non-octal digit: [^0-7] \p printable character \P like "\p", but excluding digits \r <CR> \s whitespace character: <SPACE> and <TAB> \_s whitespace character: <SPACE>, <TAB>, and newline \S non-whitespace character; opposite of \s \t <TAB> \u uppercase character: [A-Z] \U non-uppercase character [^A-Z] \w word character: [0-9A-Za-z_] \W non-word character: [^0-9A-Za-z_] \x hex digit: [0-9A-Fa-f] \X non-hex digit: [^0-9A-Fa-f] \%o<n> specified octal character \%d<n> specified decimal character \%x<n> specified hex character \%u<n> specified multibyte character \%U<n> specified large multibyte character Repetitions * Specified by a suffix on the preceding (atomic) item * Zero-or-more: * * One-or-more: \+ * Zero-or-one: \? * Exactly-M: \{M} * M-to-N: \{M,N} * M-or-more: \{M,} * zero-to-N: \{,N} * For "as few as possible", make first number negative * For example, to match a double-quoted string at least one character long: /".\{-1,}" * As a special case of that, \{-} minimally matches zero-or-more * For example, to match everything up to the first occurrence of "__END__": /\_.\{-}__END__ Alternatives, synternatives, and sequences * Alternatives are specified with: \| For example: /perl\|python\|php/ * "Synternatives" are alternatives where *both* sides have to match * Specified with: \& * The "and" equivalent of \|'s "or" * For example, to find a line containing the word "Java" and the word "line": /.*Java\&.*line * Sequences are successive characters that may be truncated at any point * They are specified with: \%[...] * Sequences are very handy when searching for terms that might be abbreviated * For example: /fun\%[ction] * ...is the same as: /fun\|func\|funct\|functi\|functio\|function Context specifiers * The ^ and $ markers allow you to constrain where a match can occur * There are many other such constraint specifiers * For example, \_^ and \_$, which are the same as ^ and $, except they can appear anywhere in a pattern (which is particularly handy in alternations and synternations) * Also \%^ and \%$: match start and end of file * Other positional constraints include: * ...match only at current cursor position: \%# * ...match only at line N: \%Nl * ...match only at column N: \%Nc * ...match only at virtual column N: \%Nv (allowing for tabs) * You can also put a < or > after the % to indicate "before" or "after" the specified row/column * The \< and \> subpatterns only match at the start/end of a word, for example: /\<for\> * ...matches "for", but not "fortune" nor "wherefor" nor "enforce" Match boundaries * Sometimes you want to use a pattern in a substitution, where you need to match a certain line, but only change part of it * To make that easy, Vim provides the \zs and \ze specifiers * They allow you to mark where the pattern should be considered to have matched from and to (as opposed to where it actually matched) * Suppose you want to find every call to the function 'update' (provided its first argument starts with a digit) and change that call to a call to 'update_num' * You could do that with: :%s/\s*\zsupdate\ze(\d/update_num/g * The \zs and \ze tell the substitution that, if it successfully matches the entire pattern... * ...it should pretend that it only matched from just after the \zs to just before the \ze * ...so that the substitution only replaces the "update" part of the match (i.e. not the leading whitespace or the trailing paren+digit)