Title: Regular Expressions
1Regular Expressions
- A regular expression is a pattern that defines a
string or portion thereof. When comparing this
pattern against a string, it'll either be true or
false. If true, it'll return something. - The return value will depend on the specific
function used and its attributes.
2Regular ExpressionsBasics
- Basics of the function
- REFind(reg_expression, string , start ,
return_sub) - compares a Regular Expression to a string and if
it matches all or part of the string, returns the
numeric position in the string where the match
starts. - The optional start position allows the search to
start anywhere in the string. - An additional option is to return sub
expressions. - We'll deal with that a little later.
3Regular ExpressionsBasics
- Any Ascii character which is not a special
character matches itself. - A matches A
- b matches b
- A does not match a unless the NoCase version of
the function is used - REFindNoCase()
- This is slightly slower, but only someone totally
anal about run times will be able to tell you how
much slower. )
4Regular ExpressionsBasics
- REFindNoCase('is', 'This is a test') 3
- REFindNoCase(This', 'This is a test') 1
- REFind (t', 'This is a test') 11
5Regular ExpressionsSpecial Characters
- A period (.) matches any single character
- A pipe () means either what comes before it or
what comes after it. - A caret () at the beginning of a RegEx means
that the regex will only match if it starts at
the beginning of the comparison string - A dollar sign () at the end of a RegEx means
that the regex will only match if it ends at the
end of the comparison string
6Regular ExpressionsSpecial Characters
- A backslash (\) means escape the next character
if it is a special one - If the character after the backslash is not a
special one, then it may be an escape sequence - Displaying a backslash (\) is done by escaping it
7Regular ExpressionsSpecial Characters
- REFindNoCase(i.', 'This is a test') 3
- REFindNoCase(is', 'This is a test') 0
- REFindNoCase(t', 'This is a test') 1
- REFindNoCase(t', 'This is a test') 14
- REFindNoCase(thte', 'This is a test') 1
- REFind (thte', 'This is a test') 11
8Regular ExpressionsEscape Sequences
- When certain non-special characters have a
backslash (\) before them, they become special. - REFindNoCase(\d, this is 4) 9
- \d means any number
- REFindNoCase(is \d, this is 4) 6
9Regular ExpressionsSets
- A character set is a group of characters from
which only one is desired - 0123456789 matches any single number
- Sets can use ranges of characters (think ascii
table) - 0-9 matches any single character
- A dash can be represented in a set by placing it
first (I.e. not in a range) - -aeiou matches a dash or a vowel
- A Carat () at the beginning of a set negates if
(I.e. anything BUT characters in the set
10Regular ExpressionsSets
- REFindNoCase(AEIOU, This is a test) 3
- REFindNoCase(0-9, this is a test) 0
- REFindNoCase(0-9, this is a 4th test) 11
- REFindNoCase(-0-9, this-is a test) 5
- REFindNoCase(0-9, this is a test) 1
- REFindNoCase(-, this is a test) 1
- REFindNoCase(-, this-is a ) 5
11Regular ExpressionsSets
- ColdFusion also includes a number of predefined
sets - A predefined set is called using a special name
surrounded by colons - alpha
- Used within a set, it would look like
- REFindNoCase(alpha, 123abc) 4
- Can be combined with other characters in a set
- REFindNoCase(123alpha, 123abc) 1
12Regular ExpressionsGroups
- A group allows a portion of a regular expression
to be separated from another portion - Also known as subexpressions
- Uses parenthesis to group things together
- REFindNoCase((thisthat), find this) 6
- More uses later
13Regular ExpressionsModifiers
- A modifier will take the previous character, set
or group and say how many times it can or should
exits. - REFindNoCase(ha, hahaha) 1
- REFindNoCase(ha, hhaha) 1
- REFindNoCase(ha?, hahaha) 1
- REFindNoCase(ha2, hahaaha) 3
- REFindNoCase(ha2,3, hahaha) 3
- REFindNoCase(ha3,, hahaha) 0
- REFindNoCase((ha), hahaha) 1
14Regular ExpressionsModifiers
- Normal modifiers are greedy, I.e. they want to
match as much as they can. - Using a question mark (?) after a modifier makes
it lazy, I.e. it will match as little as possible - REFindNoCase('a', 'baaaa',1,1)
- will return aaaa
- REFindNoCase('a?', 'baaaa',1,1)
- will return a
15Regular ExpressionsLine Modifiers
- A line modifier changes how a Regular Expression
is processed - REFind ((?i)This, this) 1
- (?i) means perform a case insensitive search
- REFindNoCase('(?x)is a', 'this is a isa') 11
- (?x) means perform a search ignoring spaces
- REFindNoCase('(?m)line3', line1
- Line2
- line3)') 13
- (?m) means pay attention to the lines
16Regular ExpressionsReturning Structures
- Rather than returning a number, a Regular
Expression function can be set to return a
structure - The structure will contain 2 keys names pos and
len - Each will contain a matching array holding the
start position of a match and its length - The first item always contains the entire match
and all others contains matches from sub
expressions - Use the mid() function to get easy access to the
return data
17Regular ExpressionsReturning Structures
- The start location must be specified and the 4th
attribute must be set to yes (1, true) - String this is a finder
- TestREFindNoCase(faeioun.,string, 1, 1)
- Mid(string, test.pos1, test.len1)Find
- TestREFindNoCase(f(aeiou)n.,string, 1, 1)
- Mid(string, test.pos1, test.len1)Find
- Mid(string, test.pos2, test.len2)i
18Regular ExpressionsReplacing
- REReplace(string, regex, replace, scope)
- Replaces the regex match in the string with the
replace value - Scope is one(default) or all
- Show lots of examples here on in ?
19Regular ExpressionsReplacing
- New in MX is the ability to modify the replace
values using special escape codes - REReplaceNoCase(make upper, u., \u\1)
- Upper
- REReplaceNoCase(make upper, u., \U\1)
- UPPER