Regular Expressions in ColdFusion Applications - PowerPoint PPT Presentation

About This Presentation
Title:

Regular Expressions in ColdFusion Applications

Description:

Knowledge Engineering : Systems Integration : Web Development : Training ... cfset secondstring = 'here is my email address d.fauth_at_domain-tech.com ' ... – PowerPoint PPT presentation

Number of Views:51
Avg rating:3.0/5.0
Slides: 22
Provided by: jacqueline80
Category:

less

Transcript and Presenter's Notes

Title: Regular Expressions in ColdFusion Applications


1
Regular Expressions in ColdFusion
Applications Dave Fauth DOMAIN
technologies d.fauth_at_domain-tech.com
Knowledge Engineering Systems Integration
Web Development Training
2
Regular Expressions
  • Small language in itself to perform pattern
    matching and text manipulation
  • Used for client side validation, server side
    manipulation and virtually any other task
    requiring string matching and manipulation
  • Enhanced in CF 4.0 to include REFindNoCase and
    REReplaceNoCase
  • Available in CF Studio, CF 4.x, and JavaScript

3
CF 4.x supported statements
  • REFind
  • REFindNoCase
  • REReplace
  • REReplaceNoCase

4
REFind
  • Returns the position of the regular expressions
    first occurrence in a block of text
  • Case Sensitive
  • REFind(reg_expression,string ,start
    ,returnsubexpression
  • ltCFSET tmpLoc REFind(\?,display.cfm?a3)gt
  • ltCFOUTPUTgt
  • tmpLoc
  • lt/CFOUTPUTgt

5
REFindNoCase
  • Returns the position of the regular expressions
    first occurrence in a block of text
  • Case Insensitive
  • REFindNoCase(reg_expression,string ,start
    ,returnsubexpression
  • ltCFSET myPath c\reportFinder.cfmgt
  • ltCFSET tmpLoc REFindNoCase((\.cfm),myPath)gt
  • ltCFOUTPUTgt
  • tmpLoc
  • lt/CFOUTPUTgt

6
REReplace
  • Return a string with the regular expression
    replaced with a substring in the specified scope
  • Case Sensitive
  • ReReplace(string,reg_expression,substring
    ,scope)
  • ltCFSET myPath c\reportFinder.cfmgt
  • ltCFSET tmpLoc REReplace(myPath,a-z,d)gt
  • ltCFOUTPUTgt
  • tmpLoc
  • lt/CFOUTPUTgt

7
REReplaceNoCase
  • Return a string with the regular expression
    replaced with a substring in the specified scope
  • Case Insensitive
  • ReReplaceNoCase(string,reg_expression,substring
    ,scope)
  • ltCFSET myPath c\reportFinder.cfmgt
  • ltCFSET tmpLoc REReplaceNoCase(myPath,A-Z,d
    )gt
  • ltCFOUTPUTgt
  • tmpLoc
  • lt/CFOUTPUTgt

8
Single Character Matching
  • Match a single character
  • Extensive set of rules for doing single character
    matching
  • Rules include
  • Special Characters are ? . ( ) \
  • Any character not a special character matches
    itself
  • A backslash escapes a special character
  • A period matches any character except the newline
  • A set of characters in brackets is a one
    character RE that matches any of the characters
    in the set. AKM matches A or K or M

9
Single Character Matching Cont.
  • Rules Cont.
  • Any regular expression can be followed by m,n
    forces a match of m through n occurrences of the
    preceding regular expression. Example a2,4
    aa, aaa, aaaa
  • A range of characters can be indicated with a
    dash. Example A-Z matches all uppercase
    letters. If the first character of the set is a
    , the RE matches any character except those in
    the set. I.e. AEIOU matches all uppercase
    consonants

10
Multi-Character Regular Expressions
  • You can use the following rules to build a
    multi-character regular expressions
  • Parentheses group parts of regular expressions
    together into grouped sub-expressions that can be
    treated as a single unit. For example, (ha)
  • A one-character regular expression or grouped
    sub-expressions followed by an asterisk ()
    matches zero or more occurrences of the regular
    expression. For example, a-z matches zero or
    more lower-case characters.
  • A one-character regular expression or grouped
    sub-expressions followed by a question mark (?)
    matches zero or one occurrences of the regular
    expression. For example, xy?z matches either
    "xyz" or "xz".

11
Multi-Character cont.
  • The concatenation of regular expressions creates
    a regular expression that matches the
    corresponding concatenation of strings. For
    example, A-Za-z matches any capitalized
    word.
  • The OR character () allows a choice between two
    regular expressions. For example, jell(yies)
    matches either "jelly" or "jellies".
  • Braces () are used to indicate a range of
    occurrences of a regular expression, in the form
    m, n where m is a positive integer equal to or
    greater than zero indicating the start of the
    range and n is equal to or greater than m,
    indicating the end of the range. For example,
    (ba)0,3 matches up to three pairs of the
    expression "ba".

12
Character Classes
  • Special Commands that can take the place of
    character ranges.
  • CF uses double brackets alpha
  • Cold Fusion supports the following character
    classes
  • alpha Matches any letter. Same as A-Za-z.
  • upper Matches any upper-case letter. Same as
    A-Z.
  • lower Matches any lower-case letter. Same as
    a-z.
  • digit Matches any digit. Same as 0-9.
  • Alnum Matches any alphanumeric character. Same
    as A-Za-z0-9.

13
Character Classes cont.
  • Xdigit - Matches any hexadecimal digit. Same as
    0-9A-Fa-f.
  • Space - Matches a tab, new line, vertical tab,
    form feed, carriage return, or space.
  • Print - Matches any printable character.
  • punct - Matches any punctuation character, that
    is, one of ! S ( ) , - . / lt
    gt ? _at_ / _
  • graph - Matches any of the characters defined
    as a printable character except those defined to
    be part of the space character class.
  • cntrl - Matches any character not part of the
    character classes upper, lower,
    alpha, digit, punct, graph,
    print, or xdigit.

14
Character Classes example
  • ltcfset thistext"lthrgtHere is some textlthrgt
  • ltbgthere is some bold textlt/bgt
  • ltigtHere is italic text.lt/igt"gt
  • ltcfset mynewtext REReplaceNoCase(thistext,
    "lt/printgt", "" , "ALL")gt
  • ltcfset mynewtext2 REReplace(thistext,
    "ltgtgt", "", "ALL")gt

15
Back Referencing
  • Capability of regular expressions to remember a
    section of text and refer to it later
  • Parenthesis provide grouping for back references
  • Grouping is referred to using \1 through \9
  • Expressions are counted from left to right
  • ex. (a(bc)(d))
  • \1 a(bc)(d)
  • \2 bc
  • \3 d
  • Powerful for search and replace functions

16
Back Referencing example
  • ltcfset secondstring "here is my email address
    d.fauth_at_domain-tech.com "gt
  • ltCFSET NewString REReplaceNoCase(
    secondstring,'(space)(a-z0-9\._at_(print
    \.)a-z2,3)(space)', '\1ltA
    HREF"mailto\2"gt\2lt/Agt\4', "ALL")gt

17
Using Regular Expressions in Studio
  • Extended find and replace in Studio and Homesite
    support Regular Expressions
  • Open the extended find or the extended replace
    dialog box. Check the regular expressions box.
    Type in your regular expression. The Studio RE
    engine evaluates the selected files and returns
    each matching pattern

18
Example Uses of Regular Expressions
  • Removing HTML Tags from Text
  • ltcfset amazonPrice Our Price 14.98 gt
  • ltcfset amazonPrice ReFindNoCase('\digit1
    ,4\.digit2',text,1,1)gt
  • Retrieving Information from a page
  • refindnocase("ltbodygtgt(.)lt/bodygt", pagetext,
    1, "TRUE")
  • REFindNoCase("upper6-digit2-dig
    it4,6",Body)

19
When Not To Use Regular Expressions
  • When it is easier to use something else
  • Example
  • ltcfset myuser engr\dbrowngt
  • Rather than write
  • ltcfset testname "engr\dbrown"gt
  • ltCfset myUsername ReFindNoCase(".\\(.)",testna
    me,1,1)gt
  • ltcfset myUsername Mid(testname,myUsername.pos2
    ,myUsername.len2)gt
  • Write
  • ltcfset myUsername ListLast(testname,\)gt

20
Cold Fusion RE Limitation
  • Limiting input string size
  • In CFML RegExp functions such as REFind and
    REReplace, large input strings (greater than
    approximately 20,000 characters) will cause a
    debug assertion failure and a regular expression
    error will be reported. To avoid this, break up
    your input into smaller chunks as illustrated in
    the following example. Here the variable input
    has a size greater than 50000.
  • ltCFSET test mid(input, 1, 20000)gt
  • ltCFSET out1 REReplace(test, "
    Chr(9)Chr(13)Chr(10)Chr(13)Chr(10)",
    "chr(10)", "ALL")gt
  • ltCFSET test mid(input, 20001, 20000)gt
  • ltCFSET out2 REReplace(test, "
    Chr(9)Chr(13)Chr(10)Chr(13)Chr(10)",
    "chr(10)", "ALL")gt
  • ltCFSET test mid(input, 40001, len(input) -
    40000)gt
  • ltCFSET out3 REReplace(test, "
    Chr(9)Chr(13)Chr(10)Chr(13)Chr(10)",
    "chr(10)", "ALL")gt
  • ltCFSET result out1 out2 out3gt

21
Resources
  • Javascript
  • http//developer.netscape.com/library/documentatio
    n/communicator/jsguide/regexp.htm
  • http//developer.netscape.com/docs/examples/javasc
    ript/regexp/overview.htm/documentation/communicato
    r/jsguide/regexp.htm
  • JavaScript Bible 3rd Edition by Danny Goodman
  • CF Studio
  • file///C/PROGRAM FILES/ALLAIRE/COLDFUSION
    STUDIO4/Help/Developing_Web_Applications_with_Cold
    Fusion/08_Regular_Expressions
  • Cold Fusion
  • Advanced Cold Fusion 4.0 Application Development
    by Ben Forta
  • CF-Talk Mailing List
  • cf-talk-request_at_houseoffusion.com
  • General
  • An excellent reference on regular expressions is
    Mastering Regular Expressions, Jeffrey E. F.
    Friedl. O'Reilly Associates, Inc., 1997. ISBN
    1-56592-257-3, http//www.oreilly.com.
Write a Comment
User Comments (0)
About PowerShow.com