Main Page   Namespace List   Compound List   File List   Compound Members   File Members  

pcrepp::Pcre Class Reference

#include <pcre++.h>

List of all members.

Public Member Functions

 Pcre ()
 Pcre (const std::string &expression)
 Pcre (const std::string &expression, const std::string &flags)
 Pcre (const std::string &expression, unsigned int flags)
 Pcre (const Pcre &P)
const Pcre & operator= (const std::string &expression)
const Pcre & operator= (const Pcre &P)
 ~Pcre ()
bool search (const std::string &stuff)
bool search (const std::string &stuff, int OffSet)
std::vector< std::string > * get_sub_strings () const
std::string get_match (int pos) const
int get_match_start (int pos) const
int get_match_end (int pos) const
int get_match_start () const
int get_match_end () const
size_t get_match_length (int pos) const
bool matched () const
int matches () const
std::vector< std::string > split (const std::string &piece)
std::vector< std::string > split (const std::string &piece, int limit)
std::vector< std::string > split (const std::string &piece, int limit, int start_offset)
std::vector< std::string > split (const std::string &piece, int limit, int start_offset, int end_offset)
std::vector< std::string > split (const std::string &piece, std::vector< int > positions)
std::string replace (const std::string &piece, const std::string &with)
pcre * get_pcre ()
pcre_extra * get_pcre_extra ()
void study ()
bool setlocale (const char *locale)
std::string operator[] (int index)


Detailed Description

The Pcre class is a wrapper around the PCRE library.

The library "pcre++" defines a class named "Pcre" which you can use to search in strings using reular expressions as well as getting matched sub strings. It does currently not support all features, which the underlying PCRE library provides, but the most important stuff is implemented.

Please study this example code to learn how to use this class:

/*
 *
 *  This file  is part of the PCRE++ Class Library.
 *
 *  By  accessing  this software,  PCRE++, you  are  duly informed
 *  of and agree to be  bound  by the  conditions  described below
 *  in this notice:
 *
 *  This software product,  PCRE++,  is developed by Thomas Linden
 *  and  copyrighted (C) 2002  by  Thomas Linden,  with all rights 
 *  reserved.
 *
 *  There  is no charge for PCRE++ software.  You can redistribute
 *  it and/or modify it under the terms of the GNU  Lesser General
 *  Public License, which is incorporated by reference herein.
 *
 *  PCRE++ is distributed WITHOUT ANY WARRANTY, IMPLIED OR EXPRESS,
 *  OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE or that
 *  the use of it will not infringe on any third party's intellec-
 *  tual property rights.
 *
 *  You should have received a copy of the GNU Lesser General Public
 *  License along with PCRE++.  Copies can also be obtained from:
 *
 *    http://www.gnu.org/licenses/lgpl.txt
 *
 *  or by writing to:
 *
 *  Free Software Foundation, Inc.
 *  59 Temple Place, Suite 330
 *  Boston, MA 02111-1307
 *  USA
 *
 *  Or contact:
 *
 *   "Thomas Linden" <tom@daemon.de>
 *
 *
 */


/* you need to include the pcre++ header file */
#include "../libpcre++/pcre++.h"
#include <iostream>

using namespace std;
using namespace pcrepp;

/* A typedef for a vector of strings (as returned by split() )*/
typedef std::vector<std::string> Array;

/* A typedef for a vector iterator */
typedef std::vector<std::string>::iterator ArrayIterator;

void regex() {
    /*
     * define a string with a regular expression
     */
    string expression = "([a-z]*) ([0-9]+)";

    /*
     * this is the string in which we want to search
     */
    string stuff = "hallo 11 robert";

    cout << "  searching in \"" << stuff << "\" for regex \""
         << expression << "\":" << endl;

    /*
     * Create a new Pcre object, search case-insensitive ("i")
     */
    Pcre reg(expression, "i");
    
    /*
     * see if the expression matched
     */
    if(reg.search(stuff) == true) {

      /*
       * see if the expression generated any substrings
       */
      if(reg.matches() >= 1) {

        /*
         * print out the number of substrings
         */
        cout << "  generated " << reg.matches() << " substrings:" << endl;
          
        /*
         * iterate over the matched sub strings
         */
        for(int pos=0; pos < reg.matches(); pos++) {
          /* print out each substring */
          cout << "  substring " << pos << ": " << reg[pos];   // also possible: reg.get_match(pos);
          /* print out the start/end offset of the current substring
           * within the searched string(stuff)
           */
          cout << " (start: " << reg.get_match_start(pos) << ", end: "
               << reg.get_match_end(pos) << ")" << endl;
        }
      }
      else {
        /*
         * we had a match, but it generated no substrings, for whatever reason
         */
        cout << "   it matched, but there where no substrings." << endl;
      }
    }
    else {
      /*
       * no match at all
       */
      cout << "   didn't match." << endl;
    }
}



void replace() {
    /*
     * Sample of replace() usage
     */
    string orig = "Hans ist 22 Jahre alt. Er ist 8 Jahre älter als Fred.";
    cout << "   orig: " << orig << endl;

    /*
     * define a regex for digits (character class)
     */
    Pcre p(" ([0-9]+) ");

    /*
     * replace the 1st occurence of [0-9]+ with "zweiundzwanzig"
     */
    string n = p.replace(orig, " zweiundzwanzig($1) ");

    /*
     * prints out: "Hans ist zweiundzwanzig Jahre alt. Er ist 8 Jahre älter
     * als Fred."
     */
    cout << "   new: " << n << endl; 
}


void replace_multi() {
  /*
   * Sample of replace() usage with multiple substrings
   */
  string orig = " 08:23 ";
  cout << "   orig: " << orig << endl;
  
  /*
   * create regex which, if it matches, creates 3 substrings
   */
  Pcre reg(" ([0-9]+)(:)([0-9]+) ", "sig");

  /*
   * remove $2 (":")
   * re-use $1 ("08") and $3 ("23") in the replace string
   */
  string n  = reg.replace(orig, "$1 Stunden und $3 Minuten");

  /*
   * prints the result: "08 Stunden und 23 Minuten"
   */
  cout << "   new:  " << n  << endl;
}


void normalize() {
  /*
   * another sample to check if normalizing using replace() works
   */
  string orig = "Heute   ist ein  schoener  Tag        gell?";
  cout << "   orig: " << orig << endl;

  /*
   * create regex for normalizing whitespace
   */
  Pcre reg("[\\s]+", "gs");

  /*
   * do the normalizing process
   */
  string n = reg.replace(orig, " ");

  /*
   * prints the result, should be: "Heute ist ein schoener Tag , gell?"
   */
  cout << "   new:  " << n  << endl;
}


void split() {
  /*
   * Sample of split() usage
   */
      string sp_orig = "was21willst2387461du3alter!";
      cout << "   orig: " << sp_orig << endl;

      /*
       * define a regex for digits (character class)
       */
      string delimiter = "[0-9]+";

      /*
       * new Pcre object, match globally ("g" flag)
       */
      Pcre S(delimiter, "g");

      /*
       * split "was21willst2387461du3alter!" by digits
       */
      Array splitted = S.split(sp_orig);
      
      /*
       * iterate over the resulting list
       */
      cout << "   splitted: ";
      for(ArrayIterator A = splitted.begin(); A != splitted.end(); ++A)
        cout << *A << " ";
      cout << endl;
}


void ex() {
  /*
   * Pcre::exception Test
   */
  
  /*
   * this will generate only one substring, "This"
   */
  Pcre ex("([a-z]+)", "i");
  if(ex.search("This is a test.")) {
    cout << "  trying to access a non-existing substring:" << endl;
    cout << "  substring 2: " << ex.get_match(1) << endl; 
  }
}


void mycopy() {
  /*
   * Sample use of copy contsructor and operator=
   */
    cout << "  initializing reg1(([a-z]+?)" << endl;
    Pcre reg1("^([a-z]+?)");

    /*
     * create an empty Pcre objects
     */
    Pcre reg2;
    
    /*
     * copy reg1 to reg2 (operator=)
     */
    cout << "  copying reg1 to new Pcre object reg2" << endl;
    reg2 = reg1;

    /*
     * using the copy constructor to initialize the 3rd object
     */
    cout << "  creating a new Pcre object reg3 from reg2" << endl;
    Pcre reg3(reg2);

    /*
     * doing regular stuff on reg3
     */
    if(reg3.search("anton"))
      cout << "  string 'anton' matched using reg3 object" << endl;
}

void multisearch() {
  Pcre reg("([^\\n]+\\n)");
  string str = "\nline1\nline2\nline3\n";
  size_t pos = 0;

  while (pos <= str.length()) {
    if( reg.search(str, pos)) {
      pos = reg.get_match_end(0);
      cout << "   pos: " << pos << " match: " << reg.get_match(0);
    }
    else
      break;
  }
}

int main() {
  /* 
   * the Pcre class throws errors via exceptions
   */
  try {
    cout << endl << "SEARCH() sample:" << endl;
    regex();

    cout << endl << "REPLACE() sample:" << endl;
    replace();

    cout << endl << "Multiple REPLACE() sample:" << endl;
    replace_multi();

    cout << endl << "Normalizing REPLACE() sample:" << endl;
    normalize();

    cout << endl << "SPLIT() sample:" << endl;
    split();

    cout << endl << "COPY+Operator sample:" << endl;
    mycopy();

    cout << endl << "Multi line search test:" << endl;
    multisearch();

    cout << endl << "Pcre::exception test:" << endl;
    ex();

    exit(0);
  }
  catch (Pcre::exception &E) {
    /*
     * the Pcre class has thrown an exception
     */
    cerr << "Pcre++ error: " << E.what() << endl;
    exit(-1);
  }
  exit(0);
}
  

Compile your programs which use the prce++ class using the following command line:

   g++ -c yourcode.o `pcre-config --cflags` `pcre++-config --cflags`
   g++ yourcode.o `pcre-config --libs` `pcre++-config --libs` -o yourprogram
 

If you want to learn more about regular expressions which can be used with pcre++, then please read the following documentation: perlre - Perl regular expressions

The pcre library itself does also contain some usefull documentation, which maybe interesting for you: PCRE manual page

Definition at line 99 of file pcre++.h.


Constructor & Destructor Documentation

Pcre::Pcre  
 

Empty Constructor. Create a new empty Pcre object. This is the simplest constructor available, you might consider one of the other constructors as a better solution. You need to initialize thie Pcre object, if you use the empty constructor. You can use one of the two available operator= operators to assign it an expression or a Pcre copy.

Returns:
A new empty Pcre object

Definition at line 107 of file pcre++.cc.

Pcre::Pcre const std::string &    expression
 

Constructor. Compile the given pattern. An Pcre object created this way can be used multiple times to do searches.

Parameters:
expression a string, which must be a valid perl regular expression.
Returns:
A new Pcre object, which holds te compiled pattern.
See also:
Pcre(const std::string& expression, const std::string& flags)

Pcre(const std::string& expression, unsigned int flags)

Definition at line 50 of file pcre++.cc.

Pcre::Pcre const std::string &    expression,
const std::string &    flags
 

Constructor. Compile the given pattern. An Pcre object created this way can be used multiple times to do searches.

Parameters:
expression a string, which must be a valid perl regular expression.
flags can be one or more of the following letters:
  • i Search case insensitive.

  • m Match on multiple lines, thus ^ and $ are interpreted as the start and end of the entire string, not of a single line.

  • s A dot in an expression matches newlines too(which is normally not the case).

  • x Whitespace characters will be ignored (except within character classes or if escaped).

Returns:
A new Pcre object, which holds te compiled pattern.
See also:
Pcre(const std::string& expression)

Pcre(const std::string& expression, unsigned int flags)

Definition at line 59 of file pcre++.cc.

Pcre::Pcre const std::string &    expression,
unsigned int    flags
 

Constructor. Compile the given pattern. An Pcre object created this way can be used multiple times to do searches.

Parameters:
expression a string, which must be a valid perl regular expression.
flags option bits can be one or more of the following bits:
  • PCRE_ANCHORED anchored pattern.
  • PCRE_CASELESS case insensitive search.
  • PCRE_DOLLAR_ENDONLY dollar sign matches only at end.
  • PCRE_DOTALL newline is contained in .
  • PCRE_EXTENDED whitespace characters will be ignored.
  • PCRE_EXTRA use perl incompatible pcre extensions.
  • PCRE_MULTILINE match on multiple lines.
  • PCRE_NO_AUTO_CAPTURE disable the use of numbered capturing parentheses in the pattern.
  • PCRE_UNGREEDY qunatifiers behave not greedy by default.
  • PCRE_UTF8 use utf8 support.
  • PCRE_GLOBAL (PCRE++ internal flag) match multiple times used only in the replace(const std::string& piece, const std::string& with) method.

Returns:
A new Pcre object, which holds te compiled pattern.
See also:
Pcre(const std::string& expression)

Pcre(const std::string& expression, const std::string& flags)

pcreapi(3) manpage

Definition at line 80 of file pcre++.cc.

References PCRE_GLOBAL.

Pcre::Pcre const Pcre &    P
 

Copy Constructor Creates a new Pcre object of an existing one.

Parameters:
P an existing Pcre object.
Returns:
A new Pcre object, which holds te compiled pattern.
See also:
Pcre(const std::string& expression)

Pcre(const std::string& expression, const std::string& flags)

Definition at line 97 of file pcre++.cc.

References _expression, _flags, _have_paren, case_t, and global_t.

Pcre::~Pcre  
 

Destructor. The desturcor will automatically invoked if the object is no more used. It frees all the memory allocated by pcre++.

Definition at line 120 of file pcre++.cc.


Member Function Documentation

string Pcre::get_match int    pos const
 

Get a substring at a known position. This method throws an out-of-range exception if the given position is invalid.

Parameters:
pos the position of the substring to return. Identical to perl's $1..$n.
Returns:
the substring at the given position.
Example:
 
 std::string mysub = regex.get_match(1); 
 
 
Get the first substring that matched the expression in the "regex" object.

Definition at line 60 of file get.cc.

Referenced by operator[]().

int Pcre::get_match_end   const
 

Get the end position of the entire match within the searched string. This method returns the character position of the last character of the entire match within the searched string.

Returns:
the integer character position of the last character of the entire match.
Example:

 Pcre regex("([0-9]+)\s([a-z]+)");     // search for the date(makes 2 substrings
 regex.search("The 11th september.");  // do the search on this string
 int pos = regex.get_match_end();      // returns 17, because "11th september", which is
                                           // the entire match, ends at the
                                       // 17th character inside the search string.

 
See also:
int get_match_start()

int get_match_start(int pos)

int get_match_end(int pos)

Definition at line 77 of file get.cc.

Referenced by replace().

int Pcre::get_match_end int    pos const
 

Get the end position of a substring within the searched string. This method returns the character position of the last character of a substring withing the searched string.

Parameters:
pos the position of the substring. Identical to perl's $1..$n.
Returns:
the integer character position of the last character of a substring. Positions are starting at 0.
Example:

 Pcre regex("([0-9]+)");               // search for numerical characters
 regex.search("The 11th september.");  // do the search on this string
 std::string day = regex.get_match(1);      // returns "11"
 int pos = regex.get_match_end(1);     // returns 5, because "11" ends at the
                                       // 5th character inside the search string.

 
See also:
int get_match_start(int pos)

int get_match_start()

int get_match_end()

Definition at line 96 of file get.cc.

size_t Pcre::get_match_length int    pos const
 

Get the length of a substring at a known position. This method throws an out-of-range exception if the given position is invalid.

Parameters:
pos the position of the substring-length to return. Identical to perl's $1..$n.
Returns:
the length substring at the given position.

Definition at line 110 of file get.cc.

int Pcre::get_match_start   const
 

Get the start position of the entire match within the searched string. This method returns the character position of the first character of the entire match within the searched string.

Returns:
the integer character position of the first character of the entire match.
Example:

 Pcre regex("([0-9]+)\s([a-z]+)");     // search for the date(makes 2 substrings
 regex.search("The 11th september.");  // do the search on this string
 int pos = regex.get_match_start();    // returns 4, because "11th september" begins at the
                                       // 4th character inside the search string.

 
See also:
int get_match_start(int pos)

int get_match_end(int pos)

int get_match_end()

Definition at line 70 of file get.cc.

Referenced by replace().

int Pcre::get_match_start int    pos const
 

Get the start position of a substring within the searched string. This method returns the character position of the first character of a substring withing the searched string.

Parameters:
pos the position of the substring. Identical to perl's $1..$n.
Returns:
the integer character position of the first character of a substring. Positions are starting at 0.
Example:
 
 Pcre regex("([0-9]+)");               // search for numerical characters
 regex.search("The 11th september.");  // do the search on this string
 std::string day = regex.get_match(1);      // returns "11"
 int pos = regex.get_match_start(1);   // returns 4, because "11" begins at the
                                       // 4th character inside the search string.

 
See also:
int get_match_end(int pos)

int get_match_end()

int get_match_start()

Definition at line 84 of file get.cc.

pcre * Pcre::get_pcre  
 

Return pointer to underlying pcre object. The pcre object allows you to access the pcre API directly. E.g. if your are using pcre version 4.x and want to use the new functionality which is currently not supported by pcre++. An example would be: pcre_fullinfo(), pcre_study() or the callout functionality.

Returns:
"pcre*" pointer to pcre object.
See also:
man pcre

pcre_extra* get_pcre_extra()

Definition at line 195 of file pcre++.cc.

pcre_extra * Pcre::get_pcre_extra  
 

Return pointer to underlying pcre_extra structure. The returned pcre_extra structure can be used in conjunction with the pcre* object returned by pcre().

Returns:
"pcre_extra*" pointer to pcre_extra structure.
See also:
pcre* get_pcre()

Definition at line 199 of file pcre++.cc.

vector< string > * Pcre::get_sub_strings   const
 

Return a vector of substrings, if any.

Returns:
a pointer to an std::vector<std::string>, which may be NULL, if no substrings has been found.
See also:
std::vector<std::string>

Definition at line 53 of file get.cc.

bool pcrepp::Pcre::matched   const [inline]
 

Test if a search was successfull. This method must be invoked after calling search().

Returns:
boolean true if the search was successfull at all, or false if not.

Definition at line 444 of file pcre++.h.

Referenced by replace().

int pcrepp::Pcre::matches   const [inline]
 

Get the number of substrings generated by pcre++.

Returns:
the number of substrings generated by pcre++.

Definition at line 449 of file pcre++.h.

Referenced by replace().

const Pcre & Pcre::operator= const Pcre &    P
 

Operator =.

Parameters:
&P an Pcre object
Returns:
a new Pcre object
Example:

 Pcre reg1("^[a-z]+?");
 Pcre reg2;
 reg2 = reg1;
 

Definition at line 153 of file pcre++.cc.

References _expression, _flags, case_t, and global_t.

const Pcre & Pcre::operator= const std::string &    expression
 

Operator =.

Parameters:
expression a valid regular expression.
Returns:
a new Pcre object.
Example:

 Pcre regex = "(A+?)";
 

Definition at line 142 of file pcre++.cc.

std::string pcrepp::Pcre::operator[] int    index [inline]
 

Return substring of a match at a known possition using the array notation. This method throws an out-of-range exception if the given position is invalid.

Parameters:
index the position of the substring to return. Identical to perl's $1..$n.
Returns:
the substring at the given position.
Example:
 
 std::string mysub = regex[1]; 
 
 
Get the first substring that matched the expression in the "regex" object.

See also:
std::string get_match(int pos)

Definition at line 594 of file pcre++.h.

References get_match().

string Pcre::replace const std::string &    piece,
const std::string &    with
 

Replace parts of a string using regular expressions. This method is the counterpart of the perl s/// operator. It replaces the substrings which matched the given regular expression (given to the constructor) with the supplied string.

Parameters:
piece the string in which you want to search and replace.
with the string which you want to place on the positions which match the expression (given to the constructor).

Definition at line 51 of file replace.cc.

References __pcredebug, get_match_end(), get_match_start(), matched(), matches(), and search().

bool Pcre::search const std::string &    stuff,
int    OffSet
 

Do a search on the given string beginning at the given offset. This method does the actual search on the given string.

Parameters:
stuff the string in which you want to search for something.
OffSet the offset where to start the search.
Returns:
boolean true if the regular expression matched. false if not.
See also:
bool search(const std::string& stuff)

Definition at line 83 of file search.cc.

bool Pcre::search const std::string &    stuff
 

Do a search on the given string. This method does the actual search on the given string.

Parameters:
stuff the string in which you want to search for something.
Returns:
boolean true if the regular expression matched. false if not.
See also:
bool search(const std::string& stuff, int OffSet)

Definition at line 87 of file search.cc.

Referenced by replace().

bool Pcre::setlocale const char *    locale
 

Sets locale for all character operations Returns false if locale can't be set. Otherwise returns true

Parameters:
locale locale alias name you want to use.
Returns:
true if setting locale were successful
See also:
locale(1)

Definition at line 220 of file pcre++.cc.

vector< string > Pcre::split const std::string &    piece,
std::vector< int >    positions
 

split a string into pieces This method will split the given string into a vector of strings using the compiled expression (given to the constructor).

Parameters:
piece The string you want to split into it's parts.
positions a std::vector<int> of positions, which the returned vector should contain.
Returns:
an vector of strings
See also:
std::vector<std::string> split(const std::string& piece)

std::vector<std::string> split(const std::string& piece, int limit)

std::vector<std::string> split(const std::string& piece, int limit, int start_offset)

std::vector<std::string> split(const std::string& piece, int limit, int start_offset)

std::vector<std::string>

Definition at line 151 of file split.cc.

vector< string > Pcre::split const std::string &    piece,
int    limit,
int    start_offset,
int    end_offset
 

split a string into pieces This method will split the given string into a vector of strings using the compiled expression (given to the constructor).

Parameters:
piece The string you want to split into it's parts.
limit the maximum number of elements you want to get back from split().
start_offset at which substring the returned vector should start.
end_offset at which substring the returned vector should end.
Returns:
an vector of strings
See also:
std::vector<std::string> split(const std::string& piece)

std::vector<std::string> split(const std::string& piece, int limit)

std::vector<std::string> split(const std::string& piece, int limit, int start_offset)

std::vector<std::string> split(const std::string& piece, std::vector<int> positions)

std::vector<std::string>

Definition at line 147 of file split.cc.

vector< string > Pcre::split const std::string &    piece,
int    limit,
int    start_offset
 

split a string into pieces This method will split the given string into a vector of strings using the compiled expression (given to the constructor).

Parameters:
piece The string you want to split into it's parts.
limit the maximum number of elements you want to get back from split().
start_offset at which substring the returned vector should start.
Returns:
an vector of strings
See also:
std::vector<std::string>

std::vector<std::string> split(const std::string& piece)

std::vector<std::string> split(const std::string& piece, int limit)

std::vector<std::string> split(const std::string& piece, int limit, int start_offset, int end_offset)

std::vector<std::string> split(const std::string& piece, std::vector<int> positions)

Definition at line 143 of file split.cc.

vector< string > Pcre::split const std::string &    piece,
int    limit
 

split a string into pieces This method will split the given string into a vector of strings using the compiled expression (given to the constructor).

Parameters:
piece The string you want to split into it's parts.
limit the maximum number of elements you want to get back from split().
Returns:
an vector of strings
See also:
std::vector<std::string>

std::vector<std::string> split(const std::string& piece)

std::vector<std::string> split(const std::string& piece, int limit, int start_offset)

std::vector<std::string> split(const std::string& piece, int limit, int start_offset, int end_offset)

std::vector<std::string> split(const std::string& piece, std::vector<int> positions)

Definition at line 139 of file split.cc.

vector< string > Pcre::split const std::string &    piece
 

split a string into pieces This method will split the given string into a vector of strings using the compiled expression (given to the constructor).

Parameters:
piece The string you want to split into it's parts.
Returns:
an vector of strings
See also:
std::vector<std::string>

std::vector<std::string> split(const std::string& piece, int limit)

std::vector<std::string> split(const std::string& piece, int limit, int start_offset)

std::vector<std::string> split(const std::string& piece, int limit, int start_offset, int end_offset)

std::vector<std::string> split(const std::string& piece, std::vector<int> positions)

Definition at line 135 of file split.cc.

void Pcre::study  
 

Analyze pattern for speeding up the matching process. When a pattern is going to be used several times, it is worth spending more time analyzing it in order to speed up the time taken for matching.

An excpetion will be thrown if analyzing the pattern failed.

Definition at line 210 of file pcre++.cc.


The documentation for this class was generated from the following files:
Generated on Wed Aug 25 01:38:04 2004 for PCRE++ by doxygen1.3-rc3