module Agrep: sig .. end
String searching with errors
type pattern
The type of compiled search patterns
val pattern : ?transl:string -> string -> pattern
Compile a search pattern. The syntax for patterns is
similar to that of the Unix shell. The following constructs
are recognized:
? match any single character
* match any sequence of characters
[..] character set: ranges are denoted with -, as in [a-z];
an initial ^, as in [^0-9], complements the set
& conjunction (e.g. sweet&sour)
| alternative (e.g. high|low)
(..) grouping
\ escape special characters; the special characters
are \?*[]&|().
The optional argument
transl is a character translation table.
This is a string
s of length 256 that ``translates'' a
character
c to the character
s.(Char.code c). A character
of the text matches a character of the pattern if they both
translate to the same character according to
transl.
If
transl is not provided, the identity translation
(two characters match iff they are equal) is assumed.
Useful predefined translation tables are provided in
Agrep.Iso8859_15.
exception Syntax_error of int
Exception thrown by
Agrep.pattern when the given pattern
is syntactically incorrect. The integer argument is the
character number where the syntax error occurs.
val pattern_string : ?transl:string -> string -> pattern
Agrep.pattern_string s returns a pattern that matches exactly
the string
s and nothing else. The optional parameter
transl is as in
Agrep.pattern.
val string_match : pattern -> ?numerrs:int -> ?wholeword:bool -> string -> bool
string_match pat text tests whether the string text
matches the compiled pattern pat. The optional parameter
numerrs is the number of errors permitted. One error
corresponds to a substitution, an insertion or a deletion
of a character. numerrs default to 0 (exact match).
The optional parameter wholeword is true if the pattern must
match a whole word, false if it can match inside a word.
wholeword defaults to false (match inside words).
val substring_match : pattern ->
?numerrs:int -> ?wholeword:bool -> string -> pos:int -> len:int -> bool
Same as
Agrep.string_match, but restrict the match to the
substring of the given string starting at character number
pos and extending
len characters.
val errors_substring_match : pattern ->
?numerrs:int -> ?wholeword:bool -> string -> pos:int -> len:int -> int
Same as
Agrep.substring_match, but return the smallest number
of errors such that the substring matches the pattern.
That is, it returns
0 if the substring matches exactly,
1 if the substring matches with one error, etc.
Return
max_int if the substring does not match the pattern
with at most
numerrs errors.
module Iso8859_15: sig .. end
Useful translation tables for the ISO 8859-15 (Latin-1 with Euro)
character set.