LICENSE | NOTES | GUIDE | INTRO | USAGE | CONFIG | HISTORY | CONTRIBUTING | ACKS |
This page contains various bits of information that doesn't fit somewhere else, or that are waiting to be inserted into some other context.
The regular expression implementation used in gentoo was written by Henry
Spencer at the University of Toronto. It is "nearly-public-domain", and has not
been modified in any way to fit into gentoo. I thank Mr Spencer for saving
me a substantial amount of work. In fact, the description below on how to write
regular expressions is greatly "inspired" by Mr Spencer's man
-page...
Regular expressions are a way of writing "patterns" of text strings. A regular expression can be matched against a string (called the input string), an operation that always produces one of two results: either a match, or a miss. The thing that makes regular expressions useful is that there is generally more than just one string against which a regular expression can actually match. Because of this, regular expressions can be used to classify strings by just looking for matches.
This will get a bit formal, because it is the easiest way to describe regular expressions, and I'm kinda lazy. There are several concrete examples below, though.
|
(the vertical bar, or "pipe" character) . The expression matches anything that matches one
of the branches.*
, +
or ?
:
*
matches a sequence of 0 or more matches of the atom.+
matches a sequence of 1 or more matches of the atom.?
matches either the atom or nothing (the empty string)..
), which matches any single character,^
), which matches the null string at the start of the input,$
), which matches the null string at the end of the input,\
) followed by any character, which matches that character, or[
and ]
).
It normally matches any single character from the sequence. If the first character after the opening
bracket is ^, the range is "inverted" and matches any single character not not in the
sequence. If two characters in the range have a hyphen (-
) between them, this is
shorthand notation for the range of all characters between them (according to ASCII). To include a
literal [
character in the range, make it the first character after the actual opening
bracket. To include a hyphen, make it the first or last character.As promised above, here are a few examples of regular expressions of varying complexities:
Example Expression | Explanation |
---|---|
abc | This matches the exact string "abc", and nothing else. |
abc|def|ghi | This matches either "abc", "def", or "ghi". |
a+[b-y]z+ | This matches any string that begins with one or more "a"'s, proceeds with any letter except "a" and "z", and then ends with one or more "z"'s. Examples of strings that match are "agzz", "aaaaaaaiz", "aabz". |
a+[^az]z+ | Like the previous example, except it allows any character except for "a" or "z" to occur between the sequences of "a"'s and "z"'s. Examples, again: "aaa-zzz", "aa(aaz", "a+zzz". |
.+\..+ | This suspicious-looking expression matches any string of length at least three with a period in the "middle". Examples: "aa.7", "...", "3.14". |
.+\.(jpeg|jpg) | This matches any string of characters that ends in a period followed by either "jpeg" or "jpg". The application should be obvious; this can be used in a file type to identify JPEG files that have been properly named. |
.+\.jpe?g | This matches exactly the same set of strings as the previous example, but using a simpler expression. |
A "glob" expression is something similar to a regular expression; it's a pattern that can be used
to check against matches with other strings. Glob patterns are often used in Un*x shells, when you
type e.g. "ls -l *.c
", the last part is a glob pattern. It can be seen that it is not
a proper regular expression, because a RE cannot begin with an asterisk.
Since glob patterns are generally easier to write than are REs, and many people already know how to write simple glob patterns, gentoo sometimes allows you to enter a glob pattern, while internally translating it to a RE which is then used just as above. The following translations are done to accomplish this:
Glob Pattern | Becomes | Comment |
---|---|---|
. | \. |
The period is "magic" in REs, and needs to be escaped. |
* | .* |
In glob patterns you use a single asterisk to mean "zero or more of any character". Not so in regular expressions. |
+ | \+ |
The plus symbol has a special meaning in REs, and therefore needs to be escaped. |
? | . |
In glob patterns, the question mark means "any character". That is the meaning of the period in REs, so it is simply replaced. |
other | same | Any other symbol is just copied straight over to the RE. This includes character ranges, which are copied as a whole. |
Now, we can translate a glob pattern to its corresponding RE easily: the example "*.c
"
becomes ".*\.c
".
Note! |
I don't have any documentation defining glob patterns more exactly. The above is basically based on empirical research (read: loads of testing) and experience. No guarantees, people. |
When you specify an icon for gentoo to use, you don't include the search path. In stead, it is assumed that the icons will be collected in a common directory somewhere. This "somewhere" can be specified in the "Icons" text box on the "Paths" page in the config window. In fact, you can have any number of somewheres; just separate individual paths with colons as usual in Un*x. Paths that are specified as relative (i.e. their first character is not a slash) are treated as being relative to the directory that was current when gentoo was started. If this sounds arbitrary, it's probably because it is. However, having some kind of rule for this case makes things (feel) more robust, and I like that.
Icon names are just plain filenames, without paths, and always end in ".xpm". Currently, gentoo doesn't use any fancy image-loading library but only plain GDK, and therefore only supports icons in the XPM pixmap format.
When gentoo notices that an icon is required, for example because a row with a style specifying a certain icon is about to be rendered, it will search the given icon path one component at a time for the named icon. As soon as a match is found, it will be used. Therefore, you can never have icons of the same names, even if they reside in different directories; gentoo will only ever find the first one.