netrik hacker's manual
>========================<

[This file contains a description of the text search functionality. See hacking.txt or hacking.html for an overview of the manual.]

Framework

Though by far not as complicated as link handling, the text search functionality is also somewhat distributed: The serach command is given in the pager (display(), see hacking-pager.*), and after the search the new position is also activated here. The search itself however is done in main(). (There is no real reason for this, except that reading the search string has to be done in scroll mode, as well es displaying a warning/error if search is wrapped or nothing found.)

When the search command key ('/') was entered in the pager, display() quits with RET_SEARCH. Additionally, "search.type" (see Search Struct below) is set to SEARCH_FORWARD.

In main() (see hacking.*), when RET_SEARCH is encountered, first the search string is read from the user with readline("/"). If '^D' was typed, the search is immediately aborted. If nothing was typed in (just <return>), we reuse the previous search string; otherwise we store the new one.

Having this, the actual search is performed. This is described under Search below.

The result of the search operation is a new cursor position. The main loop iteration is ended now, and the pager gets restarted in the next iteration. The display() then knows to jump to the new position by "search.type" being set.

Search Struct

For passing information between the search code in the pager and in main(), as well as between multiple searches, a global "search" struct is defined in search.c.

This struct is of type "struct Search" and presently contains the following data:

Search

The search itself is done in several steps. First we need to determine where on the page the cursor stands to know where to search from, i.e. find out in which text item, and the exact position inside this item's text string.

This part is fairly complicated, as there are several possibilities; it's actually the most complicated part of all. The most basic possibility is that the cursor stands inmidst some text line. For this to find out, we have to check all text items present in the cursor line (using the page_map[], which is described under create_map() in hacking-layout.*). For each of those items we look up the beginning and end of the text line visible in the scanned page line using the helper functions from items.c (see hacking-layout.*). Note that line_start() and line_end() return the position of the line inside the text string, while line_pos() gives the page position of the line start.

Now if the cursor actually stands inside the line (the offset of the cursor position ralatively to the beginning of the line is >=0 and smaller than the line width), we simply store the string position of the cursor, which is the string position of the line start plus the offset.

page:
 
            First paragraph
          with centered text.   <-- cursor_y
          |         ^
          |<------->|
line_pos()  offset   cursor_x
 
string: "First paragraph with centered text."
                         |         ^
                         |<------->|
             line_start()  offset   start_pos

The cursor may also stand before the beginning of the line. In this case we want to search from the beginning of the line. This is achieved by just assuming the cursor *does* stand at the beginning of the line (setting "offset" to 0).

            First paragraph
          with centered text.   <--
      ^   |
      |<->|
        offset<0

A third case is that the cursor stands behind the end of the text line, but still inside the text block (i.e. this is not the last line). Here, the search starts with the beginning of the next line. (Which is equal to the end of this one.)

            line_start()   line_end()
            v              v
            First paragraph     <--
          with centered text.
            |                 ^
            |<--------------->|
               offset >= line_end()-line_start()

Another procedure is necessary if the cursor doesn't stand inside a text item at all (the above search has no result). In this case we have to search from the beginning of the next text item. For this, we simply scan all text items in the page and take the first one that starts after "cursor_y". (Alternatively we could scan the page linewise starting with the next line... This would be way more efficient, but may cause problems when tables and similar things are implemented. We will revise this at some point.)

            First paragraph     <-- item->y_start <= cursor_y
          with centered text.
                                <-- cursor_y
           Second paragraph.    <-- item->y_start > cursor_y   <== start_item
           ^
           start_pos=0

Now the real search can start. It is done item by item, till the page end. In each text item, we search for the search string inside the item's text string with strstr(). For the first item, we start with "start_pos" as offset, so only the string part after the cursor position is searched; for all following items, "start_pos" is set to 0.

As soon as strstr() returns the first match, the matching item and the position inside the string are memorized, and the search is terminated.

If nothing was found, the search is wrapped: In a second step, all items beginning with the first one on the page and ending with the start item are scanned. (Note that the start item is searched again, this time starting at the beginning of the string. The part after "start_pos" is searched a second time, but this shouldn't matter.)

There is an exception if the search started at the page end ("start_item" is NULL): We can't scan up to the start item inclusive in this case, of course; instead, we simply scan to the page end. (No special handling of this case is necessary in the first pass, as that one just scans to the page end, thus will simply terminate immediately.)

Finally the match position needs to be transformed back to page coordinates again. This is much simpler than the other way round: We just need to find out in which line of the text item the match position is located, and can calculate y and x coordinates from the line number and the line position then.

                        line_end(0)         line_end(1)
                        v                   v
string: "First paragraph with centered text."
                         |             ^
                         |<----------->|
            line_start(1)    offset     match_pos
 
page:
 
            First paragraph
          with centered text.   <-- cursor_y=line_end(1)
          |             ^
          |<----------->|
line_pos(1)   offset     cursor_x