`LifeLines` Developer Documentation

`LifeLines` Version 3.0.61

Perry Rapp

Introduction to Lifelines Developers Manual

LifeLines source code is divided into several functional subdirectories, which will be discussed individually below. They are chained together by an autotools build system, which creates executables in both the liflines and tools subdirectories.

btree module

The btree subdirectory contains the implementation for a btree database, using fixed length 8 letter keys (RKEY).

nodes. Each node in the btree is a separate file on disk (named, eg, "aa"), and the first 4096 (BUFLEN macro) bytes are the node header.

index nodes . These are the interior index nodes of the btree; they contain pointers to subordinate index or block nodes. The program performs binary searches through index nodes to find a particular key.

block nodes . These contain the actual data (keys and their associated records).

keyfile . One special file on the disk, the keyfile, contains some meta information and a pointer to the root of the btree (the master key). When the root changes (splits), the master key in the keyfile is updated accordingly.

traverse . There is a traversal function implemented at the btree level, which uses a callback.

bterrno . There is a global integer error variable, bterrno, which is set by this module upon most failure conditions.

FUTURE DIRECTIONS . bterrno must be removed for multi-threading. Traversal is more elegantly done via iterator style repeated calls in, instead of callback.

stdlib module

The stdlib directory contains various utility functions not specifically related to LifeLines, GEDCOM, or even genealogy.

String Functions

There has built up, over time, quite an assortment of string functions, split currently between mystring.c and stdstrng.c (and a few macros in standard.h).

String copy and concatenation

char *llstrncpy(char *dest, size_t n, int utf8, const char * fmt, va_list args);

char *llstrncat(char *dest, size_t n, int utf8, const char * fmt, va_list args);

These are simple wrappers around the C RTL (run time library) functions. The ANSI versions do not zero-terminate on overflow, which is greatly inconvenient, os the wrapper versions do so. Also, the wrapper versions are UTF-8 aware (they backtrack on overflow, to avoid leaving part of a UTF-8 multibyte sequence at the end).

String append (llstrapp)

char *llstrapps(char *dest, size_t limit, int utf8, const char * src);

char *llstrappc(char *dest, size_t limit, char ch);

char *llstrappc(char *dest, int limit, int utf8, const char * fmt);

char *llstrappv(char *dest, int limit, int utf8, const char * fmt, va_list args);

This family of functions is one (thin) layer higher than llstrncpy, providing an interface wherein the caller specified the buffer's start and entire size. That is,

      llstrncat(buffer, " more stuff", sizeof(buffer)-strlen(buffer));

may be replaced by

      llstrapp(buffer, sizeof(buffer), " more stuff");

There are also varargs versions, so that

      snprintf(buffer+strlen(buffer), sizeof(buffer)-strlen(buffer), ...

may be replaced by

      llstrappf(buffer, sizeof(buffer), ...

String append (appendstr)

This is a family of functions similar in purpose to the strapp family, but which uses an additional level of indirection, advancing pointers and decrementing counts.

* NOTE: FUTURE DIRECTIONS I put these in, and I would like to take them out, as I find them less intuitive than the strapp family, and more bug-prone. They are slightly faster, but I don't think it is worth it. -Perry.

char functions

There are character classification functions, which have handling particular to Latin-1 and to Finnish (if the Finnish compilation option was set).

* NOTE: FUTURE DIRECTIONS It would be very nice to see wchar-based functions, which handle unicode, replace these, and then we might be able to jettison the Latin-1 and Finnish specific character code.

string allocation functions

TODO: (strsave, strfree, strupdate, strconcat, free_array_strings)

string conversion functions

TODO: (isnumeric, lower, upper, capitalize, titlecase)

string equality functions

TODO: (eqstr, eqstr_ex, nestr, cmpstr)

string comparison functions

TODO: (cmpstrloc)

string whitespace functions

TODO: (trim, striptrail, striplead, allwhite, chomp)

string UTF-8 functions

These are the low-level functions used to do UTF-8 mechanics. These should only be called when in a database with internal codeset of UTF-8.

printpic functions

These are simple printf style functions, except they only handle string format, and they do handle reordering the inputs. These are used for strings that are internationalized, so that words or numbers (passed in string format) may be reordered in other languages. Instead of %s escapes, these handle %1, %2, and %3 escapes.

List Module

list.c and list.h implement a simple, doubly-linked list type, which takes void pointers (VPTR) as elements. The list manages its own nodes and memory (struct tag_list and struct tag_lnode), but the for the elements, it only frees them if the caller so instructs it (using list type LISTDOFREE), and of course this only works if they are stdalloc/stdfree heap blocks.

Table Module

table.c and table.h implement a fixed size hash tree (with linear buckets). As of 2005-01, Perry has been changing the implementation of the table type, so it is currently in flux.

Balanced Binary Tree (rbtree) Module

rbtree.c and rbtree.h implement a generic red/black balanced binary tree. These are not currently used by lifelines, but are planned as a replacement for the current fixed-size hash table in table.c.

gedlib module

This directory is a collection of routines for GEDCOM and for its use in a LifeLines btree database.

names

This module implements indexing names. TODO: Explain soundex indexing.

refns

This module implements indexing references (REFNs). TOD: Explain two character index.

xreffile

This module stores lists of deleted record numbers for each type. When a record is deleted, its number is added to the appropriate deleted list in xreffile. When a record is added, first the appropriate deleted list in xreffile is checked for a free record number.

messages

Traditionally all translatable strings have been stored in this file. This is not necessary with the current gettext scheme, but it would perhaps be helpful if a resource based scheme were adapted in the future.

* FUTURE DIRECTIONS When/If GUI versions are incorporated into the same codebase, how to handle translate strings shared and not shared between versions needs to be worked out.

translation tables (charmaps.c and translat.c)

The implementation of codeset translation is stored here (not to be confused with language translation for the user interface, called localization, and not associated with these files). Both custom translation tables and delegation to the iconv codeset conversion library are done here.

indiseq

The indiseq type is implemented here, a list of records (which no longer need all be persons).

brwslist

Named browse lists are implemented here (temporary record lists named by user during this session).

interp module

The LifeLines reporting language parser and interpreter are stored here. A custom lexical analyzer is in lex.c, and a yacc parser generator is in yacc.y.

The main interpreter is called with a list of files to parse, and some options. In actuality, I don't think more than one file is ever passed to the main entry point. If no file is passed, the routine will prompt (and here is where the user may choose a report from a list). But a report may be passed in, if one was specified with commandline argument to llines or llexec.

The report file is parsed, and as it is parsed, any included reports are added to the list to be parsed (unless already on the list, so circular references are not a problem).

require statements are handled at parse time. The handler puts the requested version into the file property table (stored inside the pointer in the filetab entry for the file; filetab entries are indexed by full path of report). Later, just after parse completes for that file (in the main parsing loop in the main interpreter function), require conditions are tested in check_rpt_requires(...).

pvalues

All variable values in report language interpretation are stored in a union type called pvalue.

symtab

Symbol tables are a thin wrapper around the table type provided by stdlib, specialized to hold pvalues.

date

A fairly complete GEDCOM date parser is also located here. It actually includes both a date parser, and a date formatter (which generates the thousands of possible LifeLines date formats).

* FUTURE DIRECTIONS If a date type were added to the report language, it would be possible to distinguish fully-parsed dates in the report language (so invalid or illegal dates could be flagged and handled separately in a report). The date module already implements a date type internally, and it is exposed to the rest of the program (gdate and gdate_val, which correspond to GEDCOM date types), but not to the report language.

liflines module

TODO:

autotools build system

todo

Building `LifeLines`

This chapter gives an overview of one way you can build LifeLines. It is not intended to be a comprehensive list of all techniques, but rather enough to get you started. This section does not assume you are downloading the source tarball and building it, Those instructions are in the file INSTALL. We are assuming you are checking out the sources from CVS.

Checking the CVS source out

If you are not a member of the LifeLines development project, you can check out sources anonymously. In the following assume that CVS stands for

cvs -d:pserver:anonymous@cvs.lifelines.sourceforge.net:/cvsroot/lifelines

If you are a project developer, you will be checking the files out under your user_id. so CVS will stand for

cvs -d:ext:user_id@cvs.lifelines.sourceforge.net:/cvsroot/lifelines

You will also need to export CVS_RSH

CVS_RSH=ssh
export CVS_RSH

The first time you check out sources into a build area:

CVS login anonymous
CVS checkout lifelines

When prompted for a password for anonymous, simply press the Enter key.

Once you have checked out the sources, cvs hides information in the CVS sub directories about how you accessed sourceforge so the -d option isn't needed to be typed in. After the first checkout, if you want to update your sources, you can just type:

cvs update lifelines

The cvs login command stashes information in .cvspass for remote repository access. If this is the only remote cvs archive you access, you may be able to skip the cvs login command on future access attempts. If you work on multiple projects you can logout when you are finished with

cvs logout

automake and autoconf

Many of the files you're used to editing by hand are automatically generated by automake and/or autoconf. These include any file named Makefile, Makefile.in, config.h, config.h.in, or configure.

The proper files to modify by hand are configure.ac (if there's something new you need to determine about the host system at configuration time) and Makefile.am (if source files are added or removed, targets added, or dependencies changed).

As long as you have autoconf and automake installed on your system, the Makefiles generated will be able to regenerate any file dependent on a Makefile.am or configure.ac. To regenerate the build system explicitly run the script autogen.sh:

 sh autogen.sh

autogen *must* be run after freshly checking a copy of the project out of CVS -- the files generated automatically are no longer included in the CVS repository.

Which does what:

At development (or package creation) time:

aclocal: This generates aclocal.m4 from acinclude.m4.

Please run 'aclocal -I build/autotools -I build/autotools' in order to get all the autoconf, automake and gettext macros into aclocal.m4. [ The autogen.sh script has been updated to do this. ]

autoheader: This generates acconfig.h.

automake: This generates Makefile.in files from Makefile.am files.

autoconf: This generates configure from configure.ac.

On remote machine compiling a source distribution package:

configure will generate config.h and Makefile files from Makefile.in files.

Building the code on Unix/Linux

There are lots of dependencies required to build LifeLines. Of course you need a C Compiler and make, but also a number of other tools like autoconf, automake, byacc and flex. One way to build the code is to make a subdirectory, lets say called bld in your lifelines directory, (where the toplevel Makefile.am is located) and then build all the code there. This keeps the objects and executables out of the source directories. This is the process shown here.

sh autogen.sh
mkdir bld
cd bld
../configure
make

This should build LifeLines and leave the results in subdirectories of the the directory bld.

Generating the source tarball

If you have build the code as described above, you can generate the source tarball as follows;

cd bld
make dist

While this is a source tarball it does contain a number of generated files that make it easier to generate LifeLines from the source tarball.

Generating the rpm package

The specification file to build a rpm for redhat linux is included in the cvs repository. These notes show how you can use this to build the source and binary rpm for redhat linux.

These instructions use techniques described by Mike Harris in a note entitled "Building RPM packages as a non-root user." These were found at http://www.rpm.org/hintskinks/buildtree. At that url was also a tarball that included the files README( the note), .rpmrc and .rpmmacros. The later two files are installed in your home directory. These do alter the default behavior of rpm for you and are not required to build the rpm, however, these instructions will fail.

Make sure there is a line of the form

%packager     Joe Blow  <joe@blow.com>

In your ~/.rpmmacros file. It is used to put the name and email address of the individual generating the rpm package into the file. Be sure to use your name and email address. If there is a "Packager:" entry in the lifelines.spec file, make sure it is correct, as it overrides the value in your .rpmmacros file.

From the lifelines directory (where the toplevel Makefile.am and the bld directory are, execute the following commands (with appropriate version numbers of course)

mkdir ~/rpmbuild
mkdir ~/rpmbuild/SRPMS
mkdir ~/rpmbuild/RPMS
mkdir ~/rpmbuild/BUILD
mkdir ~/rpmbuild/tmp
mkdir ~/rpmbuild/lifelines-3.0.22
cp bld/lifelines-3.0.22.tar.gz ~/rpmbuild/lifelines-3.0.22.
cp build/rpm/lifelines.spec ~/rpmbuild/lifelines-3.0.22
cd ~/rpmbuild/lifeines-3.0.22
rpmbuild -ba lifelines.spec

The mkdir commands only need to be executed if needed. If everything goes ok, this will generate a source and binary rpm.

Making a release

To release a new version, run the build/setversions.sh script to set the version in the many necessary files. Add an entry mentioning the new version in the

  ChangeLog

Tag the cvs source via (for example, for version 3.0.25)

  cvs tag v3_0_25

Finally, Send an announcement to the LINES-L mailing list

Putting a release on sourceforge

(Not all developers have the power to create or edit a file release on sourceforge, only Project Administrators and File Release Technicians.)

The instructions at http://sourceforge.net/docman/display_doc.php?docid=6445&group_id=1 are the ones that Perry followed to make many of the releases.

LifeLines Developer Documentation

LifeLines Version 3.0.61