I originally wrote this program because I tried the GEDCOM to HTML translator posted by Frode Kvam (frode@ifi.unit.no), and found it insufficiently flexible. Since his program only parsed a limited portion of the GEDCOM file, not including notes records, there wasn't an easy way to modify it to get all my notes into the output files. So, I decided to write a YACC-based parser for the GEDCOM standard, and to base the translator on that. The YACC parser was used in Version 1 of my program, however as I got more experience with the GEDCOM standard and how it is actually used in practice, I decided that it was too difficult to make the YACC-based parser accept the full variety of GEDCOM's that actually exist. So, for Version 2 I rewrote the parser so that it will accept essentially ``any'' GEDCOM file, and will only complain about grossly malformed input.
Since Version 2.0 many people have used this program to place their family history databases on the World Wide Web. Small GEDCOM's of under 1000 individuals are processed into HTML by GED2HTML in a few seconds on a modern PC running Un*x (processing is somewhat slower under Windows due to the more inefficient filesystem). However, on a system with sufficient swap space and main memory, much larger GEDCOM's can be processed. The program has processed databases of well over 100,000 lines of GEDCOM and 10,000 individuals under both Un*x and Windows. The program is capable of processing all the GEDCOM's on Yvon Cyr's Acadian/French Canadian CD-ROM. The largest database I have attempted is the file ``t-roux.ged'' on that CD-ROM, which is a 5478458 byte, 214266 line GEDCOM file containing 15472 individuals and 7012 families. On my system (486/33 with 16MB RAM and IDE disks, running the FreeBSD 2.0.5 operating system), it took roughly 35 minutes to process this file, of which under five minutes were spent reading the file and constructing the database, and the remainder was spent in outputting 1548 HTML files of individual data, 10 individuals per file, organized into 31 directories, a three-level hierarchical index consisting of 574 HTML files, and a surname index in a single HTML file. The HTML output files consumed 18738K of disk space. The processing itself required 32MB of virtual memory.
I have used this program to prepare my own data for presentation on the World-Wide Web. You can view this data by starting from here. I preprocessed my GEDCOM file to produce approximately 700 individual files, which are linked together between themselves and to my hypertext family history document. Birger Wathne (Birger.Wathne@vest.sdata.no) and others have used various versions of this program in various demonstrations of genealogy over the World-Wide Web. Some of these demonstrations do not preprocess the data into HTML files, but rather use LifeLines to manage the database in GEDCOM format, and ged2html to process the output of queries for presentation over the Web. However, at present most people are using this program as a ``black box'' for quickly transforming their GEDCOM data into a form suitable for presentation on the World Wide Web. A good starting point for finding many of these databases is Tim Doyle's home page.
I have developed and run this program on an Intel 486DX/33 under the FreeBSD operating system. If you are using another flavor of Un*x, you shouldn't have too much trouble getting it to run. You do need an ANSI C compiler (like GCC), as I am no longer interested in writing old-style C. I have also compiled the program for Windows using Microsoft Visual C/C++ 1.0. Most of the people presently using the program are using the Windows version.
The GEDCOM parser in the program is built around the GEDCOM 5.3 standard. Whereas version 1 of this program checked the GEDCOM input fairly stringently for conformance to the standard, the current version attempts to make sense out of anything that looks remotely like a GEDCOM file. It will complain about grossly malformed GEDCOM files, but it still tries to get through to the end and produce whatever output it can.
The output processor is template-driven. That is, it consists of an interpreter for a simple macro language, which produces output files by processing template strings and filling in information from the GEDCOM database. The template-driven output scheme was used to obtain flexibility and language independence. The default templates use the cross-reference ID's in the GEDCOM file to name the HTML files, and will insert one ``image'' file (if it exists) near the beginning of each individual file and one ``additional information'' file (if it exists) at the end of each individual file. For example, an individual with cross-reference ID ``I101'' would receive an HTML file ``I101.html''. As this file is created, the file ``I101.img'' (intended to be used to insert an image of the person) would be inserted near the beginning, and the file ``I101.inc'' would be inserted at the end (intended to be used to insert arbitrary additional material). Default templates are compiled into the program, and they will be used unless you specify an alternative template using the appropriate command-line argument.
Though versions 2.3a and earlier of this program were released as freeware, since Version 2.3a I have started to spend quite a bit of time responding to E-mail from users of the program. From the E-mail correspondence, it became clear that much better documentation was required. Also, the program itself has grown, and revisions are starting to take more time. In addition, I have begun running an Experimental GenWeb Index site on the World Wide Web to try to establish a central index of as much of the data that was prepared using GED2HTML program (and other compatible software) as I could. To justify the amount of time I am spending on this work, I decided to make the current and future versions of the program shareware.
THANKS: go to Birger Wathne for contributing useful ideas and code for the first versions of this program, and to a number of other users (including, but not limited to, Annelise Anderson, Allyn Brosz, Susie and Kerry Jane Dunavant, Bob Fieg, W. Wesley Groleau, Brian Mavrogeorge, Steve Messinger, Mike Schwitzgebel, and Doug Smith) of various versions of the program who took the trouble to send me their bug reports and problematic GEDCOMS as well everyone else who sent kind words that make all the work I did on this program worthwhile.
Copyright © 1995 Eugene W. Stark. All rights reserved.