collection consists of 22 data files, an SGML DTD file describing the data file format, and six files describing the categories used to index the data. Each of the first 21 files as "Sets" of "Strings ...