Warning: mysql_connect(): Headers and client library minor version mismatch. Headers:50547 Library:100122 in /home/bceln2/secure/mysql_pconnect.php on line 4

Diagnostic Tools for the Union Databases: Statistics, Rejects, Filters, and Deduplication

After your records are submitted for inclusion in the union databases, there are several tools available for diagnosing the accuracy of the exporting and merging processes.

UNION DATABASE STATISTICS (AGent)

To access your library's stats, go to the AGent Staff Menu (you may need to click on Staff Menu twice, if your initial preference is set to go directly to a particular staff function such as ILL). Click on "Union Database Statistics" from the menu on the left hand side of the screen. Select the database you wish to run statistics for (ELN Media Cat, OutLook OnLine, ELN Serials Cat). There will be two options.

The first option is statistics for complete database - this provides all the statistics for that particular union database, e.g. all bib records and holdings for ELN Media.

The second option is MARC Field Stats for your library. Select this option to see the number of entries for each MARC field and sub-field for that particular database for your library.Please note that the total number of records for your library in the database may not be exactly the number of records submitted. This is because there is some preprocessing before your records are merged into the database. Duplicate records within your library’s submission are eliminated, and records with no holdings are rejected. If there is a large discrepancy in the numbers, however, it is possible that another problem has occurred. For example, if your submission has been split into 6 separate files, a large difference in the number of records may indicate a problem with the ftp process for one of the files.

REJECTS (FTP site for OutLook)

Records with no holdings (whether physical or electronic, i.e. records with a URL) are not added to the union databases. A file of rejected records is posted to the ftp site in ASCII format.If your library is not seeing records in the union database that should be there, this can be a useful diagnostic tool.

FAILED FILTER (FTP site for OutLook)

The failed filter file on the ftp site contains records that do not meet the minimum cataloguing standard of having at least some information in the 245 $a (title) and 260 $b (publisher name). For most academic libraries, this file will be quite small, and will indicate esoteric materials (e.g. theses, items where publisher name would be difficult to identify). This file can be useful for identifying minor cataloguing errors, e.g. when information that should be in the 260 $b is in the $a instead.

Libraries that submit large numbers of records with very minimal cataloguing (e.g. reserves records), will have large failed filter files. The failed filter file is for diagnostic purposes only; records that fail this filter are still added to the database.

ELN SERIALS / NO_ISSN_ISBN (FTP site for Serials & Media)

This is a list of all records submitted to ELN Serials that have neither an ISSN nor an ISBN, in MARC format. The median for ELN partner libraries is 86% of ELN Serials records with an ISSN, as of April 2004. The number of ISSNs for your library can be checked using the Union Database Statistics. The NO_ISSN_ISBN file is for diagnostic purposes only. Records that fail this filter are still added to the database.

Access instructions for the ftp site are not posted on the website for security reasons. For assistance, please contact Korinne Hamakawa <korinnem@eln.bc.ca>

MANUAL CHECKING

It is a good idea to check a few of your library’s items manually, both to be sure that they are included in the database and that holdings information is mapped correctly.

DEDUPLICATION

Records are merged into the databases using matching algorithms – for details see the links under “Documentation” on http://www.eln.bc.ca/view.php?id=82

When two records are matched, a master record is selected on the basis of cataloguing authority information, or on the basis of the most complete record (the record with more MARC fields and subfields filled out).

For assistance with union database submissions and diagnostic tools please contact Korinne Hamakawa <korinnem@eln.bc.ca>

For union database developmental and/or metadata issues, please contact Sunni Nishimura <sunnin@eln.bc.ca>