Skip to main content

Bulk Loading: Job-Specific Instructions

Scope: This procedure provides instructions for running automated import, export, cleanup, reporting and various related jobs using shell scripts and other tools created by the Library Systems Office. The list of jobs includes the frequency of the work, the list of people responsible for performing this work, as well as the function of each job. In addition, this procedure includes guidelines for scheduling batch routines.

Contact: Gary Branch

Unit: Database Management Services

Date last updated: 08/06/09

Date of next review: July 2010


 

A - B - C - D - E - F - G - H - I - J - K - L - M - N - O - P - Q - R - S - T - U - V - W - X - Y - Z

 


ACLS

  • Run this job whenever a new file of ACLS records becomes available.
  • Save the new file in \\Library30\input\vendorRecords\ACLS
  • Type "acls_load <filename>" where <filename> is the name of the file to be loaded.
    • Example: If the target file is called acls3b.marc, the command to load this file will be: acls_load acls3b.marc

ANNEX LOCATION REPORT (Suspended)

  • Run this job monthly on the first business day.
  • Type "annex_reports".

ASIA Q-V CALL NUMBER CLEANUP (Suspended)

  • Run this job monthly on or about the 20th.
  • Type "asiaqv run ".
  • FTP extracted files to PC and use NWU's Vger Location Change program to update MFHD and item records in each of the three files. 

AUTHORITY RECORDS: NAMES AND SUBJECTS

  • Run these jobs weekly or whenever new NAF and SAF files are available. Watch for the "LC files" memos from Peter Ward on CTSBULK-L.
  • Type "auth <filename>" where:
    • <filename> is the name of the file to be loaded. Files are numbered sequentially using the designations "unname" or "unsub" followed by a two-digit year and a two-digit week number.
    • Examples: If the previous bulkloaded NAF file was called unname00.17 (i.e. unspanned name authorities for the year 2000, week 17), the command to load the next file will be: auth unname00.18. If the previous bulkloaded SAF file was called unsub00.19, the command to load the next file will be: auth unsub00.20
    • Occasionally we receive a file from Peter Ward that will not load. DLIT can remove the miscoded records so that the remaining records in the file can load. After we are notified of the corrected file it can be loaded using an alternate program name.
    • Type "auth_local <filename>" where:
      • <filename> is the name of the corrected file. The load then proceeds as before.
    • To view discarded or rejected records:
      • Type "cd <directory name>" to open the appropriate file directory on Library9
      • Type "ls" to list the files in that directory
      • Type "more <file name>" to retrieve the records
      • Example: If the file containing the "Discards & Rejects DATA" is called: /usr/local/batch_data/Auth.014.0808/not_loaded.mrc, the commands to view discards and rejects will be:
        • cd /usr/local/batch_data/Auth.014.0808
        • ls
        • more not_loaded.mrc
    • Monitor the Global Headings Change queue. As needed, run PCat Jobs 11 & 12 to expand entries in the queue and PCat Job 13 to perform global headings changes (names only). The commands to run these jobs are, respectively: ghc11.pl, ghc12.pl, and ghc13.pl
    • Perform manual cleanup on remaining changed and deleted headings. Give Chinese, Japanese, and Korean work to the PCS CJK Assistant or CJK student.

    BATCH DELETE

    • BIB holdings to be deleted from OCLC WorldCat are exported nightly by means of a library systems chron job executed at 12:05 am.
    • LTS staff uses BatchCat monthly to delete Voyager records, as needed. Voyager system numbers for the target records are copied cumulatively into the folder "dbqe/batchdelete" on library30 as part of the nightly chron job.

    BATCHMATCH

    • Run this job monthly, on second Wednesday.
    • The Batchmatch process consists of eight steps: (1-4) extracting records for the current month -3, the current month -18, the current month -12, and the current month -24, and (5-8) reloading records for the current month -3, the current month -18, the current month -12, and the current month -24.
    • For the extracts, type "marc_export <Date 1> <Date 2> <Filename> <Extract Type>" where:
    • <Date 1> is the first create/update date of the monthly extract, entered in the form yyyymmdd. Example: 20021001
    • <Date 2> is the last create/update date of the monthly extract, entered in the form yyyymmdd. Example: 20021031
    • <Filename> is the name of the file to be sent to OCLC. Name the files FILEA (current month -3), FILEB (current month -18), FILEC (current month -12), or FILED (current month -24)
    • <Extract Type> is a switch to indicate which set of Voyager records are targeted. Type "reg" to retrieve backlog, CIP, and COR records for the current month -3 and current month -18 ranges. Type "supp" to retrieve COR records only for the current month -12 and current month -24 ranges.
    • Example (extract commands): In February, 2003, four extracts are run. The target months are February and August 2001, and February and November 2002. The commands for the four jobs will be:
      • marc_export 20010201 20010228 FILEA supp
      • marc_export 20010801 20010831 FILEB reg
      • marc_export 20020201 20020228 FILEC supp
      • marc_export 20021101 20021130 FILED reg
    • Watch for the "Batchload Processing Summary for COO Order <job number>" memo from OCLC, which will tell us when files are available for reload. Batchmatch reports are identified by id number P006353 in OCLC's PostReport.
    • Retrieve records for each file, which will be posted on OCLC's website. In order to determine which report matches which extracted file, compare the number of records noted in the Marcadia Export Summary message with the number or records reported in OCLC's batch notification messages.
    • Save updated records provided by OCLC for each of the four files, to a file in Lib30/Input/Batchmatch. Rename each of the four files using date of extract, following pattern provided there and giving extension of .mrc to newly named file.

    Example: D080314FILEB.mrc

    • Reload each of the four files in turn. The commands for the four reloads will be:
    • marc_reload D030212FILEA.mrc
    • marc_reload D030212FILEB.mrc
    • marc_reload D030212FILEC.mrc
    • marc_reload D030212FILED.mrc

    CASALINI ENHANCED / CASALINI FULL

    • Run this job whenever new files become available.
    • For CASALINI ENHANCED: Type ?vendor_load casplus <filename> <filename> <filename><filename> <filename>" where filename is the specific name of the file to be loaded.Use lower-case letters in all file names. Note that up to five files can be handled with one command.
    • For CASALINI FULL?vendor_load casfull<filename> <filename> <filename><filename> <filename>"
      • Example: If the target files are called NYRA40b.083 and NYRA40g.055, the command to load these files will be: vendor_load casplus nyra40b.083 nyra40g.055 (use casfull for CASALINI FULL loads)
    • Retrieve the "Titles Not Loaded", if any, from the Casalini folder on the Library 30.

    CIS

    • Run this job annually or whenever CIS sends us an update file.
    • Type "cis" followed by the file name. Example: If the file is called "cull03.mrc," the command to load the records will be "cis cull03.mrc".
    • After the initial load, send a copy of the "Set A" summary report, with new BIB record range highlighted, to the BatchCat programmer. The BatchCat script will create MFHDs for the Set A records and unsuppress the BIBs.
    • LTS will use BatchDelete to delete pre-existing duplicate Voyager records. The filename for the list of BIB and MFHD records to be deleted will appear under "Summary for Project Management" in one of the system-generated bulk import summary reports.

    COPY NUMBER CLEANUP

    • Run this job monthly on or about the 15th.
    • Type "copynum".
    • FTP extracted files to PC and use BatchCat ChangeItemNumber.exe program to update item records.

    DONOR NAMES

    • Run this job on monthly on or about the 15th.
    • Type "donor_names <Date 1> <Date 2>" where:

      • <Date 1> is the first date from the previous month covered by the file, entered in the form dd-mmm-yy.
        Example: 01-aug-06
      • <Date 2> is the last date from the previous month covered by the file, entered in the form dd-mmm-yy.
        Example: 31-aug-06
      • Example: On Sept,15 you will run Donor Names for Aug.1-30. The command to load will be: donor_names 01-aug-06 31-aug-06

    EBRARY

    • Run this job whenever a new file of EBRARY records becomes available.
    • Save the new file in \\Library30\input\vendorRecords\ebrary
    • Type "ebrary_load <filename>" where <filename> is the name of the file to be loaded.
      • Example: If the target file is called ebrary5f.marc, the command to load this file will be: ebrary_load ebrary5f.marc

    HARRASSOWITZ

    • Run this job whenever new files become available. There are two kinds of files: approvals and standing orders.
    • Type ?vendor_load harrassowitz <filename> <filename> <filename><filename> <filename>" to load approvals and ?vendor_load harrass_so <filename> <filename> <filename><filename> <filename>" to load standing orders, where filename is the specific name of the file to be loaded. Use lower-case letters in all file names. Note that up to five files can be handled with one command.
      • Example: If the target files are called 20020920053152.24.invoices.marc21.abc and 20020924414617.24.invoices.marc21.abc, the command to load these files will be either: vendor_load harrassowitz 20020920053152.24.invoices.marc21.abc 20020924414617.24.invoices.marc21.abc or vendor_load harrass_so 20020920053152.24.invoices.marc21.abc 20020924414617.24.invoices.marc21.abc
    • Retrieve the "Titles Not Loaded", if any, from the Harrassowitz folder on the Library 30.

    IBERBOOK / ITURRIAGA

    • Run these jobs whenever new files become available. Iberbook and Iturriaga activity is divided into two loading sequences. The first sequence takes a file with many invoices and splits the records into separate files indicating the invoice number as part of the file name. The second sequence is executed using the "iberload" or "itturload" command.
    • Step 1: type "vendor_load iberbook <filename>" or ?vendor_load itturiaga <filename>"where filename is the specific name of the file to be split into separate files, one for deach invoice. These split files are placed in the Iberbook or Itturiaga folder, ready for the second loading sequence.
      • Example: If the Iberbook target file is called marc15821.001, the command to load this file will be: vendor_load iberbook marc15821.001
      • Example: If the Iturriaga target file is called pe030811.mrc, the command to load this file will be: vendor_load itturiaga pe030811.mrc
    • Step 2: type ?vendor_load iberload <filename>" or "vendor_load itturload <filename>"where filename is the specific name of the file to be loaded into Voyager.
      • Example: If the Iberbook target file is called marc15821.split_marc.35868, the command to load this file will be: vendor_load iberload marc15821.split_marc.35868
      • Example: If the Iturriaga target file is called pe030811.mrc.split_marc.14341, the command to load this file will be: vendor_load itturload pe030811.mrc.split_marc.14341

       

    INITIAL ARTICLE / TITLE TAG CLEANUPp

    • Run this job on the first Wednesday of the month.
    • Type "initial_article <Date 1> <Date 2>" where:

      • <Date 1> is the earliest date in the target range, entered in the form YYYYMMDD. Example: 20030801
      • <Date 2> is the latest date in the target range, entered in the form YYYYMMDD. Example: 20030831
      • Example: To run the job for August 2003, the command will be: initial_article 20030801 20030831
    • Download the reports for RLIN and individual language cleanup into Excel files and distribute to appropriate staff for manual cleanup.

    ISBN EXTRACT

    • To create a file of ISBN numbers from a file on the Library 30, type: ?isbn_extract <filename> <jobname>" where filename is the specific name of the target file and jobname is the type of job you wish to run.
    • Example: To extract IBSNs from an RPS file called Corn0715.mrc, the command to extract the ISBNs will be: isbn_extract Corn0715.mrc rps

    LARGE-SCALE DIGITIZATION

    • Run this job whenever John Marmora, Joy Paulson, Cammie Wyckoff, or their designates post a request with circulation and shipping date information to CTSBULK-L. Shipments to Google will occur monthly.
    • Use the command "lsd_google" for all Google shipments.  Use the command "lsd_campus" for Kirtas shipments from central campus and "lsd_annex" for Kirtas shipments from the Library Annex. 
    • Type "lsd_[qualifier] <starting charge date> <ending charge date> <shipment date>" where:

      • <starting charge date> is the earliest circulation date in the target range, entered in the form dd-mmm-yy.
        Example: 13-oct-06
      • <ending charge date> is the latest circulation date in the target range, entered in the form dd-mmm-yy.
        Example: 16-oct-06
      • <shipment date> is the date the material will be picked up by the vendor, entered in the form dd-mmm-yy.
        Example: 03-nov-06
      • Example: To run the report for titles pulled between Oct. 13 and Oct. 16 and shipped from central campus up by the vendor on Nov. 3, 2006, the command will be: lsd_campus 13-oct-06 16-oct-06 03-nov-06

    LC RESOURCE FILE (suspended)

    • Run this job weekly after receiving the two "LC files" memos from Peter Ward on CTSBULK-L.
    • Load the following five types of files from Peter Ward into the LC Resource File: cnbook, unmap, unmus, unser, unvis. Do not load unname and unsub files (these are authority records).
    • Type "lcload <filename> <filename> <filename> <filename> <filename>" where filename is the specific name of the file to be loaded. Note that up to five files can be handled in the weekly dataload.

      • Example: If the target files are called cnbook03.01 and unser03.01, the command to load these files will be: lcload cnbook03.01 unser01.01
      • Example: If the target files are called cnbook03.04, unmap03.01, unmus03.01, unser.01.04, and unvis.01.01, the command to load these files will be: lcload cnbook03.04 unmap03.01 unmus03.01 unser03.04 unvis03.01

    LOST/MISSING WITHDRAWALS REPORT

    • Run this job on the first workday of the month.
    • Type "lostmissing <Date 1> <Date 2>" where:

      • <Date 1> is the earliest date in the target range, entered in the form YYYYMMDD. Example: 20030301
      • <Date 2> is the latest date in the target range, entered in the form YYYYMMDD. Example: 20030331
      • Example: To run the report for withdrawals done in March 2003, the command will be: lostmissing 20030301 20030331
    • Use MS Access and Snap software to reformat the report, then forward to Howard Brentlinger in IRIS, Collection Services.

    MARCADIA (Obsolete)

    MARCIVE

    • Watch for the "Data Ready for CITH" memo on CTSBULK-L, informing us that a new monthly file is available for loading. Run the load on either Tuesday or Thursday afternoon.
    • Following the instructions in the e-mail, FTP the file from the Marcive server to the \input\vendorRecords\Marcive\monthly_input folder on the Library 30. The data file must have a unique name.
    • To load the records, type "marcive_import <filename>" where filename is the specific name of the file to be loaded.
    • Use BatchCat to create MFHDs.
    • After the load is complete, FTP the following discard and reject files for manual cleanup:
      • discard_pre.mrc
      • setB.mrc
      • serD.mrc
      • marcive-rejects
    • FTP the dead_serials.L file and use the list of BIB IDs to identify records to be updated.
    • After manual cleanup and record updating is complete, run the cleanup program that identifies unwanted duplicate records. Type "marcive_cleanup run". Start this job in the early morning.
    • Retrieve the file delete.list and use BatchCat to delete the records.
    • Retrieve the file problem.list and resolve ambiguous matches manually.

    NETLIBRARY

    • Run this job monthly or whenever OCLC makes new files available. There are two kinds of files: owned and unowned. Records "purchased by our PDA" are owned. Records "added to our PDA" are unowned.
    • Follow the instructions in the e-mail announcement to retrieve the file from OCLC. FTP the file to the Vendor Records / NetLibrary folder on the Library 30.
    • In the F-Secure client, type ?netowned <filename>? to load owned records and ?netunowned <filename>? to load unowned records, where <filename> is the name of the file to be loaded.
      • Example: If the target file is called D030409.B7670.RECORDS.bin, the command to load this file will be either: netowned D030409.B7670.RECORDS.bin or netunowned D030409.B7670.RECORDS.bin
    • Clean up new records added during netowned loads (most will be duplicates). Resolve discards and rejects manually.

    OACIS

    • Run this job quarterly (February, May, August, November) on the first Wednesday of the month.
    • Type "oacis_ex run".

     

    OCLC CJK (Suspended Nov. 2004)

    OCLC EXPORT (Replaced by chron job Apr. 2008)

    OCLC INSTITUTION RECORDS

    • Run this job on Thursday mornings to extract and export the previous week's new and updated catalog records.
    • Type "oclc_institutional <Date 1> <Date 2>" where:

      • <Date 1> is the first create/update date of the weekly extract, entered in the form dd-mmm-yy.
        Example: 07-jan-08
      • <Date 2> is the last create/update date of the weekly extract, entered in the form dd-mmm-yy.
        Example: 13-jan-08
      • Example: To run this job on Thursday, Jan. 17, 2008, the command will be: oclc_institutional 07-jan-08 13-jan-08

    OCLC PASSWORD

    Change our OCLC EDX password every 8 weeks on Wednesday morning (1/7/04, 3/31/04, 6/23/04, etc.).

    • Open an FTP session for the OCLC EDX client:
    Host Name/Address: edx.oclc.org
    Host Type: IBM MVS
    User ID: TCOO1
    • In the Password box, type: "[current password]/[new password]/[new password]".
    • Example: If the the current password is D4E5F5 and the new password will be D4E5F6, type: D4E5F5/D4E5F6/D4E5F6
    • Verify that new password works properly.
    • Send new password to Peter Hoyt. He will update the FTP tables for our OCLC batch jobs (OCLC Export, Batch Delete, Recon, etc.)

    OCLC SYSTEM NUMBERS (Replaced by chron job Sept. 2008)

    PROQUEST UPDATES (Suspended)

    PURL UPDATES (Suspended)

    RECON    (Completed Aug. 2005)

    RLIN EXPORT (Suspended)

    RPS (RUSSIAN PRESS SERVICE) (Suspended Feb.2006)

    SEARCHING ISBNs AGAINST OCLC

    • This is not a CTSBULK operation, but is used in conjunction with CTSBULK vendor loads.
    • Launch OCLC Connexion.
    • Choose the "Export" tab under Tools / Options.
    • Click on "Create". Choose the "File" radio button and click "OK".
    • Go to "Library30\input\vendorrecords\[jobname]".
    • Name the file using the same name as the marc file. A ".dat" extension will be automatically added to the file name (e.g. corn1234.dat).
    • Click on "Open", then "OK", then "Close".
    • Choose Tools / Local File Manager / Create File button.
    • Enter the name of the file to be processed (e.g. corn1234.dat). A ".bib" extension will be automatically added to the file name.
    • Click on "Open", then "Set as default", then "Close".
    • Click on "Batch", then "Enter", then "Bibliographic Search Keys".
    • Click on "Import", then "Browse", then "wanted.txt" file, then "Open".
    • Click on "OK", then "No", then "Save", then "Close".
    • Choose Tools / Options / Batch
    • In Tools/Options/Batch make sure "Maximum Number of Matches per Search" is set to "1" and click "OK".

    • Click on "Batch", then "Process Batch". Choose the "Local file" you want to process.
    • Check the "Online Searches" box, then "OK" (searches will start).
    • Close after search completed.
    • Click on "Cataloging/Search/Local Save File" and when the box appears click "OK."
    • Highlight all records. When repeating this step, highlight only records that were not exported before.
    • Click on "Export".
    • Choose View / Bibliographic Search Report. Scroll down to find the section "Too many matches" and note the highest number of matches given.
    • Choose Tools / Options / Batch Processing.
    • Set "Maximum number of matches per search" to the highest number listed in the bibliographic search report (see above). Click "OK".
    • Repeat previous steps from: Click on "Batch", then "Process Batch". Be sure to select appropriate records if there are several titles with the same ISBN. Do so with book in hand when necessary.

    SERIAL SOLUTIONS UPDATES

    • Separate files are loaded for electronic journals and books on a monthly basis.
    • Update the ERM with new aggregators and 899 codes. --
      • Export the provider table into: \\Library30\lts\input\vendorRecords\SerialSolutions
    • On Library45 in the ctsbulk space, type: “serial_solutions_journals [enforce|filter]<filename> ” where filename is the specific name of the e-journals file to be loaded; type “serial_solutions_ebooks <filename>” where filename is the specific name of the e-books file to be loaded. Using enforce in the command results in the job halting if a resource name is not listed in the provider table. Using filter results in those records that are not in the provider table being put into a file and discarded.
      • Example: If the Serial Solutions file is called 20040830.mrc, the command will be: serial_solutions 20040830.mrc
    • Use BatchCat to delete obsolete records:
      • Deletes that are reported out will be examined by ERSM staff for appropriateness.
      • Records that need to be deleted will be sent to DQE for batch deletion.

    SERIES NOT STARTED FLIP

    • Run this job on the first Wednesday of the month.
    • Type "490_flip <Date 1> <Date 2>" where:
      • <Date 1> is the earliest date in the target range, entered in the form YYYYMMDD. Example: 20030301
      • <Date 2> is the latest date in the target range, entered in the form YYYYMMDD. Example: 20030331
      • Example: To flip the 490 0 fields for all cataloging done in March 2003, the command will be: 490_flip 20030301 20030331
    • FTP the "035.L" and "490.L" discard files (if any records appear there) and the "Report of MARC 830 fields." Open using Excel and perform manual cleanup, as appropriate.

    WEEDING EXCESS 948s

    • Run this job late afternoon on the first Friday of the month.
    • Type "cleanup_948 <YYYYMMDD>" where <YYYYMMDD> is the first day of the month three months ago.
      • Example: If the job is run on May 4, 2007 the comand will be: cleanup_948 20070201

    WITHDRAWALS COUNT

    Run this job on the first Thursday of the month.

    Type ?withdrawn <Date 1> <Date 2>? where:

    • <Date 1> is the first date covered by the file, entered in the form YYYYMMDD. Example: 20030501
    • <Date 2> is the last date covered by the file, entered in the form YYYYMMDD. Example: 20030531
    • Example: To run the report for May 2003, the command will be: withdrawn 20030501 20030531

    YANKEE

    • Run this job on Thursday mornings.
    • Type "yank <mmddyy>" where <mmddyy> is the 2nd half of the filenames to be loaded.
      • Example: If the target file is called YANK050801.dat, the command to load this file will be: yank 050801
Imported Node Type: 
Procedure
Procedure Info
LTS Procedure Number: 
None
LTS Procedure Category: 
Database Maintenance