Searching for Duplicate Titles in EBook Packages

With Create Lists and Global Update you can identify some titles that are possibly duplicated within ebook packages (for example, an earlier record and a later record that was intended to replace the earlier one).

Making use of the Titlekey index in Create Lists allows you to identify possible duplicate titles by looking at the contents of your review file in Global Update.

Instructions for Create Lists search using the Titlekey index.

  • Choose a review file and click “Search Records.” You will see your search window as always.
  • Name the review file.
  • Store record type: Bibliographic.
  • In the dropdown menu where it says “Range,” pull down to “Index.” Next to that, select your index, “Titlekey.” The next boxes over allow you to designate the alphabetical order of the titles you retrieve. For example, to retrieve all titles that begin with a or b, type a and b in the boxes.

Create Lists Titlekey Search

Proceed with your search strategy. It may be most effective to add things like the 928 code for a package or inclusive packages, or the 001 prepend. This keeps the size of the file down considerably and makes it easier to look at later.

Click “Search,” and let your file run.

Once your file has finished running, look at it in Global Update:

  • Click on the Global Update icon on the left.
  • Find your review file by title. Click Search.
  • The Title listing should come up. If it doesn’t, right clink on the Toggle icon next to the # ENTRIES column and navigate to Title under Bibliographic Variable Fields.
Global Update Title Display

Global Update Title Display

  • Scroll down the list of titles, checking each title for which you have more than one entry in the # ENTRIES column

For example, the title “Adaptation to climate change and sea level rise” has a 2 by it. That means that the titlekey has identified two titles in the database that have the same titlekey.

A couple of caveats:

One, having a matching titlekey does not guarantee that the records are duplicates, but it’s a pretty good predictor.

Two, the titlekey does not normalize or convert capital letters when deciding whether something is a duplicate or not. In the above example, Adapting to an uncertain climate: |b lessons from practice is not pinpointed as a duplicate of Adapting to an Uncertain Climate |h [electronic resource]: |b Lessons From Practice. This is probably because of the |h [electronic resource] in the second title.

For more information on the Titlekey index and how it works, see http://csdirect.iii.com/manual/Default.shtml#gmil_search_indexes_heading.html#Titlekey%20Index

(You’ll have to sign in to csdirect with our library userid and password to read this.)