Frequently Asked Questions
Can I use documents in my project?
You can use these materials for a non-commercial project if it falls under ‘Fair Use.’ Please see
Copyright and Fair Use for more information.
How do I save a document I like?
To save a document, you can "bookmark" it or download the document PDF to your computer.
- When you "bookmark" a document, you are saving the document RECORD to the Bookmarks area. This record contains pertinent information about the document such as title, author and date as well as the unique permanent ID for the document so you can easily retrieve it. Once you have finished bookmarking the documents you like, navigate to your Bookmarks area to download, email, and/or cite those saved records.
- Most documents in our library have a PDF attached that you can read and download. The PDF is not saved when you bookmark a document, just the record. To save the PDF, with or without its attached record, choose the "download" button under the thumbnail image in the search results OR use the Actions drop-down menu from the individual document viewer.
Can I download a large batch of PDF documents all at once?
Please contact us at industrydocuments@ucsf.edu if you are interested in large downloads of PDFs. For batch downloads of records/metadata, you can query our Solr index directly using an API. We also provide data sets containing metadata and OCR text for our entire corpus. Please see
Industry Documents Library API and Data Set for more information and documentation.
Do you make your entire Data Set available? Is there an API?
You can query our Solr index directly using an API. We also provide data sets containing metadata and OCR text for our entire corpus. Please see
Industry Documents Library API and Data Set for more information and documentation.
What is the format for a date query?
The date query can be expressed as YYYYMMDD.
-The year (YYYY) needs to be between 1760 and the current year.
-The month (MM) needs to be between 01 and 12. (notice the leading zero).
-The date (DD) needs to be between 01 and 31. (notice the leading zero).
an example of a valid date query dd:19810123 or dd:[19810123 TO 20101111]
an example of a invalid date query dd:00000000 or dd:[00000000 TO 19899999]
How do I search for variant spellings of names or terms (fuzzy search)?
Fuzzy search is very useful in searching for names or terms when you are not sure how they are spelled in the documents or you think a word might be misspelled in the text.
Use the ~ operator at the end of a term like teen~
A fuzzy search like teen~ searches for words that are similarly spelled to teen. The definition of similar is how far is it from the original word by "edit distance". An edit distance is either an insertion (teens), a deletion (ten), or a substitution (teem).
You can specify how much edit distance you want. For instance, teen~1 will only return words that are at most 1 edit distance away from teen.
cigarettes~ or cigarettes~1 (to be a little more conservative), will return documents where the term is spelled as cigaretes or cigarretes.
If you do not specify a number, then the system searches for teen~0.5 which will return words that are about 50% like teen (in this case 2 edit distance away).
How do you identify "Potential Duplicates"?
Potential duplicates are identified when a document matches another in the following fields:
collection, title, documentdate, pages, availability
What is the "More Like This" feature when viewing a document?
"More Like This" returns documents that are similar to the currently viewed document.
This feature contains recommendations based upon matches in title and author with a slightly higher weight put on title.
What is the "Previous/Next Bates" feature when viewing a document?
"Previous/Next Bates" allows you to view documents in order of Bates number, a sequential number stamped on most litigation documents.
What is the "Browse" feature when viewing a document?
"Browse" allows you to view the documents in the order they were ingested into the archive as a part of a contextual set.
Why is some text blocked out?
Some documents have "redactions," with a black or white box or black highlighting that makes the original text unreadable. These are redactions of personally identifiable information that are withheld from public view based on privacy concerns.