Skip to Main Content

Text-mining and more

Ethical Issues; Intellectual Property Issues

Data Mining Policies for Library Database Vendors (Examples)

Here's a good overview of library publisher/vendor issues with data mining (it's a little old) from SPARC (Scholarly Publishing and Academic Resource Coalition - check them out!!)  See also the General Data Mining Resources box on the right side of this guide; it includes links to information about database API/data-mining policies, etc.

Here's another from a European perspective:

The Hague Declaration aims to foster agreement about how to best enable access to facts, data and ideas for knowledge discovery in the Digital Age. By removing barriers to accessing and analysing the wealth of data produced by society, we can find answers to great challenges such as climate change, depleting natural resources and globalisation.

OCR and Language Issues

In order to analyze big data, you need to be able to read it. Bad OCRing, hard-to-read handwriting, and language conventions can all cause difficulties. 

Dataset/Corpus Quality Issues

General Data Mining Resourcs

WSU Libraries, PO Box 645610, Washington State University, Pullman WA 99164-5610, 509-335-9671, Contact Us