Hi, this is Professor Charles Evans, and this
is my third take
on my notes for history 218,
and I'm going to be talking a little bit
today about data mining and 
text mining. These are subjects of which
I am not really an expert, and so I'm just going to say a couple of words about them.
Data mining and text mining
is a subset of what historians now call big data, 
and big data is exactly what it sounds like, Big Data.
Millions and millions and millions
of pieces of 
data in some sort of database,
and then the task is to design some
sort of software
or what's called an algorithm to mine
that data 
and extract from that data useful
historical
information. That's something that's a
little bit 
beyond my capabilities, the ability to come
up with the software or the algorithm to
mine data, but tincreasingly those kind
of things are appearing as
web-based tools that allow you to work
with
large pieces of data. So for example if
you were looking
through thirty or forty years
of newspaper material, you need some sort of computer program that will sort
through all that
and give you useful data that you can interpret and 
analyze.
Now one successful approach to data mining
is what the digital scholarship
lab at the University of Richmond did with
the
Richmond Dispatch and analyzed the
Richmond Dispatch
through the Civil War years 1861 to 
1864 looking for patterns
that could give some sort of interpretation
about what the newspaper was covering, what the topics were,
what was important at the time. So
that's really the essence of what
Data/text mining is all about,
analyzing huge amount.
If you were a historian trying to go
through page by page by page by page
of years of a newspaper, and you were trying to summarize and analyze it by yourself
with just the physical newspaper you would be
swamped and overwhelmed.
With the computer and with digital data
you're able to do that in a 
much more coherent and quick useful manner.
So that's what data/text mining is all
about. So that's what we're 
introducing you to in this unit of HIS218.
Now since we're talking about text 
in this unit, the optional assignment
that I have for this unit is a
historical translation assignment in which
I want you to work with web-based
translation tools because there's
actually quite a few them right now
that you can plug in phrases,
excerpts, large excerpts, even up to pages of foreign text 
and get the translation. Now the translation
is not always very good. It's 
got to be honed and improved, 
but it gives you a rough kind of translation that you can work with. And so even if you're not
really an expert, for example, in German, you can get an 
idea of what the German text means and
then use a dictionary or something to further
refine it. And there's a lot of those tools
available now.
And some of those are listed here, and so
with this assignment what I do is give students a piece of text
and ask them to translate it.  It's a pretty simple, but 
it's important thing for historians to be able to do is translate
historical materials. And so it fits with
this unit since we're dealing with text and 
translation of text, and so it sort of all kind of comes together.
Okay, thanks.
