Welcome, avid learner of linguistics in
this video I want to introduce you to
the wonderful world of corpora which
can serve as a really helpful tool when
it comes to research in linguistics. I
recall reading a text around
grammatical change by Laurie Bauer in
which he describes how he tried to
figure out how the rule of comparing
adjectives changed over time and what he
did is he read through lots and lots of
newspapers to scan them for adjectives
in comparative to see whether
two-syllable words had an  "-er" added
or if they were preceded by a work "more"
But, poor guy, his job would have been so much
easier had there been computers and digital
text around in the time those newspapers
printed so what's the point of this
though
well what Bauer did was he basically
treated those newspapers like a corpus
Alright, what is the corpus? If you know
me you know that I like to explain words
by saying where they come from so here
you go
I'm so predictable. The word corpus comes
from Latin and means body and
incidentally
it's the same route for the word corpse
for obvious reasons I guess
So a corpus as a body - we've established that
bad what kind of body? Coming to think of
it using the word "body" is really just a
metaphor I think I'll just post a link to
my metaphor video somewhere around here
definition a corpus is a collection of
authentic texts commonly used for the
purpose of research yes it's that simple
By "authentic" we mean that text file written by
native speakers. And in a way really
everything can be a corpus: newspapers,
novels, recipes, facebook posts,  tweets you
name it. how does it all work well let's
assume you're a massive Justin Bieber
fan dear viewer which could be about
accurate ever look at the demographics
of my videos. So you're a Justin Bieber fan
and you want to dedicate your time to
analyzing his linguistic prowess
what you could do is you could gather
every single tweet he has ever written
and save them in one file. Now this is
your corpus: A very simple one but
it's fully functional ready and with the
right software you can now look at
things in your Belieber corpus such as
work frequency, collocation or even
concordance but what are those three
things I'm so glad you asked
let me show you that with an example I
have not compiled a Justin Bieber corpus
though because f*ck it time's too precious
I just take a simple text file one of my
literary studies assignments and run it
through a free software called AntConc
right word frequency helps you check how
often a word appears in a text. this can
be really useful because that will allow
you to make statements about the
register and audience of the text my
most frequently used words here: the , off, to
and a. and then the words "Rivers" and "shell shock" which isn't much of a surprise
because the essay actually was about
shell shock and shell-shocked
therapist called Rivers in collocation
you can look at kinds of words that
seem to appear in close proximity
they sort of form pairs and inauthentic
text this will tell you a lot about
which words are more likely to combine
in a language in my example here I screened
my corpus for collocates of the word
"shell shock" here it tells us how often
these collocations appear and if the
collocate is to the left or to the
right of my search itenm
finally concordance in which is probably
the most pleasing to OCD people because
not only does it make pretty pretty
columns it allows you to see what bigger
chunks of language surround your search item in the corpus really cool so as
you see corpus analysis has lots of
benefits first of all it deals with
relatively objective data and can be
easily carried out with huge quantities
of texts also it's a descriptive method
and that's always a plus isn't it? also working
with corpora can be really beneficial
to high proficiency language learners
Not sure about the preposition you have to use
after "abide"? Just look it up in a corpus
and here's a pro tip for simple things
like this even Google's frequency count
can be really helpful as well so moving
on corpus linguistics grants insight
into really interesting fields of
linguistic such morphosyntactic
patterns processes of language change or
even lexical traces of discourses and
that's it now you've got general idea of
corpora
did you like the video please give it a
thumbs up and subscribe to my channel
wow so when analyzing  a text with
concordance I could actually find traces
of discourses that's really really neat!
But what's a discourse, though? Don't panic, avid viewer, I've got
you covered with the video just click
here
