All right, great.
Welcome.
Thanks for your patience for
what's already been a long day.
And what I want to do in my time is
talk about two different data sets.
My discussion, the first half is going to
be organized around the Penn World Tables.
And the second half is going to be
organized around the decennial census data
in the US, and
the American Community Survey.
And in talking about these datasets
I want to give you some facts and
discuss some of my papers, and
some papers by other people.
And then maybe more than Pete and
Shadman did show you a little bit about
the programs for how to read these data.
And programs that I like to use in
MATLAB to manipulate these data sets.
Okay, so to begin I want to
talk about the Pennwell tables.
And I'll do that in the context of
some basic facts of economic growth.
Again, I'll describe the facts to
you in a couple of papers, and
then show you some MATLAB code.
So I use MATLAB for everything I do,
and you actually had to read the data.
Okay, so some facts of growth.
So here's just some basic pull from
the Penn World Tables of GDP per person,
where you normalize the value
in the United States to be 100.
So everything's relative
to the United States.
And so you can see stylized facts that
you're probably all familiar with,
which is Western Europe, and
Japan, are about 70% of the US.
Western Europe caught up a little
bit maybe to 75% of the US Japan,
grew pretty rapidly until around 1990.
And then it's fallen a little
bit further behind but
ending up in the same places
in Western Europe by the end.
Lots of other countries in this data set,
obviously Brazil or Russia.
Something really interesting,
I think Sub Saharan Africa.
And if you look at the countries
in Sub Saharan Africa in 1980,
they were 10% of the US.
And by 2000 they were down at 4% of
the US, so they fell further behind.
So it makes like the US was
growing at 2% per year, and
incomes in Sub Saharan Africa as
a whole were growing more slowly.
And so, that region of the world
fell further behind since 2000.
Interestingly, if you
look at this data set,
basically everyone has grown faster than
the United States, at least on this chart.
And so there's been sort of some catch
up relative to the US since 2000.
And then finally China and India, two of
the real growth stars in recent decades.
So you see China in 1980 was just
about 7% of the United States.
By 2017 which is the last date in
the pen world tables currently,
around 25% of the US.
And this sort of kind of highlights one of
the advantages of the Penn World Tables,
which is it's using internationally
comparable prices, right?
So you find the price of blue jeans,
and apples, and rice, and oil,
in different countries.
So that you can compare the GDPs,
and these,
getting their prices
right really does matter.
If you look at China today their
aggregate GDP is roughly the same size as
the United States,
when you do the PPP correction.
And since their population
is four times larger,
their GDP per person is
about four times less.
But you can see that growing at 6 or
7% per year in recent years,
there's enormous catch up.
India started out around
maybe 4% of the US and
then since the early 1990s,
has also experienced very rapid growth.
And India is now I think growing even
more rapidly than China according
to the Pinn World Tables.
But still only up to about 11 or
12% of the United States by 2017.
Okay, here's a different kind of graph,
and
I put this on here because I have some
nice MATLAB code which makes these graphs.
And I think they're very important
types of graphs to use in lots of
different data sets.
So here I'm showing you
GDP per person in 1960,
relative to the US GDP per
person in United States in 2017.
Relative to US log scales for both axis,
and then labelling countries.
And this graph,
the labels look a little bit messy.
I've coped with a computer program
to help label data points, and
it tries to be as smart as it can.
But I'm not as good a computer
programmer as I should be, and so
my algorithm is not great.
And so, I'll show you some later graphs
where you can clean these up by hand,
you can move the labels around.
But this I wanted to show you,
this is what the algorithm can do
if you don't do any hand cleaning.
I think the types of graphs that we show
in our paper are really, really important.
And that's I guess a theme that
comes through a lot of my research.
You want to convey your facts as clearly
as possible, and label all data points.
Legends are evil.
So sometimes people make these graphs and
they have a legend over here that says,
dark blue is the United States.
Medium blue is Western Europe, purple is
Brazil, and you have to go back and forth.
Horrible, take the time to
label your graphs nicely.
So that you and
your readers can really see,
to get the information
as quickly as possible.
So I'll come back and show you some
code for how to make graphs like this.
Now maybe into some more facts that.
Again, straight out of the Penn World
Tables, that are pretty interesting.
If you look at average annual
hours worked per worker,
this is per worker in
different countries over time.
And interesting fact that this was
highlighted in the Boppart and Krusell
paper, and also in the Bick, Lagakos and
Nicole a Fuchs-Schundein paper.
Where they've used different data sets.
Boppart and Krusell went back
over time using the Madison data.
And Bick, Lagakos Fuchs-Schundein used a
lot of micro data from a lot of different
countries.
And you kind of see the same fact
which is hours worked per worker
have been declining.
All around the world, and the United
States in 1950, that's around 2000.
Today I think it's around1800.
Japan, US, and then declined France,
the UK declined.
South Korea shows you one of
the themes of economic growth which is.
In order to grow fast, you throw
lots of resources at the problem.
So, people start moving from
agriculture to manufacturing.
They start working a lot harder,
labor force participation rises,
education rises, capital rises.
Now, big part of ketchup
growth is imports.
You can see that in Korea, but
even in Korea hours worked is
really plummeted a lot since 1980.
And so this sort of hours worked
kind of data is fascinating.
And obviously, it says a lot about what
kind of macro models you want to write
down in order to match those facts.
Another interesting graph.
So the graphs I've shown you so far,
look at incomes treating the country
as a unit of observation.
And the Penn Well Tables is great for
that and it's very natural in many ways.
if you want to look at misallocation or
why some countries are richer,
some countries are poor.
Kind of the country in many
ways is a natural focal point.
On the other hand,
if you care about what fraction of
the world's population lives in poverty.
If you care about, just getting as many
people out of poverty as possible.
You may want to look at the distribution
of income by person, not just by country.
And so,
this chart that I'm showing you here,
simply weights each country's average GDP
per person by the their own population.
And you can see, these big yellow spikes
here, these are India and China, right?
So, with kind of something like almost
a third of the world's population.
What happens in India and
China, matters at times for
what happens to poverty around the world.
And so,
in the first graph in green for 1960,
you can see that in 1960 two out
of three people lived in countries
with a per capita GDP less
than 5% of the 2017 US level.
So two out of three people in
the world lived on less than $7
a day in current prices in 1960.
And in large part because of
the rapid growth of China and India,
China and India, really take,
a third of the world's population and
raise their living standards dramatically.
So that by 2017, this number had fallen
to less than 1 out of 12, right?
We went from 65% of the world
being impoverished to 8% of
the world's population being impoverished.
And that's in some sense maybe the most
important message of economic growth.
Economic growth is this massive engine for
lifting people out of poverty, and
that's kind of what you
see in this data set.
Now again, I mentioned that this treats
everyone in China is having the average
income in China and everyone in India
is having the average income in India.
And you can do better,
to do better requires distributional data.
And there's this great data set that
I'm sure you're all aware of now,
which is the world wealth and income
database that Picketties and Zachman and
various co-authors have put together,
where they've looked at micro datasets,
tax authority datasets in particular,
are all around the world,
to back out the distributional
numbers including at the very top.
So it was interesting when Chapman was
talking about winterizing data before
cutting off the top 1% and the bottom 1%.
In many cases that's
the thing you want to do,
especially when you're worried
about measurement error.
And when you look at the income
distribution that may be exactly the wrong
thing to do all the action
may be in the top 1%.
And so that's, been a theme that's emerged
from a lot of the work using tax data.
So I you could do better by incorporating
inequality within countries as well.
There's other ways to do it in addition
to the world wealth and income database.
But the basic theme that's outlined
in this graph still emerges even when
you take the income distribution
within countries more seriously.
There's just been a massive
increase in the fraction
of people not in poverty since 1960, okay.
The next project I
want to talk about that,
at least the numbers I'm going to show you
we used to microdata in this project and
answer macrodata is this
Beyond GDP project with Pete.
Where what we did is we
tried to take into account
things in addition to GDP or consumption.
So life expectancy,
leisure, and inequality.
So, one of the motivations for this came
straight out of the pen world tables and
came out of some numbers that
I've already alluded to today.
Which is when you look at the pen world
tables, and look at GDP per person in
Western Europe, look at GDP per person in
Germany or France or the UK, the striking
thing that you see in the data is how
poor Germany France and the UK are.
So I'll take France because it's one
that we emphasized in this project.
France's GDP per person is about 70%
of the United States GDP per person,
and their consumption per
person is even lower.
When, I go visit Europe, I don't see that.
It seems like people in France are pretty
well off and so the question is,
what's being missed in this this pin
world tables calculation relative to,
casual introspection or the more detailed
stuff we're we're going to look at here.
And one thing you appreciate is that well,
people in France live longer than
people in the United States.
I'll show you some numbers in a second
that look like the life expectancy in
France is about three years higher
than life expectancy in the US.
Leisure people in France work less hard.
So I showed you an earlier
graph the annual hours worked.
And so they get more leisure in France,
sort of the famous take
a month of vacation in August and 35 hour
work weeks, so there's more leisure.
That's going to reduce GDP,
if you're not working as hard,
you're going to produce less.
And yet, maybe that doesn't
reduce welfare at all, or
certainly not by the same amount
as the GDP would suggest.
And then finally, inequality in France
is much lower than inequality in
the United States from some
of these inequality datasets.
And so behind the veil of ignorance, you
don't know if you're going to be lucky and
be the rich person in France or
the United States or be unlucky and
be the poor person in France
of the United States.
That inequality would affect the welfare
of a representative agent behind the veil
of ignorance, right?
You want some insurance against
the fact that you might be unlucky and
be born poor when the margin
utility consumption is very high.
And so
the lower inequality in France matters for
a welfare calculation behind
the veil of ignorance.
Even if you don't say that you people care
explicitly about inequality after they
know their identity,
that's a separate question.
We're instead looking at sort of the basic
insurance motive before you know whether
you're going to be lucky or unlucky,
you might not like the inequality.
So what this graph does, so you can
see I'm labeling the dots again and
here it's much cleaner because I've moved
things around after running the program.
We got GDP per person again
on the horizontal axis,
huge income differences across countries.
That's something I probably should
have emphasized in the earlier graph.
More than 64 fall differences in
income something like 100 fall
differences between the richest and
the poorest countries in the world.
It's just really just a stunningly
large difference in income.
On the vertical axis, we've got
the welfare differences that take into its
consumption equivalent welfare.
So we put everything in consumption units,
and
then look at it relative
to the United States.
The US is one on both of these axis.
And the first thing that maybe
jumps out at you from this is,
the income and
welfare are highly correlated.
And the correlations about 0.95
I think if I remember correctly.
And that's its GDP is a good proxy for
welfare and that's one of
the reasons why we use it so widely.
On the other hand the dots systematically
deviate from a 45 degree line.
Look at France and Sweden are above in a
bunch of poor countries tend to be below.
And so on the next graph I
show you that in more detail,
the horizontal axis isn't the same.
The vertical axis is the ratio
of welfare to income.
So if GDP and
welfare were exactly the same,
everyone would lie on
horizontal line across at one.
And so what you could see, for
example, is that France is about
20% richer in consumption equivalent
welfare than it is in GDP.
And you can see over
here China is about half
as rich in welfare as it is in GDP, right?
So you get big differences in
consumption and welfare even though
they're highly correlated,
these corrections really matter.
And so probably the best way to
see that is with a simple chart.
So how to read this, lambda is
the consumption of global welfare,
income is the GDP per person.
Again, everything in the US is
normalized to be 100 in this case.
And then the decomposition is additive.
If you use a log utility function for
consumption,
you get a nice additive
decomposition here.
And then in green, I've got data.
Right so life expectancy in
the US is 78 years roughly,
the consumption to GDP ratio is 85%.
Leisure per capita rather than
per worker is 836 hours per year,
and the standard deviation of log
consumption is about 0.6, right?
So here's France having income 70% of the
US, but their welfare is 90% of the US.
Why is that?
Well, look, life expectancy is
three years more, each year of life
expectancy in our calibration turns
out to be about 5% of consumption.
So an extra three years of life
expectancy raises France's
consumption equivalent welfare
by 15 percentage points, right?
And then you could see, I guess, sorry,
these are hours worked per capita,
not leisure.
It's the leisure contribution,
but hours worked, so
the US works more hours per person,
not per worker.
So you adjust for
labor force participation.
France gets more leisure,
so that's about 7%.
And France has lower inequality,
standard deviation of log
consumption is 0.47 instead of 0.65.
That's equivalent to about
a 10% increase in consumption.
And so France looks like 90% of the US.
If we go down here to
the poorest countries, so
let's take South Africa as an example.
Here, these are numbers for 2007.
You can see the devastating
effect of AIDS, right?
So life expectancy in South Africa
in 2007 was just 51 years.
And so if each year is 5% of consumption,
that difference massively
reduces welfare in South Africa.
So even though their GDP per
person is 17% of the US,
their welfare is only 5% of the US.
And and this is sort of a,
the magnitudes aren't so large everywhere.
But for example, the same thing
was true for China in these data.
China, if you think about it,
they don't live as long as
people in the United States.
Their investment share's higher, so
their consumption share's lower.
They work just as hard, if not harder,
in terms of hours per person.
I'm sorry,
I'm talking about China now, but
I'm clicking on the South Africa numbers.
That's not so good.
And inequality is a little bit bigger
in China than it is in the US.
So China's consumption and
global welfare is lower.
So the general theme that emerges here is
that the Western European countries look
closer to the US, or
even occasionally above the US,
we have some robustness checks.
Whereas the poorer countries of the world
look like they're further behind.
So the hundredfold difference in income
actually understates the welfare
differences, because people in poorer
countries have lower life expectancy, for
example.
Okay, so that's what I wanted to say
about this quick tour of the Penn World
tables and some things that we've done.
Let me just show you very quickly
some key MATLAB programs for
reading the Penn World tables.
And these are all in
the zip file that's posted.
And there are two key ones, there's
a read and there's making some graphs.
And these all use some functions
from this ChadMatlab.zip file, and
these are some of the functions.
relabelaxis is good for
putting the log scale labels,
plotnamesym2 is good for
getting the dots and the names.
And let me just switch
screens here really quickly.
MATLAB has been updating
the way it reads data, and
now it's actually really, really good.
So with one command,
with a command called readtable,
you can read in data from an Excel file.
And it kind of automatically does
everything you'd like it to do.
And so this used to be even harder,
even just a year or two ago,
it's much easier now.
So MATLAB is as easy at reading data
as data is at reading data now.
And, Yeah,
I think that's the main thing I wanted
to emphasize from this program.
Let me switch to another program now.
Here's the key graphs program,
and I won't run these.
If I had more time, I would show how
they run, they run very quickly.
If you just run the read table,
read pwt, read program, and
then run the plot key graphs,
it'll plot the key graphs very easily.
Down here, this plotlog command,
there's also a namesym,
that's the command for getting the names.
You can see I'm relabeling the axes here.
So anyway, the main point I wanted to get
across here is that MATLAB is great for
working with data.
People often prefer to use Stata, or R,
or something else for working with data,
but I actually use MATLAB for everything.
I think it's incredibly useful.
All right, so
let me pause right here for questions.
And then I'll move on to the census and
American Community Survey data.
Any
