CARL MALAMUD: Hello, my
name is Carl Malamud.
I'm chief technology officer
at the Center
for American Progress.
It's a think tank in Washington,
DC, and I'm part
of a group of volunteers that
have been working on a
document that we're here to tell
you about, the Archimedes
Palimpsest. I'd like to thank
Dan Bloomberg, who is our host
from Google, for inviting us in
and Vint Cerf, who is the
one that originally gave us
the invitation to do this.
My role in this project
is pretty simple.
I'm going to be running the FTP
server, and there's not
really a lot of rocket science
to that piece.
So I'm going to turn it over
to three speakers from the
team who are here to talk
about the work.
Dr. Will Noel is the
curator at the
Walters Museum in Baltimore.
He's the one that worries about
old manuscripts and what
they mean, and how to
work with them.
Abigail Quandt is also here from
the Walters Museum, and
she runs our conservation
operation.
And as you'll hear, this
document took quite a bit of
conservation.
Following Will, we're going to
hear from Roger Easton who is
a professor at Rochester and
is heading up a lot of our
imaging efforts.
Bob Morton is also here
from Conoco Phillips.
He's one of the team of imaging
experts from around
the world that have been on
working on this project.
And then Mike Toth is going
to speak last and he's the
program manager.
He's administered several very
large imaging projects in the
past, and he's volunteered his
time to keep everything going
and work with scientists from
dozens of places around the
world and also specifically to
worry about things like meta
data and making sure that we're
capturing the right
information as we create
these images.
So, Will.
DR NOEL: Thank you very much,
thank you Google.
It's a great pleasure
to be here.
I am the project director and I
have a rather extraordinary
story to tell, if I can tell
it, and I have about 20
minutes to tell it.
There's a philosopher called
A.N. Whitehead who very
famously said that Western
philosophy is nothing but a
series of footnotes to Plato.
I'm going to make the case
very briefly that Western
science is nothing but a
series of footnotes to
Archimedes.
What do I mean by this?
I mean that Archimedes was the
guy who first got abstract
mathematical problems and
attached them to the physical
world, so that you could just
think entirely abstractly and
find out something that was
true about the physical
external world.
I'll give you a very simple
example: how to find the
center of gravity
of a triangle.
What you could do is hang a
whole bunch of triangles from
the ceiling and find out where
the center of gravity is for
each individual one.
That's not how Archimedes
went about it.
He was sitting in Syracuse in
the 3rd century BC and he
thought, how am I going
to find the center
gravity of a triangle.
I'm going to draw a triangle
in the sand.
I'm going to call it A, B, C,
and I might guess that the
center of gravity in a triangle
is going to be on the
median line, AD.
But being Archimedes I'm not
going to do this simply.
I'm going to make it
complicated and
do an indirect proof.
So what I'm going to prove to
you is that it can't lie on
the line AX, say at the point
T. OK, so let's say that the
center of gravity in this
triangle is at the point T,
and I'm going to prove to
you that it can't be.
I'm going to divide
this triangle into
four individual triangles.
So the sum of the center of
gravity of these four
individual triangles is going to
be, should be, identical to
the center of gravity of
the big one, right?
Now being similar triangles, the
center of gravity should
be in roughly the same place as
those respective triangles
as the point T. So they should
be for the blue triangle at
the point K and the yellow
triangle at the point L. OK,
now those other two triangles
there, the center of gravity
for those two is easy.
It's going to be at the point M
because it's a rectangle and
it's right in the middle
of the rectangle.
K and L, the center of gravity
of those two triangles, should
be at the point N. That's
exactly halfway between the
point K and the point L.
So the center of gravity for the
four triangles is going to
be on the line somewhere right
in the middle of NM.
OK?
if you add all those triangles
together the center of gravity
is going to be in the middle
of the line NM.
But you know it isn't, because
it's going to be at the point
T. Now, those two lines, the
line running through NM and
the line running through
AT to X, are
always going to be parallel.
Whatever the shape of your
triangle, they are always
going to be parallel.
They are never ever going
to intersect.
They're never going
to intersect.
And while they don't intersect
of course you can't have a
center of gravity.
The only place they'll intersect
is actually on that
median line.
So you know that the center of
gravity-- you have proved it
without even hanging a triangle
from the ceiling--
that the center of gravity
is going to be
on your median line.
Now the nice thing is you
haven't got a center of
gravity yet, you just know
it's on the median line.
But of course a triangle has
three median lines, and they
coincide at one point.
So you know that the center of
gravity of a triangle is going
to be a third a way along
actually of any median line,
which is going to be
at that point.
So if you're Archimedes, you
know that you can just through
simply working it out entirely
in your head that you could
find the center of gravity.
So you know if you're an eminent
historian you might
try and hang triangles from
the ceiling any old how.
But if you're Archimedes you
just think about it entirely
abstractly, and before you know
it, you've got triangles
that balance perfectly
on the ceiling.
The application of abstract
mathematics to the physical
world is something that
Archimedes was
very, very good at.
And measuring surfaces is
something that he was very
interested in.
Now measuring a rectilinear
surface, however complicated
it is, is incredibly easy.
You just divide it up into
right angle triangles.
And before you know it, it's
incredibly easy to measure.
But measure an incredibly simple
thing like a sphere,
and that's incredibly
difficult.
And what Archimedes did was
to pack it with triangles.
And he had a recipe for packing
it with an infinite
number of triangles.
So that way, he can
calculate curves.
Now if you can apply abstract
mathematical models to the
physical world, and if you can
calculate curves, eventually
you can send a rocket to the
moon, which is why Archimedes
is the most important scientist
who ever lived.
OK.
Everything we know
about Archimedes
came from two books.
They are very boringly called
Codices A and B. Just two
books, that's how we how
we know about all
of Archimedes' treaties.
In the Renaissance, Leonardo
got hold of copies
of these two books.
And with it he found equilibrium
of planes, which
is where Archimedes starts
talking about triangles.
And typical of the Renaissance
he picked up on the ancient
mathematics and found the
center of gravity for a
tetrahedron.
It's very, very clever
to do this.
You can calculate the
center of gravity of
the Pyramid of Cheops.
But there was another
manuscript, Codex C. And Codex
C, Archimedes didn't
know about.
If he did know about it, he'd
have known that Archimedes had
1700 years earlier already
calculated the center of
gravity of a segment of an
ellipse, and many more
complicated curved solid
objects besides.
So he can calculate the center
of gravity of even a cut
through one of those eggs.
This manuscript Codex C--
well, let me tell you about the
fate of Codices A and B.
Codex B was last heard of
in the papal archive
in Viterbo in 1311.
Codex A fell off the back
of the truck in 1546.
Codex C was only discovered in
1906 by a guy called John
Ludwig Heiberg in a monastery
in Constantinople.
And that's what it looks like.
But the Archimedes text in this
manuscript is not the
Archimedes text that you can
see running down like this.
It's coming in two columns
across the page the other way.
Along like that.
It's a palimpsest. It was made
in the 10th century.
But in the months shortly before
April 14, 1229, someone
needed to make a prayer
book and they
didn't have any parchment.
So they tore up the Archimedes
manuscript, scraped off the
text, cut up the pages, stacked
them in a corner,
rotated them 90 degrees, got
some parchment from other
manuscripts as well,
wrote over it--
scrubbed it very hard, wrote
over it in a very black ink,
and turned it into
a prayer book.
And it survived as a prayer
book for approximately 777
years, until it was discovered
in 1906, and that's what it
looks like now.
It was sold at auction
on October 29, 1998.
But when it was sold an awful
lot of things had happened to
it between when it was
discovered in 1906 and when it
was sold in 1998.
One of the most remarkable
things that happened to it is
that some time after 1929,
someone painted over the pages
with gold ground icons.
We know that this
is after 1929.
So what you're looking at is
unique Archimedes text that
was overwritten in 1229, and
then someone after 1929
painted a painting
on top of that.
Fantastic.
The other thing they did was
they left it in the bottom of
their garden for quite
a long time so it
got very, very moldy.
Now I can't go into this in
great detail but medieval
manuscripts are made
of parchment.
That's the stuff your
shoes are made of.
It's tough stuff.
There are two things that can
kill it-- fire and water.
This was left in a bucket.
And that's the page as it was
in 1906 and that's the
page as it is now.
And you can see a big hole, a
hole in Archimedes' brain,
right here.
You don't get more written
off than this manuscript.
It's in a terrible state.
It is the unique surviving copy
of On Floating Bodies in
the original Greek.
It's the unique surviving
copy of The Stomachion.
And it's the unique surviving
copy of the Method of
Mechanical Theorems. It was read
by Heiberg in 1906, but
it wasn't read very well because
he didn't have the
modern technology that
we have now.
There it is.
It was bought by an
anonymous owner in
1998, sold for $2 million.
And he gave it to me.
Fantastic.
I don't read Greek,
I can't add.
I do know about manuscripts and
I am in a position to find
some of the people that are
going to help, but it was a
major problem.
We've been working on it for
eight years now and I'm going
to talk to you a bit quickly
about some of the
things that we did.
We had an article in the
Washington Post and we got
lots of responses.
You know, the grandson of
Rasputin said his name could
be found in the Archimedes
Palimpsest if only we looked
hard enough.
Lots of kooky things, and then
a nice email from Mike Toth
who is going to speak to us
later, who said that he worked
for the government and that he
had some kit that might be
able to help.
It's a privately owned book,
so we can't use government
equipment to help us.
But Mike's had a great deal of
experience in running imaging
programs and that sort of
thing, which of course I
didn't, so he works for us
as R.B. Toth Associates.
And he's the brains behind an
integrated program of imaging
scholarship and conservation
that I'm briefly
going to talk about.
Conservation of course
is the most important
part of this thing.
The way the palimpsest
was made, we
have to take it apart.
It's hard work taking
it apart.
The main reason it's hard work
taking it apart is because
there's now glue on the spine.
Now glue is put on spines of
books quite a lot, but not in
the Middle Ages.
There are two bits
of glue there.
You can see at the top that it's
white and at the bottom
it's black.
The black stuff's all
right for Abigail.
That's called hide glue.
It's made out of animal parts,
and conservators like Abigail
have been taking this apart
for time immemorial.
The trouble is the Elmer's
wood glue on the top.
The white stuff's Elmer's
wood glue.
It's tougher than the parchment
support itself.
And that's probably been put off
on since 1970 by some nice
people called the
[UNINTELLIGIBLE]
in Paris.
So it took four years to
take apart the book.
That's a rare action shot.
Abigail started it on the 3rd
of April in the year 2000,
finished on the 26th
of November 2003.
I bought her a bottle
of champagne.
It was a long job.
Abigail is a great conservator
of manuscripts, but boy did
she need to work hard
on this book.
That on the left is the first
page of the book.
It's On Floating Bodies, that's
never been read before.
But you can see that it's in a
really bad state, and how on
earth are we going to read it.
This is a typical
Abigail story.
You can see the Archimedes text
running vertically, right?
And what you're looking at is
a detail of the gutter.
So Abigail performs
brain surgery.
There's a before on the left,
an after-Abigail in the
middle, and then a UV
photograph, very low quality,
low res JPEG that we send to
Reviel Netz, a professor of
ancient science at Stanford
University.
He circles the circle and he
comes back and he says this is
the earliest symbol of a circle
in the history of the
Western mathematical
tradition.
Great!
So I sent it to the owner, I say
here it is, this isn't bad
work, is it?
He says back, here's $10,000
go do the same thing again.
That's on a good day,
but there are lots
and lots of bad days.
Abigail's basic job
is to prepare the
palimpsest for imaging.
And that's the imaging stage.
And these are my images.
Bill Christians Barry from APL,
Keith Nox, who now works
for the Boeing Corporation
in Hawaii, lucky fellow.
And Roger Easton, who is the
professor of imaging science
of the Chester Carlton Center
for Imaging Science and who is
going to speak to
you in a minute.
And he's good to talk to
you about the imaging.
I'm just going to show you
a product of the imaging.
This is the Archimedes text
running along here.
And that's what it looks like
after he's done it.
It took a long time to get
there, but those are the
images that the scholars like.
And however ugly these images
are, that's not the point.
The point is that the scholars
can read them.
And we distribute them at the
moment in hard drives, which
is one of the reasons
we're here.
We are about to have
a problem.
Some of the results
of our work.
This is Reviel Netz
who's at Stanford.
Method proposition 14 of The
Method is very important
because Archimedes starts
dealing with infinite numbers.
This is a UV photograph.
Reviel tries to read it.
The Q to AQ is a question to
Abigail Quandt asking if the
tear in the parchment is
possibly original, because if
it is then Reviel doesn't have
to try and find a letter to
put in there when he's doing
his transcription.
That's the Method proposition
14 page with RGB on the left
and processed on the right.
There's a detail you can see how
easy it is to read with a
processed image and how
difficult with normal.
And this is a result.
Dear Will, in a couple of months
the first intellectual
fruits of our labor will be
published together with a
complete transcription of one
crucial side of one page, most
of which is unknown
to modern science.
I send you the final lines of
the article as it stands in
draft form.
It's understated.
It reads as follows.
To sum up, then, the new reading
from Archimedes'
Indivisibles Proof should call
for some reconsideration of
the position of Archimedes in
some key areas of mathematics,
notably the two related fields
of calculus and of infinity.
Very learned article.
Very obscure journal.
Read by six people, five
people, because Natalie
Tchernetska isn't married.
And published on the Sunday
Times colour supplement,
Eureka, exclusive, it's just a
few lines of scrawled Greek
text but the new technology
has identified the hand of
Archimedes and the results
are rewriting history.
One of the interesting things
about this project is that on
a very, very academic level we
are fundamentally transforming
the landscape of Greek
mathematics, but there's also
a great popular component
to what we do.
Another example is
The Stomachion.
The Stomachion which uniquely
survives in this manuscript.
And it was known as Archimedes'
box in ancient history.
And it's just a square, and
it's divided into 14 bits.
And we knew that Archimedes was
playing some kind of game
with these 14 bits, but we
had no idea what it was.
And together with our new
reading from that page, which
as you can see is dreadful,
we found out what it was.
Archimedes wasn't playing
a game at all.
He was calculating how many ways
in which those bits of
that square can be recombined
and still make one square.
And that makes it the earliest
mathematical study in the
history of combinatorics, which
is important I think to
what a lot of you guys do.
The answer, anyone
want to guess?
17,152.
We all thought we were working
for Archimedes, but those of
you who are paying attention
will have noted that I said
the other manuscripts were also
used to make the prayer
book and they had never been
read at all simply because
they were so faint.
And we thought just for the
purposes of completion, we
should study those ones too.
This from Natalie Tchernetska,
of Cambridge University.
We have five pages in this book
that belonged to speeches
by an Athenian orator called
Hypereides who was a
contemporary of Demosthenes
and Aristotle from the 4th
century BC.
Seventy seven speeches were
written by him in antiquity.
None of them survived the rather
important transition
from the role to codex,
that's the book form
as we know it today.
This is the only Hypereides text
that has ever been found
in a codex.
We have added 30% to the
surviving Hypereides corpus so
people are rather excited
about that.
This is Nigel Wilson.
He's at Lincoln College,
Oxford.
Dear Will, excellent news.
The hard drive and photos
came safely this week.
A first glance suggests that
there is no more Hypereides
but several leaves of the
philosophical text, on which I
read the name Aristotle
clearly enough.
Well clearly for Nigel is not
clearly for the rest of us but
I circled it on the image below
and you can just see it
there and there it
is, you see?
That's Aristotle.
We now have seven pages of
an unknown commentary on
Aristotle that must date from
the 3rd century BC.
Now where I come from, from
the world of medieval
manuscripts, this is just
utterly sensational.
There are quite a few
palimpsests around.
A very few of them
are important.
The Cicero's De Republica only
survives in palimpsest form.
But to have three unique texts,
and we're hoping for a
fourth from one book, makes
this book the eighth
wonder of the world.
And this is how the scholars
read it at the moment.
Through hard drives
and through copy.
And until about four weeks ago
we were dealing with four
scholars, because you only need
about four scholars to
read Archimedes text but with
the Hypereides text and the
Aristotle text we have a major
distribution program, which is
one of the reasons that we're
here, because we have to
address it very soon.
Because with these new
discoveries we have scholars,
we're talking 100 to 150
scholars who we want to work
on this project.
When they work together they can
come up with some pretty
wonderful results.
We have a website, I'm
not going into it.
We are working on other
technologies to improve our
pseudo color processing,
including short wave multi
spectral imaging, which four
years ago was very, very
expensive but now you can
just use with LEDs.
And our provisional results
are encouraging.
We're also doing optical
character recognition, but I'm
going to leave my part of the
talk, leaving you with the
forgery problem, which is why
we're here in California
because we have two weeks of
imaging at the Stanford Linear
Accelerator Center over the
next couple of weeks.
We've been going for
eight years.
There are a lot of people
involved in this project.
Nearly all of them are
on this screen.
I'm going to hand over you to
Roger Easton who is the head
of our optical imaging effort
and he's going to give you a
brief summary of the techniques
that we've been
using and what we've learned
along the way and how to image
the manuscript.
Thank you very much.
Roger.
ROGER EASTON: Thanks, Will.
So as Will said I've been
working on the imaging team
for the Archimedes for
quite some time.
In fact we actually started
working on it--
Keith Nox my colleague and I--
even before Will did, because
we did some imaging for the
auction catalog that was
published in August- September
of 1998 before the auction.
But our original intent, and
what we have been doing
subsequently with significant
modifications as I'll show
you, was to do multispectral
imaging of the palimpsest. And
the reason is because the two
inks, the overwritten ink that
came later and the original
ink, are different colors.
And the slide here shows images
of a bunch of different
wavelengths which illustrates
that.
Both inks tend to get fainter
as you go to longer
wavelengths because the
penetration depth of the light
gets longer.
But the Archimedes
ink, that changes
significantly more quickly.
So we're going to try to
take advantage of that.
Here's one of our early graphs
that we made in the summer of
the year 2000 when we
first started to
image this for Will.
And you can see there the prayer
book text is the upper
line and it's significantly
flatter than the lower line,
which is the Archimedes text.
Sorry I got that backwards.
The prayer book is
the lower line.
The Archimedes text is
significantly brighter in the
red wavelengths, which is the
reason why, the difference
that we're going to
take advantage of.
So the original imaging plan was
to take advantage of this,
look at the image in 30
different wavelengths in the
visible, and then do subsequent
image processing to
try to recover, to extract,
the Archimedes text, to
literally segment it from
the original text.
And this is taking advantage
of techniques that are well
known, they've been done
in environmental
remote sensing for years.
So we weren't actually doing
too much that was very new
when we did this.
The camera we had at the time
and we still use on occasion
was just marginally capable
to do this.
It was a relatively small
sensor, 1536 by 1024, so 1.5
megapixels.
These days when you can go
down to your neighborhood
store and buy a 8 or even a 12
megapixel camera this is
relatively small potatoes.
This was a monochrome camera.
It was cooled, which allowed us
to image more gray levels.
We got 12 bits of data rather
than the 8 or 10 that you get
from most normal cameras.
But because it was a monochrome
sensor to be able
to get the spectral recognition
we had to put a
color filter over the front
of it and we used a liquid
crystal tunable filter, which
again at the time was
relatively new technology but
is much better known now,
which allows you to literally
tune an electrical knob and
change the wavelength
of transmission.
And so this camera, which was
about $25,000 when I bought it
in 1999, you can get that same
capability for considerably
less money now.
So we're going to use these
techniques we derive from
environmental remote sensing
where we assumed a linear
mixing model.
We assumed that the pixel
values were linear
combinations of the different
constituents.
And then we tried to re-extract
that, undo that, by
doing an unmixing model using a
pseudoinverse matrix, which
is again relatively
straightforward simple stuff
that many people here are
probably snoozing right now
because it's so obvious.
So here's a diagram of what
that model would be.
We take the images in the
different wavelengths and what
we're trying to do is construct
the inverse of the
matrix e, which is the matrix
describing how the
combinations of constituents get
matched into the different
combinations of wavelengths
of each pixel.
One other little detail that
we had to deal with was of
course in real life here we've
got a multiplicative model,
where the different combinations
actually are
combined multiplicatively and in
the calculation we want to
assume an additive combination,
so we simply had
to take the logarithm
to do that.
And then we went ahead and did a
least square solution of the
matrix, real straightforward
stuff, more Penrose
pseudoinverse and then we could
construct images of each
constituent based on that.
This was the first images
that we got doing this.
We did this in our phase one
imaging, which happened in the
summer of the year 2000.
And so the original appearance
of this one leaf 70b is on the
left, and then the channel that
shows the Archimedes text
is on the right, where the
level of whiteness is
associated with how much the
algorithm recognized that text
to be Archimedes.
So if it's white, it says
it is Archimedes.
If it's black it
says it isn't.
And if its mid gray there's
some mixture.
There we actually felt we'd
solved the problem.
There's a magnified
view of that.
But it turns out that was not
a successful result in the
minds of the scholars.
The images look very good
to us, the imagers.
But it didn't look good
to the scholars.
The other problem was it was a
very time intensive process
both collecting and processing
the data.
And part of that was
we had to do custom
stitching at the time.
We didn't have an algorithm
that we could use.
But it turns out even with
the fact that there was
significant improvements in the
visibility of the text,
which I think you can see
in the center there.
There's a diagram of Archimedes
that if you look on
the original page, my memory is
you couldn't even recognize
it was there.
But it turned out that the
scholars didn't think so.
So our reaction to the scholars
reaction was one of
shame and shock.
We figured we had the problem
licked, but it
turned out we didn't.
But there are a couple of simple
reasons why we hadn't
done it for them.
One thing they really wanted was
finer spatial resolution.
And that was the problem again
with that particular camera,
being only 1.5 megapixels.
We took the images of those
leaves in two shots and then
stitched them together.
So we only got images
that were about 1.5
megapixels by 2--
sorry, 1500 pixels by 2000.
And so it worked out to be about
8 pixels per millimeter,
which is certainly sufficient
if you're just trying to
recognize the text.
But when you're trying to
extract text from other text,
it turned out it wasn't.
And again, we clearly also need
a more efficient imaging
and processing to be able to
do this much more readily,
because it took us virtually
all summer to process those
five leaves that we did in
that original phase.
So then we adopted the technique
that was well known
to scholars for many years to
read palimpsests, which is to
use ultraviolet fluorescence
imaging where you illuminate
the page with an ultraviolet
fluorescent light and that
makes the parchment glow.
That enhances the visibility
the text.
And why is that?
Well it's because you get
a double absorption.
The ultraviolet light is
absorbed by the ink going in,
and then the visible
fluorescence is absorbed by
the ink coming out.
That double absorption gives
you an enhancement in the
contrast of a text and makes
it much more readable.
This was something that we had
known about, but we figured
that with the technique that
we had up before we didn't
need the ultraviolet
fluorescence.
Turned out we were wrong.
So we also went to a simpler
digital camera, which gave us
a significantly better
resolution, and again,
compared to what you
can get now this is
relatively small potatoes.
But at the time we got it, it
was the limit of technology.
So we used a Kodak DCS 760,
which is a digital camera
based on a Nikon body
single lens reflex.
Kodak doesn't even sell it
anymore, they don't have any
comparable camera
at this time.
But it gave us 6 megapixels,
3000 by 2000, and it did give
us images that over from 400
nanometers through the infrared.
And again we illuminate now
pages with ultraviolet
fluorescent light, but we look
at the visible fluorescence,
so the 400 nanometers
is sufficient.
The other point is of course
this is a color camera.
It gives you a color image,
again technology probably well
known to everybody sitting
in this room.
You've got over the 3000 by 2000
sensor, you've got green,
red, and blue filters.
So in fact you have to do
interpolation, so we wind up
getting slightly poorer
resolution than in fact you
would have expected perhaps.
The technique that we finally
evolved into, we image now the
pages in three illuminations.
We use a xenon strobe that
strictly we don't process
that, that is just for the
visible appearance.
Then we use a low wattage
tungsten light, which gives
you a very reddish
illumination.
And I'll show you the reason
for that in a second.
And then we also illuminate with
a long wave ultraviolet,
is what where we get the
enhanced visibility of the
under text.
With this camera we set it up
so we image 600 pixels per
inch, or 25 pixels per
millimeter roughly, and that
gives us an image size over the
whole page of about 7500
by 5000 pixels.
To do this we have to image the
page in sections, so we
image it now in 10 sections,
to do that the image of a
bifolio, which is a double page
spread of the original,
double-page spread of the
Euchologionm a single-page
spread of the original
Archimedes.
And then we have to digitally
stitch those together.
And again, technology has
evolved quite a bit from when
we first started to do this.
Our stitching algorithm was
pretty poor but we've adapted
to get a new and significantly
better one.
So here's an illustration of the
enhancement of the under
text using the ultraviolet
fluorescent, and the image on
the right, the Archimedes text,
is running vertically
and that's of course the text
we're interested in.
So then now we have these
three color images so in
effect we have nine bands.
The strobe RGB, the tungsten
RGB, and the ultraviolet RGB,
even though in the ultraviolet
image there really is almost
no signal at all in the red
channel and the green channel.
It's almost all in the blue.
Then we wanted to come up with
a simple method for doing the
image processing.
This was sort of a mutual effort
with my colleague Keith
Knox and myself.
It was based on the observations
that the Tungsten
red channel, the red channel
light of the image, viewed
under this reddish Tungsten
illumination, shows almost no
evidence of the Archimedes
text.
The reason is because the
Archimedes text is reddish.
So you illuminate something
reddish on a neutral
background with a red light,
it will tend to disappear.
Whereas the ultraviolet blue
channel shows both writings
for a reason that we've
already mentioned.
We came up with this processing
strategy where we
would encode the spectral
differences of those texts in
color to create the pseudo color
images that Will already
showed you a couple
of examples of.
And these images, as we all
absolutely readily admit, are
hideously ugly.
But they are useful,
and they are
quick and easy to generate.
Here's the illustration
of the steps.
The same leaf, this is 92v 93r,
and this is Archimedes on
spiral lines.
And you can see the diagram
right there in the gutter.
This is the page that we
use for our reference.
We image this every
imaging session.
But in the image on the left,
that's the full color tungsten
where you can really see-- the
Archimedes text is virtually
invisible, whereas the image
on the right, it's actually
pretty apparent, though
certainly it's still
overshadowed by the over text.
Then we just go to
the separations.
And there's the red channel, the
tungsten on the left, and
the Archimedes text
is virtually gone.
And the blue channel, the
ultraviolet, the two texts are
at least of comparable
contrast.
So then we have to go
and process those.
This is using the software that
Keith Knox wrote, what we
call Archie, it's now Archie
1.1, where we take the red
channel tungsten and the blue
channel, the ultraviolet, and
normalize them by using
a moving window.
And then we balance
the mean and
variance inside that window.
So we get a significant
enhancement of the contrast
and also it removes a lot of
the shading variations that
are apparent in many
of the images.
And Keith did this in a very
efficient way so it winds up
being able to run in 15 seconds
for a whole image,
rather than days.
We now have it set up rather
than having to haul the images
back to Rochester, which was
what we originally did, Keith
sets up his Mac laptop and runs
it in his hotel room in
the evening.
We have the images
by the next day.
So then how do we display
these to the scholars?
Well we want to take advantage
of the pseudo color, the eye's
ability to distinguish
the text.
We put the blue channel in the
green and blue channels of a
pseudo color image, and we put
the tungsten red image in the
red channel.
And again, the Archimedes text
is not visible or almost not
visible in the tungsten red
channel, so it shows up as
bright, whereas it shows up as
dark in the green and blue.
And they over text shows up as
dark on all three, so as a
result the over text comes out
neutral, and the under text,
the Archimedes text, comes
out with a reddish tint.
And that gives you
a color cue.
There is another example
like the one
Will showed you earlier.
The visible appearance on the
top and the pseudo color on
the bottom, and now you
can do a comparison.
The top is the visible
appearance.
The middle is the appearance
using that multi-spectral
segmentation with a pseudo
inverse, and then the bottom
is the pseudo color.
You certainly get similar kinds
of visibility of the
Archimedes text, but the
advantage of the bottom one is
you also see the over text.
That allows the scholars to be
able to tell if a character
has a break in it.
They can see if that break was
caused by the over text
obscuring the original
Archimedes writing, which in
the middle image we were more
successful than we probably
wanted to be, though we didn't
know that at the time.
We've been able to significantly
make that over
text disappear into the
background and it becomes
difficult for the scholar to
tell if that break was caused
by obscuration from
the over text.
So in these pseudo color images
both texts are visible.
This was the scholars request.
We get that reddish tint, and
we assume or we have found that
this method is useful to
recognize about 80%
of the text.
There is a large image of one
of the sections, there's the
ultraviolet, and then there's
the pseudo color.
So we went into production
imaging.
There's Abigail putting
one of the pages--
this is one of the pages with
the forgery on the bottom--
into our system, which has a
computer controlled x-y stage
so we can readily drive the page
to the particular section
that we want to image.
This set up actually holds both
the Kodak camera and the
earlier census multispectral
camera so we can do both kinds
of imaging on the same set up.
But even with that the scholars
tell us that some of
the text is still
not readable.
It's either underneath the
forgeries, which is again why
we're here up at the Stanford
synchrotron because we're
trying to do x-ray fluorescence
imaging, or it's
been eaten by mold,
or it's scorched.
So we're now trying to figure
out ways to image these
difficult sections.
One of the ways Will already
hinted at, we want to do a
little more sophisticated
multispectral with more
wavelengths, or we're going to
try this x-ray fluorescence.
Here's our team of x-ray
fluorescence imagers.
Bob Morton on the left, who is
sitting in the front row, and
Gene Hall from Rutgers in the
middle, and Uwe Bergmann from
the Stanford synchrotron
on the right.
Again, it's because of
Uwe that we're here.
So in x-ray fluorescence, the
technique here is very similar
to regular fluorescence.
You bring in an x-ray to an atom
and that x-ray kicks out
an electron from one of
the inner shells.
And then by consequence one of
the electrons from one of the
next couple of outer shells
drops into that to fill a hole
in the inner shell and emits a
photon, an x-ray photon at a
wavelength characteristic of
the particular material.
This allows us to do
spectroscopy on the inks at a
particular point at
the same time.
This was a test image that was
actually done with a test
system, not using
the synchrotron.
So this is one of
the forgeries.
Then in that particular
section, do an average
spectrum and you can see a
variety of x-ray peaks there,
and those are characteristic
of all the materials,
including the iron.
This ink is iron gall
ink and so it has
traces of iron in it.
That's what we're counting
on using to try to
recover that text.
Here are a bunch of different
images from using wavelengths
characteristic of different
materials to
try to extract that.
If you look at the calcium
image in the lower right,
there's a lot of calcium present
in this because that's
how they treated the parchment,
and there you can
see the text through
the paint.
So, our first test images.
Bob and Will spent quite a few
sleepless nights putting these
together last year.
I fortunately had to teach, so
I wasn't available to go and
spend my nights working
on this.
There's a small section and
there is an image using an
EDAX system, which is
a commercial system.
One of the interesting things
about it is you can see both
writings on both sides,
in the same image.
Here's an illustration of trying
to indicate that a
little more clearly.
The face, page 163 verso on the
left, and then 163 recto,
the backside of it,
on the right.
But it's been digitally flipped
to give you the mirror
image, and you can see that
those writings, you can match
the writings in the EDAX image
in the middle with the
different writings
on the different
sides in the same image.
Again we mentioned it a few
times that's the reason why
we're here.
We're at the Stanford
synchrotron radiation lab just
up the road and taking
advantage of that.
Here's one of the pages
in the set up.
This was when we did the first
test of this last year.
The x-rays come in through that
tube on the right and
then are measured by the sensors
both behind and at the
bottom and use those signals
to try to recover the
different wavelengths.
Here is the first examples of
writings from that, and again
I didn't have the pleasure of
being there, but that image on
the right took how many days--
30 hours to put together.
That's the imaging section so
now I'm going to turn it over
to Mike Toth who is going to go
to tell you a little more
about what the management
and some of the
data issues have been.
MIKE TOTH: As both Roger and
Will have noted, we've got a
good team here, and we've got
a good process in place,
starting with the conservation
and the imaging, and working
our way through to ultimately
gaining knowledge about the
Archimedes.
Now is a point where you
all come in, and we're
particularly interested in your
thoughts on this, because
we're looking at the data.
We've been looking at the data
throughout this process but
we've been working with
a closed system.
We have had a given set of
scholars, we've had a given
set of image processors.
Now we want to make this more
broadly available for people
who may want to try different
imaging techniques, who may
see different things in the
Greek text, and they want to
get back to these
original images.
So the question for us is how
do we make these original
images available to a broader
audience, wherever they may be
around the world.
Now early on, as we said hey
we're going to collect large
amounts of data here, we created
a metadata standard.
Part of that's basic Dublin
core, basically what is it,
intellectual property rights,
all the images are copyright
the owner of the Archimedes
Palimpsest. We have a range of
information there.
But one range of information
that I think is truly unique
for manuscript imaging is
the spatial information.
Because when we looked at this,
we said hey this is the
equivalent of having a satellite
over the globe, and
we're looking at coordinates
on this globe.
In fact just the other day as
we're looking at the XRF,
we're saying we're going to
be imaging from the back.
Which way is our world
going to spin?
What's 90 degrees and what's 270
degrees, so we're trying
to address all these,
we're trying to
create a standard here.
So part of this was creating a
standard for our imaging, so
that you could go back to those
original images on our
coordinate system and
pick any point in
this coordinate system.
Back in 2000 when we were
starting this, we went back to
the content standard for digital
geospatial metadata,
extensions for remote sensing
metadata, following with our
analogy to the satellite
over the earth.
And so we created this where
we have an x position, a y
position, and various
coordinates there.
So we have a spatial
capability here.
Now does this help Google?
How would Google
work with this?
You've got Google maps.
You can search for Archimedes,
and you can look all around
the US, and people can do mash
ups, and find locations on
Google maps.
You also have Google books.
We have Reviel Netz's book
there, The Works of
Archimedes, Translation
and Commentary.
We have Heath's early
translation, The Works of
Archimedes, which actually
used the Hypereides.
Now how do we bring those
together, is our question, and
can we use the same type of
method, the same methodology,
that you're using
for Google maps?
Can we do this on a scripto
spatial scale, if you will,
instead of a geospatial scale?
Can we apply this to
our manuscript?
One question for
us is, is there
enough information there?
And then the question for you
is, how can you use that
information?
How can you make this available
to scholars, who are
saying OK I want to know
where the word--
whatever that word is, starting
with the epsilon
there-- or Reviel's
input, or Nigel's
comment, can I find those?
Now it's pretty straightforward
as we go
through the various images here,
the UV, and the natural
light, and then the
pseudo color.
Bigger challenges when
we get to the XRF.
We've got registration
issues here.
We've got to address those, and
we've spent a full day to
an afternoon and a full morning
trying to make sure
we've got the metadata squared
away for our upcoming XRF
imaging session, so that
ultimately you can correlate
this information with the
visual information here.
That's fairly straightforward.
Now the real question is
the transcriptions.
This is a transcription by
Reviel Netz of that same page
that we're looking at earlier
and these are the same points
that I pointed out earlier.
And the real challenge for us
is going to be to bring it
back to the original image.
You'll see transcriptions or
translations in books, and in
Google books or whatever
it may be,
you'll relate to those.
But how do you get back to
that original image?
And that's what the academics
want to look at.
Or a processor, who says hey,
I think I've got Archie 3.7,
let me try to crank this
through these images.
How can I apply the same type
of thought to the original
image, not just to the
transcription, not just to the
translation.
And how can I relate
these together?
How can I relate it to
references in various books?
Can I relate that to
the specific image?
Can I go back to that thought
on sphere and cylinder or
whatever in the original image
there from a book?
Can I do it from a transcription
and relate my
notes to that, that if I have
the word, mastin, for example
I can find that in the
original note.
If I have the word enenfoy, can
I put that into the book?
And so can we relate it
to the information in
an index, for example?
This is Reviel Netz's book.
Can I relate back to those
original images?
That's our question
for you really.
Do we have enough information
here?
How do we make our
death available?
There's going to
be distribution
questions on the data.
I mean, I just got a terabyte
of Archimedes data at home
very quickly.
The Fedex box arrived, sat on my
front doorstep, and there's
that information with
the hard drives.
But we're going to have to move
this around by internet,
and we're going to have to
move around good quality
images there.
So our plan currently is to
make the data available
sometime this year.
We're going to make
it available in a
very simple flat file.
We're going to use our metadata
standard and we want
this available to the broadest
range possible of students,
scholars, educators, and
the general public.
So this is getting beyond our
narrow circle of the scholars,
the imagers, the image
scientists.
We want them to be able to link
to the information they
need, to be able to search
across this, search across the
original images, not just the
transcription, not just the
notes, not just the books.
Then we're going to make this
available to anyone, including
Google, and what do
you do with it?
What type of GUI do you
develop for it?
What front end goes on this?
That's the kind of a heads up
here as to what's going to be
coming, and then what
you do with it.
And what is it good for?
There's more than just this.
I do not believe the other
imaging projects currently
capture the spatial
information.
But there are a number
of other projects.
The Herculanaeum papyri, the
Oxyrhynchus papyri, Codex
Sinaiticus, that effort is
just starting up with the
British Library, International
Dunhuang Project, also with
the British Library.
All of them are collecting large
amounts of images, large
amounts of metadata.
How do we standardize across
this, and it may be an
organization such as Google that
gives us the impetus to
standardize across our
various efforts.
So the question for
you is, what are
your thoughts on this?
That kind of got skewed
over to right
there., or the left there.
What are your thoughts with
regard to what we're doing?
What are your thoughts in terms
of making this available?
What are your thoughts with
regard to how do we make these
images available to the broadest
group possible, and
make it available for them to
work with it as they need to
or as they can make it
available to others?
There's a question back here.
AUDIENCE: You were saying go
back to the original image.
Which one?
MIKE TOTH: That's
a good question.
What we've said is that right
now the TIFF image, we have
different versions
of images really.
We have the original, was that a
12 bit or a 16 bit TIFF, and
then we have the processed
images, and those
are all TIFF images.
We then push around JPEGs
but those are not
our standard images.
We want to make those available
in ASCII so that you
not dependent on any
other standard.
We're doing the same with the
x-ray fluorescence, where
we're going with the ASCII.
We're then converting
those into TIFFs.
So the original image as shot by
the imagers, which are the
10 images per page.
And then the TIFF image
of each page.
AUDIENCE: Can you repeat the
questions for us in the back?
MIKE TOTH: The question was
which are the original images?
And that's where I was saying
it's that TIFF and then the--
we're going to make those into
ASCII images as well.
Other questions or comments
for any of us here?
How would you use this?
You just host it on Google books
as a book, or do you use
it on Google maps,
or Google Archie?
What do you do?
We welcome your thoughts
on it.
AUDIENCE: You mentioned that the
manuscript was copyright
of the owner of the
manuscript.
I gather from what you're
saying now he's not
restricting distribution
of it.
MIKE TOTH: He's protectively
copyrighting it so that
someone else cannot claim the
intellectual property from
that, that he wants to make it
available, so that no one else
can claim it proprietarily for
their organization or for
themselves.
He does not want this to become
a Dead Sea Scrolls,
where the scholars held on for
50 years to the scrolls.
He wants to make it available.
And that's why we're here,
really, because he feels that
you offer opportunity for us
to make this available.
We welcome your comments.
AUDIENCE: One suggestion,
if you're going to be
distributing the files directly
is to look at the
peer-to-peer sessions.
Especially Bittorrent might
lend itself to large file
distribution because it acts
in a distributed way that's
more scalable than perhaps
downloading it straight from
the server.
MIKE TOTH: The problem there
is that would work well for
pushing it around amongst
our peer group.
But this has really developed
as a project as people who
have gained insight into it,
like you all, and then say
hey, I think I can help.
Now many people come up, saying
it's easy it's obvious,
I can just use Photoshop,
and they then
find it's more complex.
All the people here who are
working on this project have
gotten some visibility into it
and then said hey I can help.
The question, I'm sorry, the
question was with regard to
peer to peer networks and you
said peer to peer networks.
That's where we want to make
this more broadly available so
the people who may not be
as part of this peer
group can access it.
And there are a lot of people
out there, both imaging
scientists and mathematicians,
who are very interested in
this and give them
opportunity.
They've had no insight into
this, or limited insight.
Give them the full information,
allow them to
work with that in any
way they want to.
AUDIENCE: You don't necessarily
have to distribute
your entire data set
to everybody.
It seems to me that
representative samples would
be useful for people that just
want it play around with.
MIKE TOTH: The statement was
representative samples alone
would be useful for people
to play around with.
AUDIENCE: If somebody has an
imaging idea that they want to
try out, they don't need
your entire data set.
They'll try it out on one or two
sample pieces and if that
works then they'd contact you
to get the whole thing.
MIKE TOTH: Then go peer
to peer and send them.
That's a good point.
AUDIENCE: With respect to
copyright and distribution you
may wish to look into the
greater commons or other
similar licenses that allow you
to retain some measure of
control while still allowing
for public use
within certain areas.
MIKE TOTH: We'll have to take
that up with the owner and how
does that restrict
you or enhance.
CARL MALAMUD: That's
a good suggestion.
Basically we're looking at a
Berkeley style license, which
is you can do whatever the hell
you want with it, please
don't sue us.
In terms of distribution, we're
just starting, we'll put
a terabyte of disc up and
colo Bit Torrent, HTTP,
FTP, you name it.
And the core underlying
strategy is massive
replication of either small
parts or large parts.
So naming the files using
an ISBN number and then
identifying where within the
document you are, so that you
have at least some ability to
self identify these things.
The hard part though is once the
data is out there, how do
people start building the GUIs
that the scholars want, how do
people begin applying the
different imaging
technologies, and feeding those
results back in, because
we hope that in addition to our
core images, there's going
to be additional images
generated by other people to
look at this and say well, I've
got some DSB algorithms I
can add to this and recover
even more of the text.
AUDIENCE: So the metadata that
you were talking about, I
don't know anything about it,
but does that provide an
absolute coordinate system so
that two people anywhere can
talk about, yeah, that word?
MIKE TOTH: Yes it does, in fact
the question is does the
metadata provide an absolute
coordinate system?
Yes we have the absolute
coordinate system and then the
offset, because the leaves are
mounted so there's some offset
there, and where is 0,0?
Where is our prime meridian,
and our equator?
So we define that in terms of
bounding coordinates and then
you can work across that.
You look like that didn't
quite answer.
AUDIENCE: Are some of the
translations already
available, the new
translations?
WILL NOEL: The question is are
some of the translations
already available?
Yeah, Reviel Netz published
sphere and cylinder one with
Cambridge University Press and
sphere and cylinder was
already in Codices A and B. But
one of the things that I
didn't say is that the
palimpsest is also the unique
source of the diagrams that
Archimedes drew and that's
really important because
mathematicians think in
diagrams, they don't
think in text.
And so the diagrams in the
palimpsest are very, very
important for that edition.
Heath's translation isn't really
a translation at all.
It's a modern interpretation of
what Archimedes was saying.
Actually Archimedes is a truly
foundational figure in the
history of Western philosophy
and science that has not been
translated into English
before.
In those learned articles that I
was showing you there aren't
translations but there are
expositions of the things that
we found and the journal was
Sciamus, for those that are
interested volumes
1, 3, and 5.
We want to make this available
as quickly as possible.
We're hoping for
transcriptions.
We've now got a transcription
of the whole of Codex C, the
whole of the original
Archimedes text
except for the forgeries.
That will be completed by
the end of the year.
And we're hoping to have it up
on a web site together with
the images upon which those
transcription were based.
This is a very important point
because of course you know
when you're thinking about what
the original images are,
you are not working from a
manuscript you're working from
processed photographs, which
are man made things and
sometimes you're having to go
back to the photographs that
Heiberg made in 1906 because
they are now the only source
for the images.
So it's a rather complicated
set of data to put together
and the other thing is that in
any edited text there's always
an awful lot of guess work.
There's a lot of editing
to be done.
AUDIENCE: Why do you call those
paintings forgeries?
WILL NOEL: Why do I call
them forgeries?
Because they're trying to look
Byzantine, they're trying to
look like 12th century, and we
know that they were done after
1929 because they are copied
on a one to one scale from
[UNINTELLIGIBLE], which is an
album of pictures taken from
Byzantine manuscripts but
at a reduced scale.
Abigail measured them and
they copied from it.
We've also done some pigment
analysis and there's a
particular type of green that
only became commercially
available in Germany in 1938 so
we actually know that they
were done after 1938.
That's why they're forgeries.
The interesting question is why
don't we take them off?
There are two reasons.
One is that we might damage the
text when we're trying to
take them off.
And the other reason is that
even if we haven't got the
technology to read under them
now, and as you can see we're
beginning to get there, in 50
years time someone can have
another crack with more modern
technology and at least we
haven't destroyed the book.
You know one of the things about
palimpsest is that in
the 19th and early 20th century
people to try and read
the text they used to put all
sorts of acidic paints on
them, to paint them over with
iron gall or with a gallic ink
or even with hydrochloric acid,
which would temporarily
make the under text more visible
but in the long term
destroy the book.
And we can be very glad that
Heiberg didn't actually do
that, because then we'd be
in even deeper trouble.
AUDIENCE: From your experience
with the scholars, if there
was a nice GUI would the
scholars use that or would
they just print it out?
If there was a nice GUI, nice
streaming data, draggable
interface, would the scholars
use that or would they just
print it out?
WILL NOEL: The question was
would the scholars actually
use a really nice interface or
would they just print it out?
Our images tried very very
hard to make very very
wonderful pictures.
Nigel Wilson is a scholar
of 60 and he does his
transcription in the summer
months using a magnifying
glass from a printout.
But others of the scholars are
very very computer literate
and they do use the
hard drive.
Now our interface that we've
developed so far is effective
but clunky, very clunky
and they would
love a better interface.
Last question.
We got the high sign
here on time.
AUDIENCE: Why not use PNG for
the images because it's loss
less, works in all the browsers,
and then maybe have
some Javascript on top of that,
would let you see what
the possible translations are.
CARL MALAMUD: Question was first
of all why not use PNG
instead of TIFF.
That's a no brainer, TIFF is how
we capture and transcoding
to PNG is easy.
Second part of the question is
why not build some Javascript,
pull up the PNGs, lay the text
on top, and that's the great
beginnings of a user interface
and love to talk to
you more about it.
One of the things we're trying
to focus on here is we think
there's going to be lots of UIs
out there, different ways
of people working with this
data and similar data.
Part of what we're focusing on
is making sure we've got the
metadata right, the transmission
method, we're
getting this stuff on the
net nice and solid.
So rather than spend all our
time doing Ruby on Rails
sitting on top of a MySQL
database that does this and
that, we think a lot of people
will do that level, and our
prime mission is making
sure that we get the
core data out there.
But we're also very interested
in how to build UIs on top.
MIKE TOTH: As far as the actual
image, whether it's PNG
or TIFF or JPEG, one of our
key considerations is the
archival availability
of this information.
It's got to be available a
thousand years hence, however
many years hence.
It's a very fragile object.
We don't know what its
future may be.
This may be the only history of
the object, and we've got
to preserve that for
future generations.
Why we're actually going back
to basic ASCII, with your
basic flat file there, so that
anyone in future can put it
into whatever they want, they
don't have to worry about
whatever version of the
software it may be.
So it'll be available for
generations hence.
I'm told that this ends the
formal discussion but we're
available for another half
hour if you want to chat
informally so I thank
you very much.
