>> PAWLIGER: Good afternoon.
>> Hello.
>> PAWLIGER: Hi.
So my name is Marc Pawliger.
I'm engineering director for client applications.
And I'm here today with Sean Parent from Adobe
Systems.
Sean and I have known each other since '93,
'94 or something like that, long, long time,
when we worked together at Adobe on the Photoshop
Team.
And Sean has done a tremendous amount primarily
in and around Photoshop up until recent times
for the work that he's going to be talking
about today put amongst other things; automation
and UI layout and just a tremendous amount
of capability into Photoshop.
And of--I guess, the last couple of years
has moved on into more of the research side
of Adobe and has concentrated on areas primarily
centering around STL and Adobe editions to
them and also sort of looking at the whole
problem of user interfaces and their complexity
and behavior and why a lot of the software
that we have works maybe despite itself and
despite the way that it's built.
So with that, I'll turn it over to Sean.
>> PARENT: Okay.
So, is my mic on?
Yeah.
So, that was a nice introduction.
Thanks, Mark.
I've been at Adobe now for something like
13 years and about the last four years working
in either our advanced technology group or--now
I run our software technology lab which is
a small team trying to rethink how Adobe goes
about building software.
Now this talk came about because Bjarne Stroustrup,
who created C++, invited me to give a talk
down at Texas A&M and said, "You're in the
industry, why don't you come and give a talk
to the students about the future of software
development?"
And I said, "I don't know about the future
of software development.
I know about what I'm building which is hopefully
the future of software development.
So, I'll give a talk on a possible future
of software development."
For those of you who don't know much about
Adobe, a little background information.
We're a big and growing company about two
and a half billion dollars of revenue, not
growing nearly as fast as Google or as big
as Google, but we've got about 6,000 employees
worldwide, very distributed.
We do a number of products; some of these
you've probably heard of.
So, internally, we're structured loosely like
this.
It varies from business unit to business unit,
but we have product lines.
So, Photoshop is not just a product; it's
a product line.
There's Photoshop Elements, Photoshop Extended
Edition.
And we have product teams where we typically
have 20 developers working on a product team,
about 30 testers which surprises most people
more QE than engineering.
Typically one or two UI designers working
with the team, and then we have about 20 shared
technology groups which all have similar structure
to the product teams dealing with libraries
for vector graphics, typography, color, help,
localization, sure things like that.
So development process, we strive towards
an incremental development process but the
business we're in tends to push things into
more of a waterfall process.
There's a huge lag between the times that
you have to reserve press to print the manuals,
and the manuals have to be done between you--when
you can actually ship the products since we
do physical boxes, getting everything done
in the quantities we demand takes a manufacturing
ramp.
And that tends to push the development cycles
into more of a waterfall model where you sacrifice
quality early to get the features in, so you
can write the manual and ship later.
We ship mostly Macintosh and Windows, simultaneously,
English, French, German, Japanese, our tier
one languages.
And by the time we're done, usually about
three months after the first ship, we've covered
about 24 languages with the products.
So, if you're going to talk about the future
of software development, I thought I'd do
a little research on what do the analysts
say.
You've got lots of buzzwords.
I spent sometime going through analysts' reports.
You hear lots about Java, and C#, and XML,
and web services, and open source and some
of these things I agree are important and
some I don't think are very important.
Best practices and methodologies is listed
first up there because that's where most people
think the problems are.
I pulled some quotes from analysts that I
just thought were interesting.
The second one being the most interesting,
"Microsoft rolling out a new development process,"
this is Directions on Microsoft which is an
analyst publication that's usually very friendly
to Microsoft.
And the fact that they would go through the
trouble of putting together a whole development
process when analysts say they don't think
it will necessary help the company ship its
products on time or with fewer bugs.
The top one is talking about the problem of
security.
And the number of analysts who believe that
the solution to security is more and more
testing, which I think has very, very diminishing
return.
We already have 30 QE people to 20 engineers.
If you double that to 60 QE people testing
for security holes, you might find, you know,
another quarter or half of the security holes
out there.
So, you're not going to close the holes that
way.
Ultimately though, this is why I think the
status quo in software development fails.
This is a quote from Jon Bentley from his
book "Programming Pearls" published in '86.
John went around teaching classes on how to
develop software and gave a test to mostly
senior engineers at IBM in Bell Labs and asked
them to write the binary search problem.
Given a specification, write a correct binary
search.
Ninety percent of the programmers failed finding
bugs in their solutions and of the solutions
where they couldn't find bugs, many of them,
they still weren't convinced that they were
correct.
Okay.
To put it in perspective, this is a correct
solution to Jon Bentley's binary search problem.
Okay.
And they were not given just like a few minutes
to do this.
They were several hours to try to do this.
Photoshop uses the same problem and has, for
many, many years, as a take home test.
And more than 90% of the candidates just fail
miserably at coming up with anything remotely
resembling this piece of code.
I think the worst case is probably 48 pages
or something like that of code.
And once he figured out what the 48 pages
of code did, it was wrong.
>> Man.
>> PARENT: So, that's a problem.
Now Jon Bentley's own solution is considerably
more complicated than this solution, okay,
and considerably less efficient.
So, our experience at Adobe in teaching programmers
is that the same rule holds and you can take,
you know, PhDs from Stanford which comprises
a large section of Adobe's advanced technology
group, and 90% of the PhDs from Stanford can't
write a correct binary search.
So, how in the world do we ever ship a product?
Okay.
If you can't write that much code correctly,
how do you get Photoshop out the door?
How do you get Word out the door?
How do you get Google Desktop out the door?
Well, these are actual bug graphs from product
cycles inside of Adobe and this is a clue.
What's going on here is we're not solving
problems; we're approximating solutions.
And some of these curves, this one in particular,
I don't believe.
What happened here is there was a projected
ship date out here.
The little peak here is because QE stayed
over a weekend to finish their regression
tests while engineers went home so you had
a little peak which caused a panic, which
led to a nearly linear slope.
The linear slope is not a fixing bugs, it's
deferring bugs.
Okay?
So, we never hit a zero bug count, even pretended
to.
Here, you'll see we're still off zero.
But in reality, what's not shown here is the
number of bugs that were just plain deferred
off the bottom.
So, that's what our current programming methodologies
do.
They allow us to iterate and refine and approximate
a solution, but never actually come up with
a correct solution.
So, one thing my team is working on is studying
how do we go about writing correct software?
Okay.
If you--if you want to write it correct to
begin with, how would you go and do that?
And this leads us to generic programming,
where my team spends a lot our time.
The idea with generic programming is you start
with a concrete algorithm.
Okay.
You refine the algorithm, reducing it to its
minimal requirements.
Clusters of related requirements are known
as concepts, and we have a little more formal
definition that we'll get to in a minute.
Then you define the algorithm in terms of
the concept supporting maximum reuse.
So if you have an algorithm and you can then
prove that it's a correct algorithm, now what
you want to do is lift out that algorithm
so it's written in terms of its minimal requirements,
so that you can reuse it in the most number
of areas.
Now there's a parallel here between Mathematics
and generic programming.
Programming is Mathematics.
Okay.
In generic programming, we have what we call
Semantic Requirements.
In Mathematics, they're called Axioms.
Okay?
What we call a concept in generic programming
is an Algebraic structure like a ring or a
monoid in Mathematics.
Models, we use the same terminology.
In computer science and in generic programming
what we call--what we call an algorithm is
just a theorem.
Okay?
So an algorithm is written in terms of concept
just as a theorem is written in terms of an
algebraic structure.
Both rely on axioms for the correct definition.
So a regular function, a function with no
side effects, corresponds to just a mathematical
function.
And the one thing we have as computer scientists
that mathematicians don't have to worry about
is complexity, right?
We have to worry about, how long does it actually
take to compute the answer here?
Now, we can refine a concept just as mathematicians
can refine an algebraic structure.
So a mathematician would that a monoid is
a semi-group with an identity element, if
you ever took any abstract algebra.
Okay.
That's an equivalent statement to saying--roughly
equivalent--a bidirectional iterator is a
forward iterator with constant complexity
decrement.
Okay.
So you can take a concept and add additional
axioms, semantics requirements, okay, in order
to refine it and that will give you something
which is, in some sense, more powerful in
that you should have algorithms that you can
execute more quickly on it.
More algorithms, but you'll have less models.
Okay?
Less types that you can actually use with
this thing.
We also refine algorithms, which is something
mathematicians tend not to do.
Mathematicians typically do not refine a theorem.
A refined algorithm is saying if you have
a concept which is itself refined, then you
can perform the same operation that you could
on the unrefined concept on the refined concept
but potentially, you can do it with better
complexity or reduced space requirements.
Okay.
So, in C++, this is the way concepts appear
in the existing C++ standard.
They are tables that appear in the standard
itself.
They are not expressed in code.
And, in fact, the tables are very, very weak
on their semantic requirements on this end.
So, my group takes it a bit more seriously
and applies much stronger semantic requirement
to what each of the axioms are.
So, we're here by back up.
We defined the notion of copy to say that
T is equivalent to U.
The reason why the standard uses equivalence
instead of equals is because it doesn't require
that equal be defined in order to define a
copy.
Okay.
But if you want to define what it means to
have a copy, then you have to require a quality.
Now, concepts enable equational reasoning
about your code just as axioms and algebraic
structures enable equational reasoning about
Mathematics.
Through concepts, you can prove algorithm
correctness.
Okay.
And you can then reason about your system
in its entirety.
Generic programming often gets confused with,
you know, angle bracket templates in C++.
Generic programming is not a programming style
as object-oriented programming is a programming
style.
Generic Programming is a way to reason about
code.
I can reason about code through axioms defined
through generic programming even if the program
is written in an object-oriented style.
Okay?
The same axioms hold, so the axioms have to
exist and the theorems have to be correct
if the system functions correctly and works.
And where it doesn't, it's because something's
been violated.
Okay.
So you can use the same ideas from generic
programming to--programming to analyze an
existing piece of code as well as write a
correct piece of code.
Now, in C++, it's pretty horrible right now.
So, we can't do generic programming in the
language.
So, what we do is we approximate it with templates
which are really just macros, and we have
comments.
So, when I say "type name I" or an STL that
might "type name input iterator" or, you know,
"random access iterator," in here, I say,
"I models random access iterator" just to
be a little more explicit.
Random access iterator is naming the concept
that you can formally define.
Regular is naming the concept that you can
formally define.
In fact, that was the definition of regular
a couple of slides back.
Okay.
So in C++0x, if anybody's been following the
progression of the language, much of what's
written here--up here in comments, you can
now write in code in your C++.
So, this is a big part of what my team is
working on.
Now, this brings us to a small pause in the
talk.
Okay.
So, great.
So, we can write sort.
We can prove sort correct and we can write
reverse.
And maybe we can write some nice image processing
algorithms going this way.
So, how do you get from there?
How do you get from forward iterator to Photoshop?
It just seems like this very large chasm.
Okay.
How do you get from STL to Microsoft Word?
How do you span that gap?
So, we're going to change gears a little bit
here.
I have a conjecture, which is that all systems
eventually grow and turn into some kind of
network problem.
Okay.
So if you have any problem of scale, okay,
what that usually means by a problem of scale
is that you have to be able to reason about
the system as a whole by being able to reason
about it locally, okay?
Such systems become graphs with sets of rules
that apply to the graphs and those describe
an awful lot of the systems that we build.
So if you go and you look at Photoshop, what
you'll find internally is it's become this
mass of network of objects.
These objects are all interconnected.
They form some giant, implicit data structure.
There is messaging between the objects.
So those messages do something, hopefully,
that's an algorithm.
Okay.
We use tools like design tool or like design
patterns, Scott Meyers', you know, 50 rules
writing the C++ program to come up to come
with local rules that, if we apply them, hopefully,
we get a correct system out the back.
And then the--once we assemble all these objects,
we iteratively refine and refine and refine
and refine until we can ship the product,
until it's good enough.
So, take a moment and define a generic algorithm
for solving a large system.
If you're--if you're just, you know, building
word, how are you going to go do it?
Well, identify the components and how they
connect.
Okay.
This is my generic algorithm.
Organize the system into a directed acyclic
graph.
Okay.
Everybody know what a DAG is?
Hopefully engineers here.
So, the reason why a DAG is important, okay,
if you have--a DAG is just a directed graph
with no cycles.
So if you have a cycle in your system, it
makes reasoning about the correctness of that
system very difficult.
Okay?
You can no longer reason about the correctness
of the system with only local knowledge.
You have to have complete knowledge, okay,
and know what the effect of that data feeding
back on itself is.
So, you want to define your system in terms
of a DIG.
Now, of course, you're going to have loops.
You're going have things that do iterate.
And what you want to do is encapsulate those,
figure out where the loops in your graph are,
and figure out what the algorithm is that's
done by that loop and write that algorithm
and make that algorithm be just a node in
your graph.
It becomes just processing node in your graph.
And then, ensure that for all possible states
in the system, you know--this DAG is going
to be changing, things are conditional, stuff
coming on and off--that cycles don't pop up,
okay, that this thing stays a DAG.
And then make it generic which means once
you have this for Word, figure out how you
describe similar structures to build other
applications.
Now, this all sounds very complicated but
the idea actually, for me, came from this.
One of the things I worked on at Adobe was
a layout engine which is used in most of the
Adobe products at this point to lay out user
interfaces.
And it's a constraint system of sorts but
it always forms DAGs.
And I had this realization one day that I
could solve a spreadsheet inside of the layout
system.
And that I could turn the layout system into
a spreadsheet if I wanted to.
And this is why; they both form DAGs.
So, this is just a graphical representation
of this.
We're just summing up three numbers while
I can draw it like this.
Okay?
I've got three cells feeding into the plus
operator and coming out with one result.
When you start to talk about constraint systems
and declarative programming, which is what
we're talking about here, people have a panic
attack mostly because in school they might
have been exposed to the wonders of Prolog.
Prolog is a rule-based system.
In some sense, a spreadsheet is a rule-based
system.
But Prolog's was striving to be a Turing complete
system, a complete programming language where
you could write anything.
You can sort in Prolog.
You really don't want to sort in Prolog, okay?
So, sorting is an iterative loop.
If you have a rule-based system that's iterating,
it becomes very difficult to reason about.
So, what we're talking about here is more
the long--along the lines of something like
HTML or a spreadsheet or SQL or Lex & YACC.
Those are also declarative rule-based system
of sorts, okay, that strive not to be Turing
complete.
Some of them are accidentally, but in general,
you stay from the scary edges.
So part of what my team has been working on
is something called the property model library,
which is to apply these ideas to a real system.
And this is the domain we went after.
It's a piece of a user interface.
So, this is the dataflow of a small section
of an actual user interface in the product
that Marc Pawliger is very, very familiar
with.
The symbology here are basically these four
items in the middle are widgets, UI widgets.
You can think about them that way.
So, the top two editing numbers, the bottom
two are check boxes, just Booleans.
And we have some code that sets up the dialogue,
so it populates these widgets with stuff.
And then, we have a bunch of event handlers,
so if I enter a value into one thing, then
it percolates and changes the values in the
UI.
And I literally went through the code and
drew a connection, drew an arrow anytime one
piece of code was connected to another piece
of code; data could flow from there to there.
And the complete section of this dialogue
has 14 widgets, I think, in the middle.
I stopped after four and said, "Screw it,
it's just a big ball of spaghetti."
I did add this one box down here, which is
script validation.
So what's happening here is, you know, we
have a UI.
This is setting an image size for a particular
width and height.
And you can also drive it without a UI attached
by executing a script and all the validation
logic that's contained in the spaghetti code
of event handlers has to be replicated into
the script validation code so that you make
sure that, you know, what ever I record up
here, I can play back without the UI and not
end up with an error.
So, in fact, if you look at Adobe's code base,
a third of the code in our products ends up
being event handling code of this sort, part
of the big ball of spaghetti, and if you survey
our bug database, you'll find that half the
bugs during the new development cycle are
in this body of code.
Now, when I first saw this, I thought cherry
picking.
Go after the--this problem, I come up with
a solution and I'm the hero.
I eliminated a third of the code and half
the bugs.
What I didn't realize was this is indicative
that it's a very hard problem.
So--and here's why.
If writing a correct algorithm itself is difficult,
then writing a correct algorithm when it's
implicit, okay, when you don't actually have
the algorithm as loop in your--on one page,
what you've got is a message to another object
that eventually loops back to you or not and
algorithm is buried in there, gets to be almost
impossible.
So, start to break down this problem.
How would we define it?
Well, you have to start and say, "What is
this thing?"
You know, what's this event handling logic?
How can we even define this?
And we backed all the way up and said we don't
even have a good definition of what it means
to have a user interface.
So, we came up with these as working definitions
for what it means to have a user interface.
And it's just a system designed to assist
a user in selecting a function and providing
a valid set of parameters to that function.
So, what distinguishes a UI and a product
from, say, consuming a movie is the fact that
the user is trying to do something.
He's selecting a function and then executing
it.
Now, a graphical user interface is just a
user interface that's visual and interactive.
A big part of the code we're talking about
here is not the code that deals with selecting
the function, right?
You pick the menu item, you've got the dialogue
box or you're on the palette and the function
is implied that's going to set the set of
properties on your document.
But what does it mean to assist the user in
providing a valid set of arguments to that
function?
So, we have two loaded terms there.
One is valid, you know, we know what arguments
to a function are.
What does it mean that they're valid?
And what does it mean to assist?
So, valid is pretty easy.
It's just a predicate that corresponds to
the preconditions of the function.
Assist, there are many, many, many, many probably
an infinite number of ways that you can assist
a user but if you go and look at the applications,
we have general patterns and you can start
to categories these patterns.
So, here are four that we're dealing explicitly
with.
One is validation, which is given that you
have said predicate to know whether or not
your preconditions are satisfied, you can
simply--validation is simply rejecting that
input, saying no, you can't do that.
Okay. correction is the idea that, let's say
I have a valid set of input and now I poke
a value and now the system is invalid, how
do I get it back to a valid state?
That's correction.
So, prediction is letting the user specify
the values for the function in terms of the
desired result.
Okay.
So, if I want to take a file and split it--split
it into 4k chunks, then depending on the size
of the file, I'll get some number files out
the back so the user could specify the file
that they want and how many pieces they want
out the back and you can calculate how big
the chunk needs to be.
So, prediction is specifying the parameters
for the function in terms of the desired result.
Then, related values.
So, if a user is specifying, we'll use our
example here, the size of a document in pixels
which might be the only thing the application
cares about, that might not be a natural user
interface.
There's a related value there of inches, which
is related to pixels through resolution.
So, you end up with related pixels or related
values through which you can define the arguments
for the function.
So, let's make this a little more concrete.
Flip to a demo here.
Are there questions at this point?
People following?
Think I'm nuts?
Okay.
So, this is a little picture of the UI that
we had the big ball of spaghetti about before,
and this is fully functioning here.
Okay.
So, we can change values, their constraining
proportions to the original document within
height just up there.
We can toggle our display between percent
and pixels.
We can unconstraint and change the value.
And then we can re-constrain and the other
one updates appropriately, and if you put
this in percent, they'll always be equal.
So, that's the basic idea for this UI.
Over there on the left, you see the complete
description for the underlying model for that.
And we also have a description through our
layout language of what the visual portion
of the UI looks like.
So, this is--looks very much like a box model,
the solver is a little more general; it automatically
does things line up colons and has a little
more smarts underneath it, but things are
described in terms of a basic box model.
And you'll see these bind statements against
things.
You know, there's bind here, bind here.
The bind statements are what connect the UI
to the model.
So, the way most people tend to write an application
is the application logic reaches up into the
UI and mucks with it.
And if your application is doing that, your
application knows all about the UI; you can't
have an alternate UI on it, you can't re-share
that logic for scripting.
So here, what we've done is we've factored
out the description of the model which is
how these properties are connected together
independently from what UI is bound to it.
And we can demo that by opening up another
UI.
So, this is another UI that we just attached
to the same model.
We can see the original document with the
height down there and you'll see these are
actually attached to same instance.
So, if I change the UI or if I change the
values up above, they're reflected down below
and it's running non-modal so I can change
it here and it will change up above and un-constrain
and change and re-constrain.
Okay.
So, two alternate UIs.
Or I can do a third UI and here, I wanted
to represent my percentages, sliders.
So, I've got two sliders and it will rule
through the numbers and all the other views.
And it's pretty traditional model view controller
stuff that you probably could have seen demoed
on small talk systems at Xerox many years
ago but somehow, we've forgotten how to do
this.
What we've forgotten how to do is define models.
Okay?
So, that's a basic idea of what we're talking
about here.
Let's flip back.
Now, when I came up with that piece of code,
I went and found one of the--I think one of
the best engineers at Adobe, a guy named Mark
Hamburg.
He was chief architect on Photoshop for many
years.
He's responsible for Photoshop Lightroom which
is recently shipping.
And I said, "Hey Mark, this little mini image
size is just a portion of the Photoshop image
size.
It's just got a couple of connections fully
specified.
I'll tell you exactly what the specification
for this dialogue is."
Write the code in the language of your choice
and the framework of your choice.
I don't care about how your UI is laid out.
I just want the logic for this dialogue.
This, you can zoom in on it if you want or
get really close to the screen, is his resulting
code.
It's written in Objective C to Cocoa on the
Mac only.
The gray areas here roughly correspond to
the logic that's in the model, and the rest
of the code is code that, given a change in
the model, reaches up into the UI and manipulates
it.
It's all the event handlers going on there.
>> This isn't the full program, right?
The full program would also include an .nib
file.
>> PARENT: Yeah, there is no .nib file here.
There's no description of the UI.
This is just the logic for it.
>> Is this showing you're not using any bindings
[INDISTINCT]
>> PARENT: What's that?
>> Since it's normally not using Apple's binding...
>> PARENT: Oh, no.
It is--there is a .nib file and there are
bindings.
So...
>> It's just not showing it?
>> PARENT: This is just not showing it.
Okay.
>> If there are any complications using bindings.
>> PARENT: Yeah, yeah.
This is not like trying to recreate the world.
This is as small and tight as he could get
it.
Okay?
This is my solution in comparison.
>> Now, excuse me.
>> PARENT: Okay.
Yeah?
>> A huge chunk of code is [INDISTINCT] to
interpret it.
>> PARENT: There's--yeah, so the amount of
code to interpret that is 2,000 lines.
Hopefully, the next time I get around to rewriting
it will be about 1,500 lines.
In comparison, the code for the complete version
of image size in Photoshop is 6,000 lines.
So, my complete system for the UI layout engine,
the solver for the models and both the parsers
is smaller than 6,000 lines.
Okay?
So, this is the resulting structure that,
you know, if I were to draw that, this is
what we end up drawing.
Okay.
So, here, the notation here is a solid arrow
with a solid point is a fixed connection.
Okay?
That's a directive connection.
The soft arrows here can go either way, they
can flow around.
The doted line here, this is a conditional
connection, so there's a predicate attached
to that which can toggle it on or off.
And that describes the complete system.
So, a little more about the structure, it's
always expressed as a bipartite graph.
It's very similar to spreadsheets except directionality
can be determined at runtime, so you can kind
of flip things the way they're going.
Data is flowed from higher priority cells
out towards dependent lower priority cells.
So, there's a link reversal algorithm that's
employed to reserve soft cycles which there
are systems where if I tried to pin two values,
it becomes over-constrained.
What if I only pinned one?
Then it's not over-constrained, so that's
a soft cycle.
Hard cycles are very easy to detect.
So, we just detect and reject them.
And soft cycles, we just solve them.
Now, the full notation here, we have sources
which just have N output, syncs just have
a single input.
A cell can have zero or one input, meaning
it can be derived or self can be--can be a
source and the remaining are outputs on it.
And then, relationships have N-inputs and
one output always.
And a conditional relationship is the same
with just a predicate on it.
So, those are the basic computational structures
for the system.
Now, we're attacking 30% of the problem.
We're getting pretty good traction right now.
Last time I looked, we had--this is going
in as part of--part of future of Photoshop
and several other apps.
And last time I looked, we've got 20, maybe
25 dialogues re-expressed this way and all
the thousands of lines of code underneath
them ripped out.
So, it's--took a long time to get traction
but now that I have traction, the roll is
actually going pretty quickly.
My estimate is that 85% of the existing code
base can be replaced with small declarative
descriptions and a small library of generic
algorithms if you go and look at a product.
Now, that's not these small declarative descriptions.
There are other problems that manifest themselves
in the applications and they need to be solved.
But that's my gut feeling looking at the application
code.
In fact, if you would look at all of it, I
think we're about two or just a magnitude
off from the minimal expression for any large
application which Photoshop is roughly 3 million
lines of code last time I counted, which means
I think you ought to be able to do express
all of Photoshop in about 300,000 lines of
code.
Am I off there?
Sorry, 30,000 lines of code.
I'll let you guys read the rest.
I'll pretty much close it in.
So, if you guys want more information, I've
got some links up here.
We've got our open source site, all the example
here, the example application that I'm running
this through, and I can give examples all
day of the system.
It's available on opensource.adobe.com.
You can download it; all the code is there.
Stepanovpapers.com, Alex Stepanov, if people
know who he is, the guy who created STL.
He works for me on my team and we have a website
just for his collected works.
One of the things we're working on is a book,
kind of bottom to top here.
These are the drafts of the book.
The first failed draft, second failed draft
and the draft we're on, so we will get this
thing out one of these days.
But please read them, give us feedback.
We're specifically most interested in feedback
on the talk which is our current attempt.
This is the book that I think Alex should
have written when he release STL, well, almost
15 years ago now.
People look at STL and think it's a nice container
and algorithm library and not it's a different
way to think about code.
So the point of this book is to explain what
means to do generic programming, not to teach
you about how to use STL.
That's it.
Adobe's tag line.
Question.
>> What concept that here's long programming
that [INDISTINCT] address much here and also
that currency?
I mean, does it work in [INDISTINCT]?
Are there any problems?
>> PARENT: So...
>> Repeat the question or use the microphone?
>> PARENT: Sure.
I can--I can repeat the question.
Yeah, so the problem of concurrency, we've
done a little work on concurrency usually
when we're stuck on something else.
Our two side projects have been--have been
threading issues and Unicode issues.
I think the same basic techniques apply to
concurrency meaning that successful systems
that I've seen that are concurrent are written
basically in one of two ways, either in a
structure similar to a Unix pipe system which
is basically flowing data out into a dag and
at each node, you have a cue and the cue provides
enough elasticity at the join so that you
can wait until you have enough data to progress
along the dag.
The other systems are ones where they are
algorithmically parallel where I can do, you
know, a reduction algorithm, things of that
nature.
So if you look on kind of the algorithm parallelization
side, there's a lot of work that's been done
on, you know, there's the staple project at
take--Texas A&M which is parallelizing STL
algorithms.
I'm drawing blank on an example but there's
many out there, especially in the domains
of linear algebra, image processing and we
have a fair amount of work going on at Adobe
in those domains.
On the how do you structure an application
into a sequence of processes, you know, basically
even though maybe they're threaded, they're
not full processes but how do you go about
building an application like that and describing
the structure much as you would, say, on the
command line of a Unix shell to stream together
a bunch of--a bunch of Unix tools.
How would you build complete applications,
piping a bunch of components together?
And that's where my team is most interested,
and I think there's some good work to be done
there.
So I don't know if we'll ever get to it seriously
but that's my thought on it currently.
>> I see you're trying to mold STL as the
generic code right now, [INDISTINCT] ask you
then.
You didn't actually use any [INDISTINCT] to
your solution.
I'm just wondering if there's a reason for
that.
>> PARENT: Yeah, the code under the hood certainly
uses templates and the generic programming
language itself is.
>> [INDISTINCT] language in terms of a bunch
of [INDISTINCT] things, the C++.
>> PARENT: Oh, do it as like a metaprogramming
language?
It's not part of our goals.
One of the guys at Adobe has done that for
the layout engine.
There's a--there's a domain specific language
done as metaprogramming in C++.
Our goal is to get the mathematics correct
to describe what the computational structure
of these systems are.
And give a generic description of what it
means to solve one of these systems; not to
do C++ metaprogramming.
>> Well, if you recall, the [INDISTINCT] if
you--like are you trying to convince people
that C++ is a good generic programming language
if you're not even using that in your model
library?
>> PARENT: Well, all the libraries are implemented
in C++.
Do I think C++ is a good generic programming
language?
I think C++ is the closest approximation that
we have to a good generic programming language
and getting better.
So, you know, Alex wrote STL in the Ada long
before he wrote it in C++.
And it's been written in--portions of it in
Scheme and Java and C and we still tend to
product type codes, sometimes in Scheme.
So, it's the ideas of generic programming,
supersede what language you express them in.
You know, typically, what we do is we work
out what's the generic description of this.
What are the axioms and what is the algorithms?
And then you go through the painful process
of how do you map that into C++, you know.
What I want is to just be able to write a
type function and what I have to do is write
traits classes and partial specializations
to get them to dispatch in C++ because I can't
just write a function in the natural terminology
that manipulates types.
So, yeah.
So--and you had one more point there which
was--which is are we trying to convince people
of this?
I've got nothing to sell.
I mean, we're a very small team.
Everything we do is open source.
The reason why we open sourced it is because
it's interesting problems and I'm interested
in collaborating with other people.
We do a lot of work with Indiana University
and Texas A&M.
I see a general trend in this direction.
I think eventually, this approach wins out,
you know.
By this approach, I just mean a mathematical
approach to programming.
I don't think we can continue to treat programming
as an approximation.
Our systems are just getting too big and too
complex.
And without perfect approximation, it just
means more and more and more failures.
So...
>> What kind of bug rates have you seen with
this approach and are the--are the bugs easier
to solve or are they harder?
>> PARENT: So, we're seeing pretty much zero
bugs so far.
The ones you have, your description out there.
The challenge is getting the model written
in the first place and the system right now
does a fair amount of checking, not quite
as much as I would like, but it does enough
that it tends to bitch at you.
So, the whole reason for coming up with a
way to draw the picture, and eventually we're
going to wire that in with graphics and use
dots so you can--the system can automatically
draw the pictures for you, is that--is that,
you know, if the system bitches and says there's
a cycle here because you over-constrained
the system to finding 200 relationships; you
need some way to figure out well, what did
I do wrong and how do I get it right?
So, there tends to be more thought process
up front and it's more difficult to get kind
of anything working.
You know, once you get it working, you're
done.
Here it comes.
>> So, back to the question whether C++ is
a good language for generic programming.
>> PARENT: Yeah?
>> You talked about the inadequacies in C++'s
support for concepts and especially about
the goal of expressing concepts at run time
not just at compile time.
What is missing from Haskell-type classes
that would make them useful as a representation
of concepts?
>> PARENT: I would go and read--let's see,
what is that?
Doug Injako and that crew did a survey paper
not too long ago that covers that in far more
depth than I could give it justice here because
I'm not much of a Haskell programmer.
My opinion is I would program in Haskell if
I had a reason or some--you know, whatever
language worked the best if I had a path to
deploy that in products.
To fund my team, I have to ship software.
And right now, that means I ship in C++ so
I'm not a religious zealot when it comes to
C++.
I do think it's the best language out there
for concept-based programming but...
>> So when you're making UIs, like, as a programmer,
you sort of already naturally think of it
as a single when you're designing, you know,
like it's [INDISTINCT]
>> PARENT: Correct.
>> So [INDISTINCT] to you and your newer model
language for describing it.
But have you tried complying it to, like,
[INDISTINCT] that aren't already compliant?
>> PARENT: We've done somewhat wacky things
with it.
We've used the modeling system to describe
overload sets for functions in C++.
So you described the relations on a set of
parameters and the system will go through
and iterate what all the possible overloads
for that function are that have any meaning.
We've used it to, well, the layout engine
which I didn't talk too much about here.
We've used the layout engine which usually
does 2D UI layouts to describe 3D worlds and
the same thing that connects baselines, made
sure that the you had paths between rooms
that, you know, you didn't end up with completely
walled in spaces.
The solver here has been used for doing dependency
tracking for installers.
What else?
Somebody at Apple used it for doing music
mixing.
And the graph solver itself is fairly independent
of what the equation is on the node; it doesn't--it
doesn't care what the function is on the little
circles, just that things are connected.
So you can plug whatever you want in there.
Some people have used it for music processing.
You know, how far--how far it goes?
I don't know.
I mean, we're-- from the--from the UI standpoint,
we solved a good chunk of the problem right
now but we still have little pieces that are
hanging out that we're focused on, not figuring
out more domains for it.
>> Well, does this approach lend itself to
maintainability?
I mean, you said the main problem is coming
up with the model
>> PARENT: Yeah.
>> Now when we want something else or changes
a little bit, do you already think about that?
>> PARENT: Yeah, it lends itself pretty well.
The interesting thing is developers have a
hard time changing the way they think.
Okay?
So what tends to happen is frequently, the
request comes in and says, well, the UI designer
says they want it to work this way.
And that might just might be a view behavior
on the outside that has nothing to do with
the model, the model itself doesn't change.
And getting developers in the mindset where
if it does come and impact the model, I mean,
this is a model of parameters to some function
and you--and the goal is to try to make the
model as complete as possible and that gives
the UI designers much flexibility as they
want to put a big UI on it or a little UI
on it.
You know, you don't have to expose everything;
you can expose whatever sets that you want
so long as the model is consistent and correct.
So the system tends to be fairly easy to maintain.
What tends to be hard is recasting the problem
into what is--what is the relationship here
that is being asked for.
How do I express that?
And since what I'm modeling is typically a
set of parameters to a function or a set of
properties in a document, going from that
relationship to say, well, does that imply
about what the function has to do?
These things just don't exist externally.
Okay?
So you have to, you know--I'm going to go
change the behavior of this function and what
it does and then I'm going to go express that
in my model and that's--it's just a different
way to think about software.
I mean, my experience so far has been the
more years somebody has spent doing object-oriented
programming, you know, myself included, you
know, over decades worth of object-oriented
programming, the more trouble they have expressing
things in terms of the relationships within
the model as opposed to how events are flowing
through the system.
So you know, one typical example, people will
say, well, I want this checkbox--when this
checkbox is checked, I want this thing to
be disabled.
Okay?
Well, what it means, when something is disabled
in the system is that it's disconnected from
the output, that no matter what value you
put into that cell in the model, it could
have no effect on the result.
Okay?
So there isn't any way to say disable this
thing when I click this checkbox.
What's implied by that request is that there's
some relationship to that checkbox that tends
to disconnect that cell from the output.
What is that relationship?
Write that down and the UI will just work.
Okay?
So, are there questions?
I've got, I don't know, three minutes.
I can give more demos or ask--answer a question
or...
>> Have you had any interests in your QA team
using a model as sort of a [INDISTINCT] that
they can work off?
>> PARENT: Yeah, the QA team likes the models
because it gives them, you know, a very concise
spec of what's going on.
The other thing is we have a set of just command
line tools that you can check the models independent
of the app.
And it's also nice.
You can, you know, we've got a little sample
app here, so you can bring up a model, you
know, they'll function attached on the back
end so I can bring up, like, the complete
image size from Photoshop with all the 14
widgets and all the connections between them
and fully test the thing without launching
Photoshop.
Okay?
And from a UI designer standpoint, it means
that I can play with my UI and say, oh, I
want three panels instead of two panels and
I want a reveally widget and I want a slider
here and never have to touch the code.
So they like that.
The one thing that--where we do have tension
with QE is QE is used to working in kind of
a puppet string world where, you know, a lot
of the automation test tools will, like, reach
into the UI and say what's the state of this
checkbox?
And in our system, our checkbox don't have
any real, intrinsic states, they're simply
a view of the underlying model.
So you can ask the model what's the state
of this cell but it's--it doesn't make any
sense to ask the checkbox what its state is.
And our automation QE team has a hard time
getting past that hurdle.
They're like, well, you know, what if the
model is correct but the connection to the
UI is wrong?
How do we test that?
So I think that's just friction that will
fold out over time.
>> The--90% of the computer scientists that
[INDISTINCT] how do they fit into this new
model with a program?
You think this is going to be easier to teach
them or you think it's not as important as
you [INDISTINCT]
>> PARENT: Yeah, so I think eventually you
end up with a shift in what it means, you
know, for people to be programmers.
I think assembling collections of, say, generic
algorithms is an easier task than writing
the generic algorithms to begin with.
I think even, you know, constructing these
models for a system is somewhat easier.
I can show you.
It's a hundred lines for the image size in
Photoshop and it's way easier to write that
hundred lines than it is to write the 6,000
lines of code in Photoshop, and Mark can probably
attest to it.
And the 6,000 lines is also wrong and I can
prove to you it's wrong.
>> I didn't write that code.
>> PARENT: I wrote one version of it.
So, I do think it makes it easier, but I think
what you want is a shift.
I mean, at this--at this year, at this point
in history, I think it would be great, you
know, if you could get a PhD by trimming a
couple of cycles out of one of the STL algorithms.
That should be, you know, get you a PhD.
Coming up with a new algorithm that's generally
useful certainly should be worth a PhD, and
yet, right now, very few people--you know,
we don't have a massive library of this stuff.
Very small libraries and everything is very
primitive.
And my team spent the last week and a half
working on a formal definition of integers
because math--the mathematic definition of
integers doesn't hold-up real well when you
integers of a fixed range.
Right?
We can define Z but we can't define what Z
over range of zero to N is.
Because N is Z if you...
>> [INDISTINCT]
>> PARENT: What's that?
>> [INDISTINCT]
>> PARENT: Z sub N which doesn't quite capture,
say, signed arithmetic.
It works pretty good for twos complement unsigned.
So, also, Pno axioms which is how mathematics
are defined are in terms of successor.
So, you start with zero over one and then
you add you one and you can build up all of
mathematics with just successor with basic
increment.
From a computability standpoint, that's very
unsatisfactory.
Right?
We don't--we don't add two numbers by one,
two, three, four, okay?
It just doesn't work that way.
So, coming up with kind of re-casting mathematics
into axioms that apply to computer science
is a part of the work and, you know, eventually,
these should be textbooks and, you know, we
need our Oilers and our Euclids and those
folks.
Any more questions?
Is it on?
>> Yeah, this is back on.
So, thank you, Sean.
