- Hello, I'm Jean-Louis
Leroy from Bloomberg.
Welcome to my talk on open
methods and my library, yomm2.
Since we have only half an hour,
I am going to take question
but at the end, please.
So, lets start with
a typical case study.
You want to create a representation
of an arithmetic
expression and evaluate it,
you want to create an
AST and manipulate it.
So, typically, you can
write an abstract class node
that has a pure virtual function value,
that returns the value of the node,
and you derive several concrete
classes from the node class
and override the value function.
Here is number, which
simply returns the value
that is stored in the number.
And you have Plus, Times,
and so on and so forth.
This is a very very common
design, very reasonable.
And then, you can now build trees
and evaluate them by calling a value,
a value function on expression.
And, it works nicely.
Now, let's suppose that you
also want to render the tree.
For example, in Reverse Polish Notation,
like in a Forth language
or HP calculators, OK.
So, we want to write something,
call a function, or what not,
that is going to render
the tree as a string,
and then we print the value.
So, how can you do that?
Well, there are several options.
The most natural in C++ is to, simply,
plant a new virtual function
in the node as type class.
Let's call it toRPN,
and override it in a subclasses.
So, that going work well in a small,
in a self-contained application.
But if you start reusing your hierarchy
in several applications,
you're going to run into a problem.
And someone has described it as this.
The problem with OOP is
that you wanted a banana,
but you also got the gorilla,
and then the entire jungle comes with it.
And, indeed,
imagine that you want
and you now use this,
this hierarchy
in an application that
does not render strings
as Reverse Polish Notation,
maybe as infix notation.
The toRPN function is still
going to get pulled in,
because that's the way virtual
functions are implemented,
tables of pointers to functions.
And even worse, since
toRPN returns a string,
you're going to get the
string class as well.
And that maybe your application
doesn't need to render the tree,
maybe it doesn't even
need the string class.
So, this is quite a big problem with OOP.
Then there are alternatives.
For example, you could use a type switch.
You know, you use dynamic cast
to try to cast the Node-based class
into all the possible concrete subtypes.
And you do what's right,
depending on a result.
The problem with this is that
it's not very maintainable.
Each time you add a new subtype to Node,
you have to update the type switch.
And it really hinders possibilities
of reusing the hierarchy
across different applications,
because applications may
add their own Node subtypes.
Then we can use patterns like Visitor.
So, if a sole library
designer you design you class,
so that it can be, you
know, the hierarchy,
so that it can be, you
know, used in many context,
you may provide Visitor.
I am not going to go into the details.
You all now what that is.
There are many ways of
implementing Visitors,
as it recurs by itself or not.
So, after quite a lot of work,
now the user of your library
can accept class Visitor
and implement rendering as an RPN string.
So, now we can add behavior,
a user of your classes can
add behavior to the hierarchy.
But we, we have the same
problem as with the Type switch.
What if someone adds new subclasses?
So, once again, this is
not really extensible,
and it's very ugly, in my opinion.
And then there are some,
thus yet another solution,
you could use a function table, you know,
you create a map indexed by
the Type index of the Node,
so you use Type ID to
retrieve the Type info,
convert it to a Type index
and then you use that as a key
into a map
whose values are functions to call,
depending on a Type.
And you can even initialize that map
using static constructors.
And, actually, this is not that bad,
because this is extensible, right.
Your application and my application
can add new Node subtypes
and new functions.
So, it's not that bad, except that it's,
it's very verbal, you have
to do the casts yourself,
and it doesn't know
about inheritance, right.
You cannot give a function that
will work for any Node type,
or you have to cover
all the possibilities.
OK, so let's take a little survey.
Who likes the first option,
plant a new virtual function?
Raise your hand.
Oh, gosh, nobody likes it, OK.
Who likes the second...
Oh, you do.
Who likes the second?
No. (chuckling)
Who likes the second option, Type switch?
One who likes the Type switch.
Who likes the Visitor?
Oh, that's a few.
And who likes the function table?
Ahh, good, this is encouraging.
Because this is extensible, right.
We have been, actually,
have playing with something
called the expression problem.
So, the expression problem is this.
You have a family of types,
and you have behaviors.
In C++, typically, were expressed
that as a class hierarchy
with virtual functions and
virtual function overrides.
And the expression problem is
that you want to add new types
to existing behaviors,
and in C++ it's easy,
you, simply, derive new classes
and override virtual functions.
But it gets much more difficult
when you want to add new
behavior to existing types.
Functional languages are better at that.
But, usually, you want
both, and, actually,
no, I gave you the example of that little,
you know, toy, expression evaluator.
But if you start, you know,
writing bigger programs,
especially with multi-layer architectures,
where you have a domain
layer, and above it,
a presentation layer, and
below it, a persistence layer.
Well, you want to vary
the behavior, you know,
the presentation depending
on the dynamic Type
of the objects in the domain model.
And that's how you sometimes end up
putting virtual functions in
a domain class that returns,
I don't know, a widget or a dialog.
I've done that at one point, it's ugly.
So, actually, the expression
problem you offer,
it's quite a common problem.
Now, there is a great solution
to the expression problem.
And that is called open methods.
Open methods, we're going to look at them,
but it just virtual functions
declared and defined
outside of the hierarchy,
yomm2 is a library that implements
support for open methods.
Maybe some day C++ will have
them as a native feature,
it has been proposed in the past.
But in the meantime, let's look at
what it's like using my library.
First, there's a bit
boring boilerplate to do.
So, you have to include
the library's header,
which comes in two versions,
the cute header gives
you in your case names
for a series of macros.
This is what I'm using here.
And you have to tell
yomm2 about your classes
and the inheritance tree.
And, yes, multiple and virtual
inheritance are supported.
And before calling any open method,
you have to call a function
call at that method, right.
Now, let's solve that
RPN rendering problem
using open methods.
So, we have that declare
method and what follows it
looks very much like a
normal function declaration.
These macros create an
ordinary inline function
that is going to, well,
to do the right thing.
Below I have two other macro calls,
which are define_method
and define_method provide
a specialization of toRPN method,
one for Number, one for Plus.
If we look again at declare
method, we see that,
what does it say, it says
that we declare a function
called toRPN and return
to_string and that takes a Node.
That I have decorated
Node with virtual bracket,
this says that when selecting
one of the specializations,
we take into account the Type,
the runtime Type of the Node.
So this is just like virtual functions,
except that it doesn't belong to a class,
it's not a member function,
it's a free function, OK.
And you call it like an ordinary function.
OK, this is the key slide.
This is a feature.
We can add polymorphic
behavior from the outside.
So, this is virtual function
from the inside of class hierarchy.
This is virtual functions
outside a class hierarchy, right.
This is closed.
Open methods are open, we
can add as many of them,
your application can add them.
It's the application's business to decide
how to render the tree,
how to print the matrix,
how to let the user
edit some domain object.
With this you get the entire jungle,
with open methods you
just get the banana, OK.
That's all there is to it,
well, almost.
So, you know that value function,
it's, you know, it's a natural,
but let's think about it again.
The value function, the
value virtual function,
it's just another way of
rendering the tree as a number.
This screams interpreter.
You know, if it was not a toy,
it was a real AST library,
and we are trying to use it in a compiler,
I would have no need
for the value function.
So, once again, value, in
fact, it's just like toRPN,
it's across getting concern,
the tree classes should
just do their tree business.
And value is processing of
the tree, just like toRPN.
So I argue that, in fact, even value
should be an open method,
outside of the class hierarchy.
At this point I'm often asked,
does it support multiple dispatch.
Yes.
So, what is multiple dispatch?
Sometimes, it's convenient to
select the right thing to do,
the right behavior,
depending on more than one
virtual argument.
A typical example is matrix addition.
If you add two matrices, in general,
you just add all the elements,
but if you know that both
matrices are diagonal matrices,
and if you know that only at runtime,
then you want to pick a better algorithm,
you just want to add the diagonals,
and return on your diagonal matrix,
which will store on either diagonal.
This is an example of a multiple dispatch.
It's something that you can
implement using double dispatch.
It's a well-known technique, yeah.
Well, sometimes, it's convenient
to have more than two virtual arguments.
For example, in text-oriented
adventure games,
you have situations where a
character fights a creature
using a device.
And, if the character is a
human and a device is an ax,
or a magic wand, he cannot
use it, but if he's a warrior,
and he's trying to use
an ax, he can use it,
because he's agile enough to wield,
but if he tries to use
it against a dragon,
it ends badly, right.
This is an example of a multimethod
that uses tree virtual
parameters, so that happens too.
Well, good luck implementing
that with triple dispatch.
And the syntax using yomm2 is very simple.
You simply use the virtual decorator
on more than one argument.
So, the only requirement
being that, of course,
the classes on which you
use the virtual specifier
have to be polymorphic types,
they need to have, at
least, one virtual function,
probably, the destructor.
And
how does it select,
in presence of multiple virtual arguments?
How is the function to be called selected?
Well, I don't need to tell you.
You already know the rules.
It's exactly the same rules as
overload function resolution,
except that it happens at runtime.
It's also exactly the same rules
as partial template specialization.
So, you know the rules.
OK, I'm going to
skip over next.
Next is a way of calling the
function in a base class,
it's the equivalent of
calling the function
that has been overridden
I prefer to say it sometimes.
OK, performance.
Open methods are, at least
in this implementation,
almost as fast as virtual functions
supported by the compiler.
I have a benchmark here,
you know, benchmarks,
Pff, you take them with a grain of salt,
but, according to my measures, in a case
which is not so favorable to open methods,
calling an open method
with one virtual argument,
which is the exact equivalent
to calling a virtual function,
is within 16-20% slower
than a virtual function.
So, it is very fast.
Inside
and tomorrow I will give a
talk about what goes on inside.
It uses dispatch tables and
indexes, integer indexes,
just like virtual functions.
And here is the call which is generated
when you call an open method
with one virtual argument.
Now, I split the call in two paragraphs.
If you look closely, the
instructions in the first paragraph
are independent, meaning that
they can execute in parallel.
So, if the body of your function,
that's any bit of work
you won't be able to see it.
So don't worry about the performance.
That almost as fast, and
in practice, as fast as
normal virtual functions.
OK, now let's do a bit of philosophy.
Is this still object-oriented programming?
The history of OOP is quite interesting.
And, you know, it started with Simula
which was the major source
of inspiration for C++.
And the first language
that really popularized OOP
was Smalltalk.
And those languages come
with a nice metaphor.
Objects send messages to one another,
and each object reacts to the message
according to its nature.
You kick a dog, it barks.
You kick your pit bull, it
barks and it bites back.
Around 1985 another path was
explored by the Lisp community.
And it created a common
list object system.
CLOS or CeeLOS has open
methods, has open multimethods.
And, in fact, when you think
about it an open method
or an open multimethod,
it's much more like a rule,
it's not about sending messages
and seeing the world in terms of objects.
It's rather selecting the
appropriate algorithm,
depending on the type of
one of the real arguments.
Once again, here in this approach,
algorithms, function take by front stage.
That is what Stepanov has been
saying about the STL, right.
Algorithms that matter, and then,
then, if needed you split
the world into objects,
but sometimes it's simply
not the right approach.
So, in my opinion,
this is not really OOP.
So, recently object-oriented programming,
well, in the last 10
years, fell out of grace.
It's really revived now.
But I think that we shouldn't
throw the polymorphic baby
away with the OOP bathwater.
I know that template
metaprogramming and so on is cool,
but you're not going to
write Excel at compile time.
On a side note, you know, back
to virtual member functions,
have you ever realized that,
have you ever noticed that,
you know, if you want polymorphism,
you are forced to use, to
make your function a member.
You know, there was that little
algorithm by Scott Meyers.
He said, make your
functions free functions,
and then he gave you
an algorithm to decide
whether a function should
be a member function.
And a very first rule was bizarre.
Basically, it said, if
it needs to be virtual,
make it a member function, right.
It's a good advice.
It's, simply, admitting that
we have no other choice.
And if you want to be a virtual function,
you have to be a member and, snap,
you get access to the
objects' private parts,
when most of the time you don't need it.
That's what so strange about OOP,
the C++, Simula, Java way.
It's that polymorphism and
encapsulation, they're, you know,
they're like Siamese twins
for no good reason, they're,
if you want one, you get the other.
Well, not anymore.
Now you hae that option, open methods.
It has been proposed for
inclusion in the language
as far back as 2006.
Bjarne Stroustrup and two of his students
who wrote several papers about it.
So, I engage you to read them.
I hope that we get them
in the language some day,
and we can retire
yomm2.
But, in the meantime, you have this option
and as you've seen, so,
let's take a survey again.
So, what do you think of
solving that RPN rendering
using open methods?
Who likes it?
Oh, splendid.
OK, so, tomorrow at half past 12,
I'll give a one hour presentation
describing the techniques
that I use inside yomm2.
Some of them are interesting.
You can use them to do
other things, you know,
of the same kind, you know,
extending runtime-type information.
So, you are welcome to come tomorrow,
and now we have eight minutes
for questions and comments.
Yes.
- [Audience member] Is
this ALP, like I remember
in previous life and the previous stack,
in the middle of 2000s
aspect-oriented programming
came out was, you know,
kind of, it seems like that.
You mentioned cross-coming concerns,
not sure your numbers continue.
- OK, so the question is,
is it similar to
aspect-oriented programming?
- [Audience member] Yes.
- OK, so,
I would say no.
Aspect-oriented programming
is a way of addressing
the expression problem.
Earlier today we had a
great talk about dynamics,
which seems to me
more similar to
aspect-oriented programming
non-open methods.
You know, they're just tools, I'd say.
It's good to have many
tools in your toolbox.
Yes?
- [Audience member] You
mentioned it's like rules,
and it made me think of
like Matchers and ASKL,
and stuff like that.
Is that what this is more similar to?
Like you write in Matcher?
- So, the question is,
is this similar to Matchers, to,
maybe you have in mind
open pattern matching?
- [Audience member] Yes.
- OK, so Yuriy Solodky wrote
a great dissertation about
several approaches to
the expression problem,
and one of them is open pattern matching.
Well, he also considers
just pattern matching.
In my opinion, open is important.
If you're going to write
libraries, you know,
that was the big promise
of OOP, right, in the 90s.
It was going to be wonderful, modular.
They called it Software ICs.
You were going to take
classes, put them together,
and it would just work.
And it failed.
Right?
So, open pattern matching
is one of the solutions.
Yuriy Solodky gives benchmarks
about it, he says that
open optionality methods
are the fastest,
the fastest solution,
even so open matching,
open pattern matching gives you
other things like destructuring.
OK.
- [Audience member] So, you
talk mostly about open methods
and briefly mentioned multimethods.
If you had a magic wand
and could add multimethods
to C++ in any way you wanted to
and have the committee do what you wanted,
how would you go about doing it,
or would you go about doing it,
or would you just make
it a library feature,
like you've done here?
- So, if I could make the committee
accept open multimethods
in one form or the other,
would I put them in a library
or in a language?
That's the question?
- [Audience member] And
what would they look like?
- And what could they look like.
Well, Bjarne Stroustrup and his students
created an implementation
of open multimethods
as a compiler and linker extension.
If you want to know more come
to the Inside yomm2 talk.
And one of the point was
to prove that you cannot
implement open multimethods
as efficiently in a library,
and, of course, it is true.
So, if you want to know
what they would look like,
look at the papers,
they're very easy to find,
and they are very interesting papers.
What I wrote is mostly a library emulation
of what I believe should
be a language feature.
- [Audience member] Thank you.
- [Audience member] So I
noticed, you have to register
all the derived classes, or
some derived classes at least,
when you're using this library?
- Yes.
- [Audience member] You
don't have to register
all of them, right?
- So, the question is,
I have to register the classes involved
in open method dispatch,
but I don't have to register all of them.
Well, you could get away with
not registering all of them,
but
what matters how the
classes which are actually
specialized upon, right.
In our example it's Node, Number, Times.
But I would register all
of them, because, you know,
the idea behind this is to be open, right.
The idea is, especially if
you're writing a library,
you give the library and
that implements matrices,
and I am going to decide
how to render them,
how to export them to JSON or
per system, or what not, OK.
So, if you just cover
the subset of the classes
that interest you.
Your algorithm may not be as useful to me.
- [Audience member] I
guess, I'm not thinking
in terms of a library, it's more like,
if you have a base classes
very commonly used that has,
and you want to add some
sort of optional features
that might have a sensible
default implementation
- Aha.
- [Audience member] It would
make sense to only specialize
derived classes that actually use
a non-default implementation now.
And I was wondering, does
your library allow that?
- Yes, so the classes
that you don't register
will be completely transparent.
It reminds me that, in fact,
in all those general features,
because C++ gives us only
virtual member functions,
when you write a library,
you have a tendency
to put all sort of things in base classes.
And that is the harm
that OOP, traditional OOP
based on member virtual
functions has done.
It has given us Visitor.
It has given us God classes.
And this is a solution
to do away with them.
- [Audience member] Thank you.
- OK, we have two minutes.
- [Audience member] Maybe
you've said it and I missed it.
Where you able to access
private data members
in the open method or not?
- OK, so is there a way of
accessing private data members
from an open method?
- Yes.
- Currently, no.
In a previous version
of the library, yomm11,
there is a feature that allows it.
And if it's needed in yomm2,
I know how to implement it.
So, when you need it,
ask me, but you know,
it sort of defeats the purpose,
because the idea is that,
you can add functionality
without touching the hierarchy.
So,
if you,
at one point I will
probably add the possibility
to declare that an open
method specialization
is a friend of a class.
But when it appears, please, don't use it.
It's defeating the purpose, right.
The purpose is to have
classes and class hierarchies
that do one job well, for
example, the AST library,
that's know the tree correctly,
it has a complete public API
that allows you to manipulate trees.
And then the consumers of your library.
They will implement
functionality, behavior
using the public interface.
That's really the good way to go.
- [Audience member] Thank you.
- [Audience member] So, I imagine
if this became a language feature,
the syntax would be more convenient,
or using it would be more ergonomic.
Would you expect it to be a difference
in the runtime performance
between the library and the
language implementation?
- So, if open methods were
embraced by the language,
would the syntax be
different, would it be nicer?
Of course it would be nicer.
- [Audience member] I'm
actually more interested
in the performance.
Like I get the difference
in language feature.
- Oh, OK.
- You can make it,
syntax sugar also, can you make,
do it, would you expect the compiler
to be able to generate better code?
- OK, so, I think that
the syntax is, actually,
quite better about it.
Now, concerning the performance,
come to the talk tomorrow,
I'll go into details, but,
depending on what you accept to do,
you see, the problem with
open method is that it's that,
we know the complete set of methods
that target a type much later
than with virtual member functions.
At the earliest, it's
at link time, but then
you have to take dynamic
loading into account, right.
So,
I think that, unless you
resort on dynamically
modifying code, open
methods cannot be as fast,
but they can be very, very, very close.
The only things that differs, you know,
for one virtual argument open method
is that the position of the
function will be floating
inside the method table, instead
of being at a fixed index.
Come tomorrow.
- OK.
- OK, one last question, we have?
The three seconds, it has
stopped, we have forever.
(audience laughing)
OK, go ahead.
- [Audience member] So,
if I forget to register
one of my derived classes
or something like that,
and I try to define a method on it,
do I get a compiler error,
or does it just fail silently
and not call that?
- So, what will happen
is that at one point
library navigates from the
object and, more precisely,
the object's type ID to a method pointer.
So, it can be able to do that,
and in a debug version it will assert.
- [Audience member] OK, thank you.
- But we, we need them in a language, OK.
(audience applauding)
Thank you.
