You guys have a good lunch? Good.
Now you guys can take a nap, perfect.
So, my name is Andrew
Sutton, I am a professor
at the University of Akron,
I'm talking about reflection.
I know this is not the title
of my talk as promised,
this is a bit bait-and-switchy,
not as bad as the next talk,
which I will explain later,
but I'm gonna talk
about static reflection,
kinda how you implement it,
sort of language features.
This is definitely not a big ideas talk,
this is not a how you use it talk,
because the language features
simply aren't mature enough
to generate really good examples
to kind of explain the domain.
So I'm talking about
my work that I've done
for the past two years.
But before I start, University of Akron,
that's where I work, nice place.
Apparently we will be having
an e-sports program sometime soon.
Competitive e-sports? I don't know.
If you guys can see it,
there's a little gold star
that's where Akron is if
you've never heard of it,
it's in Northeast Ohio,
it's a lovely place.
Living there is inexpensive
as opposed to places
further east and west, near water,
and less likelihood of water
rushing in over your home.
This is, you guys might know
Akron from this company.
Good Year is based there, in fact
that is actually flying over Akron.
The university is in the bottom
left corner of the screen,
my building is just cut off unfortunately.
But, nice place.
Akron is also the home of Lebron James,
which I have to add because we now have
the Lebron James Family
Foundation College of Education,
I don't actually have to add
that, I'm just a Cavs fan
and we just signed Dwayne Wade
yesterday, so that's awesome.
If anybody follows
basketball, it's big news.
Okay so on to C++.
So I got started in this work
with herb on metaclasses,
and I admittedly was not out
of the goodness of my heart,
I actually had a plan which
was to take Herb's ideas
from metaclasses and apply
them to another project
that I was working on,
which has nothing to do
with compilers, which is in
fact software-defined networking
which compliments my work
with programming languages,
compilers really well.
But I figured it would be worth a shot
and I thought this was
a good way to kind of,
not do front-end languages
or multiple compilers for,
say, packet decoders,
packet generators, whatever.
And that we could maybe leverage that
to generate faster and
more efficient and safer
networking protocol
limitations, or at the best,
or maybe ideally,
fuzz-testing capabilities
for arbitrary packets based
on a single C++ program.
That seems like a really nice idea.
Unfortunately, there are
a lot of language features
that you have to implement
before you get from
that point to the point
where you can apply it.
This is kind of the tip of
the iceberg for that work.
So, in order to generate code,
you really need reflection
because you have to analyze
the code you want to generate
so reflection of course
ends up being the sort of
tip of that iceberg, and so
that's what this talk is about.
So really briefly, the
obligatory overview slide
that I typically hate
giving, but whatever.
So what is reflection?
Important thing to know.
We gotta talk about static
and dynamic reflection,
I shouldn't just jump into explaining
what static reflection
is without explaining
why every other attempt at
reflection is less than good.
And then basically the rest
of the talk is going to be
how you do this in C++ which
is an interesting question.
And if you think it's easy,
then you have not thought
about how to do this in C++.
So, what is reflection?
Well, basically it's a
set of language features
that allow you write algorithms
that consume your program
as data, and then do stuff with it.
In the very broadest
sense of the definition,
that's really what reflection is, kinda.
It's that introspection
part, being able to use
your program as data.
The other side of that, kind of,
is you know doing stuff with
those other programs, right?
So really the end goal of all
of this is metaprogramming.
Like we wanna be able to write metacode
that consumes the program
and generates a program
hopefully within the same language,
hopefully without a metalanguage,
but reflection is really the
entry point to all of that,
reflection enables
metaprogramming, and it informs it.
Alright so in the most broad sense
that I could think to define this,
reflection is basically
just a set of operators,
it translates between two domains.
One domain is a set of source
code that you might have,
the other domain is some set of values
that encodes that in a way that you can
use it in a program as data.
I am going to talk about different ways
you can encode this information later.
So the two kinds of operators we have,
they go back and forth
between these two domains,
a reflection operator
takes a source code entity
or construct, I'm gonna keep
using the word construct
because C++ has a very firm
definition of what an entity is,
do you guys know what an entity is in C++?
- [Audience Member] A thingy.
(laughter)
Did you get an early copy of my slides?
I think I actually use
the word thingy somewhere.
(laughter)
No. A macro is not an entity in C++,
a function, variable, class, template,
like those things are entities.
Usually things that we can
put a name to are entities,
although we have a couple more things
like objects and values,
those are also entities.
Is an expression an entity?
It is not. Is a statement an entity? No.
So I invented the word construct,
or co-opted the word construct
to kind of build a
bigger set from "entity".
So then we have these
other set of operators
that go on the other direction which are,
you know I have this value that encodes
some kind of a construct, I
want to do something with it.
And I am deviating wildly
from whatever terminology
might exist in literature today,
I'm just calling these Action Operators
and I'm using the word
operator very loosely also,
it could be a function it
could actually be an operator,
it could be a method of a class,
but it's something that invokes
a behavior that's very internal
to what the language is.
How many of you are familiar
with Java and Java reflection?
I don't know how to feel about
that. That's too many hands.
So in Java, you can look up
the definition of a class
via the class floater,
the fact that I know this
is almost embarrassing, you can query that
for a particular set of methods,
that's the reflection part of it, right?
And then you can call this
function called Invoke
you package up a bunch of
arguments, you call Invoke,
and that does something
very langaugey right?
You can't actually see what Invoke does,
but it pushes a new stackframe
allocates a new set of locals
copies all that information there,
if it can, maybe throws an exception.
So that's a language-oriented action
that you take from a reflected value.
Then in the code generation, same thing.
I haven't written C
sharp in many many years,
but I understand that you
can actually generate C sharp
on the fly from reflected codes,
so you have some action
associated with these values.
I'm gonna change that word
action in a little bit.
So every useful metaprogram
roughly has this kind
of behavior, this kind of strategy,
you're gonna get some reflection
of a particular entity,
you're gonna take some source
code construct that you have,
you're going to package up this data
so you can write an algorithm against it.
You analyze the
reflection, you look at it,
you decide some properties
of that particular thing,
and then based on whatever you
see, you perform an action.
This is basically how
we teach CS 1 students
to write programs, although
we never use the word "meta"
but you know, same concept, right?
So if I wanna write a two-string function
for enumerations for example,
I actually have like four steps in this.
Somebody give me an enumeration.
I go get the type, as
data, that's reflection.
I go get the list of
enumerators in the enum.
More reflection, right?
We're just getting data
about these objects.
We're might wanna search
that list of enumerators
for the value that we wanna
print in one way or another,
that's analysis, very simple algorithm.
Standard find would be great.
Return the declared
name of the enumerator,
that's an action.
So very simple formulas,
very simple recipes,
the only things we have
to concern ourselves with
are how do you get source code into values
and how do you get information
back out of those values,
or actions out of those values?
So a lot of languages
support dynamic reflection,
a lot of languages support
dynamic reflection.
And there are some, I feel
like I missed a slide.
Oh, well, yeah.
So a lot of languages
support dynamic reflection
it's not a zero-cost abstraction,
you end up paying for
that in a particular way.
So if you actually do have
to have dynamic reflection,
basically the ability to
do all of that at run time.
So you can ask for the value of an entity
at run time, you can
write a normal algorithm
against it, and all those actions end up
being some kind of run-time system
that performs that action.
So if you actually want to ask
for the name of a numerator
in a traditional language
that supports reflection,
you have to pre-cache
all the enumerator names
because if you don't that
data simply won't be there.
So it's clearly not going to
be a zero-cost abstraction.
It can't be, you have to keep
putting stuff into binaries
because of this.
It increases your binary overheads,
you lose optimization opportunities,
unless you're going to
be very aggressive about
compiling expressions into code,
that's a completely different discussion,
although one that's
particularly interesting
in the context of programming languages,
but if you opt in to a
language that does this,
it is almost impossible to opt out
of the reflection requirements.
It's not possible in
the case of Java at all.
That's actually how
you translate the code.
You generate all that
information you need,
because that's how it works.
So why, if there's gonna
be all that much overhead,
do you actually want dynamic reflection?
Well I mean I guess
there's some good reasons,
dynamic method invocation
or look up invocation,
that seems to be useful,
some interesting things
like runtime code generation,
C sharp does this I believe,
you nodded earlier, so yes.
Then you have some really
interesting questionable things
like self-modifying code,
I usually trust myself
to get a program right
the first time I write it.
I do not trust myself to write a program
that writes program that modifies itself
and remains correct after modification,
that scares me a little bit.
But I think the real reason
that all these languages
have to opt in to dynamic reflection
is that they simply don't
have the capabilities
to compute things that compile time.
Outside of things like constant folding,
you know if you write
3+4 in Java or C sharp
it's almost impossible
that any good compiler
would allow that to escape
to run the operation,
but beyond those simple
compile time computations,
these really simple
expression calculations,
you cannot do compile time
things in these languages.
Period.
And so anything that you wanna
do that involves reflection
has to be pushed down
to runtime decisions.
I always like to point this out,
I think this is a crutch for
bad generic programming support
in those languages, because
I worked on concepts
for six years and so of
course I have to say that,
but really if you look at those languages
and the way they dispatch things,
you have a generic algorithm
and then you write a bunch
of if statements that
kind of decompose that
so you can select better
algorithms later on,
so that the kind of
specialization based on type
must be runtime decisions
because there are not a lot
of languages out there
that can do if statements
that compile time, which is really weird
cuz it's not that hard.
So meanwhile in C++, not a problem.
We got lots of great facilities
for compile time computation
specifically, we have Constexpr.
And that other thing, templates.
So we have lots of facilities for this,
we have ways that you can
ask questions of types,
we can kind of ask
questions of declarations
if we're clever and we
work a little bit harder,
that certainly happens.
Concepts, which showed up in C++20,
and another reference
to concepts of course,
let us write queries
for how a type is used,
which is fairly novel.
As advanced as our
support for this stuff is,
we don't really have comprehensive
support for reflection.
We can't ask really interesting questions,
like no really what are
the members of a class?
Is this member public? Is
it private? Is it protected?
What constructors does this
class actually provide?
Not how can we initialize a class,
but what constructors
do you actually provide?
So the goal when we
approach metaprogramming,
one of the requirements as
we approach metaprogramming
is the ability to ask those questions.
You have to be able to do that.
If you can't do it, we can't
write meaningful algorithms.
Hopefully I just kind of
answered this, really,
but why would we, I guess I didn't.
Why would we want to add a reflection,
like comprehensive support to the language
for static reflection?
Is it because we can actually write
these interesting queries?
No, not really.
It's so that we can write
better metaprograms,
so we can generate better code.
The reason that we wanna do this,
for me it all boils down to boiler plate.
It all reduces to boiler plate.
The goal should be to write
less boiler plate code.
If we can find some way to opt in
to the generation of a lot
of different facilities
based on a few parameters,
or a few simple indications,
that would make me very happy.
The day that I don't have
to write operator, equal,
not equal, and less and
greater and whatever
I can just provide eq. and less
and not use the curiously
recurring template pattern,
that would make me very happy indeed.
Anyways, so static reflection.
The static version of normal reflection.
Same idea, right?
So we still have a set of operations
mapping between constructs,
in the case of C++
we are literally mapping
between constructs
plus expressions and statements
back to some kind of value.
How you encode that value,
again, is really interesting.
That's going to be the
remainder of the talk.
We have another set of
operations or operators
that go the other direction,
I'm gonna call these
projection operators
because they're not runtime.
They don't actually need to do anything.
What they really need
to do is generate code.
They don't act, they just
turn back into syntax.
Reflection, abstractly;
source code entities
plus expressions and
values, I left that out.
Our reflection operators are going to take
some kind of name
identity, return some kind
of encoded value, the what did I say here,
so this is going to support
the ability to query
properties of a declaration,
that's important,
we haven't considered
deeply looking at reflecting
on every possible construct.
As soon as you want to talk
about reflecting on expressions
for example, like this is a
cool and powerful technique,
it also opens a slightly
different Pandora's box
especially when you get
into code generation
because now you have essentially,
macros but structured.
They're interesting.
We're not talking about
doing everything in one shot.
This is very much about declarations
and properties of declarations right now.
Projection operators come in
a bunch of different flavors,
and again abstractly I'm
going to talk about these
as kind of being one per part of grammar,
because that's really the
best way to talk about it.
It's the easiest way
for me to talk about it.
These projection operators,
they take an encoded
reflection value, something that you get
from a reflection operator,
and they give you back syntax.
They don't perform an action,
they actually give you back syntax.
So for example if you want to project
a reflection value into a type specifier,
you have to generate what
that type specifier is.
Or if you're writing a
compiler for example,
you would just generate
a type in your language.
But it fits into that part of the grammar
wherever a type specifier is required.
If you want to project
and reflect the entity
as a nested name specifier, you know,
like when you write
"standard::experimental::"
that's a nested name specifier.
So if I have a reference
or if I have a reflection
of standard::experimental, I
should be able to take that
and project that as a
prefix for whatever follows.
So we're generating
syntax from these things.
And again, we can do the
same thing with value
and name projections right?
So if i have a reflection of a function
I might actually want the
value of the function.
What's the value of a function?
Anybody wanna guess?
Don't say an overload set.
It's just a pointer to the function,
something that would let
you call it for example.
If I have a reflection
of a global variable
or a member of a class,
the value of those things
would be a reference
to the global variable
or a pointer to the member of the class.
So we have some way to
kind of traffic between
this abstract data or, not abstract,
but this data that we use to
represent these reflections,
and the grammar that we
use to write our code.
Name projection is one of my favorites
because you can take
the name of a function
and just generate a new
unqualified ID for lookup later on.
So there's lots of ways that we can move
between these two domains.
Lots of ways we can move
between these two domains.
But it's not really
general-purpose code generation,
we're not talking about
taking these things
and generating new declarations, not yet,
that's tomorrow morning's talk.
So there's a bunch of
committee work going on
and I believe this list is
probably a little bit incomplete
but the work actually
started with P0194 and P0385,
385 gives motivation, P0194
there's a series of these
that give you a
specification for reflection.
This was done by Matus
and Axel and David Sankel,
I see David over there, put his hand up.
Thank you, David.
There's type-based
reflection that Herb and I
have worked on for the last year,
there's actually two flavors of that,
depending on how time goes I
will talk about the second one.
And then we have a bunch of
additional direction ideas
from P0633 by Daveed
Vandervoorde and Louis Dionne.
It should be worth noting that Daveed
had actually proposals for
metaprogramming earlier
in his career with a metacode system,
and I think a lot of
those ideas have shown up
in different format in P0633,
and I am not ashamed at all to say
we have co-opted many of them.
Encoding reflections.
Turns out there's a bunch of
different ways you can do this.
I'm gonna talk about them each in turn.
Maybe the most recent
work last, if possible.
We can encode reflections as types.
This is an incredibly powerful technique
but there's two ways you
can look at doing it.
You could take the reflection of an entity
and make that just a type,
or you could return an object
whose type encodes the entity.
And this actually ends up
making a big difference.
You guys will see this immediately
when I start showing you the difference
between the proposals.
The alternative is you can just
represent these as objects.
Which I might be able
to talk about depending.
Choice of representation
and the way that you do this
makes a huge difference in
the way that you go about
building your library, your
library-compiler interface,
how you use it, it is not an
arbitrary decision at all.
So, P0194. Reflection operator
in this paper is spelled,
oh is that right David? Oh, it's changed.
Just reflexpr, okay I will change that.
Ignore the dollar sign, there
were different versions,
I wasn't quite sure what was current.
So it's just reflexpr, so
you write reflexpr(int)
so you're using meta_int
equals reflexpr(int)
and meta_int is now
the type, it is a type,
a class that tells you what
you want to know about int.
This is not a particularly useful example,
unless you wanna know the size of int
which I believe you could
just write size of int
and get the same information.
The definition of the
class is unspecified,
these things are created
as needed by the compiler,
the specific properties
that the class exposes
are really defined by a concept,
so the concept says you can ask for size,
probably the name of the type
that seems to be important,
I would expect because it's a type
you can also ask if it's const
or volatile or reference?
That might be a different class.
So there are things that you can ask
and all of that stuff is
conveyed by this concept.
But the implementation
details of the class
are completely unspecified and I believe
that all of the properties of the class
with one exception are
encoded as static members
of the class, so you
actually end up generating
every time you call reflexpr
something like this.
It is a new class.
Some internally computed
name for this thing
and then you just sort of spam out
the properties of that class
that you want to be able to access.
Works nicely.
If I write, again ignore the dollar sign,
earlier draft, I apologize.
So if you say using T = typename , sorry.
Production operators are just type traits
because you kind of pack all those traits
into the class itself, you don't have
to do anything very complicated
to get them back out, they're just there.
So you can write a type trait
or something like a type trait
and simply request the
associated member through
whatever details are necessary
to compute that thing.
So if I say reflexpr(int)
and ask to get the type
of the int, that is taking
the reflected type class,
the data for that int, and it's turning it
back into a type specifier.
So now I can use that thing, I can use T,
wherever I need a type,
which would be nice.
Another example if you have
enum E with a bunch of things,
you can ask for the constant
of the reflection of A.
So you reflect on A,
that gives you a type,
that encodes probably the value
as a static constant expression
and you can just unpack that
guy with his type trait.
There are some other
cases that are alluded to
in various drafts of this proposal,
variously mentioned as unreflexpr,
if you're not getting a type or a value,
you need some kind of other operator
to generate a fragment of syntax.
One of the suggestions was
unreflexpr, is there an update?
No, okay. I will leave that as it is.
(chuckles)
so this is two-string
with this implementation
from P0385r2, which
hopefully is the most recent
that's been published,
I know there's an r3
that probably doesn't include this.
It's a lot of template metaprogramming,
none of this is really important
except the bottom two lines
but the way that it works is that
you invoke this helper function
to generate a static map
of values to strings, and you
just look it up at the end.
So this works nicely.
Except in the case where the
enumeration has two names
with the same value,
but every implementation
we're going to look at
today has the same problem.
Which it doesn't quite work, but.
So the choice made to encode
the value of a reflection
as a type, as a class, forces
you to use it as a class
which forces you to use
template metaprogramming.
Now that being said, this
is a very complete solution.
You can do any level of reflection
and any level of projection,
possibly with an extra operator
or two, using this system.
The downside for me is that it
uses template metaprogramming
I am not at all a fan of using templates
to express computation and
this is just computation.
If we can make it easier to
do, we should try to do that.
Benefit of this approach
is there are very few
compiler intrinsics, the only
one that I'm remotely aware of
is that if you ask for
the members of a class,
the compiler will
automatically generate you
a list of types of different type
and that is the one interface between--
besides reflexpr and
potentionally unreflexpr--
I think that is the one interface
between the compiler and the library.
Minimizing that is kind
of a nice thing to have.
One of the downsides of the approach
is compile-time performance.
Every time you reflect an entity,
every time you invoke reflexpr,
you are generating a new class.
Compilers do not forget classes, ever.
In clang at least every time
you allocate a class object,
like the abstract's in
text tree for a class,
it stays with you for the
duration of translation,
it does not go away.
So for long-running metaprograms
in large translation units,
you run a very real problem
of inflating the memory
costs of these programs
to a point where they are not sustainable.
I haven't seen it yet, but it is a thing
that we should be concerned with,
and in fact is a thing that
the Standards Committee
is very concerned with.
My approach, our approach,
Herb's and I works.
The goal was to make
metaprogramming go away,
we wiped the ideas in 194 and 385, we did.
It made a lot of sense,
we just didn't like
the angle brackets.
So the goal was to get that out of there
and try something a little bit different.
And this is actually implemented here
in github asutton clang along
with a bunch of other things,
I am proud to say that the
reflection capabilities
in this branch are actually fairly stable,
unlike the injection capabilities,
but this wasn't all
that hard to implement.
Modulate a ton of bugs and difficulties.
Our reflection operator
here, we spell it dollar,
and I know the Standards
Committee has not been fond
of dollar as a reflection operator,
I had some good discussions
with Veela earlier today,
earlier this week as to
why, but I still like it.
It's short, that's why I like it.
Really that's exactly why I like it.
So if I have a function
called f I don't know,
returns n+1, no big deal,
I can write auto metafunction
of f equals dollar f,
that gives me an object that contains
the reflected value of the function f.
So syntactically dollar is
just a primary expression,
the operand is is either an ID expression
which means it refers to
a function or a variable,
or it's a type ID which means
it refers to a class name,
or it is a namespace name which means
it refers to a namespace.
So basically the idea is you
reflect on a named entity,
again not a full construct,
we don't reflect expressions
just yet, or statements, that'd be weird,
and it gives you back a value.
What value? Actually, maybe not yet.
Yeah. What value?
The type of dollar is
actually a specialization
of a class in some name space somewhere,
we call this it is cppx::meta::something
I left off the cppx
here, the eventual goal
would hopefully be standard ::meta:: .
Which template you get out of this
or which template is instantiated
depends entirely on
the entity you reflect.
So if you ask for a variable,
you might get a class
called specialization of variable,
if you ask for function you might get
a specialization function,
if you ask for a class,
you get class type, fundamental
type basic concept is there.
One-to-one mapping
between the kind of entity
that you request and the kind of template
that gets instantiated.
Reflected name space is
ns because you can't write
namespaces and class name in C++, sadly.
All these templates,
they are parameterized
by one thing, which is
an encoded reference
to some internal data structure.
So it's the encoded
reflection that we get back
from these, or that we encapsulate.
And by the way I should
point out that the intent
was to make these implementation defined.
I don't care what they're called,
it doesn't actually matter
you would always just write auto.
Because you don't know what the type
of these things is going to be.
Again you would just
model some kind of concept
that tells you what you can get out of it.
Same thing above and you get the type.
The template you instantiate
would look like this.
Template
reflection T is not a magic type,
in fact it is a type alias
for centered in pointer T.
That's it, it's just an int.
It is not a particularly
complicated data structure at all,
it is just a laundered version
of the pointer to the AST
node in this implementation.
Cuz that was really easy to do.
So basically every time
you ask for the reflection
of something you end up
instantiating a class
whose template argument is
a laundered pointer value.
Beautiful.
It could also by the way be an index
into a data structure the
compiler keeps track of,
it's not in this implementation.
This is a small, small
area of concern for me,
because reflection T is just an int,
you could ask for like a
projection on the value 4,
and the compiler will say
sure cast four as a pointer
and dereference and that's bad.
Clearly some work needs
to be done in this area,
not a big deal, not yet.
The details of the class, we encode
all the properties of the class as,
so first of all the class is empty.
It's just an empty class with a bunch of
static constant extra numbers.
Well for now it's empty,
tomorrow it won't be.
So all these things are just
static constants for functions
name returns a constant character pointer,
type returns whatever the
type of this thing is,
they just return bool.
Not particularly complicated.
The definition of these things requires
some library compiler interface,
and that is done through intrinsics,
so these are projection operators
sorry I forgot my order of slides,
so we have a set of projection operators,
some of them are compiler intrinsics,
some of them are actually operators,
the implementation of those
functions in particular
are just intrinsics, they
all start with __reflect.
They take as an argument
the encoder reflection value
and the magic happens in these functions.
Like the entire
implementation of reflection
happens in these functions.
So what happens if you
get the reflections,
if you write dollar int and call dot name,
what happens is you don't instantiate
the definition of this function
until you actually call the function name.
At the point that you call name,
the compiler is obligated to instantiate
the function definition
which means it substitutes
the compile time value of
x, which is at this point
an integer, into this
reflect name expression,
and the compiler just
says, hey you're giving me
reflect name and a value?
I know how to get that.
So it just goes to the AST
node that was passed in,
de-launders the AST node and then goes
and asks for its name.
So if you're asking for
int it'll compute int
and it replaces the
reflect expression here,
the intrinsic, with the
string literal expression.
So when you call this function
it bakes into the class the definition,
what that result is going to be.
That's how all of this stuff works.
Keep that in mind for later,
that might be a problem.
So if you ask for the type of something
we do the same thing, we
just have reflect type,
you instantiate that, the
compiler looks up the reflection,
which is again just converted
for cast from the AST node
or an integer into an AST node,
gets the type, and
constructs another object
whose type would encode
the type of that object.
So if you were asking for the name
of the function in this case,
you would get the type
of it and you would get
a function type with parameter
types and return type.
Just another reflection, it works nicely.
Value projection is just a
property of these intrinsics also
we don't have to work too hard for this.
So a bunch of variables
and a bunch of functions,
so we ended up, the first pass of this,
we said that for the most part
we just want to get pointers
to these things back
so that's why everything is
written in terms of pointers,
except enumerations of
course, cuz you can't
get the address of an
enumerator, doesn't happen.
So if you want a pointer
to a global variable,
you can ask for x dot
pointer, and that just calls
this reflect pointer trait, same thing.
We just end up replacing these things
with expressions that compute the address
of that particular
object, and it gives you
full access to that as you go.
Probably worth noting
for variables at least,
that this will not work at all
if you try to give it a local variable
because local variables do
not have a constant address.
That would be bad.
But you can get the address
of a member variable,
and a member function
and so you can use that
to index into, if you
have an object you can use
that particular pointer to access
the members of the class at runtime,
which is kind of nice.
Dot value for enumerators gives
you the compile time value
which is kind of nice.
Let's see here.
Asking for the members of a class is fun,
cuz it was not easy to implement.
So this gives you a tuple.
Remember, every member of a
class could potentially be
a different kind of thing.
You can have constructors,
you can have a destructor,
you can have conversion operators,
you can have normal member functions,
you have member variables,
static member variables,
which are actually in clang
are not member variables,
but variables, you can
have access specifiers.
Did you guys know you can
reflect an access specifier.
Turns out to be important.
You can get injected
class names, that's fun.
So there's a lot of things you can get
in that tuple if you ask
for the members of a class,
that you can get from there.
And so we generate this
tuple, I can't say tuple,
it's a tuple-like object
that gives you access
to all those numbers in turn.
So you can ask for
tuple size of that thing
and it'll tell you how
many members you have
although it is not a useful operation
because normally people don't count things
like access specifiers and injected names.
If you use get zero, you can
get access to the zeroth member
which will almost certainly be
the injected class name of the type.
It works nicely, I mean it works okay.
Getting size are lazy, they're just built
on a couple of intrinsics,
I think reflect member size,
and reflect member, I don't
remember the precise names
of them, I wrote it about a year ago.
So we don't actually build
the tuple in one shot.
Otherwise if we tried to do that,
we would actually have to build
an entire abstract syntax tree
the second you build anything
and that seemed a bit
crazy, so you don't want
to compute those values.
You definitely do not want
to create types of tuples
of types of tuples for a single operation,
even though it's gonna
stick around and be reused.
Most of the time however, these
tuples are actually filtered
cuz nobody wants all of
the members in a class,
you might want the member variables,
you might want the member functions,
or the constructors, so
we have a wrapper tuple
that tries to filter these things out.
I must report that I
am not the best person
working with these data structures,
I'm sure that Louis if
he was here, Louis? No.
Or Eric if he was here,
might be able to craft
a better implementation
of that data structure;
first pass I got it to work, I'm happy.
I will not look at that
code again it is disgusting.
Anyways, which is what I just said.
So the problem with
this is that we end up--
we still have to be able
to use these different--
we have the problem now of programming
with heterogeneous
containers, so you go from
template metaprogramming with type lists
to programming with
heterogeneous containers
and there's still a lot
of overhead to this.
We still have good tools
for it, like Boost.Hana
is fantastic, we could have foregone
any kind of additional language support
and just said, "We must
standardize Boost.Hana,"
and that is how you will work
with these template meta programs.
I didn't really like
that, Standards Committee
might have issues with it, I don't know.
Veela? Standardize Boost.Hana?
Not here. Disappointing, I know.
So the question was, can we do better?
We can do a little bit better.
We ended up creating a new
language feature on the side,
by the way this has kind of been the story
of how reflection and metaprogramming go,
like, you try to solve one problem,
you run into another, and you create
new language features as you go.
So we invented a new kind of for loop.
So if you write for dot dot dot,
and Herb alluded to this in his talk,
you don't get a loop, you
actually take that thing
that's on the right hand side,
auto x and member
variables, that's a tuple.
And so conceptually, you
iterate over the members
with the tuple, that's not right at all.
What happens is you basically
just unroll the loop
so every instance of the loop body binds
for the next member of the tuple.
And this simplifies
programming a little bit.
It doesn't solve all problems,
it solves a particular set of problems.
It certainly makes it
easier to write loops,
basically to apply a particular function
to every member of a tuple.
So this ended up getting
extended over the summer,
by a student of mine, Sam Goodrick,
honor student at U Akron,
we actually expanded that
to include--
tuples was done--
we included to incorporate arrays
and then we said, well why not
just destructure both classes?
So basically anything that you can use
with a structure binding you can also
throw through this thing so simple strucs.
Fantastic that works.
I think we were going to go
after parameter packs too,
as in if you have an
unexpanded parameter pack,
and you instantiate this loop
without a C argument, you'll just unroll
over the entire loop.
I expect to write a paper for
this for C++20 in Albuquerque,
mostly just updating an old paper,
but this is not a bad feature.
I like this feature a little bit.
Normally pack expansions expand
to a sequence of arguments,
so you can only expand in
a limited set of contexts,
usually where expressions are required.
This takes that and linearizes it
to expand over sequences of statements,
which is kind of a new
thing that we can do.
Haven't gotten to expand over
sequence of declarations yet,
that might be tomorrow.
So the other projection operators
that we have for these things,
they are currently, at least
in the implementation so far,
are spelled thusly: typename,
namespace, and idexpr.
Herb and I have been discussing
whether it's possible
to compress all of
these things to a single
project operator whose meaning depended
on the thing that you
project, it might be nice
to simplify all these.
I think in some cases it
might not be possible,
but again we're still exploring the idea.
These ideas do actually come from P0633,
I think they were almost directly taken
from Daveed Vandervoorde's metacode ideas.
And the idexpr I think is a reference
to something that appeared
in an early draft of 385.
So reflecting the type name
is pretty straightforward.
If you have a reflection value,
if you have one of these
objects of meta::type,
you just call typename and that
gives you a type specifier.
So if you have metax,
you can ask for its type,
pass out the typename, you get its type.
But of course if you give
it a type declaration,
the operator should be
smart enough to know
that you actually want the
type of the type's declaration,
in this case an int so you get int,
or you could just directly
go ahead and pass in
the reflection of a type and generate that
as a type specifier on its own.
All three would give you
int y whatever equals x.
Namespace projection, to be honest,
have not implemented this guy yet,
just demand hasn't been particularly high,
I think it's useful, I suspect
it's useful. I don't know.
But this is how it would work.
You can apply namespace to one
of these reflection objects,
context by the way will
give you a reflection
of the enclosing context
of the declaration,
for those of you who might work on clang,
that would be the semantic context
not the lexical context.
So you could write using
N equals that thing
and then N becomes a namespace
that you can use wherever,
or as a namespecifier that you
can use in different contexts
or you could use it directly
in a nested name specifier
and expand to these kinds of things.
It seems reasonable.
Idexpr is actually one of my
favorite, I like this one.
So if you askfor the ID expression
of a named declaration,
you get an identifier,
actually you get an unqualified ID.
That unqualified ID typically
forms an ID expression
that is then looked up,
especially if you're in an expression.
So if you ask for the idexpr of F,
you get the identifier
F which gets looked up,
in this case there's
only one definition of F,
but because it's just
an ID that could invoke
an overload set, it's
just an unqualified ID.
Nice. Useful? Maybe, maybe not.
Until you actually wanna
start building IDs.
So we have a generalization of this
that allows you to add
multiple arguments in,
and I have to be honest this supports
the code-generation side
of this a little bit more
than in reflection and projection,
but you can start building an ID.
So you can pass in reflections,
and that'll just take the
name of the reflection
and add that to the
string, or you can pass
in a string literal, and it'll
append the string literal;
you can pass in an integer,
and it'll append the integer.
I haven't figured what else I
might want to push in there,
but those are the three
things I can deal with so far.
And we can make IDs
like dollar F function 3
and it generates function F,
function 3 for lookup later on.
Because it also generates
an unqualified ID,
it turns out you can use
these to declare things too.
So if you really need to generate the name
of a declaration programatically,
this is possible.
You can certainly do this.
And again this will show up, I suspect,
extensively when you start talking
about generative and metaprogramming.
It's not specifically about reflection,
but it fell out of the reflection work,
and it's not actually a happy coincidence,
this is very much by design.
So implementing stringification
in this particular context
is fairly easy, that's what it looks like.
Any questions?
And yes your optimizer may in some cases
reduce that to a switch statement.
So what happens is you run
the sou for ah I missed a dot.
Man, this talk is ruined.
(laughter)
Probably ruined it when I started.
Anyways so, for dot dot
dot auto x and enumerators
that will expand to the
sequence of compound statements
that include the if statement,
each consecutive thing will--
that might not, well that might reduce--
basically tests for the value
then return the name if you can.
I would hope that an
optimizer would change this
into a switch statement
if at all possible,
not entirely clear.
Herb actually gave an example of work
that I'll talk about tomorrow
where you could do this
directly in a switch statement.
Although again, not guaranteed
to work for all enums
if you have multiple
names of the same value,
cuz switch statements
are picky about that.
The work is very much equivalent
to the semantics of 194.
I didn't think this when I started,
but it wasn't very hard for
me to figure out that it was.
Going form 589 to 194 is
just defining reflexpr
as the decltype of reflection,
that's how you get there.
The library and the interfaces
still need to be adapted
or whatever, but that's just programming.
Going the other way if
you create an object
of reflexpr type from 194,
you get back essentially
what is returned by dollar
x, so that's the basic idea.
We have the exact same problems
that those guys did earlier,
with one difference,
there is one difference.
The approach taken earlier,
they create a new class
for every reflection, we
instantiate a template.
So you run into this particular problem,
this caught me late, maybe in July.
I was trying to write a
metaclass for something
and I ran into this problem,
I couldn't figure it
out for a little while.
When you call a function on
one of these reflected objects,
you are instantiating the
definition of that function.
It will not change, you
have instantiated it.
So if I have a struct
S, and I static assert
that S is complete, it's not, it's false.
That call to S dot complete,
that locks in the definition,
the falsity of that
behavior, so even if you
define the class later on,
it will continue to be false.
Which gets a little bit weird
if you ask for the numbers
of an incomplete class
you get an empty tuple and then later on
you ask for the members
of a complete class
and you get an empty tuple,
like oh my God what's happened?
My class has evaporated.
Leads to some surprising results,
but again if you depend on
the completeness of classes
for any part of your design, sorry,
if you design depends on
changing the completeness
of classes from in complete to complete,
you probably deserve
this kind of behavior.
This is not a thing that sane
people write programs to do.
Moving forward.
Direction was given by SG7 to
pursue a reflection approach
that included only values,
so that we do not create
or instantiate types per reflection.
In other words, take classes
out of the implementation.
Values are cheap, cheap to create,
there's no lasting memory
footprint for them,
they go away usually, they're
simple discriminated unions
inside of clang, pretty much.
Pretty sure they're
the same thing in NGCC.
If anybody here works on clang
it's called AP value, it's a nice class.
So I did this, I have an
implementation of that, it's here.
It's not complete.
I know I have 15 minutes left,
I have extra slides I can talk about that,
but I'm gonna wrap up and
give time for questions first.
I had to, there was a
context for all the things.
So look, all the work so far has focused
on reflecting properties of declarations,
cuz that's kind of the thing
that we want to deal with the most,
we want to generate declarations,
analyze declarations,
write programs that operate
over the members of declarations,
like specifically classes.
There is a whole wide open world
when you start moving to
reflecting on expressions.
Anybody heard of template hasgul?
Yeah that's that approach.
So we're not very far away from
doing that particular thing,
which it's a good thing I
talked to Gabby yesterday
and he pointed me at that
so I could look at it this morning.
Anyways, so the idea is that
we want to be able to reflect,
I think that we wanna be able
to reflect on more things.
I don't know yet, I
don't know if we wanna go
in that direction; I suspect that we do.
So maybe let's just be
reasonable about this,
we don't need to context
for all the things,
we don't need to template
metaprogram everything,
but certainly there are
times that this could happen.
So let's see acknowledgement,
some NSF funding,
some support from Microsoft,
thanks to Matt Godbolt
for hosting the experimental compiler,
although not that one yet, cuz that's
not quite ready for prime time.
And thank you for any early users
and stop sending bug reports please.
It's not that I don't want them,
it's that I'm moving forward
and not trying to be
perfective, that's all.
So, if there are any questions now,
I will be happy to answer them,
otherwise I have 10 more slides
on the other stuff I've done.
Yes.
- [Audience Member]
So, the backbone's like
one of the things that can be reflected?
28 you say? Yes.
- [Audience Member] So
I think the fact that
(mumbles)
you can reflect on a
vector of eight but not...
So good question. You can indeed.
Sorry repeat the question I'm being told.
I have typeid here,
sorry I keep forgetting
to repeat the question.
Can I reflect on a template specialization
and can I reflect on a template?
- [Audience Member] Well,
I assume you can reflect
on a template specialization, but you seem
to leave out just templates.
So the answer is that you should be able
to reflect on both, I
left templatename out
because I haven't looked at it yet.
And reflecting on a
template specialization
requires a library design that
I haven't quite gotten to,
but it's interesting cuz
then you can start asking
for the template arguments of the thing
and it starts opening up a whole new,
a brave new world as it were.
Like you got the template
from a template specialization
and now our rebind operators go away.
Questions? Yes.
(audience member mumbles question)
Ask that one more time.
- [Audience Member] If
I went home with DSL,
could I run a computation for that?
Probably not yet.
Sorry the question is, thank you Jennifer,
the question is can I
use the implementation
to build a DSL by iterating
over the numbers and types
and generating new things from these?
The answer is probably not yet.
More on this tomorrow,
but I've had to step back
and rethink how all the
code generation stuff works.
You can certainly reflect
on a lot of things,
not templates or template IDs just yet,
cuz we don't have the
library design for it,
but you can't use that to
generate interesting things.
- [Audience Member] So I
was playing with it earlier,
and one of the things that I wanted to do
is see what (mumbles)
basically in all modules.
So I wanted to make the
syntax prettier by doing that,
so what I was curious about is,
would you be able to make it reflect
on whether a member function
or a variable is static?
And if it's non static, make it static?
Or if it's static make it non-static?
So the question is can you
reflect on a member variable
or a variable to determine
whether it is static
or non-static and if
so, make it different?
- [Audience Member] Yes.
Yes. You can detect the storage duration
of a variable or member variable
and in the code generation stuff
that I am not talking about today,
you can actually change,
as you generate new code,
whether that should be
static or non-static.
Sorry let me clarify,
you can make it static.
But once you've written something,
and this really is about the next talk,
once you've written a thing,
we don't erase things.
If you declare something to be static,
you mean it to be static.
If you just write int something,
you might not necessarily,
there may be transformations
you can apply to make it
a little bit different.
But we don't erase specifiers.
- [Audience Member] So if
it's a member function,
like a method, I can make
a static member function
and it will be like?
Through code generation, yes.
- [Audience Member] Cool
Can I make a member function
a static member function
through code generation? Yes.
- [Audience Member]
Please use microphones.
Yes, or come up.
- [Audience Member] We have microphones.
Any other questions?
How much time do I have Jennifer?
- [Jennifer] 10 minutes.
10 minutes? You guys wanna see what I?
Oh, wait two questions.
- [Audience Member] It's about this slide,
so a member can be a type ID?
Yes. Could you? Say that again.
- [Audience Member] Can
that be a runtime type ID,
or does that have to be a tums expression?
Can the type ID be a runtime type ID?
- [Jennifer] Can you
rephrase the question?
- [Audience Member] Can
I reflect on the type ID
at runtime and say, write
function that symbolizes a ?
Can I reflect a type ID at runtime? No.
That is the syntactic
form of type identifier,
it is the type that you write there.
If you want runtime type information,
dynamic cast and RTTI or
type ID are your operators.
- [Audience Member] So type
ID you don't switch type?
No, type ID here refers to a
production in the C++ grammar
not a type underscore ID.
Yes?
(audience member mumbles question)
(laughter)
So the question is, can we
lose the dots in the for loop?
And the answer is evolution suggested no.
Because this is not in fact a for loop,
it does not iterate, it generates code
and so the request was by evolution
because it does something so different,
there should probably be something
that visually signifies what
it actually does. So, dots.
- [Audience Member] Actually
my question was also about
the for dot dot dot.
I think there's already
actually a function
that E called Y, which
is basically a one-liner
that lets you apply a veriatic lambda
to every member of the tuple,
it's pretty straightforward
you can tweak this so you
also have early (mumbles)
and so on, so using that
one, sort of the rationale
of having a language
feature at all it seems
for dot dot dot it's not really necessary.
Like is it compiled by
or for something else?
The question is, is four dot dot dot
really necessary, given the fact
that we have a function
called standard apply?
There's overlap, so I think
that there's certainly overlap.
So I know that there are a
lot of people in the room,
that if they see a forloop would tell you
to use an algorithm and I
don't necessarily disagree
with that but at some point
a lot of those algorithms
reduce to forloops so if
you really wanna think of it
this way, it's implementation support
for people writing apply.
It is implementation support
for the people who have to write apply.
- [Audience Member] Apply
is a one-liner as it is.
Implementation-wise?
- [Audience Member] Yeah.
I hear to the contrary. You guys can talk
about that afterwards, I
will bow out. Questions?
- [Jennifer] FYI, further
questions can use the microphone.
Further questions can use the microphone.
Thank you, Jennifer.
Come on down.
I like this. Dance.
(laughter)
Sorry, Roland.
- [Questioner] So you have
a very simple example here,
where you pass everything to C out,
could I also use this
to aggregate something?
Dammit. Can this thing change the type?
In the fold expression, for instance,
if the initial value on
the left is whatever,
I can pass something in
that returns a new type
and continue this for
instance to build a typeset,
can I use something here as well?
So as far as I understand, probably not.
You can't change the value of these things
as you iterate through
them because they expand.
You can still carry variable
state through the expansion,
but as far as modifying things
that you're expanding over, not yet.
So there's a limited range of things
that you can do with the expansion.
I don't really know how
useful it is in practice,
I think the template metaprogram approach
gives more mileage because
you can break out early,
and we don't have really control
for doing that with this.
Because in mathematics we get numbers
and we get groups and we thought well,
this is pretty abstract,
but then we got categories,
and categories of small
categories and etc.
Do you think it's going
to be the same case C++?
No, I don't really
think of metaprogramming
as an abstractin mechanism, per se.
I think it's a tool you use
to buy into abstractions
that C++ already has.
Does that make sense?
- [Questioner] Yeah.
So I think that concepts are the height
of the abstraction that we have in C++,
they specify very broadly
what values you have
and what expressions
work with those values.
Almost everything that we do in C++,
every line of code that we write in C++,
is at least when you write a class,
is an attempt to buy into one
of those concepts or another.
Whether it's default construction
or copy construction or writing the code
that locks down a resource container,
resource value against
copying for example.
Or equality comparison or less comparison.
So all of the things that
we build around a class,
a lot of the things that
we build around a class,
tend very much to be about the concept
that you are trying to satisfy,
the thing that you're trying to model.
If you wanna talk about categories,
maybe there's a mapping there,
I don't think of it that way,
but then again I'm not
that kind of professor,
I write code.
(chuckles)
So I see this as a way of opting in
to abstraction facilities
that the language already has,
not as a way of making things
that much more abstract.
And I suspect that if you
want to think of it that way,
you might not be thinking of it correctly.
- [Questioner] Alright, thank you.
- [Questioner] Concretely,
I had a need or a desire,
is pointer-free predicate
and is static reflection
as proposed necessary and sufficient
feature condition for having that?
To detect whether a
pointer has been freed?
No, if a class doesn't
define any features,
is pointer free?
I think so. I'll go out
on a limb and say yes.
Okay, I thought so.
But don't ask me to put
money on that just yet.
So does this only work for each loops?
Or can you do for dot dot dot open brace
and then a classic C-style forloop
and have that kind of work?
For dot dot dot and then what?
C-style forloop? Like auto I equals zero,
I is less than blah?
Potentially. I've actually
given this some thought.
So what's required to make that loop work
for expanding over a range
is that begin and end
have to be constant, your
iterator has to be literal type,
actually no you don't even
really need an iterator,
you just have to do
really interesting things.
You almost certainly require
a random access iterator
to do it, but yeah if you have
constants at the begin to end
and your iterator is
literal and random access,
I think that you can do that, yeah.
That'd be interesting, at least.
It is a thing I have thought of
because of the other 10
slides that I'm not showing.
(laughs)
So you mentioned using the
idexpr to define an identifier,
programatically, is that
something you could use
to declare an identifier
that is not actually normally
legally typeable as an identifier?
Such as with a space in it or something?
For better error messaging.
Easiest question all day. No.
And no you can't put a
poop emoji in there either.
(laughter)
I know string literals support tha
if you have more questions,
I will be around.
Thank you for coming.
(applause)
