>> CROWL: Good morning.
Welcome to Halloween's TechTalk here.
We thought we'd dressed up as the scariest
thing we could find which is two language
lawyers in front of a large room.
So--here.
So, this is Matt Austern.
He's going to be handling the second half
of the talk, and I'm Lawrence Crowl.
And we have a fairly whirlwind talk here.
So what we'd like you to do is hold on to
your questions until the end of the talk because
you might find that your question turns into
something more interesting.
And the other thing we'd like to ask is that
if you can ask your question in a way that's
not Google confidential that would allow us
to put the talk up on YouTube.
So let's get started.
And the first thing we're going to do is go--a
little bit of history.
C++ has a very long history and it developed
through the '80s.
And then, about 1990, the International Standards
Committee formed, to try and get a formal
standard coming out.
That was successful in producing C++ 1998,
which was the first real standard.
Everything else before that was de facto.
And then five years later, we produced what's
called Technical Correction I.
So that's a formal update to the standard,
but it was mostly a bug-fix release.
And now we're working on the first major revision
of the language which is currently scheduled
to come out in about 2010 where that X is
the Roman digit.
There are four major thrusts of the language
right now.
The first one is concurrency because multi-cores
arrived.
The second one is, a lot of clean up in the
language to make programs easier to write
and more robust in the face of working programmers.
And a major feature is Concepts which helps
do type checking for templates.
And then, we have a significantly expanded
standard library that helps us get through
Evolve.
First major topic is concurrency.
This work is not something that you would
find a big stretch.
However, there's an awful lot of details to
make it all work together and play together.
And none of this is going to be particularly
new except perhaps in the memory model which
is at the bottom layer here.
The first thing is that you don't get to read
and write to memory at random once you have
threads.
You have to control how each thread reads
and writes to memory.
So if you write in one thread and read from
another thread, as in the first example here,
that's a data race.
Your program is undefined; it could launch
the missiles and order anchovy pizza.
So don't do that.
The procedure to make this work is with release
and acquire.
So you write to a regular variable on one
thread, then you write to an atomic variable
with release semantics, then another thread
reads from that same variable with acquire
semantics, and now it can see the write to
the regular variable that you did earlier.
And you can do this with locks as well.
But it's very important to note here that
you have to use the same variable.
Here, we change from A to B. Those are two
different atomic variables.
They don't carry that same information.
You have to make sure it's the same one.
And the atomics have a bunch of useful stuff:
reads, writes, atomic increment, atomic XOR.
And in addition, we have a compare and swap
member function which allows you to get the
full generality of the kinds of things you'd
want to do, link lists, all of that stuff
you can now do, and that's designed to work
with a loop where you try and figure out what
you want and see if you can get to what you
want.
And there are weaker primitives available
for super experts.
I won't go into those.
Global variables need to be initialized.
And once you start having threads, you have
to start having initialization scenarios.
There's two approaches.
One is for regular global variables.
The initialization is a bit weaker.
So you used to be able to rely on a global
variable defined in another translation unit
either being fully initialized or being zero
initialized.
You now cannot rely on that.
You have to either not use it or know that
it's been initialized.
For function-local statics, which is something
near and dear to the hearts of Google, those
initializations are now synchronized.
So you can rely on those things being right.
The language also has thread-local storage.
A lot of compilers already have the first
line with simple primitive types and pods.
That's being formalized and extended to allow
types with constructors and destructors.
But when you have thread-local variables,
their addresses are not constant so you can't
use them as arguments to templates.
The--probably the biggest use of thread-local
storage is in two areas.
One is if you have a multi-process program
that you're trying to turn into a multi-threaded
program, and the other is for thread-local
caching.
We have mutexes and locks, and there's a distinction
here.
A mutex is a data object that will live in
your--in your objects.
A lock is something that is a local variable.
And creating the lock then acquires--creating
the lock then acquires the mutex, and when
it goes out of scope, it releases the mutex.
You can do that without the--this lock guard
here but then you have to worry more about
exceptions.
And a special case of mutexes is the ones
from pthread and this has been formalized
to be more general than just simple functions
and it says, "We're only going to call this
argument once."
So for any given flag, you'll only fire once.
We also have condition variables which allow
you to move locks and--from one thread to
another until certain conditions are met.
And in particular, if we have a buffer, there
are two conditions that are important to controlling
the flow.
One is whether or not it's full, and one is
whether or not it's empty.
And typically, you want to know when it's
not full or not empty.
And then, you can wait for a certain condition
to be true and that has to occur in a loop,
reevaluating the condition because you may
get spurious wakeups.
So this is not guaranteed that the condition
will be true when you return from the wake.
But then it's likely to be true.
You still have to test.
And then, of course, when you've achieved
some condition, you want to make sure that
you notify that condition as well.
The way we create threads is there's a standard
thread class and then when you construct that
thread, you give it something to execute which
could be a function or in--or a class that
has a call operator.
And once you create that thread, that function
starts executing in that thread.
And then, when you want to synchronize with
determination to that thread, you can join
with that thread later.
The big problem with threads as they stand
is that if you get an exception in here, the
exception will then destroy this thread which
will detach the thread and detached threads
have much weaker semantics and are really
hard to get a hold of.
So you really want to avoid having detached
threads.
And so those exceptions cause problems.
We have mechanisms to deal with that.
In particular, you can catch an exception,
find out what that current exception is, copy
that current exception into a global variable
or some other data structure, and then in
another thread, you can re-throw that exception
which allows you to catch an exception in
a thread and then pass it to another thread
to be re-thrown which shows up in futures
and so forth.
So what happens here is that we are creating
our work and now we have an intermediate.
So we create a packaged task.
This wraps the work, and then, we will move
that task into the thread and now the thread
is executing that task.
But before we sent that away, we grabbed a
future from that task which we can now wait
on that future and get its value.
And if that work happened to throw an exception
when we called the get, we would get that
exception.
So now, we have an exception-safe way to deal
with the threads.
The next major area of talk is this sort of
general bucket of making programs better.
And some of it is C99 compatibility, and some
of it is just moving around in different areas.
The simple things from C99 have been carried
forward, all the changes to how integer types
work have been carried forward, the preprocessor
work has been carried forward.
So that's all the same now and you can write
C preprocessor stuff that's agnostic to the
language.
A lot of the headers have come forward.
There are some things in C99 that didn't come
forward.
In particular, variable length arrays.
And the reason is the typing of variable length
arrays in C is just incompatible with the
C++ typing model.
Some things that just didn't have enough support
to go anywhere like the restrict qualifier.
Compound literals also didn't match C++ semantics.
Complex numbers, C++ already had complex.
There has been some adjustments but the basic
syntax is still going to be a bit different.
The current language only let's you put pods
in unions, and that restriction has been eliminated.
So you can now put classes with full constructors
and destructors inside of unions, but it's
not free.
Among other things, you have to define the
special member functions yourself.
So you have to define a default constructor,
you have to define the copy constructor.
All of that, you have to do yourself because
the language doesn't know what to do with
it.
And when you want to change fields, you can't
just assign a new value to the field.
You have to first delete the old field and
then construct the new field with the placement
new.
So you have to say, "That one's gone.
This one's coming in the scope."
So you have to remember where you were.
We also have generalized constant expressions.
Now what this does is basically allow you
to have function calls in constant expressions.
So we declare a function as something we want
to be in constant expressions.
The body of it has to consist of a single
return statement and the expression here,
assuming the parameters were constant, would
have to be a constant.
And now we can use this in array declarations.
We've extended this to also allow you to create
classes that can be constant expressions.
So we have a complex number here.
We declare its constructor to be a constant
expression and it has to be initialized with
full member initializations and the body has
to be empty.
And because we have a constant expression
class, we can have member functions on that
that are also constant expression.
And we can define a constant P, which is a
complex number, and we can initialize that.
So now, P is the constant expression and we
can use that constant expression later, in
particular here, real--the real member function
is a constant expression.
So all of this is a constant expression which
means we can now initialize this double statically,
based on all of that constant expression we're
following through.
And we can extend this with user-defined literals,
which is a way to add a little suffix like
you see in C with, you know, 2.0F. That F
is a suffix that helps you extend the literal
space.
We can write a function that extends that
and recognizes this I as a suffix, matches
it to this function and does the right thing.
And so now, we can get a full constant expression
Z that is a constant.
And the committee has plans to do ISO units,
you know, for length, meters, kilometers in
that, but it's not going to be in the current
standard.
We also have delegated constructors.
This has been a long standing complaint of
programmers, where you had to repeat all of
the constructors every time even though one
constructor was very similar to another.
Now, you can define your regular constructor
and define new constructors that appeal to
the earlier constructor, and just reuse all
of that code.
Now, it's possible to write mutually recursive
delegated constructors.
That's undefined behavior.
Don't do it.
There's been a lot of work trying to regularize
and extend the power of initialization syntax.
So the current language has, like, five different
ways to initialize things.
The goal was to get that down to exactly one
way.
And the syntax we use is these curly braces
basically instead of parentheses or assignment
operators.
And so we can initialize the variable X with
3 but using the curly braces.
We can initialize member constructors with
the curly braces.
And here, we have a list and we can initialize
the vector with a list.
Think Python here where you can just write
a list and the right things happen.
There's one place where we couldn't quite
make curly braces do everywhere, and that
is that vector has--its constructor has parameters
to say how big you want the vector to be.
Well, we need to distinguish that from, how
big do you want the vector to be, to a vector
of one element.
So the parentheses here are going to say,
"You want three elements."
If you were to use braces here, that would
say, "You want a list of one element whose
value happens to be three."
Enumerations in the current language are kind
of weak because they will decay to an integer
at the earlier opportunity, so there's a lot
of syntactic problems.
We now have stronger enums with this enum
class.
You can specify the base type.
So it's not--it's not inferred.
You can specify it ahead of time.
This allows us to do some forward declarations.
You can have the elements as per normal, the
enumerators as per normal.
But you don't automatically get those enumerators
exported to the containing scope.
So this is an error because green is not in
scope.
In order to get to the green, you have to
go ahead and qualify that.
Likewise, because there's no automatic decay
to int, this comparison is wrong because the
color yellow is not the same as the alert
yellow.
Those are two different types.
The language now has static assertions which
allow us to compile-time check certain things.
So this template parameter, we want to make
sure it's greater than three.
We stay to that with a static assert, and
this text message is what comes out in the
error diagnostics from the compiler.
We can now say I want the default behavior.
So, in the past, there's always a big fight
over whether or not you declare the default
constructor and so that everybody knows you've
thought about it, versus not declaring it
and getting efficiency.
Well now, you can get both.
This says, "I want the efficiency and here's
the--and use the default behavior."
This is particularly important when you want
to have a user-defined copy constructor but
still want the default constructor.
There's no way to do that in the current language.
Likewise, we have it for the copy assignment
operator and the destructor.
We also allow you to get the default behavior
without having it be trivial.
So the problem with writing this is you're
now committed.
That's part of the ABI.
If you don't want to commit to it in the ABI,
you sort have to write a non-trivial constructor.
This allows you to write that non-committed
interface and still get the default implementation.
The language also has move semantics which
is implemented with a language feature called
R value references which is a kind of a tricky
feature.
But basically what it allows you to do is
have structures that are moved rather than
repeatedly copied.
And this can substantially improve the efficiency
of code, in particularly, things that what
to write in a functional style or that--where
you don't really want multiple copies of something.
So here, we have a wrapper for file star and
we don't want--there's no suitable default.
We don't want to copy it.
So what we're doing is we're deleting that
function.
The simple--the symbol still exists but if
you try to call it, that's an ill-formed program
and the compiler will catch that.
And here, we have a move constructor.
What this says is, "I will only accept parameters
that are sort of R values and not L values."
And then we can close--we can close that file.
And here, we're opening a file fu and the
pushback operation is declared to take an
R value reference parameter so it can now
move from the file created.
This allows us to shuffle data through many
calls--many call parameters without actually
doing any significant copying.
We also have explicit conversion operators.
So given this class here with its conversion
operators and these two functions, this call
would be ambiguous because there are two matching
conversion operators.
And sometimes, you don't get the right ones.
So, if we re-define those conversion operators
and make one of them explicit, now if we do
this call, the explicit one doesn't apply
because this is an implicit conversion so
it can only be this one, and the conversion
goes through.
What's probably better is that if you make
both of them explicit so that the right things
happen.
So typically, you only ever want one of these
to be not explicit and all the others should
be explicit.
The language also does a better job of type
inference.
So, a common thing you'll see in C++ code
is these common sub-expressions, PF, PF with
different trailing function calls.
And this occurs a lot because people don't
want to figure out what the return type of
F is and write that and so forth.
So we get a lot of duplicate execution of
code.
Now, we can make the type specifier be auto
and this says, "Infer the type of auto from
the return type of this expression."
And now, we can go ahead and use this variable
later and avoid the redundant call to F. Sort
of simple, don't have to think about way to
optimize your code.
We also have what's called late-specified
return types.
The problem with putting a return type in
front of the function is that if you want
to infer the return type from the argument
types, you're out of scope.
The argument type--arguments aren't in scope
yet.
So we now can move the return type to the
end of the function and we can then use the
decltype sort of, like, the GCC feature to
compute the type from the arguments and away
we go.
Now, there's still a little bit of debate
about the exact syntax here and that will
get resolved before the final language.
We also have a new style of for statement
which infers the appropriate iterators and
ranges and so forth.
So given an array, this will do the appropriate
inference for iterators and you can run the
body.
And it's even more powerful when you combine
that with the auto keyword so we can take
some data structure.
We don't even have to know what the elements
of the data structure are right now.
Say, we want auto reference to each of those
and then we can modify those.
>> On the previous slide, you had P pointer
to F. I noticed that at the end you had auto
followed by an ampersand but at the beginning
you didn't.
If P pointer F returns a reference, does that
first auto variable--also a reference or is
it not a reference?
>> CROWL: It's not a reference.
So, the next major feature of the language
is Lambda Expressions.
So this is the Lambda introducer which says,
"Here's a lambda coming," and then the parameters
to the invocation of lambda and the body of
the lambda.
This lambda introducer also allows you to
specify a capture list.
It says, "What do you do about the containing
environment?"
That ampersand says, "Okay.
I want to get to the containing environment
by reference."
Which means that if we were to modify min
salary here, that would be reflected in the
environment.
I can alternatively say, "Well, I want to
capture min salary."
So when I create the lambda expression--the
closure, this will be captured and it won't
change.
So, if I try to modify min salary in here,
it will actually not escape the lambda.
So that gives you a great deal of control
over what you do about your referencing environment.
This is good if you're trying to stimulate
something like a traditional C4 loop.
Capturing variables is good in a multi-threaded
context where you really don't want a lot
of concurrent access to the environment.
And then, there are a number of syntactic
improvements that have been dropping in.
One is being able to put the two angle brackets
right next to each other.
There's now a syntactic name for the null
pointer instead of doing inference on zero.
You can apply sizeof to member variables.
There's now a full attribute syntax which
allows you to sort of get away from all the
vendor specific ways to write attributes and
go to a standard way.
And I'm going to hand it over to my colleague,
Matt Austern.
>> AUSTERN: Well, Doug Gregor came here last
year and spent a hour-long talk on concepts,
so I'm going to have about five minutes.
So let's try to get going quickly.
So, first thing to know about concepts is,
in a sense, they are a language feature.
In a sense, they're also something that we
have in any language not just in C++.
People started thinking about this idea in
the context of scheme and data and all sorts
of other languages and we've always had them.
The basic idea of--let's say this style of
programming--of generic programming is, you
classify the things that you're talking about,
what's the fundamental category.
You write algorithms in an abstract sense
instead of relying on the details of any individual.
And one of these things--and then you write
specific models that conform to these abstract
concepts.
This is the way mathematics has always worked
and this is essentially--programming as mathematics.
Well, that works reasonably well whether you
have language support for it or not.
But we found in the context of C++ that we
really do want language support for it.
In C++, the way we typically think of this
kind of programming is with templates.
And templates just need a little bit of taming.
So one way to think about concepts as the
language feature is that they're constrained
genericity.
You can think of them as type classes.
If you're a Haskell fan, you can think of
them as types of types.
So, let's first talk about why we need them.
What's painful about programming in C++ templates
as it stands right now.
Well, the first thing is we've written this
very nice template.
Here copies things from an input range to
an output range--what are the requirements
on this?
If I'm a user and I want to plug classes and
objects in here, what do I have to do for
it not explode?
I haven't told you.
I've just given you names here.
If you happen to be familiar with the naming
conventions of STL, then you can make a pretty
good guess, or you can read the implementation
and just figure out something that will satisfy
this implementation.
I'm really not a fan of requirements that
can only be expressed in comments or naming
conventions or where you have to read the
implementation.
Requirements ought to be in signatures, and
here they are not.
So, another pretty painful thing--let's suppose
that you are familiar with STL.
You know that sort takes iterators, so you
know if I want to sort a range, I pass it
a couple of iterators.
Okay.
Well, you're not super familiar with STL,
so you know that you have to pass it in iterators
but you don't happen to know that list provides
the wrong kind of iterator.
Oops.
I was actually a little pleasantly surprised
GCC has gotten better.
Now it only provides 18 lines of error messages
for this.
Unfortunately, even though GCC has gotten
a little bit better, something you don't get
in these 18 lines, is the one of the fundamental
thing, you passed in the wrong kind of iterator.
You should be passing in a random access iterator.
And that's a problem, too.
We don't have a way of specifying this in
the language itself so the compiler doesn't
have a way of talking about it.
And then, the third problem is when you do
get these errors, they come too late.
Everyone sees the bug in this code, right?
It's just glaringly obvious.
This is the bug.
That should be bang equals, not less than.
And the bug here is that this actually would
be correct if we were taking in random access
iterators.
Instead, we're taking in input iterators.
But, oops, the compiler doesn't have any way
to know that because input iterator is just
a name.
The compiler doesn't know if there's anything
special about that.
So we've got this mismatch.
And the worst thing about this bug is it won't
show up in testing.
Because probably, when you test this, you're
going to give it vector iterators to try out
and it'll work and it will only be a month
later when some unlucky user sees the problem
in their code.
So, what does this look like with--in concepts?
What we want to do is actually constrain what
a possible--what a template parameter can
be instantiated with.
You have to say something about what the types
are in these quest--case.
And here's the syntax for doing it.
We're saying, "I have a concept less than
comparable, and any type that conforms to
that concept has to provide this kind of operation."
This is simple--this is very simplistic by
the way.
First of all, it's a simple concept.
And second of all, it's not actually even
enough constraints for this very simple concept.
But, you know, you do what you can to get
things onto a single slide.
So, here's how you use a concept to constrain
code.
You've got about the simplest kind of operation
that you could imagine that requires less
than comparison.
You take two objects, return the smaller of
them.
And this is very simple but it provides the
error checking that you want.
If, when I'm writing this function, I make
a mistake, if I--if I perform some operation
here that the concept doesn't guarantee is
there, I will get an error and the error will
be, you've used something that isn't in the
concept.
So that last example where I used the less
than by mistake, that's taken cared of.
The way you use this is the same way you use
any other template.
You just call the function, and if you are
calling it with a type that does not conform
to this concept, you will get an error message,
"you've passed in the wrong type.
You need to pass in less than comparable type.
You haven't."
And before I move on to the next slide, notice
this one word here, auto.
That word, auto, means if a type provides
the appropriate operations, then we automatically
agree that it conforms to the concept.
That's not always what you want.
It's--you find for a case where you've just
got less than, greater than.
You don't want to go through the nuisance
of specifying for every class.
You define that it conforms to less than comparable,
really quality comparable list copy of both
blah, blah.
But for a bigger type like, say, an iterator,
then it probably is good for you to express
your intentions explicitly.
You've written some kind of an iterator.
I've put in a bunch of ellipses here because
an iterator is a fairly large class even in
a simple case.
And the way you explicitly declare you intentions
then, the way you explicitly say that this
conforms to the forward iterator concept is
with the concept map keyword.
You're explicitly saying my type conforms
to forward iterator.
And ultimately, that's all that it comes down
to.
You declare a concept, a set of requirements.
There's special syntax for doing that.
You declare when you define a template what
the concepts are that the template parameters
have to conform to.
And when you define a class, if appropriate,
you say which concept does it conform to.
There are a whole lot of other features with
concepts.
Some of which I expect will be used commonly,
some of which I expect will only be used by
experts in rare circumstances.
Just to name a few of the more important ones,
we've got refinement or concept inheritance.
You can say anything that conforms to this
concept also conforms to that one.
If you're familiar with the STL--the sort
of classic example of this is the iterator
hierarchy.
You want to have the random access iterator
concept inherit from the forward iterator
concept.
There is--I've showed you the simplest kind
of requirement on types, just saying that
a template parameter has to conform to a concept.
There are more complicated cases where you
have multiple template parameters and there
has to be some kind of a relationship between
them.
That's what the requires keyword is for.
And then, the concept map keyword--I showed
how can use it to say that a type conforms
to a particular concept.
There's also a sort of interesting case.
What happens if a type--in some philosophical
sense, conforms to a concept but doesn't have
exactly the right syntax?
And the classic example here again is from
iterators.
You want to be able to say that pointers are
random access iterators.
But pointers don't have member functions.
So you have to say a little bit about this
syntax maps from one to the other and that's
what the other feature of concept mapping
is for, that's how syntax adaptation works.
And I think that my five minutes are up.
So if you want to know more about concepts
ask us after the talk or go watch Doug's talk
again or wait for the tutorials to get written.
So that's--there's been other new template
features.
Concepts are the one that probably took the
committee the most time.
I think it's actually fair to say concepts
were 10 years in the making.
But there have been other interesting template
features.
And the one that is actually worth--well,
I'm going to mention--to spend--actually spend
some time on one.
This falls into the category of, "Why didn't
you do that earlier?"
It's a restriction that there was never any
really good reason for, that you couldn't
have template type disk, well now you can.
This syntax looks a little bit funny for uninteresting
historical reasons, but the feature is there.
Something that's a little more interesting
is variadic templates.
So again, the first question isn't so much
how the feature works, but why do you care
about it?
Why is this something you would want?
Well, this is the classic example.
You'd like to have a tuple type and you'd
like to just be able--well, it's obvious what
it does.
It's a obvious generalization of pair.
Pair takes two template arguments, stores
two things of arbitrary types.
This takes an argument stores and things of
arbitrary types.
It is actually possible to implement this
without variadic templates, but is extraordinarily
ugly and there are limitations.
This is really the way that you want to do
it.
Another pretty obvious example you'd like
to be able to take passing N arguments and
find the minimum of them.
And a slightly less obvious, but still interesting
example is type-safe printf.
We've--this probably looks pretty familiar
to you if you've got Python.
We've got three percent S things here.
And the semantics is just--you can pass in
anything that converts to a string, the conversion
will happen at compile time.
It will just work.
So, how does it work?
Well, if you're actually defining variadic
templates, if you've used Haskell or ML, it'll
actually look fairly familiar to you.
It's pattern matching.
You--this is a variadic template declaration.
It says that you can an arbitrary number of
arguments.
And then, you have to define partial specializations.
This is where the pattern matching comes in.
Here's a partial specialization for the case
where there is at least one template argument
provided here.
We've now named that head.
You can do something with it.
Here's all of the rest of them.
You then recursively invoke the template again,
and finally bottom's out because here is the
specialization for there are no arguments.
And you can use various tricks for how you
work with one argument at a time.
It's--I'm not going to go into this in detail.
But if you're familiar with this sort of pattern
matching style of programming, it should be
fairly obvious.
Well, that's all I'm going to say about template
features now.
So, we've got about five minutes now to talk
about everything in the library.
The library is now about twice as large as
it was in C++ 98.
And one--the one line description of what's
in the library is everything in TR1, except
special functions sort of and then just a
few other things.
So, this is an--this list even isn't quite
complete but it contains most of it and I'm
just going to touch on a few of the highlights.
This regular expressions--it's not as if nobody
has been using regular expressions in C++
before, but there have been a whole bunch
of different libraries, none of them standard.
And this is the standard syntax for doing
it.
Again, if you're used to Python, it should
look fairly familiar.
It's more of the Python model than the pearl
model.
You have a regexps class.
You've invoke--you invoke the constructor
passing in a pattern for the regular expression.
And, oh yes, here's another language feature
that we hadn't mentioned before.
You see this R here before the string literal?
This is a raw string.
And in a raw string, you don't have to escape
back slashes and you can put new lines.
Having regular expressions without that, would
have been pretty painful.
So, you create a regular expression pattern.
You invoke the match and function on the input,
the pattern, and a object here that gives
you access to the individual fields.
Then, you can just extract each of the individual
fields you care about.
And, of course, there are more things that
you can do.
It's got search-and-replace or format, depending
on which way you want to think about regular
expressions.
We can iterate through matches.
Match one thing, then match another thing,
then match another thing.
But in the--in the simplest case, it's really
as simple as this.
Random numbers.
The--again, it's not as if we haven't been
using random numbers in C++ since forever.
We've had the rand function.
The model here is we want to distinguish between
the engines.
Where do you actually have your underlying
source of randomness?
Is it linear congruential, is it Mersenne
twister, is it some kind of physical entropy
generation?
And then once you have this physical source
of randomness, how do you actually get a set
of numbers distributed in some pattern?
You've got the uniform distribution, you've
got the exponential distribution, poisson,
normal log, normal--all of these things.
So here's a simple example where we take one
engine.
This is recommended as the one to use if you
don't want to think about it too hard.
It's just a reasonable default.
You've got a distribution here, a normal distribution
here with the parameters, and you get a number
or--well, presumably more than one number
from that.
Smart pointers.
These are pretty familiar from TR1.
We've got the shared pointer model which is
simple reference counting.
If you make a copy of a shared pointer, then
they point to the same thing and the underlying
object gets deleted when the last smart pointer
to it disappears.
Unique pointer is a little bit less familiar
because you couldn't have it without the move
semantics that was introduced in C++ OX and
that Lawrence talked about.
Unique pointer, the model is--an object is
owned just by a single smart pointer but that
pointer can change.
You cannot make a copy of a unique pointer
because then, the ownership wouldn't be unique,
but you can do a move.
So here's an example, again, of how you might
use unique pointer.
We have a function that creates something.
There's a temporary variable of type unique
pointer in here, but it goes away when you
return it so it's still unique.
You have a vector of unique pointers.
You can push something into the vector.
At any moment, the ownership is still unique.
At any moment, there is still only one owner
but that owner can change over time.
No copy but move.
And I think at Google, that's probably going
to be fairly popular since the moving but
no copying is a fairly common technique here.
A whole lot of new features to support manipulating
function objects, I'm not going to go into
all of them in detail.
I'll just show a simple example but combines
a lot of them.
This is a way of naming an arbitrary function-like
thing that takes a single argument of type
person returns a string.
And this can bind to a member function, it
can bind to a function object, it can bind
to an ordinary C style function.
This is just abstract away all of these differences,
and just lets you talk about the argument
types and the return type.
And here is an example of how you can use
this to return something that takes a string
and returns a bool.
And that's a result of a fairly complicated
bind expression that lets you test whether
a person has a name equal to something arbitrary
that's passed in.
This is--this whole bind facility is probably
largely redundant now that we have lambdas.
It--there might still be cases where it's
more useful than lambdas.
I can't actually think of very many.
This is probably still useful.
It really is useful to be able to talk about
function-like things without the specifics
of the syntax.
We've got some very, very limited support
for talking about time.
This was really an adjunct to the threads
facility, because you really like to be able
to say when you're talking about threads,
"Run this in five seconds.
Sleep until such and such a time."
It can be the basis for a more complicated
time library.
But at the moment, it--at the moment, let's
think about it--let's think of it as a somewhat
more explicit and somewhat more type safe
way to talk about clocks.
And it's a little bit more explicit mostly
because again, the question of what a clock
represents, you've got this integer.
Is that microseconds, milliseconds, nanoseconds,
what have you.
Now, you're actually able to talk about that
in the type system.
So it's at least a sight improvement over
what we had before and it's something we can
build on.
Well, a whole lot more library features that
I'm not going to talk about.
You can ask us later or again read things.
Probably the most interesting of the things
that I've left out are improved Unicode support.
We've got new types to represent wide characters.
We've got them extended into the library somewhat.
We've got some features for converting between
UTF8 and UTF16 and UTF32.
And we've made some other relatively minor
improvements in string.
Notably, we've made it a little bit less thread
unfriendly.
So a whole lot of things that didn't get into
C++ OX.
Suddenly, the committee has started feeling
time very short.
This is a list of some of the things that
didn't get in but the committee is reasonably
committed to towards the future.
Probably, what's going to happen is the next
major revision of the standard will still
be a while from now but we're going to be
releasing these things as technical reports
one at a time.
Those won't officially be normative but will
at least a way to design and describe the
feature so that compiler writers can get practice
implementing them.
And everyone has their own hot list of what
features they think are most interesting from
this list.
I suppose I would probably have to say that
improved thread support would be at the top
of my list.
We've got very limited support for a high
order--thread--higher order concurrent programming
in the C++ standard.
It's not really much higher level than POSIX
threads.
You can create threads, you can do locks,
you've got condition variables but we don't
have all of the other data structures that
we're used to.
We don't have thread-safe queues, we don't
thread pools.
It would be very nice if we could standardize
those things using our low level thread structures
as a foundation.
And that's something where I would hope that
Google can help by defining some of these
facilities and try--and open sourcing them
and trying to get them standardized.
So, what is happening now?
Well officially, C++ OX has not been released
yet.
And in fact, I should take out that word officially.
It just isn't finished yet.
What we have right now is called a committee
draft, meaning, this is the first draft of
the new standard that the committee is even
willing to talk about in public.
And now, we're in the middle of the public
comment period.
We're asking for people to find bugs in the
standard, find features that should be removed,
find some features that aren't as complete
as they should be.
Notice the deadline.
There is a reason that this is in red.
We have some time to submit public comments.
Lawrence and I are collecting them.
I'll tell you in a minute some ways that you
can help with this and we should try to get
them in soon.
What's going to happen after this is the standards
committee is going to look at all of the defects
that have been reported.
It's going to look at all of the defects that
committee members have found on their own
and it's just going to be--well, you can think
of the CD as an alpha.
So, what we want to have one year from now
is our beta, the final committee draft.
There are still going to be some more comments
on that, but we expect them to be much smaller.
We expect that here, things are going to be
really feature complete and it's going to
be small bugs that we're looking for.
A year after that, we will be shipping the
final draft international standard.
That's still not officially, officially a
standard but except for typos, you should
think of this as final.
It's just going to take a six months or a
year after that, for it to get through all
of the officialized bureaucracy.
So I think we've left enough time for questions
now.
And I would encourage everybody to use the
mic for the questions since some people are
listening to this talk remotely.
>> I think, between the two of you, you have
a reasonable idea of what the C Compiler Team
is up to, what GCC is up to, and--kind of
how the C Style people think.
So when do you think we might be able to start
using some of these in conforming Google 3
code?
At least some fun subset of them.
>> AUSTERN: I think that some of these features
will be easier than others.
One problem, of course, is that we use more
than one compiler at Google.
So, we're not going to be able to use any
feature until--let's put it generally--until
whatever code that feature is in can be compiled
by all of the compilers that are ever going
to be touching that code.
That's probably the biggest concern because
it's a big company and we don't necessarily--there
isn't necessarily a single person who even
knows what all of the compiler are out there.
And if we solve that procedural concern, we'll
probably be able to get these features into
you sooner.
The style issue, well, some of these features
are going to be a little bit more controversial
than others.
I would expect that say, auto, there probably
is going to be a very little resistance to.
I would expect that some of these features
address areas of the language that Google
just stays away from in general and that probably
won't change very quickly.
>> So currently, C++ is the slowest compiling
language that I use.
How do you expect these changes to affect
the speed of compilation?
>> CROWL: So the speed of the compilation
will go down.
The current--the current estimate--a lot of
it comes from the concepts and improved checking
on the templates.
The current expectation is that in full production
compilers, there will be about a 10% loss
in performance for code that uses the standard
library in a--in a heavy way.
For code that doesn't use the standard library
and isn't using these features, you probably
won't see a major slowdown in the compilation.
>> AUSTERN: The other answer that I should
give to that is that the range of compilation
speed in different compilers is much larger
than you might think.
I--we used to work in Apple before I came
here and people at Apple were constantly comparing
the speed of GCC and MetroWerks and wondering
why GCC couldn't be anywhere near the speed
of CodeWarrior.
I am hoping that--it would probably be hard
to modify GCC to match that speed but other
compilers that are better than GCC are entirely
practical and there are work--there is work
on new C++ front ends.
>> CROWL: And I'd also like to point out,
for optimized builds, most of the compilation
time is in the back end and the optimization
and so forth.
So, it--the front end speed is really only
an issue when you're doing debug compilations.
>> Yes.
Two quick questions.
I see they address the thread safety of the
strings, does that mean that the copy-on-write
is gone?
Are they using the atomics for that?
And also, what about other reference-counted
semantics like locales and facets?
>> AUSTERN: Strings--copy-on-write is gone.
So, yes.
That's--the existing standard has very complicated
guarantees that were basically designed to
allow but not require copy-on-write.
The new standard has guarantees that I think
basically cannot be satisfied with copy-on-write.
That doesn't bother me.
Copy-on-write is one of these things that
sounded like a great performance improvement
at first, but that most library vendors have
decided to move away from anyway.
And the question is what about other reference
counted facilities?
Shared pointer is just plain thread safe.
It's designed to be--it uses--every implementation
that I know of uses atomics internally.
Locales and facets, the--yes.
I'm trying to think what terminology to use
here.
There is a general guarantee of thread safety
in the library introduction essentially, unless
anybody says otherwise using two objects of
the same class in two different parts of the
program.
In two different threads, it won't explode
you--and the implementation has to do whatever
is necessary to meet that guarantee.
>> I--question here.
Is there going to be template specializations
based on concept conformity?
So could you have sort work for link list
just as well that performs it?
>> AUSTERN: Yes.
Yes.
That was--that was on the list of other concept
features that I would not discuss.
I think of it more as overloading on concepts
and it's actually a pretty essential feature.
One of the classic examples of why you need
this is in the existing STL.
We've got this advance that takes an iterator
and account.
If it's a forward iterator, you just have
to advance one at a time.
[INDISTINCT] If it's a random access iterator,
you just add--so, yes, you can do that overloading.
>> Is the committee considering any actions
that would make C++ easier to parse for automatic
refactoring tools?
>> CROWL: So the problem with that is that
it's very difficult to make C++ easier to
parse and make it compatible with the past.
So, we--the general approach of the committee
is to try not to make it too much worse.
So, you know, it's part of the price of dealing
with, you know, a trillion lines of code.
>> AUSTERN: It's very hard to remove features.
There's one feature that it wasn't necessarily
the feature I hate the most, but it's on my
list and it's one that I thought was completely
unused so I proposed to the committee that
we get rid of it.
And the IBM rep said, "No, no.
We have all of this code internally that you
can't find with Google code search that we
would break if we remove that."
So, we are still going to have trigraphs.
And if you can't even get rid of trigraphs,
you can't get rid of anything.
>> Okay.
Okay.
So with introduction of threads and especially
all of--all of the mutexes and condition variables,
was thought given to the ability to automatically
do thread safety reasoning?
>> CROWL: Well, the GCC compiler has facilities
in it to annotate the association between
variables and locks and so forth.
So that stuff is going forward.
There's no specific language design feature
to address that but it was--it was in the
mind of the people putting this together that
that should be possible.
>> C++ is my preferred language for more than
10 years.
And usually when people complain that C++
was complicated and complex, I used to explain
that as kind of misperception.
It's lengthy.
I won't give it here.
But now that the new standard is almost twice
long than the old one, isn't that too complex?
Was the committee considered this situation?
>> CROWL: Okay.
So, there's a couple of answers there.
First is the standard is much bigger, but
most of the growth has been in the library.
So the--and that's fairly isolated.
You know, you could not read a chapter and
not be upset.
The core language part has grown but it hasn't
grown a whole lot.
>> AUSTERN: The third answer is everybody
on the committee agrees that C++ is too large
now.
I mean everybody.
It's unanimous.
Everybody agrees that we shouldn't add to
many features and everybody says, "But just
this one feature."
And I--we've tired to strike a balance between
restraint and growth.
I hope we've struck the right balance.
There are some features we've added that I
wish we hadn't, but not very many.
>> CROWL: And the other thing to remember
is many of the features that have been added
are in a way of generalizing things.
So, C++ 98, there are--if you try and mix
this feature and that feature, you find out,
"Well, you can't really do that."
And so, a lot of the complexity comes in things
you couldn't do and a lot of those barriers
have been reduced.
The initialization--you could simply not learn
the old way to do initialization and go with
the new syntax and cover almost all of your
needs.
So, there are simpler ways to approach the
language but if you want to know the whole
language, it has gotten a little more complicated.
>> Are mutexes re-entrant?
Are mutexes re-entrant?
>> CROWL: So, there are regular mutexes, timely
mutexes, and re-entrant mutexes, and time
re-entrant mutexes.
Now, what we don't have is reader-writer locks.
That was something the committee decided they
didn't have enough time to put into the standard.
>> I think it's fantastic work.
I haven't looked at this for a long time.
I'm just surprised in the number of features
you put in this.
How are you going to name this new language?
More seriously, the perfect forwarding issue,
is that going to be taken cared of?
The preference...
>> CROWL: Yes.
>> ...to a reference?
>> AUSTERN: Our value reference is--one of
the design goals for our value references
was to take care of that problem.
Yes.
>> CROWL: And that's used heavily in the standard
library now.
>> AUSTERN: I think we've out of time now.
So, thank you.
>> CROWL: Thank you very much.
