- [Instructor] Welcome back.
This week our most important topic
is to review the general structure
of programming languages.
This applies to languages
in all paradigms.
In fact, this applies not
only to programming languages,
but to languages in general.
Let's see, the structure
of a programming language
is defined by its rules.
There are three kinds of
rules that define a language,
lexical rules, syntactical
rules, also called the grammar,
and semantical rules.
So lexical, syntactic and semantic.
Let's review each of them.
Lexical rules tell us
how to combine symbols,
letters, numbers and so on,
to create words.
And usually you separate
words with white spaces
or operators, or some delimiter.
This is what you do when
you're reviewing a program
and trying to figure out
if it is lexically correct.
You read character by character,
and you put the characters together
to create words,
unless you identify an
operator, a delimiter,
a white space, or the end of the line.
Those, the operator, delimiter,
white spaces and end of the line
are correct by itself.
Also, they separate words.
So when you found one of those,
all the characters before them
create a new word,
and when you have a new word
the next step is to review if that word
follow the rules that make it
a literal, a keyword, or an identifier.
If that is the case, that word is correct.
If not, that word is
considered a lexical error.
Let's see an example.
This is a source code in Java,
and have, for lexical
errors, highlighted in red.
Probably you are thinking
there are more errors here,
but we are talking about lexical errors.
We are reviewing the lexical rules,
and there are only four cases
that are not following
correctly lexical rules.
Probably you are thinking
there's an error here,
there is something missing,
but you are applying a syntactical rule.
So yes, here, there is
a syntactical error,
but in terms of lexical
rules, this is correct.
All the words there are correct.
I will talk about syntactical rules later.
For now, talking about the lexical rules,
in Java, the first error here
is an error because
you cannot have a number
at the beginning of any word.
I mean, if you have a number,
the other element after the number
should be more numbers,
or maybe a dot for a
floating point number.
This 3y is not a keyword,
is not an identifier, is
not a number, et cetera.
So because there is not
a lexical category for this
combination of symbols,
this is considered incorrect.
Again, following the lexical
rules defined in Java.
Maybe another language
could have different rules,
and according with those rules,
this could be correct,
but in Java this is incorrect.
The next error is this one here,
the error is because that
if you start something
with a quotation mark,
you need to put a
quotation mark at the end.
In this case, we do not have
a quotation mark at the end.
We have an Enter, and that is an error.
Again, according with the
lexical rules of Java.
Maybe in another language,
this could be correct.
And we can continue looking
for the other quotation
mark in the next line.
But that is not the case in Java.
The next error, this one,
you know that there is an error
because you cannot use this
symbol, the at, in Java.
You cannot have the at symbol
in the middle of an
identifier, or a keyword,
or anything else.
Finally, here, there is an error because
you can have one dot
between the first two numbers,
but when you have another
dot, that is incorrect,
because this one is not an integer,
is not a floating point number,
is not an identifier, is not a keyword,
so because you cannot classify
this combination of symbols
in any of the categories
that we defined before,
then you know that you have an error.
And remember the computer don't care
about the white spaces,
so the previous program is
exactly the same as this one.
In summary, the lexical rules
define the following categories.
Those categories are called tokens.
So the lexical rules define
what is an identifier,
what is a keyword,
what is an operator,
what is a delimiter,
what is a literal,
and what is a comment in
the particular language.
The next set of rules are
the syntactical rules.
Syntactical rules define
how we can combine words
to create sentences that are correct
in the particular language.
The syntactical rules of a language
can be represented
using BNF notation,
or syntax diagrams.
We're not gonna review that in detail,
but I'm gonna show you some examples
so you can have an idea
of how that is done.
This is a subset of the Java grammar.
For each rule you can see the BNF notation
and the diagram.
BNF notation and diagrams and given.
I'm gonna use the diagrams to explain
the following example,
just because the diagrams
are easy to follow.
In the diagram, as you can
notice, there are two elements,
ovals, and boxes.
The ovals represent lexical elements,
token, for instance, class, extent, comma,
open curly bracket, closing curly bracket.
The boxes represent rules.
For instance, there is a rule modifier
and a rule field declaration.
The third row define that the Java program
should start with a keyword class,
but also (mumbles) that
we could have a modifier.
It is optional.
Then, in order to know what is a modifier,
we need to review the rule modifier.
As you see, the rule
modifier is just ovals,
and basically this is
a list of all the words
that you can put before the keyword class.
The rules tell you that you
can start program in Java
with the word class.
But it's also correct if
you put the keyword public
before class, because
public is a modifier.
However, if you make a mistake
and put here something
different like this,
that is going to be an error,
because this word is not listed
as a valid modifier.
If you continue with the rule
after the keyword class,
the first rule tells you
that you need to put an identifier.
Something like this will be correct,
but this will be incorrect.
After the identifier you
can use the keyword extent,
that is optional,
so if you put extent, then good,
but if you do not put extent,
it's also fine.
However if you decide to put extent,
you need to put after that a class name.
You can also continue and put here
the keyword implements,
which is also optional.
But if you put implement
then you need to put the
name of an interface,
and you can have seven interface names,
but you need to separate
them with a comma.
It's mandatory to put
the open curly bracket.
That is not optional.
If you review here,
the open curly bracket do
not have any open part,
so you cannot avoid to put
the open curly bracket,
and so on.
You can continue the rule.
The last thing is a closing curly bracket.
The closing curly bracket
is mandatory also.
Now, these are in the open
and closing curly bracket,
if you have nothing,
according with the rule,
that is correct.
Syntactical rules
help us to identify syntactical errors.
For instance, in the first line,
they're missing some column.
In the second line,
that identifier hi
cannot be between two expressions.
An operator could be correct there,
but not an identifier.
The same case is with the equal operator,
the equal need as a third element an ID,
not another expression,
so an equal with (mumbles)
expression is incorrect.
Similarly, after the
parentheses of the if statement,
you need an expression,
or an open curly bracket,
but not a closing curly bracket.
The last of the rules
are the semantic rules.
Semantic rules will review
if sentences that are
syntactically correct,
those have a meaning in the language.
These are examples of semantic rules.
Semantic rules would review, for instance,
that you declare a variable only one time.
You cannot have two
variables with the same name,
in the same function.
So according with this rule,
the code shown here have
two semantical errors.
Semantic rules would
review that the type match
between variables and values.
And this has three semantic errors
according with this rule.
Semantic rules also want to review
that the index of an array
are integer number.
Or that the condition in an if statement
or loop statement are Boolean value.
That a function have a return type,
and that the return type match
with the type of the function.
And also that the function is called
with the same number of parameters
that is what's declared.
Lexical, syntactical, and semantical rules
define the structure of
a programming language.
Programming languages that belong
to the same paradigm have similar rules.
Our work here is to compare
between the different paradigm,
the lexical, syntactical,
and semantic elements.
That is gonna be our goal the next weeks,
have a nice day.
