Hello,
everyone.
My name is David Malan,
and I'm so sorry I couldn't be there in person with you this week,
but I'm here to present on behalf of myself,
Brian Yu and Doug Lloyd from CS50 at Harvard.
If at any point during today's talk you have any questions or after the talk,
please feel free to drop me a note.
I'll keep an eye on my email throughout the day.
And
if you'd like to get a copy of the paper in question,
head over now or later,
to cs50.ly/sigcse20-paper or for
these very slides,
you can go to cs50.ly/sigcse20-slides
So,
without further ado,
CS50 is our int roductory course in computer science for majors and non-majors at Harvard.
It's a one semester amalgam,
the course is generally known elsewhere,
as CS1 and CS2
We teach it primarily in C,
followed by Python.
And as of the past few years CS50 happens to be Harvard's largest course,
with over 800 or so students on campus as well as online through Harvard's Extension
School, 2/3 of whom have never taken a computer science course before.
The workload itself is nontrivial,
with most students spending 12+ hours per week on the course's
problem sets, or programming projects.
And in terms of the course's support structure:
We have lectures once per week,
followed by sections or recitations led by the course's teaching fellows,
or TFs,
as well as tutorials and office hours held by all of the staff,
which are one-on-one opportunities for help with those same problem sets.
Now,
within the course itself,
we have different tracks,
so to speak,
for those less comfortable,
those more comfortable, and those somewhere in between,
whereby different demographics of students convened for different sections so that they are
among like minded and similarly experienced or inexperienced classmates.
And even within the courses problem sets do we tend to have different versions of problems for those less comfortable
or more comfortable that students can choose between week by week,
depending on the ceiling or floor that they would like to have ahead of them.
So the course,
in terms of academic honesty,
has a long standing and very detailed policy on academic honesty in the syllabus that
defines what is reasonable and what is not.
In fact,
if you'd like to look at the full detail feel free to forge ahead to
cs50.ly/sigcse20-syllabus
But in essence,
the syllabus allows and encourages students to collaborate,
particularly in pseudocode and when it comes to the thought before their programs.
But it does prescribe guard rails on what collaboration is ultimately allowed,
insofar as ultimately we prescribed
for all of the course's
problem sets that students implement their programs largely on their own and
write their own code.
An exception to this is the course's final project at term's end, where they're free to collaborate with
one or two classmates.
But in essence,
within the syllabus,
we provide this heuristic that generally guides these guard rails for students.
"When asking for help.
You may show your code to others,
but you may not view.
theirs."
In other words,
if two students, A and B happened to be sitting next to each other at office hours or in their dorms or the like,
and one of those students is struggling,
it's fine for the other student who isn't struggling to look at the struggling student's screen and perhaps
point out rhetorically or,
more explicitly,
what lines of code, or bugs,
or syntax they might want to focus on.
But it would be crossing the line if the student who's struggling outright looks at the code of another
student,
but again see the syllabus for more precise guard rails as well.
Even so,
despite these constraints,
in the course of syllabus,
we have the unfortunate distinction in CS50 of referring more students most every year to the
university's honor council or disciplinary process as a result of students having
crossed some line prescribed in the syllabus.
How we know this is that we, like a lot of universities or intro courses,
do run our student submissions through software be it e-tector or MOSS or some
other such tool with which you might be familiar.
And it outputs as part of that process any number of similarities that we then apply human
eyes to.
So just for context in fall 2019 most recently among our 800 or so
students,
and our 10 or so problem sets did we have some 6 million+ pairwise comparisons
conducted by this automated software,
out of which might come 1200 seemingly worrisome matches side by side,
two submissions that look unduly the same,
at which point we have a gauntlet of humans,
either two or three pairs of eyes that then iteratively review those matches and whittle it down to a more
manageable list.
For instance,
we ended up with 116 matches after initial human review,
and then myself and the course's more senior staff
each week sit down and review those matches and decide
collectively whether or not to send a case forward to Harvard's Honor Council or not,
in this most recent fall semester did we refer 38 students to the Honor Council,
at which point we document the case,
we write up the similarities,
provides the samples of those submissions,
and then it's an independent body,
this Honor Council that ultimately adjudicates the outcome.
Now,
in terms of what these cases involve,
it generally involves students having based their code on someone else's,
whether that's someone else's on campus or perhaps some online solutions.
In terms of the numbers each semester,
it's highly variable over the years
but it generally ranges between 0 and 10% of the class that we ultimately
refer to Harvard's disciplinary body.
From conversations with peer institutions.
we gather,
unfortunately,
that the 0 to 10% is within range of other courses, as well.
But it's worth noting in our case it seems to be trending upwards,
and that's in part because of some of the interventions we'll present here today.
But even so,
it's been of interest to us,
certainly over the years and to the university to reduce the frequency of these acts
of dishonesty themselves,
students copying unduly,
some online or some on-campus source.
So this paper and in turn,
today's talk is really a look at everything we tried over the years
and the spoiler,
unfortunately,
is that nothing really has worked,
at least more precisely,
and nothing has worked in our case that has put a consistent downward pressure on the
total number of cases.
But even so,
we do think a number of the interventions we've tried have been successful educationally,
ultimately transforming what has historically been purely punitive processes into more
teachable moments.
So what have we considered and
what have we tried? Well,
it's often suggested here on campus.
And perhaps beyond, that
we simply change.
The course is problem sets each year the programming assignments.
Unfortunately,
at least in our case,
the homework assignments tend not to be,
tend not to be as simple as problems that allow you to change the inputs and
expect different outputs.
They do tend to be programming projects unto themselves,
so changing the problem sets would involve rewriting and coming up with brand new problem sets.
And at least in my case,
I actually find that over time our problem sets get better over time.
As I work out the kinks,
fix bugs address FAQs within the specifications themselves.
And so we feel it would actually be a net negative for the sake of that smaller demographic of students,
simply to throw out problems that seem to be challenging a majority of students and working well overall.
So we've generally shied away from changing problem sets for changing problem set's sake.
 
Though they certainly do evolve by choice on our part.
over time.
It's often been suggested too why don't we just re-weight the problem sets? Make them worth much less
in terms of a student's final grade to put downward pressure.
Therefore,
on the
the pressure they might be putting on themselves and also the effect that plagiarized work might
have on a student's overall quantitative performance.
Here,
too,
we feel that that would be a disservice to the sheer amount of time that students are spending on these
programming assignments.
Indeed,
it is largely the problem sets that characterized students learning and hands on experience throughout
the class.
And so I've never been comfortable.
For instance,
with the idea of instituting proctored
exams or weighting the exams in the class much more heavily than that which the students are spending most of
their time on in the first place.
But we have tried communication each year.
Have we spend more and more,
sometimes less,
but in general,
more and more time on communicating to students exactly what our expectations are in terms of
academic honesty
Reviewing past cases that have gotten some of their prior classmates in trouble
albeit anonymously,
and sharing outright examples of code that we adjudicated to be
too similar or indeed identical in some cases,
and we even go so far as to point out examples of the kinds of things we via software are looking for.
Variables whose names are somewhat transposed but are still in essence serving the same function.
 
And certainly for multiple lines of code as well do we point out similarities there as well.
We've also focused on raising awareness.
Some years ago,
did we have the,
 
the university's dean of undergraduate education even come to one of our orientation meetings
at term's start to speak to students not only about academic honesty more generally,
but also about the course's
own policies and his own experience.
And his colleagues' own experience with,
unfortunately,
the disciplinary process that all too often results at the end of the term.
Also,
by nature of our having referred so many students over the years to this Honor Council
has there generally been,
in some years,
more than others,
just more awareness on campus,
of classmates who have been disciplined in some way.
And so we find that in parts, are numbers trending up and down
depending on the year as a reaction to the past year's numbers of cases and students awareness
thereof.
But we've also taken more software based approach is so built into,
for instance,
the courses software via which students submit
their work nowadays, is an explicit prompt,
to which they must answer yes or conversely no,
We asked them explicitly:
"Keeping in mind the course's policy on academic honesty,
are you sure you want to submit these files?"
The presumption being that even if a student has unfortunately crossed some line or made some poor
judgment,
they have this final chance before they submit their work and put their name on that work to say "No,
this is not,
in fact,
my own."
Unfortunately,
all too often have students still is typed in 
"yes",
and submitted their work despite this final prompt here.
But we've also added other prompts over the years to the courses
homework assignments via forms that the students submit a la
Google Forms whereby we have an opportunity to ask them not only short answer
code-unrelated questions,
but also checkboxes and the like that we require that they check we,
for instance,
in almost every problem, present students with a link to the course's
academic honesty policy
and the syllabus and ask them to check a box by requirement that they have read
and the course's policy on academic honesty.
And then later in the term,
do we ask them a few other related questions as well to ensure that they're not just clicking.
Unfortunately,
this too does not seem to have had a measurable effect on student's behavior.
But we've also focused in recent years on more human interventions interventional conversation,
so to speak,
whereby when the senior staff and I are reviewing that week submissions,
if we notice that some pairs of submissions seem to be awfully similar and awfully
close to suggestive that some line has been crossed.
But it's not clear cut evidence thereof,
we might instead ask those students to come in for a conversation to better understand how it
is they approach that week's problem set, how or how they didn't collaborate that particular week,
and ultimately, not to interrogate them toward an end of some punitive outcome,
but rather just to understand that process and to help them better navigate the waters next time
around.
So the outcome of this process,
these interventional conversations,
really is guidance and suggestions for how they can better structure their time together
if they're indeed working with some classmate and some reasonable way for the coming week's
problem sets, as well.
But perhaps most impactful was one sentence that we introduced to the course of syllabus in
2014.
a so-called regret clause.
At the time,
we realized,
after years of reflection on data past that all too often were student's
transgressions.
The result of late night panic.
Late night stress often well after midnight when they simply, under a looming
deadline in our course or others, opted to make a poor decision: grabbing some code off the
Internet,
copying from some friend,
with or without their knowledge, and submitting it as their own.
And up until then,
we didn't have a well-defined process for how a student might own up to that sort
of late night mistake.
To be fair,
there was no mechanism in process preventing such students from coming forward the next day in admitting
they had crossed some line.
But with the potential penalty so high,
indeed,
the university might, as part of the disciplinary process,
ask that they withdraw from the college for one or more terms.
It's no surprise that no students really availed themselves of that explicit option.
And so we opted in 2014 to make it more explicit.
Introducing this language into the course's syllabus:
"If you commit some act,
that is not reasonable,
but bring it to the attention of the course's heads within 72 hours.
the course may impose local sanctions that may include an unsatisfactory or failing grade for work
submitted,
but the course will not refer the matter for further disciplinary action, except in
cases of repeated acts.
In other words,
we hypothesize that if we allowed students some 2 to 3 days, in our case, to come
forward after some rest, after some sleep, after some reflection,
after having crossed some line, that they may very well take us up on that offer,
reach out to us
before we had even noticed the transgression ourselves, and indeed turned a process that would have
otherwise become purely punitive into,
we'd hope a teachable moment.
Indeed,
after these invocations of the regret clause,
would we sit down, I, or one of the senior staff with the student to, first and foremost, understand what had
happened.
We would then typically zero the problem.
or problem sets in question,
but quite explicitly assure the student that we then considered the matter behind us.
But we also invited students to share with us any extenuating circumstances or
stressors that indeed had been looming on them when they made that poor choice.
And in fact,
as a side effect of this regret clause,
 
not only did we hopefully transform within the course a historically punitive process to more
teachable moments and opportunities,
it also came to light.
in some of these conversations that students were having particular troubles back home or
relationships with roommates or friends or the like struggling in other classes or struggling with issues of mental
health.
And so in those cases,
when a student brought forth those circumstances well,
could we then connect them all the more successfully than in years past with the appropriate support
structures on campus,
particularly in cases of mental health.
And so it's those conversations that had previously never happened that were the byproduct of
having introduced this clause,
and that first year some 19 students availed themselves of that clause,
coming forward to me for heartfelt and in a couple of cases,
tearful conversations as to what had led them to that particular point,
at which point we then would connect those students with right resources on campus. The year after,
in large part,
we think,
because of awareness of the policy's introduction,
did some 26 students out of the course's student body avail themselves as well.
After that,
I suspect we weren't as strongly voiced in the class and beyond the class on the
availability of this clause,
particularly as the campus newspaper's attention dwindled after a couple of years of it in play.
But in more recent years,
with 18, 11 and 8 students having availed themselves most recently of this particular
clause,
do I suspect we're more in a band of equilibrium, indeed,
it seems to correlate with just how vocal we are as to the availability of this particular
clause.
Now,
the introduction of this regret,
clause was not without issue or concern early on.
Indeed,
in so far as the Honor Council at Harvard is intended to handle all cases centrally,
there was concern in some circles,
particularly within the dean's office as well as the Honor Council itself,
that we were effectively proposing to handle some of our situations internally.
Now,
to be fair,
we were only proposing to handle those cases that students themselves brought forward and
not the cases that we ourselves detected by our automated and human process.
However,
ultimately assuaging folks concerned was the introduction of one final clause to that sentence,
which was,
"except in cases of repeated acts."
Indeed,
if a student availed themselves of our clause but then made a second mistake despite that intervention,
we did agree to inform the honor council lest that student be crossing similar lines in
other classes as well.
Now there are other interventions we tried over the years,
inspired by these clauses and others that were not so succesful
In fact,
we tried deploying two years ago so called brink clause as well to the course's
syllabus, inspired,
in fact,
by a conversation I had with Princeton's Christopher Moretti in the past SIGCSE,
In fact,
whereby this clause's design was meant to preemptively catch students
before they crossed some line and was met,
therefore,
to be much more proactive than reactive.
In this case,
we asked students in formal language in the syllabus, if they feel themselves late at night about to make some
poor decision,
Googling more than they should, copying and
pasting more than they should, to close their laptop, to email us,
then close their laptop,
and we'll deal with it in the morning,
effectively giving them a blanket extension for the day until we can handle things after they've had some rest.
Unfortunately among the student who invoked this particular clause,
as best we could tell,
it was almost always effectively to obtain for oneself a self-granted
extension in few,
if any cases did we actually sense,
based on the wording of the students
emails and subsequent conversations that they were indeed, ironically, on the brink of doing
something dishonest
And so we actually discontinued this clause last year as a result of those impressions.
But we also have tried deploying even changes to some of the problem sets.
And in fact,
a few years ago did we introduce a problem set on documents,
similarity,
ostensibly about plagiarism,
whereby we challenge students to implement code that cross-compared two files and then
visualized as by highlighting the similarities or differences between the two. Very similar in
spirits to the software we ourselves use
But without describing it as a problem set on plagiarism and yet somehow as of
2018
when we last offered that problem set,
did we end up referring 6 students to the university's Honor Council for having
crossed some line on that particular problem set. Now,
in terms of the results are cases have indeed varied between 0% and 10% of the course's student
bodies over the past 10 or so years,
most recently in 2019 did we refer 5% of the course's student body
to the Honor Council.
Even despite any number of these interventions in place throughout the semester,
you'll notice,
though some curiosities,
for instance,
in 2016 a spike from 2 to 10% of the course's
student body that actually follows.
Two years later,
the courses regret clause,
and this is not the result of do
I think any change,
fundamentally in student's behavior there does tend to be ebb and flow over the years,
based on again just how much awareness there is in a prior year of students and their classmates
having been disciplined through this process.
That partly explains the up ticks and down ticks.
But after 2014 truth be told,
did we and did I personally become all the more comfortable referring students to the university's
Honor Council simply because we had,
after 2014 offered them this window of opportunity to come forward to meet us
halfway by the course's regret clause.
And therefore in more recent years have I been much more comfortable when presented with seemingly clear cut evidence
of simply documenting the case,
referring it to the Honor Council because the students have indeed had an opportunity to handle it
otherwise.
But that particular 10% of course,
was then noticed
all in many circles on campus by way of the campuses newspaper as well.
And so we think there,
too.
The 4% in 2017 was,
at least in part a reaction to students awareness of just how seriously the course and others
take these processes on campus.
So if you'd like to learn more about this intervention in particular,
I would invite you to take a look at the specific language in the syllabus here again at
cs50.ly/sigcse20-syllabus
If you'd like to take a look at the paper or slides here those other two URLs again and
by all means, if you would like to experiment with the software we use,
please feel free to take a look at this GitHub repository.
github.com/cs50/compare50
This is CS50's own implementation of a tool similar and spirit to MOSS and eTector
But in this case is the repository
open source, is the design of the software meant to be extensible so that others can contribute heuristics and
algorithms and variations thereof.
So by means please feel free to consider this part of the open source community now as well.
If you do have any questions whatsoever now or hereafter,
please feel free to reach out.
And on behalf of myself,
Brian and Doug, this was CS50.
