It’s time for students to grade their professors.
Each semester, through the Q — the Faculty of Arts and Sciences’ course evaluation system – students get the chance to anonymously review their professors, teaching fellows, and courses on a series of metrics like workload and lecture effectiveness. They are also given a free-write section in which they can give more personalized feedback.
While the Q has been used by administrative bodies since 2008 to evaluate student workload and professor quality, it has recently become a topic of lively debate. Faculty administrators contemplate whether the Q evaluations give instructive feedback or simply make it easier for students to find gems — easy courses for students who just want to get by. At the same time, this feedback is still used to evaluate instructors. Professors and teaching fellows worry that their worth and career success hang heavily on the Q. Some administrators are concerned that the pressure to earn a stellar report might tempt instructors to hand out higher grades, hoping their generosity is proportionally repaid.
Among those concerned is Dean of Undergraduate Education Amanda Claybaugh. She recently authored a report titled “Recentering Academics at Harvard College: Update on Grading and Workload.” In the report, she identifies several culprits contributing to Harvard’s grade inflation — among them, the Q.
“It’s important to me that we base our policies on rigorous analysis, rather than anecdotes and vibes,” wrote Claybaugh in an email to The Crimson about the Q.
This report comes at a moment when the University’s academic rigor has become a topic of national debate and an issue the Harvard administration has sought to combat.
Tracing the history of Harvard course evaluations — from The Crimson’s early Confidential Guide of College Courses (the “Confi”) to the Committee on Undergraduate Education (CUE) guide, to its current form, the Q guide — may help us understand what role student feedback can and should play now.
***
The original incarnation of the Q, the “Confidential Guide of College Courses,” or “Confi,” was composed and published by the editors of the Crimson beginning in 1925. Crimson editors felt that “the past discussion of the merits and defects of college courses has been altogether too meagre to be of any value either to instructors or prospective students.” The Confi was somewhat anecdotal, relying less on data and more on the personal experiences of upperclassmen editors.
The Crimson wrote in its 1930 issue that the Confi was intended to “furnish students with the frank opinions of other students who have taken the courses being offered.” Indeed, frank they were: “Zoology 3 has, unquestionably, the worst reputation among the courses in the field of Biology; in general, this reputation is deserved,” the 1934 guide wrote.
Of a certain Comparative Literature professor called Mr. Babbitt, the 1930 Confi wrote, “Anyone may guess what happens when the acid of Mr. Babbitt’s mind meets the syrup of romanticism.” Babbitt’s reported arch-enemy was Jean Jacques Rousseau, and he was known to write on the papers of “non-conformist” students the note: “Good argument. You’ll get over this after a while.”
Not all were pleased with the humorous timbre of the Guide, however. An anonymous letter to the Crimson after the first issue claimed that “such destructive criticism as comprised most of your ‘Guide’ is not calculated to help the student in selected [sic] his courses.”
It’s difficult to tell when the Crimson stopped publishing the Confi (you’d really think we’d know), but as late as the 1990s, instructors were still disputing the accuracy of the Confi’s “biting” reviews.
“A for no effort,” the 1995-1996 Confi said of “Chamber Music from Mozart to Ravel.” Professor of Music Robert D. Levin ’68 did not appreciate the sentiment, warning students the next year that they would not, in fact, be skating by.
Much more objective, many professors thought, was the “CUE Guide.” After the 1973 consolidation of the Committee on Undergraduate Education (CUE), the College began distributing questionnaires and summarizing the results in formal course evaluations.
Like the Confi, the CUE was written by students primarily for other students. The CUE, however, sought to be more faithful to reality, carrying out a much more extensive survey process and introducing quantitative measures of difficulty and workload.
Still, some faculty weren’t quite enamoured with the evaluation system. In 1973, Professor Harvey C. Mansfield ’53, who might rightfully be considered (in the language of today’s youth) the “OG hater” of the CUE/Q, reportedly refused to allow the inaugural distribution of CUE questionnaires in his class Government 1a, “Introduction to Political Philosophy.”
With FAS oversight and funding, however, came the potential for the College to influence reports. In 1985, Dean K. Whitla, director of the Office of Instructional Research and Evaluation, allegedly pressured student leadership to change the “harsh tone” in 17 of 250 write-ups in the CUE guide — many about such influential professors as evolutionary biologist Stephen J. Gould.
Though the CUE was intended primarily for students selecting courses, professors still paid attention and responded to their courses’ ratings. Professors in the 1990s who discovered that the CUE rated their course as a “gut” — the etymological predecessor of “gem” — tended to promptly raise standards. Hollis Research Professor of Divinity Harvey G. Cox told The Crimson in 1990 that if his class “Jesus and the Easy Life” ever had a “gut” reputation, “it certainly won’t after last spring.”
The Office of Institutional Research and Analytics took over the CUE in 2008, renaming it the “Q,” and required all courses with five or more students offer evaluations.
For students, the reform had been a long time coming: a persistent criticism of the CUE was that instructors could opt not to participate in the evaluation process. Mandatory evaluations, some students reasoned, were a matter of responsibility: the University ought to give students information about the courses they might enroll in. “Students retain a legitimate claim to the opinions of their fellows before enrolling in a course,” Crimson writer Dan E Markel ’95 wrote in 1994.
The Q, in contrast to the CUE, wasn’t just intended as a course-picking tool for students but also as a tool for the formal evaluation of faculty. Until March of 2025, a number of teaching awards were determined based solely on Q scores, a decision Professor of Government Paul E. Peterson cited in a telephone interview as a probable contributor to grade inflation.
Another difference: Q reviews aren’t a synthesized work of prose, but a disjoint mosaic of comments. Faculty like Mansfield and German Professor Peter J. Burgard continued to object, arguing course evaluations — especially in their new, more official form — encouraged instructors to give higher grades.
In a 2006 faculty meeting, Mansfield said that he thought evaluations “subjected the wise to the judgment and scrutiny of the unwise” and intruded on “the sovereignty of the professor in his classroom,” possibly violating “academic freedom.” German and Comparative Literature Professor Judith L. Ryan responded, “I think it would be a sad day when I or anyone else considers me wise, because I as a professor am here to learn.”
Mansfield, for his part, thinks that we ought to go back to just the Confi, which he recalled using during his undergraduate years at Harvard. “It had no standing,” Mansfield says, “and everybody knew, you know, it was just the opinions of one or two people in the Crimson.”
***
Today, the Q has a life of its own.
Burnt-out students complete the Q on dark nights during reading period, only remembering because of a stream of Canvas notification reminders filling up their inboxes. However tired and overworked students may be, they are asked to set aside a few minutes to grade their professors and TFs, incentivized to do so by the promise that they will be able to view their own final grades a few days before they are officially released.
“I want to get my grades back early, so generally I fill it out,” says David D. Dickson ’28.
Despite this obvious incentive to fill out the Q, Dickson says he is still intentional with his responses. “Unless I really don’t like the class, I’ll be like, ‘Oh this person’s so amazing. Blah, blah, blah. This class is so amazing.’”
Even when a class is far from a match made in heaven, Dickson says he tries to “give grace” when writing his evaluations. For example, if a class is fine, he may just give a good score. If the class was not “taught well,” he says he feels more inclined to be “a little more strict.”
Other students argue that not everyone is as motivated to fill out the Q, and that the reports tend to be skewed towards the most passionate.
“It’s mostly an opportunity for students who have a really strong opinion to share that, because it’s mostly people who either really, really enjoyed the class or really, really didn’t,” Audrey Chalfie ’28 said.
A couple of weeks after students fill out the Q, instructors are able to see how they scored. For tenured faculty, these scores have lower stakes and can be taken as mere suggestions.
Professor of Government Paul E. Peterson says he certainly pays attention to his scores in terms of seeing what appeals to students — yet he makes clear that a bad review here or there would not radically change how he approaches the classroom. The Q “informs the way I think about how to teach my classes,” he says — but so do many other things.
Non-tenured faculty, however, are affected to a higher degree. Mansfield, for one, says the Q creates an “unspoken contract” between younger faculty who don’t have tenure and their students, where high ratings are exchanged for high grades.
The latest report from the Office of Undergraduate Education raises a similar concern about the reciprocity of evaluations and questions if the Q contributes to grade inflation. Even though grading and workload don’t necessarily affect the scores students give, “faculty nonetheless believe that the grades they award and the work they assign determine the Q scores they receive,” Claybaugh wrote in the report.
However, senior preceptor in physics Jacob A. Barandes — once a TF himself at Harvard — rejected the idea that grading may be influenced by a desire to score well on the Q. He says that for himself, grading was such a demanding task that it was “very hard to focus on anything other than just getting each next p-set done.”
At the same time, Barandes is well aware that the moment TFs open their Q scores may be just as stressful as when undergraduates receive their final grades.
Barandes — who taught a teaching practicum course to graduate students in his former position as Co-Director of Graduate Studies for the Department of Physics — devoted an entire class to explaining the various ways that the Q impacts TFs.
The Q, he explained, is used to give feedback to TFs to improve as educators and to provide accountability. Getting an ‘A’ on the Q guide can help TFs be recognized for their stellar teaching and can also help them land future job opportunities — good reviews from the Q function as a kind of letter of recommendation written by students. The Q can also signal to the department when TFs need more support in their capacity as developing educators.
Because of these uses, instructors aren’t the only ones reading the Q. FAS’s review committees and department leadership use the Q as a tool to celebrate great educators and to evaluate the state of academic rigor at the college. As of spring 2024, the FAS also uses the Q to gauge the range of viewpoints present in a given class, asking about courses’ openness to diverse opinions and about how comfortable students feel speaking about controversial topics.
During course registration, the Q reaches students again. When clicking on a course title, students have the option to see what has been written by both fans and survivors of the class.
Yareh Constant ’29 explained that the Q is particularly helpful as a first-year student.
“I always check the Q reports to see how the workload is, how the teacher is, and get a comprehensive view of what the course will feel like to take, beyond the course material,” he says.
However, the information provided on any given Q varies greatly.
Student feedback on the Q for General Education 1038: Sleep, which was taught this past spring, mentioned the word “gem” a total of 27 times. Some students claimed the class has been “de-gemified” — suggesting its difficulty has increased.
Another student, however, responded more concisely: “Gem.”
This blunt feedback can still be helpful for students when trying to balance demanding classes that are required for their concentration with electives and the College’s general education and divisional distribution requirements.
“I don’t really want an elective or a Gen-Ed to be taking up hours,” Olivia Sullivan ’26 says while explaining how she uses the Q.
In an email to The Crimson, Claybaugh wrote that some students “do use the Q to find gems” — which she feels impacts the overall utility of the Q.
“I do wish that students understood that the Q is supposed to serve three discrete functions,” Claybaugh writes in the email. These functions are to be a source of feedback to help instructors, a teaching evaluation to help review committees, and a guide to help students register for classes.
“It sometimes seems as if students are filling out the Q with only that last purpose in mind (just writing “GEM” or whatever), and that can keep instructors and review committees from getting what they need from the Q,” she wrote.
Looking through the history of student evaluations at Harvard, it seems inevitable that, should their beloved Q guide significantly change, students will innovate. Though the current gem website (which is not officially affiliated with Harvard) is based on existing Q reports, it serves as evidence for student willingness to create a version of Harvard’s course evaluation system that best fits their needs and desires — no matter how “ethical” professors and administrators deem those to be.