How do Princetonians feel about their courses, really? Machine learning analysis offers an answer.

As courses recommence for the spring 2024 semester, The Daily Princetonian knowledge part took a glance forward, analyzing a typical end-of-semester ritual for Princeton college students: course evaluations. At the tip of every semester, the University encourages college students to submit numerical course evaluations, ranking their programs on a scale from one to 5 in varied classes.
Questions within the survey ask about varied features of the course expertise, together with “what recommendation would you give to a different scholar taking this course?” These evaluations can be found for the reference of future college students who think about enrolling in a given course, as assets akin to enable college students to simply evaluate numerical course rankings.

However, the utility of this sort of analysis has broadly been known as into query just lately, because of its inherent subjectivity. For instance, the distinction in high quality between a rating of “1” versus “2” might differ amongst college students — this makes evaluating common “Course Quality” rankings significantly problematic. The ‘Prince’ got down to take a look at a metric that avoids the shortcomings of numerical evaluations: sentiment analysis by way of pure language processing (NLP).

The Computer Science (COS) division scored an common of 4.02/5. As a fraction, 4.02/5 would recommend a excessive course ranking — that is the best common numerical analysis of any sequence analyzed. However, out of all written course evaluations for intro COS, solely roughly 66 p.c had been categorised as optimistic.
RoBERTa, the kind of machine learning mannequin we used, is pre-trained on hundreds of books and English Wikipedia articles. The precise mannequin we used is a model of RoBERTa, educated on an further 58 million tweets. The variations between Princeton course evaluations and the info RoBERTa was educated on (books, articles, and tweets) might have affected our findings; tweets have a big quantity of linguistic noise, and this mannequin is unfamiliar with Princeton-specific terminology (i.e. PDF, PSet, principle, and so on.).
Introductory and core programs
Introductory programs, which are sometimes required for college students pursuing a selected diploma observe or focus, are among the many largest programs supplied on the University. We evaluated introductory course sequences throughout varied standard departments, akin to Computer Science (COS) and Economics (ECO), the 2 departments that awarded essentially the most levels within the 2022–2023 tutorial 12 months. For some departments, like COS, the sentiment of written opinions was persistently optimistic; for others, like ECO, findings various.

The proportion of scholars leaving a damaging written analysis of COS 126, COS 226, and COS 217 solely rose above ten p.c twice since fall 2014: as soon as in fall 2018, and once more in fall 2022. Positivity in written suggestions elevated through the pandemic by roughly 11 p.c, earlier than returning to pre-pandemic ranges. In spring 2023, 65 p.c of scholars left a optimistic evaluation, 26 p.c had been impartial, and solely 9 p.c had been damaging. In comparability, the typical numerical evaluations from fall 2014 to spring 2023 had been 3.96 for COS 126, 4.29 for COS 226, and three.82 for COS 217, for a complete common of 4.02 out of 5. This numerical ranking reveals a probably extra optimistic ranking than the feedback present. 
COS 226: Algorithms and Data Structures persistently accounts for a big proportion of the optimistic opinions. 
“I appreciated the truth that our programming assignments had been related to actual world functions,” wrote Rayan Elahmadi ’26 in an e mail to the ‘Prince.’ Elahmadi cited assignments akin to Autocomplete, the place college students implement a textual content auto-completion algorithm à la Google search, as significantly relevant.

Before the tip of junior 12 months, all economics college students at Princeton are required to finish ECO 300/310: Microeconomics, ECO 301/311: Macroeconomics, and ECO 302/312: Econometrics. Our findings for these programs various — from fall 2015 to spring 2016, the proportion of scholars leaving a optimistic evaluation for these six courses decreased from 54 p.c to 29 p.c.
During the COVID-19 pandemic, opinions gained positivity earlier than step by step returning to pre-COVID-19 ranges. In the spring of 2023, 44 p.c of all submissions had been optimistic, 40 p.c had been impartial, and 16 p.c had been damaging. To distinction with the everyday numerical averages, over our interval of curiosity the core ECO courses scored an common of three.48 out of 5.

The general development is barely optimistic for introductory BSE math and physics programs, with an enhance within the proportion of optimistic opinions from 33 p.c in Fall 2014 to 45 p.c in spring 2023. In the spring of 2020, there was a noticeable bounce in damaging opinions. From fall 2014 to spring 2023, the typical numerical ranking for these programs was 3.23.

The EGR sequence, which started in 2017, was launched as an various to the normal math and physics programs for first-year engineering college students. Responses had been very optimistic within the first a number of semesters — in spring 2018 and fall 2019, the proportion of optimistic opinions approached 80 p.c of all written evaluations submitted, in comparison with roughly 45 p.c for conventional BSE introductory programs.
There had been noticeable oscillations within the knowledge from fall to spring semesters – in some situations there was ten p.c extra damaging suggestions within the spring in comparison with the earlier fall. Only two EGR sequence programs are supplied within the spring: EGR 153: Electricity, Magnetism, and Photonics, and EGR 154: Linear Systems. The different three, EGR 151: Mechanics, Energy, and Waves, EGR 152: The Mathematics of Shape and Motion, and EGR 156: Multivariable Calculus, are inclined to obtain extra optimistic suggestions.
The spring 2023 semester confirmed an all-time negativity excessive for the EGR sequence. The common numerical evaluation for the EGR sequence was 3.84 out of 5.
Myles Anderson is an assistant Data editor for the ‘Prince.’
Additional consulting offered by emerita head Web Design and Development editor Anika Maskara.
