title: 'Undergraduate Projects: Pinnacle or Detour?' author: ED76004A (Module 4) / 33347617 mainfont: Helvetica Neue linestretch: 1.5 papersize: a4paper geometry: margin=1.5in header-includes: - \renewcommand{\abstractname}{Declaration} - \usepackage{tikz}

abstract: '\noindent By submitting this, I declare it to be my own work. Any quotations from other sources have been credited to their authors. Approximate word count: 1950 words'

It is now common for undergraduate degrees to include a substantial dissertation or project component in the final year. Over the past two decades, the perception that a project is an important and integral part of the degree has increased (Fry et al., 2008, chap. 11): benefits are seen not only in the immediate production of knowledge or synthesis as a result of the project execution, but also in the development of independence in the student, who must in large part take responsibility for planning and executing the project work, and additionally in the production of a new and substantial element in a student’s portfolio, suitable for discussion or demonstration when job-seeking or applying for further study. Indeed, having performed a number of interviews for admission to postgraduate study, and sat in on interviews at major employers, it is common for a substantial part of the interview to be given over to discussion of the student’s final project: as the project is often seen as the culmination of the student’s learning journey, and a demonstration of their personal effectiveness (Fry et al., 2008, chap. 8), it provides the clearest evidence about the student’s knowledge and skills available at interview time.

It is expected that part of the curriculum delivery in undergraduate Computing degrees will be “a major activity allowing students to demonstrate ability in applying practical and analytical skills ... this will often take the form of a project carried out in the final year” (QAA, para. 4.3). In the degrees offered in the Department of Computing at Goldsmiths, this is certainly the case: all the undergraduate degree programmes culminate in a 60-CATS project module, usually carried out as the only activity in the second term of the final year (to remove as much as possible distractions from the experience of independent work on a single project). These projects are broadly ‘unstructured’ (in the terminology of Fry et al., 2008, chap. 11), though there are interim milestones, and individual supervisory practice varies as to how much structure is suggested or imposed on the overall unstructured framework.

There are a number of sub-disciplines in the discipline of Computing; the draft subject benchmark (QAA, sect. 2) identifies Computer Science at the foundation, and discipline areas of Computer Engineering, Software Engineering, Information Systems, and Information Technology as having distinct emphases. At Goldsmiths, degrees are offered in Computer Science, Business Computing (closely related to Information Systems) and Creative Computing, as well as more recently in joint and targeted programmes (Music Computing, Digital Arts Computing and Games Programming) which will not concern us here. Importantly, the curriculum for each of these degree programmes is similar in shape: the first year of study is almost entirely shared between single-honours programmes, with 15 or 30 CATS of the 120 being taught separately (though often covering very similar learning outcomes). In the second year of study, there is more specialization, including a 30-CATS module reflecting the distinct identity of the degree programme within the general field of Computing; and in the third year of study, students choose four 15-CATS modules and execute a project.

Given this curricular system, if the project is the pinnacle of a student’s learning experience in the degree programme – if it is truly to reflect the knowledge and skills gained, and if the knowledge and skills are in any way distinct between the degree programmes in different Computing sub-disciplines – we would expect to see some kind of association between a student’s performance in signature modules for their degree programme, and their performance in the final project. The rest of this essay is a quantitative analysis on historical marks drawn from the Department of Computing’s database, to assess the evidence for or against this hypothesis.

The data used is the subset of the marks covering the results of those students achieving results in 120 CATS^1 of level 5 modules in the 2009-10 academic session, and a mark in the final project in the 2010-11 session. This will exclude part-time students, and also students who were required to resit a year in part-time attendance through failure, but does not exclude students who progressed to their final year despite failing one or two modules at level 5: a total of 51 students’ marks were included in this dataset. Records were anonymised, and no information enabling identification of any individual or small group is presented in this essay or stored in any persistent format.

framework; in this case, there is a straightforward relationship
between course units and CATS: 1 c.u. = 30 CATS

There are a number of subtleties to take account of when modelling these data. The full details of the modelling assumptions are in the appendix; we briefly motivate and describe each modelling step here.

Firstly, viewing marks as a collection of observations as to the extent any relevant learning has occurred, we transform the raw marks using the logistic function. Since it is probable that individual markers are more or less generous when interpreting descriptors, for each module we subtract that module’s overall average (transformed) mark so that the marks for each module have a mean of zero.

Secondly, it would be completely expected for there to be a correlation between students’ performance in modules: broadly, students come in with a certain aptitude towards Computing and towards study in general, and that manifests itself in strong correlations between marks in all modules. Without wishing to oversimplify, we assume that each student has a single factor in the sense of Spearman’s $g$ (1904), though restricted to aptitude in Computing and related cognitive skills such as Computational Thinking (Wing, 2006); module marks as transformed above are assumed to be drawn from a Cauchy (heavy-tailed) distribution with a per-student median.

Having made these modelling assumptions, we can probe for associations between marks in a level 5 module and the final project by positing a deterministic component to the project mark coming from the level 5 module in question. Again, the details of this modelling is in the appendix; for the purposes of this essay, the strength of the association is measured with a single number $C$ per module, where a $C$ of 0 indicates no association at all, and a $C$ of 1 reflects a project mark fully determined by the level 5 module mark.

\begin{figure} \input{/home/csr21/goldsmiths/research/marks-data-mining/coursehist.tikz} \caption{\it \setstretch{1.1} Graph summarizing the empirical influence of each module, taken independently, on the final project. A value of 0 would indicate no relationship at all, while a value of 1 would indicates that the assessment of the final project was predetermined given the assessment of the module. The vertical ordering reflects the strength of the evidence for or against influence, from strong evidence of influence at the top to strong evidence against influence at the bottom.} \label{f:associations} \end{figure}

Using JAGS (Plummer, 2003), we used Gibbs Sampling to estimate model parameters (median and precision parameters for per-student mark distributions, and coupling parameter $C$) for each level 5 module in turn; the results are displayed in figure \ref{f:associations}. The per-student estimates for median and precision parameters were discarded, and only the coupling parameter $C$ was considered.

Taken at face value, the results show a very strong association between IS52023A and the final project, and a somewhat strong association between IS52020A and the final project. There is also strong evidence against any association between either of IS52016A and IS52014B and the final project, and (broadly) no strong evidence either way for any of the other modules.

IS52023A was a module in “Creative Projects”, which was a second-year group project module taken by BSc Creative Computing students. It is therefore perhaps unsurprising that its marks are related to the final project mark, though it is surprising that the association is so strong – particularly since the analogous module taken by BSc Computer Science and BSc Business Computing students, IS52018B “Software Projects”, with broadly the same learning outcomes but different delivery did not show strong evidence for association. In this (historical) case, we can conjecture that the fact that the lecturer and lead marker for the Creative Projects module was also heavily involved in supervising and marking many of the Creative Computing students’ final-year projects might explain the unexpectedly strong consistency in the marking.

IS52020A “Creative Computing II” was the signature module for Creative Computing students. That there is evidence that there is an association between the marks there and in the final project suggests that, at least to some extent, the Creative Computing students are applying material that they have learnt. This is in stark contrast to IS52014B “Problem Solving using Creative Programming”, which was a module taken only by BSc Computer Science students: if there was a signature module for BSc Computer Science, this was it.

These findings are somewhat troubling, as it indicates that (at least as of the time in question) there were diametrically opposed practices or outcomes related to the final project in two of our undergraduate degree programmes. One interpretation of the results is that it shows that students in BSc Creative Computing tend to integrate the core part of their sub-discipline into their final project, whereas students in BSc Computer Science actively avoid the core part of their discipline when executing their project^2.

the department’s programmes, and anecdotally this rings true: BSc
Computer Science students tend to produce projects which could be
characterized as in the subdisciplines (QAA, 2015) of Software
Engineering or Information Systems.

The difference in student project approach could be explained by a number of factors: the BSc in Creative Computing is a programme attempting to exploit Goldsmiths’ strengths, both in the department and reputationally. It has a small cohort, with (historically) more stringent requirements on entry, enabling more focused teaching, and does not have very many competitor programmes in the sector. By contrast, the BSc in Computer Science is more of a generic programme, with almost every other HE institution offering a comparable programme; its entry requirements have historically been less stringent, and the spread of interests and ambition among students enrolled is wider, needing more varied project supervision provision. Additionally, Computing is a subject with both vocational and traditionally-academic aspects; Computer Science is the programme likely to have the strongest academic emphasis, but this may be at odds with the aspirations and expectations of the student body.

The results and their interpretation in this essay need to be confirmed. We have repeated the analysis for students one year behind (level 5 modules in 2010-11 and final project in 2011-12), finding broadly similar results. The modular structure changed substantially subsequent to the 2011-12 session, meaning that the analysis presented in this paper could no longer be straightforwardly applied; it would be of interest to see whether the change in the module structure, or any other changes to departmental curriculum delivery, have affected the associations between modules and the final project on particular programmes.

If the evidence suggesting a divergence of status of the final project among our degree programmes persists, we propose firstly that this evidence be discussed among the department, to verify that the evidence is broadly in accord with colleagues’ perceptions, and also to determine whether it is in fact a problem. If the general feeling in the department is that the final project should have more of an association with the core knowledge and skills of a particular sub-discipline, then it will be incumbent on programme leaders to specify more tightly and to brief project supervisors thoroughly on the expected project content and learning outcomes, so that any structure that they impose on the students’ projects is appropriate for that particular student’s programme of study.

References

Fry, H, Ketteridge, S and Marshall, S (eds.) (2008) ‘A Handbook for Teaching and Learning in Higher Education’. Routledge

Plummer (2003) ‘JAGS: A program for analysis of Bayesian graphical models using Gibbs sampling’. Proc. Distributed Statistical Computing, Vienna, Austria

QAA (2015) ‘Subject Benchmark Statement: Computing (draft for consultation)’

Spearman, C E (1904) ‘“General intelligence”, Objectively Determined And Measured’, American Journal of Psychology 15, 201–293

Wing, J M (2006) ‘Computational Thinking’, Communications of the ACM 49(3)

Appendix: Modelling

Logistic links and transformations

It will be helpful in modelling assessment scores to convert between a compactly-supported distribution (on the range of possible scores, which we will take as being between 0 and 1) and a distribution whose support is the real line. The natural transformation for this, which can be theoretically motivated by considering the assessment scores as a collection of Bernoulli trials, is the logistic transformation

[ x = \log\left(\frac{s}{1-s}\right) ]

which maps scores $s$ (expressed as a fraction of the maximum achievable) to a real number.

Marking styles

Although some effort is made, by the use of double-marking and by consideration at Examination Boards, to assure a broadly consistent view of learning outcomes and attainment descriptors, there is a substantial subjectivity in assigning marks. We attempt to correct for this by subtracting the mean logistically-transformed mark for each module $j$ from each of the individual logistically-transformed marks before modelling:

[ y^j_i = x^j_i - \bar{x}^j ]

Aptitude

For a student $i$, we assume that the mean-corrected logistic transform of their score in level 5 module $j$ is drawn drom a Cauchy distribution with median $\mu_i$ and precision $\tau_i$:

[ y^j_i \sim t_1(\mu_i, \tau_i). ]

Note that under this assumption there is no dependence of the parameters of the Cauchy distribution on the particular module $j$.

Causal influence

How can we measure the causal influence of one course on another? This too only partially be addressed in this study – the standard way of investigating the effect of modules might be to take cohorts of students, and randomly assign some fraction of them to take the module and the rest not; this is impractical in general for undergraduate tuition. We must therefore make another simplifying assumption, and model the effect of module $j$ influencing the results of a different module $k$ by saying that the results from module $k$ should be a weighted mixture of the result from module $j$ and a random draw from a more narrow distribution. Specifically, when considering an association between the final project mark and a given level 5 module, we say that

\begin{eqnarray} y^k_i & \sim & Cy^j_i + (1-C)t_1(\mu_i, \tau_i)\ & \sim & t_1(Cy^j_i + (1-C)\mu_i, \frac{\tau_i}{(1-C)²}) \end{eqnarray}

Note that a value of 0 for $C$ in this expression causes the distribution of $y^k_i$ to reduce to $t_1(\mu_i, \tau_i)$ – in other words, $C = 0$ indicates no influence at all. A value for C of 1 indicates perfect correlation, as there is no dispersion around the central point as the precision $\tau$ is infinite in that case. Because a model with this coupling can reduce in the limit to one without, we can compare models including coupling to no-coupling models using standard measures of goodness-of-fit.