Social Music Network Analysis
The project is to study user behaviour on music-related Social Networks, and to see if there's any useful information. As an example, consider two sites: Musicbrainz, a site where interested parties input, edit and collate metadata about audio recordings, and RapGenius, where users input, analyse and discuss the lyrics of hiphop and rock songs.
We'd be interested in learning whether it's possible to
- automatically “align” (link) the same works between the two social networks;
- use editor behaviour to identify clusters of related works (maybe related to “genre”, or maybe some more specific concept), by looking at groups of works;
- compare any clusters found between the two social networks, to see if similar clusters are identified in both cases.
The alignment task will teach the student about describing and publishing data using the Semantic Web. The social network analysis will involve a limited amount of programming in Python, along with a significant amount of data visualization and exploration using statistical programming environments such as R.
The project is to work on the server-side of a social network for interacting with and commenting on collections of recorded music. Specifically, there is work to do to help implement a REST API designed in conjunction with the PRAISE project, and to validate it by designing a suite of tests and their expected results, automatically generating reports for developers to see the impact of their changes.
Some or more of the following:
- contributions to the implementation of the agreed API design;
- a documented set of manual tests to run to verify that portions of the API are implemented correctly;
- an automated test suite (using a ‘headless’ web browser such as phantomjs) exercising the majority of the API implementation, generating a report of successes, failures, and time taken when run;
- deployment of a continuous integration platform (such as jenkins) to run the test suite at regular intervals or on every version control system commit.
The work on the backend itself will teach the student about server-side Web Programming and advanced uses of Web Server technology. Developing a testing framework and suite of tests will teach the student about good software engineering practice, along with automation tools such as jenkins and web client technology such as phantomjs.
Music and Image Similarity Search
The project is to take a prototype implementation of a tool to search for similar images within a database, turn it into slightly more of a product, and test it on a variety of images from digitized collections of early music manuscript, such as those from Early Music Online.
- an implementation capable of detecting near-duplicate images within a a collection of a small (around 100) number of images;
- automatically generated test cases for this implementation, with known near-duplicates (and known non-duplicates) to validate the implementation;
- prototype work towards an implementation capable of scaling to larger number of images, using Locality Sensitive Hashing.
The student will learn about information retrieval, image feature extraction, and probabilistic data structures; the implementation work can be carried out in the programming language of the student's choice.
Ethnographic Musical Instrument
The project is to design and implement a prototype musical instrument on mobile devices; working in conjunction with an anthopologist and musician, the instrument should be capable of being used in improvisation contexts, while also collecting data about its use and its context to be later analysed offline.
- a mobile application, targetting iOS and Android, implementing a design made in collaboration with a musician and anthropologist;
- documentation (including video documentation) on the use of the application;
- preliminary data analysis on the use of the application in musical contexts.
The student will be exposed to ideas from human-centric study and user interface design, and will work on the production and documentation of a mobile application with high-quality sound production (and possibly sound recording and analysis too).
Industrial Partnership Work
TCP Congestion Control performance
The project is to implement simulations of TCP congestion control algorithms in the presence of faults associated with mobile networks, and to study the achieved performance of the TCP under those kinds of faults.
- simulations of TCP in mobile networks, whether through network simulators such as ns2 or a set of virtual machines networked together through Linux user-space networking devices such as tun/tap and virtual bridges;
- methods for injecting faults characteristic of mobile networks into those simulations;
- data on the performance of a variety of current TCP congestion control algorithms in the presence of these faults.
The student will learn about the details of TCP and mobile networking, along with how to simulate and implement virtual networks. The data on congestion control performance would be of interest to an industrial partner.
Lossy image compression and perception
This project is to investigate the perceptual effects of lossy JPEG encoding, and to assess whether there is any way of determining from an encoded JPEG file whether there is scope for additional lossy compression to be applied to the image without unacceptable artifacts being introduced
- a report on a systematic investigation of the perception of encoding artifacts in lossily-compressed image file;
- an implementation of a tool which can predict with a certain amount of accuracy the degree to which a given JPEG file contains noticeable compression artifacts;
- (ideally) a custom implementation of a JPEG encoder, able to further compress images to a given baseline acceptability level.
Steel Bank Common Lisp / Google Summer of Code
SBCL is a mentoring organization in Google's Summer of Code programme, funding students to work on open source projects for the summer. There are a number of ideas listed on the page the SBCL project has created, which could form the basis of a solid student project even in the absence of funding (for which the application deadline is Friday 21st March).
One project not listed on the SBCL ideas page is the idea of providing an implementation of as many of the Virtual Operations making up the lowest-level intermediate representation in C++. This would have the benefit of providing an initial (slow) implementation on platforms not currently supported by SBCL, such as ARM, with minimal architecture-specific code needing to be written.
- incremental modification of the x86 or x86-64 backends, replacing as many of the custom-written assembly fragments with calls out to C++ routines implementing the same functionality;
- verification that the modification of the backend is portable, by running it on a distinct platform, such as powerpc or sparc;
- (ideally) using it as the basis of a completely distinct port, for example on ARM.
The student would learn about the architecture of a production compiler and runtime system, and about the details of calling conventions and assembly language. The student will also learn about the peculiarities of incremental modification of compiler processes.
Flexible (pseudo)random number generation
See the entry here