Student Projects

Below are some ideas that I think would make good student projects. Most could be adapted for either Part I group or Part II individual projects, or as summer projects for internships. I tend to have a focus on practical software development with a usable piece of software being the end goal. Several projects are programs that I want to use in teaching or research but don't have the time to write myself. Where possible, I try to tie in Open Source or Free Software to increase the longevity of the work to more than an abandoned year's worth of "homework". Projects often become a major part of your early CV so anything that makes them look more realistic to an employer or graduate admissions is a plus.

Some of the projects below may have already been done or are in the process of being done; if so they will be marked "Completed" or "In progress". They have been left up both so you can see the scope of a typical student project and so you could potentially propose to do an extension to one of them. Be aware, however, that any extension would have to be significant enough work to count as a project by itself.

I am happy to hear proposals from students on variations of the topics below, or pretty much anything connected to programming languages, games, or making contributions to existing Free or Open software. Please contact me via email so we can discuss it or arrange a meeting; do not submit a project form without talking to me.

NOBA: Ningbo Online Battle Arena (a MOBA game)

Type: Individual/Group
Background: Some familiarity with concepts in MOBA games.

NOBA is an ongoing umbrella project I am running to develop an open source, modular MOBA game with data logging for research. Last year, a group project started off by creating the basic client/server infrastructure needed for the game. There are multiple possible projects to expand that into a proper game below. These will require you to learn an existing code-base before you start and properly document your work for students coming after you. Not that if more than one of these is being done you do NOT need to coordinate with each-other; they are run as separate projects and should be all your own work.

Dylan programming language

Type: Individual
Background: An interest in programming languages and self-motivation.

I have a long-standing interest in the Dylan programming language. When I was in my 2nd year of university I did a tour of all the programming languages I could find and decided that I liked Dylan the most. Unfortunately, the active Dylan community is quite small and developing a compiler and surrounding ecosystem is a huge and constant amount of work. Pick something from the OpenDylan list of proposed work, check with me that it is big enough for a project, then do it. This is extremely likely to require you to learn the Dylan programming language and maybe more depending on the type of project (eg, LLVM internals or machine-generated C code for compiler-related projects).

Working on a Dylan-related project will require you interacting with the community (IRC, bugtrackers) to check you are not duplicating effort and that what you are doing will be suitable to integrate into the compiler/runtime/etc. Upsetting people or expecting them to do your work for you will be looked on very negatively!

Using program AST similarity for suggesting identifier names

One metric for comparing how similar two functions are to each other is to build the Abstract Syntax Tree for each of them and in some way compare the two trees. If the two functions are identical then the trees will be identical. If the functions are identical except for variable names then the trees will be the same shape but with different labels. Given the large amount of publicly available code nowadays, this can be used to explore a few aspects of software engineering. One particular aspect I wish to explore is automatic identifier naming.

Consider a function in which the function name and all the variable names inside it are given as placeholders (eg, fn, var1, var2, etc). Is it then possible to explore a database of all ASTs (built from many different projects) to find those which are similar to that function's AST and suggest names for those identifiers? How useful/accurate would this be? The underlying assumption is that, over a large number of programs, the same or very similar functions are written many times. Further, it is assumed that people will name functions and variables in a "useful" way. Therefore, identical or very similar ASTs may provide a good source of naming suggestions.

This is interesting by itself, but my interest is motivated by an actual use-case for the suggestions. Writing programs on mobile devices (eg, smartphones) has been very awkward because virtual keyboards are not very nice to use either for extended periods of time, or for text involving lots of different kinds of characters. While other programming methods exist, such as different levels of drag-and-drop programming and templates, names for things still need to typed in. This slows down the speed of interaction with the programming system. The system described above would allow a function to be built up with placeholder names which are automatically updated with suggested names as the AST is constructed and matches known ASTs in the database. If effective, this may result in the user not having to override the suggested names at all and therefore avoid having to use the virtual keyboard. If they do override it, then that newly-named AST can be added to the database for the future too.

You will need to write a program which can parse and create an AST for at least one popular programming language (eg, Java or C). You will need to explore existing tools to see if they can help you in this task (eg, ANTLR, LLVM/clang). Once you have an AST builder then you will need to come up with a metric for scoring tree similarity and use that to design an appropriate storage method for huge numbers of AST and identifier names. Then you will need to crawl large open-source repositories like GitHub to download large numbers of real projects using your target language. These will then be processed into the forms required for the data-store, no doubt encountering unexpected problems due to the real-world nature of the programs being processed. The final program will then take an input program with placeholder identifiers of some form and generate naming suggestions for it. Finally, you will need to evaluate the usefulness of these suggestions in some way (yourself or surveying students?).

Depending on how you view this project, it could fall under "Big Data", Software Engineering, or program analysis. I am interested in researching around this (hence the project) so any good results could lead to a jointly authored paper. Similar AST concepts have been used in plagiarism detection for programs, which may be useful for initial research.

ExPo2: High-speed bot interfaces

ExPo2 is a stock exchange simulator created by John Cartlidge, Paul Dempster, and others, which allows humans and automated trading programs (bots) to trade stocks on a simulated stock market. The stock market and the bots can be set up in different ways to allow researchers to run experiments on how markets function in different conditions. Technically, the system is written in Java and consists of two different servers which run together and two different clients (one web-based for humans, one using Java for bots). The clients and servers communicate using WebSockets, since this is the only bi-directional communication technology available in most web browsers.

The bots use the same WebSocket communications protocol as the human web-page client. This works well and meant that only one communication system had to be implemented. The performance is also very good, giving sub-10 millisecond response time on that same machine without any serious optimisation. Going below 1ms, however, won't be possible with WebSockets. Specialised high-frequency trading equipment used in actual companies can have microsecond response times, so we would like to support that as a possibility in ExPo2. To do so, we would like to add two new communication methods between bot clients and the servers - the first over the network using ZeroMQ, the second on the same machine using an in-process communications API.

ZeroMQ is a very high-speed messaging protocol. ZeroMQ libraries are available for most programming languages, including Java which ExPo2 is written in. We couldn't use ZeroMQ for the human clients, because web browsers do not support it, but we can use it for bots since they can be written using whatever languages and libraries the programmer wishes. You will have to integrate ZeroMQ into the communications system of the server so that clients can connect with either ZeroMQ or WebSockets without the rest of the server caring. You will also need to write a client-side library that makes it easy to write a bot which uses ZeroMQ. The client-side language(s) can be anything with a ZeroMQ library; the most interesting for ExPo2 are Java, C, and Haskell.

As part of finding out how fast we can get bots to run, we need to find the upper limit on speed. For this, we also want to include an in-process API based communication method for the bots. This means the bot is running as part of the server and it's "communications" is really just method calls into the communication system which then act similarly to the WebSockets and ZeroMQ protocols. As above, you will need to integrate this into the existing communications system and write a Java package that presents an easy-to-use API to the bot writer.

For both communications methods, you will also need to write example bots that act of tutorial examples for other developers who want to learn how to use either of the communication methods.

User Interfaces for mobile programming

Type: Individual
Background: An interest in programming languages, Android programming.

Traditional programming interfaces are heavily dependent on keyboards and large monitors which can display a lot of information at once. The vast majority of programming languages are text-based and usually use an extensive array of punctuation characters. This makes a straight port of a typical programming user interface to mobile devices rather difficult. Text entry in particular has many problems, for example, commonly required punctuation that isn't available on mobile keyboards or require scrolling; difficulty to accurately type each character since auto-correct is typically not useful to programming texts; and moving the cursor to a different position is slow and inaccurate.

This project is to compare 3 different mobile programming user interfaces. The first interface is a traditional desktop-style interface consisting of a text window and the mobile device's default keyboard. This is the baseline for comparison. The second interface replaces the default keyboard with a "macro" keyboard which allows language-sensitive words/structures to be inserted in a single "keypress". The text editor window remains as a normal character-based window. The third interface replaces both the keyboard and the text editor window. The keyboard is similar to the second interface with language-sensitive keys. The editor window is replaced by an editor that visually appears to be a character-based editor, but if the user taps on the editor, it interacts with the entire language "block" rather than individual characters. Interacting with the editor character-by-character is not possible.

This project will implement these three interfaces on an Android device for programming Java. Since the actual development is not the focus of the project, compiling the programs will be done by sending the source code to a remote server and receiving the compiler output in response. Once all three interfaces have been completed, the project will then run a study to evaluate the effectiveness of the different interfaces when used by beginner-to-intermediate level programmers. The results of the project will be the three working Android programs and an evaluation of the results of the study.

Memory diagrams: automatic generation from C programs

A previous student project developed a JavaScript-based system to show memory diagrams - visualisations of how memory is used in C programs. This system used a JSON-based format to specify what should be shown in the diagrams for each step of the presentation. This format has to be written by hand by the person wanting to show a diagram. While this allows customisation, to only show the bits of the program activity that the presenter is interested in, it also makes it difficult to use this system quickly.

This project is to automatically generate such diagrams directly from the C source code of short programs. To do so will require you to write a program that is aware of the rules of C and how programs execute, and then be able to generate the appropriate JSON-format steps for a given program. Since the rules of C program compilation and execution are complex, possible solutions may involve using an existing compiler such as LLVM to analyse the source code and generate the JSON, or annotate the program to generate extra output when the program is run which is post-processed to generate the diagrams.

Note that you should already have some interest/idea of what this source code analysis might involve at the beginning of the project - waiting until you have finished the semester 1 compilers module will be too late to do the project.

VR-based social sports watching

Consumer VR has rapidly grown in popularity in the last few years with the Occulus Rift Kickstarter (and subsequent Facebook purchase), Google Cardboard and Samsung Gear adapters for smartphones, and the HTC Vive room-scale VR system. As with any technology, having a "killer app" is required to encourage widescale adoption. This is especially true for VR, which has attempted to break into the consumer space several times in the last 30 years. While games are typically seen as the primary potential killer app, there may be another possibility with "social VR". This project is to implement and evaluate one possible social VR experience - watching sports.

The XBox One launch heavily promoted the social aspect of XBox One. This was done by showing the XBox One owner watching American Football alone at home, but with Skype video-conference videos of his friends along the side of the screen. They were all doing the same at home, so that they were watching the same match and able to talk about it in real-time with their friends also watching it. This was promoted as a "social experience", although it was as clumsy as you would expect from a Skype conference call. This project suggests an alternative.

Imagine you want to watch the match with your friends but you are all at home. Instead of using TV and Skype, you put on your VR headset. Inside the VR world, you are in the stadium where the match is being held. You can look around and see your friends are also sitting beside you. Your friends are shown as 3D models that replicate their movements, who are also connected using the VR system. Your group can now all watch the game together, looking around and chatting in a more embodied manner than with video conference calls. You can even use controller-tracking technology to make beer glasses that are tracked so you can "cheers!" in the real and virtual world!

You actually watch the game in one of two ways. The first, and most basic, would be just to project the TV coverage onto a virtual screen where the pitch should be in the stadium. This would be less immersive but relatively easy to program. The second, more advanced, method would be to virtually recreate the game in real-time, using a game engine for that sport. For example, for American Football, there would be a team of "programmer/actors" who would watch the TV coverage and recreate, in real-time, what is happening on the pitch using the avatars in the game. Such recreations are technically possible, although they have only been shown as technology demonstrations by sports games companies. Since this would be a 3D recreation, it would open up all sorts of possibilities such being able to walk down and watch the game actually on the pitch! Non-sports could also be viewed with the same technology, such as eSports (in which case the recreations could be automatic from the game code) or political events (in which case not much movement will happen anyway).

This project would implement the minimal viable prototype of this idea for evaluation. You will need to implement a VR world to which multiple "viewers" can connect. Each viewer should be able to see the avatar of the others and the user should have the ability to perform basic movements of the kind that would happen while viewing sports (stand up, sit down, wave hands, etc). Depending on the VR technology used these may be artificial or generated from actual movements (eg, with HTC Vive controller tracking). To do this, you will need to use an existing 3D engine technology with VR support and use that to create the 3D world and support the networking code between different users computers. Finally, you will also need to be able to display video in the virtual world (the basic viewing method described above) so that the users have something to watch. Once all this is implemented, you then need to perform an evaluation study to see how people feel about using the system.

Note that this project is dependent on getting access to the appropriate VR equipment, which is still being confirmed. If access is not possible, you will have to do a different project. I'll know if it is possible before the project deadline.

Daifugo multi-player card game

Type: Individual
Background: Interest in Android programming if choosing to do a mobile implementation.

Daifugo is a Japanese four-player card game in the same genre as Rummy and Landlords and Peasants. Each player is dealt a hand of cards and the objective is to be the first player to get rid of all of their cards. Turns rotate between players, giving them the opportunity to remove some cards from their hand if they are related to the last removed cards by certain rules. These rules are typically to do with discarding the same number of matching cards of higher or lower value, or consecutive sequences of the same number of cards. There are also a few combinations of cards that change the rules themselves.

This project is to implement a multi-player version of the Daifugo. This will involve writing a client and a server. You will also need to implement some way of having a single-player game, perhaps by writing bots to run as computer players. The platform (desktop, web, or mobile) can be decided on during the project.

Memory Diagrams [Completed]

Type: Individual
Background: JavaScript, HTML/DOM, git

The goal of this project is to create a Javascript library that can create memory diagrams of stack (primarily) and heap, of the sort used in the PRG module. These diagrams are used to help explain the stack and heap, how variables and pointers are stored in memory, and how function calls and scope are related to the stack. These diagrams will be made up of a number of steps and the software should be able to move forward and back through these steps. The format and features of the diagrams and their description have already been (mostly) decided so you need to take the partial designs and specifications and implement them, although there may be opportunities for modifications if appropriate. The completed Javascript library will be used in teaching the C programming module next semester so working software is the most important deliverable of this project. The library will be open-source licensed (either BSD or GPL, to be decided later) and publicly available so there will be the opportunity to keep working on it if you wish to do so.

Since the library will be in Javascript, and having a working version is the main objective, you should have a good understanding of Javascript and HTML DOM prior to this project. You will also have to use the "git" version control system. Prior knowledge of git would be beneficial but, if not, you can learn it at the beginning of the project. You should be confident working independently since, although this is a supervised project, most of the implementation decisions will be decided by you.

The library should take a JSON description of steps in the diagram and display the current step as part of an HTML page. This should be done using HTML DOM elements or SVG, avoiding using Canvas elements if possible. The user can then step through each stage of the diagram as they wish. You do not need to implement editing or creating diagrams; these will be supplied as valid JSON objects created by hand in the page source.

Coin-counting via image recognition [Completed]

Type: Individual/Group
Background: If you do not already have experience of writing Android apps then you will need to learn this as part of the project but that should not pose a difficulty for students that already know how to program.

Write an Android app to allow you to point the camera at a collection on coins lying flat on the table and have it tell you the total. This will require using image recognition algorithms (perhaps from OpenCV?) and a nice user interface to bring it all together smoothly. Design, implementation, and evaluation of training; user documentation (manual); and user evaluation will all take time. Some quick searching shows a only few programs which try to do this but none appear very fast or widely used. Extension points are recognising multiple currencies; partially obscured coins.

Advanced Xiangqi [Completed]

Type: Individual/Group
Background: Existing knowledge of Xiangqi would be useful but the game can be picked up in a single day.

Advanced Chess is is a variant of chess in which a human and a computer work together as a team against another human and computer team. The human is in control of the team and can use the computer to analyze moves, look-up positions, check strategies, etc. As a result, the team should perform better by the computer removing the human's weaknesses (eg, overlooking a possible move) and the human removing the computer's weaknesses (eg, lack of overall strategy and innovative play).

Xiangqi is also known as Chinese Chess and is very similar to International Chess. Many pieces play very similarly although there are a few special rules that give the game a unique flavour.

This project is to create a Xiangqi program which can be used for Advanced Xiangqi. A player should be able to use this program to check threats to and opportunities for their side given the current state of the game board. The program must therefore know the rules of Xiangqi, allow the player to enter their and the oppositions moves, and facilitate user-driven querying and exploration possible moves in the future.

Things to consider:

Are there any chess programs specific to Advanced Chess and, if so, what features do they offer?
Since the player will need this program while playing, should it be an Android app rather than a desktop program?
Xiangqi games have a time-limit so the user interface will have to be fast and intuitive.
Would interfacing with an existing Xiangqi engine be more effective than writing custom search functions?
How would you evaluate the usefulness of the program? Would the Chess Society be a useful contact?