CDT-AIMLAC Research

UKRI Centre for Doctoral Training in Artificial Intelligence, Machine Learning & Advanced Computing

Research projects

Our doctoral training programme is constructed around three research themes:

T1: data from large science facilities (particle physics, astronomy, cosmology)
T2: biological, health and clinical sciences (medical imaging, electronic health records, bioinformatics)
T3: novel mathematical, physical, and computer science approaches (data, hardware, software, algorithms)

Research projects are placed in one of the three themes. The CDT encourages in particular the development of synergies between the themes, via the sharing of common methods and a interdisciplinary supervisory team. Not all themes are available at all of the partner universities.

All positions for the 2021 cohort have now been filled.

A sample of research projects is given further down on this page.

Research contacts

In order to discuss a PhD position/project at one of the partner universities, please contact:

T1: data from large science facilities

Astronomy/Cosmology: Prof Malcolm Bremer (Bristol), Prof Stephen Fairhurst (Cardiff)
Experimental Particle Physics: Dr Henning Flaecher (Bristol)
Theoretical Particle Physics/Physics: Prof Gert Aarts (Swansea), Prof Biagio Lucini (Swansea)

T2: biological, health and clinical sciences

Medical imaging: Prof Reyer Zwiggelaar (Aberystwyth)
Genomics, bioinformatics: Dr Tom Connor (Cardiff), Prof Sinead Brophy, Prof Steve Conlan (Swansea)

T3: novel mathematical, physical, and computer science approaches

Computer science: Prof Reyer Zwiggelaar (Aberystwyth), Prof Jonathan Roberts (Bangor), Prof Roger Whitaker (Cardiff)
Mathematics, computational science: Prof Biagio Lucini (Swansea)

Example research projects, organised by host university, 2021 cohort

ABERYSTWYTH UNIVERSITY

Project title: Robotic assisted eating for independent living

1st supervisor: Dr Patricia Shaw
2nd supervisor: Dr Fred Labrosse
Department/Institution: Department of Computer Science, Aberystwyth University
Research theme: T2 - biological, health and clinical sciences T3 - novel mathematical, physical and computer science approaches

Project description: This proposal looks to address one aspect of assisted home living for users with highly limited mobility. Specifically, this project investigates how a robot manipulator can learn to support and assist a user in daily tasks such as eating. This task requires fine dexterous control and coordination to transport food on a spoon from a bowl to the users mouth. Different techniques can be considered for optimising the control of the robot manipulator for data intensive machine learning over multiple trials [1], modelling how children learn to transport food to their mouth [2] or mathematical optimisation of trajectories e.g. [3]. As part of this project, different approaches will be evaluated to achieve a system that has the ability to not only perform the proposed task, but is capable of adapting to performing other tasks that will support independent living.The input to the system will be images captured by an RGB-D camera overlooking the area from which the face of the person and bowl will be detected using standard techniques such as Haar Cascades and CNNs. The project will use existing hardware and link into related projects on assisted living.

[1] H. A. Pierson and M. S. Gashler, 'Deep learning in robotics: a review of recent research', Advanced Robotics, vol. 31, no. 16, pp. 821-835, Aug. 2017
[2] M. E. McCarty, R. K. Clifton, and R. R. Collard, 'Problem solving in infancy: The emergence of an action plan.', Developmental Psychology, vol. 35, no. 4, pp. 1091-1101, 1999.
[3] L. Janson, E. Schmerling, and M. Pavone, 'Monte Carlo Motion Planning for Robot Trajectory Optimization Under Uncertainty', in Robotics Research, vol. 3, A. Bicchi and W. Burgard, Eds. Cham: Springer International Publishing, 2018, pp. 343-361.

Project title: Explainable Artificial Intelligence Systems for Massive-Scale Nonstationary Data Streams

1st supervisor: Dr Xiaowei Gu
2nd supervisor: Dr Changjing Shang
Department/Institution: Department of Computer Science, Aberystwyth University
Research theme: T3 - novel mathematical, physical and computer science approaches

Project description: Thanks to the rapid development of information technology and electronic manufacturing industry, massive volumes of streaming data are generated from various aspects of Internet-based activities, in different forms such as texts, images, audios, videos, etc. The information embedded within the streaming data is of paramount importance for enhanced insight into and decision-making about the underlying problem. The need for extracting invaluable information from such data has attracted numerous international organizations and companies to make efforts in order to deploy advanced data mining techniques. However, the very high volume, velocity, variability and complexity of the streaming data have posed great challenges to traditional AI technologies in data-intensive applications. Particularly, the lack of transparency and explainability has been a great barrier for the relevant techniques to be practically implemented in life-critical and financial applications. Therefore, a significant demand exists for developing more advanced data-intensive technologies that entail high performance and efficiency, while enjoying model transparency and explainability. The main aim of this project is to develop cutting-edge computational intelligence technologies for massive-scale data stream mining and modelling in nonstationary environments. In particular, the work of this project will construct an advanced explainable AI methodology through integrating the latest developments in deep learning, ensemble learning, fuzzy systems and pattern recognition. The developed methodology will be further implemented for a carefully selected real-world application, to be chosen from a range of possible problem domains, including: autonomous driving scene analysis, remotely sensed imagery analysis, and high-frequency trading data analysis. The project will provide a platform for an exceptional doctoral candidate to undertake research, involving both theoretical development and experimental investigation, within a world-leading research team for computational intelligence.

Project title: Automated correction of stray light in images

1st supervisor: Dr Helen Miles
2nd supervisor: Dr Matt Gunn
Department/Institution: Department of Computer Science, Aberystwyth University
Research theme: T3 - novel mathematical, physical and computer science approaches

Project description: Stray light in an optical system is un-wanted light -- it is light that enters the optical system but is scattered from internal surfaces, reaching the image plane and contributing to the noise and uncertainty in the image. Modern optical design techniques and coatings allow stray light to be reduced to an acceptable level in many applications but it cannot be eliminated entirely. Stray light cannot be corrected by conventional image processing techniques such as flat field correction as the light reaching any point in the image can be contributed by the whole scene - the correction is therefore dependent on the brightness distribution in the image. The stray light behaviour of a camera system can be characterised through calibration and testing but we currently have no way of removing it from real images. This project would investigate the use of AI and ML techniques for advanced image processing of these images to aid characterisation and removal of stray light artefacts.

Project title: A virtual environment for visualizing multimodal geological data

1st supervisor: Dr Helen Miles
2nd supervisor: TBD
Department/Institution: Department of Computer Science, Aberystwyth University
Research theme: T3 - novel mathematical, physical and computer science approaches

Project description: In the field, geologists walk around to look at evidence in context: curiosity drives movement, inspection, and analysis. Many types of site data are regularly captured for off-site analysis, some of which will not be visible to a human in the field, e.g. multispectral or subsurface data. During analysis, this information is typically displayed on a 2D computer screen, which limits the ability of geologists to interpret the data. This project will develop an immersive virtual reality environment to present geological data in the 3D form in which it was originally captured, with AI-powered tools to support interpretation. This will allow users the best of both worlds - that of the human field geologist and that of the instrumentation used to capture the data.

Project title: Using artificial intelligence to simulate reaching and grasping

1st supervisor: Dr David Hunter
2nd supervisor: Dr Patricia Shaw
Department/Institution: Department of Computer Science, Aberystwyth University
Research theme: T2 - biological, health and clinical sciences T3 - novel mathematical, physical, and computer science approaches

Project description: Over the past decades researchers have shown training neural networks that mimic known features of human physiology can teach us much about how human visual processing functions and develops. This project will look at Reaching and Grasping, a highly complex visual task that requires close coordination of eye gaze and motor control. Unlike many other gaze related visual tasks, reaching and grasping requires top-down intent driven control of visual processing and therefore allows us to study how high-level intent drives low-level feature-driven visual processing. This project will involve creating immersive environments to track peoples' hand and eye motions while they interact with objects. Data from these experiments will be used to design and train a novel artificial neural network that can learn to accurately replicate hand and eye motions. The Higher Education Funding Council for Wales Research Capital Funding has recently invested in a new eXtended Realities laboratory in Aberystwyth that will allow us to accurately track both eye gaze (saccades) and hand motion. This project will be suitable for students with a strong mathematical background and an interest in either cognitive psychology or neuroscience.

Additional projects can be found here.

BANGOR UNIVERSITY

Project title: AI to engender Fast Visualization Ideation Design

1st supervisor: Professor Jonathan C. Roberts
2nd supervisor: Dr Panagiotis (Panos) Ritsos
Department/Institution: School Computer Science Electronic Engineering
Research theme:

Project description: Ideating new creative visualizations takes much time; often design sketching is used and many alternative ideas are sketched. While recently researchers have been developing tools to help users create visualizations, they focus on developing a final visualization design (Roberts et al 2020), rather than capturing or helping the user in the ideation process. Tools like Tableau's show-me, D3, Lyra, Keshif and the Improvise environment enable different visualisations to be crafted, but it is less easy to create or manage new ideations. Dashboards can be crafted in many ways, using D3.js, high-charts.com, datahero.com and in tools such as Tableau, SAS or Power BI, but again do not help the user analyse, collaborate, or share their ideas of the ideation process.
Researchers have started to make smart design interfaces (e.g., Chen et al 2020), which help designers create new data-visualizations, recommend design ideas, encourage good design, suggest ideas and return to previous ideas (perhaps forgotten). This research will investigate current visualization design strategies such as the Five Design-Sheet method (Roberts et al. 2016), multiple views (Roberts et al 2019), deep learning, ideation techniques, recommender systems, and version control, to make the ideation process smart. The work will investigate how to mix smart guidance lead by AI recommendations with version control, and incorporate learnt behaviour into visualization ideation process, which will encourage novel designs to be explored and captured.

[1] Chen X, Zeng W, Al-Maneea HMA, Roberts JC, Chang R. 2020, Composition and Configuration Patterns in Multiple-View Visualizations. IEEE Transactions on visualization and computer graphics.
[2] Roberts, J.C., C Headleand, and P.D. Ritsos. "Sketching Designs Using the Five Design-Sheet Methodology". IEEE Trans on visualisation and Comp. graphics. 2015, 22(1). 419-428. https://doi.org/10.1109/TVCG.2015.2467271
[3] Roberts, JC, Al-Maneea, HMA, Butcher, P, Lew, R, Rees, G, Sharma, N & Frankenberg-Garcia, A 2019, 'Multiple Views: different meanings and collocated words', Computer Graphics Forum, vol. 38, no. 3, pp. 79-93. https://doi.org/10.1111/cgf.13673

Project title: Edge-based object recognition for immersive analytics in Web-based XR

1st supervisor: Dr Panagiotis (Panos) Ritsos
2nd supervisor: Prof Jonathan C. Roberts
Department/Institution: School Computer Science Electronic Engineering
Research theme:

Project description: We are Increasingly being immersed in a technology-mediated world, where the omni-presence of data introduces increased needs in mechanisms facilitating in-situ cognition, reasoning and sensemaking [2, 3]. In parallel, edge computing, facilitated by future networks, such as 5G, is transforming the way data is being processed and delivered from millions of devices around the world, bringing computing and analytics close to where the data is created [1]. Building on these synergies, this project will investigate the use of edge-based object recognition using distributed neural networks (DNN), as a mechanism for in-situ registration and data processing for mobile, Web-based Immersive Analytics (IA) in Extended Reality (XR).
Object-recognition can provide accurate and real-time registration [1], yet its practical application still faces important challenges. Current object-recognition systems are either self-contained, or cloud-based, yet face low latency and poor user experience respectively. Deep Learning, and DNNs, can provide effective solutions for object detection, and ameliorate these challenges [1]. In addition, they have the potential to provide adaptive MR interfaces, and multimodal sensing capabilities useful for advanced IA experiences [2].

[1] P. Ren, X. Qiao, Y. Huang, L. Liu, S. Dustdar and J. Chen, Edge-Assisted Distributed DNN Collaborative Computing Approach for Mobile Web Augmented Reality in 5G Networks, in IEEE Network, vol. 34, no. 2, pp. 254-261, March/April 2020, doi: 10.1109/MNET.011.1900305.
[2] P. W. S. Butcher, N. W. John and P.D. Ritsos, VRIA: A Web-based Framework for Creating Immersive Analytics Experiences, in IEEE Transactions on Visualization and Computer Graphics (Early Access), 2020 doi: 10.1109/TVCG.2020.2965109.
[3] J. C. Roberts, P.D. Ritsos, S. K. Badam, D. Brodbeck, J. Kennedy and N. Elmqvist, Visualization beyond the Desktop--the Next Big Thing, in IEEE Computer Graphics and Applications, vol. 34, no. 6, pp. 26-34, Nov.-Dec. 2014, doi: 10.1109/MCG.2014.82.

Project title: FLOOD-AI: Using Artificial Intelligence to Investigate the Impact of Land Management Decisions on River Flood Risk

1st supervisor: Dr Sopan Patil
2nd supervisor: Dr Panagiotis Ritsos
Department/Institution: School of Natural Sciences and School Computer Science Electronic Engineering
Research theme:

Project description: Hydrological models are essential tools for simulating streamflow in river basins and are widely used for understanding, and forecasting, a river’s flood response to storm events. However, appropriate application of hydrological models requires a priori calibration of parameters using historical measured streamflow data. Previous research has shown that the relationship between hydrological model parameters and physical river basin properties (e.g., topography, soils, land use) is too complex to characterize using traditional statistical models. This limits the ability to determine how parameter values will modify if land use change alters the physical structure of a river basin. Recent advances in Artificial Intelligence (AI), specifically in Deep Learning (DL), have resulted in the ability to provide efficient high-dimensional interpolators that can handle data of multiple dimensions and heterogeneous information, such as those encountered in hydrological modelling.
In this project, our goal is to develop AI techniques that can help improve the ability of hydrological models to predict the impact of land use change on river flood risk. Specifically, we propose a novel use of AI and information visualization to interactively relate hydrological model parameters to the physical properties of river basins. Our approach will involve development of DL techniques to extract high level abstractions in the hydrological model and physical river basin data, which can be used to test the impact of land management decisions on river flood risk. This abstraction will be made available to end-users via an interactive visualization interface to facilitate the flood risk investigation of multiple scenarios of land management changes (e.g., increase in urbanization by 10%). Our training dataset will include data from >1000 river basins across the UK, and the coupled AI-hydrological modelling workflow will be streamlined to operate on Supercomputing Wales High-Performance Computing (HPC) framework.

[1] Patil, S. and M. Stieglitz, Modelling daily streamflow at ungauged catchments: What information is necessary?, Hydrological Processes, 28(3), 1159-1169, 2014.
[2] Patil, S. D., Y. Gu, F. S. A. Dias, M. Stieglitz, and G. Turk, Predicting the spectral information of future land cover using machine learning, International Journal of Remote Sensing, 38(20), 5592-5607, 2017.
[3] S. Rizou, K. Kenda, D. Kofinas, N. M. Mellios, P. Pergar, P. D. Ritsos, J. Vardakas, K. Kalaboukas, C. Laspidou, M. Senožetnik, and A. Spyropoulou, Water4Cities: An ICT platform enabling Holistic Surface Water and Groundwater Management for Sustainable Cities, in Proceedings of 3rd EWaS International Conference, Lefkada, Greece, 2018.

Project title: Predicting the Relative Coastal Weather and Conditions.

1st supervisor: Peter Robins
2nd supervisor: Matt Lewis
Department/Institution: School of Ocean Science
Research theme:

Project description: Using AI and ANN (artificial neural networks) to develop a novel Met Ocean prediction tool, with end users training the algorithm's assessment of a user's impression of condition and risk. Outcome is a new app that users download and use to make decisions (with constant confirmation of 'forecast accuracy'); using coastal forecasts and the user's assessment of the quality through ANN and AI to determine relative forecast based on user's opinion of previous forecasts.
Reduced levels of risk and maintenance costs in coastal activities can thus be estimated. Impact will include integration with two world-leading companies in the Operation and Maintenance realm of offshore energies: Turbine transfers (www.turbinetransfers.co.uk/) and MetOcean solutions (www.metocean.co.nz/). Current Met-Ocean condition forecasts have focused on the physical conditions for users of the coastal environment; for example, wave height and period, wind strength and direction. In the coastal zone, recreational users and industry currently have to make decisions using this forecast information, alongside their own experience and skills level, including their appetite for risk (e.g. a two water-sport enthusiasts may have completely different views of the quality of conditions due to many other factors; and this has led to wide spread perception of forecast inaccuracy at the coastal zone). Indeed, offshore wind farm maintenance operatives have different boats and levels of acceptable risk, typically based on wave steepness and wave-tide interaction processes that are not included in forecast models due to the computational burden.
Such human interpretation of forecast data often falls within classical heuristic risk traps settings. Current physical Met-Ocean forecast model products have recognized this change in user requirements, but are still processed based forward stepping physical models -- with a new focused on uncertainty quantification to improve the user's perception of accuracy. This novel application of computer learning could reduce uncertainty and improve confidence in weather forecasting.

[1] Lewis, M.J., Palmer, T., Hashemi, R. et al. Wave-tide interaction modulates nearshore wave height. doi.org/10.1007/s10236-018-01245-z
[2] Hashemi, M.R., et al.. and Lewis, M., 2016. An efficient artificial intelligence model for prediction of tropical storm surge. Natural Hazards, 82(1), pp.471-491.

Project title: Ensembles of Deep Neural Networks for Semi-supervised Learning.

1st supervisor: Prof Ludmila Kuncheva
2nd supervisor: School / Franck Vidal
Department/Institution: School of Computer Science and Electronic Engineering
Research theme:

Project description: In semi-supervised learning, some of the data have labels but most of the data is unlabelled (Engelen and Hoos, 2020). This type of data is widespread because labelling is often infeasible, destructive, or too expensive. Consider as an example a sample from a population affected by a pandemic. A small proportion of the people in the sample would have been tested for the disease, returning a positive or a negative output. This will be the labelled part of the data. The rest of the sample will be unlabelled but could still be of great help in extracting an accurate proxy for the test, which can then be deployed to the whole population.
Deep learning has hijacked the research in machine learning and pattern recognition for a good reason - the undisputed success of large-scale models for complex data. Ensemble models are known to be more accurate than single models, which explains the interest in ensembles of Deep Learning Neural Networks (DLNN). Ensembles of DLNN for semi-supervised data are just taking off, with a few, heuristically crafted models. Considerable amount of knowledge on standard classifier ensembles has been accumulated in the past twenty years.
Grounding our work in that, we will attempt to improve on the state-of-the-art in semi-supervised learning. Our analysis will gauge the need for a DLNN ensemble based on data size and characteristics. We will examine the contribution of diversity within the DLNN ensemble to the ensemble performance as well as the extent of usability of unlabelled data. New ensemble methods will be devised based on robust training and testing protocols, a suite of classifier combination methods, and a novel idea about feature transformation. In the first instance, we will illustrate the proposed methods on videos related to animal monitoring.

[1] Van Engelen, J.E. and H, H. Hoos, A survey on semi-supervised learning. Machine Learning 109, 373-440 (2020). https://doi.org/10.1007/s10994-019-05855-6
[2] Laine, S and T Aila, Temporal ensembling for semi-supervised learning- arXiv preprint arXiv:1610.02242, 2016 - arxiv.org https://arxiv.org/abs/1610.02242
[3] Kuncheva L.I. Combining Pattern Classifiers. Methods and Algorithms, Wiley, 2nd edition, 2014

Project title: Bringing big-data to social science

1st supervisor: Simon Willcock
2nd supervisors: William Teahan & Prof Jonathan Roberts
Department/Institution: School of Natural Sciences and School of Computer Science and Electronic Engineering
Research theme:

Project description: We live in a period of unprecedented data availability, but not all data are equal - with quantitative data sometimes viewed as more reliable, robust and/or useful than qualitative data. This is particularly problematic when conducting the interdisciplinary research necessary to address the most important global challenges, such as climate change and sustainable development. For example, the benefits human's derive from nature are categorized into three types: provisioning services (products obtained from ecosystems; e.g. food), regulating services (benefits obtained from the regulation of ecosystem processes; e.g. regulation of air quality), and cultural services (non-material benefits people obtain from ecosystems; e.g. cultural heritage or spiritual enrichment). Whilst both provisioning and regulating services can be quantified at local and global scales, cultural services are often viewed as 'unquantifiable', being spatially and temporally distinct, intangible, subtle, mutable and intuitive in nature, based on ethical and philosophical perception - thus largely unique to the individual. As such, most nature-based research is dominated by the relatively easily quantified provisioning and regulating services, which are readily monetized to enable comparisons across services. The same is not true of cultural services and how to combine these data to holistically value nature's contributions to people is unknown. We seek to address this here.
Using existing data from seven national surveys across Wales (~1000 respondents per survey; 3 surveys complete [Jan-Jun ’20], 1 ongoing, 3 planned in Jan-Jun '21). Using Supercomputing Wales, we will use bespoke Natural Language Processing (NLP) to analyse the quantitative data within these surveys, understanding how people's reasons for spending time in greenspace change from before, during and after the ongoing coronavirus crisis. These qualitative data contain free text responses in both English and Welsh, and our cross-language analysis would compare responses to see if there are any specific language differences, as well as differences between genders and socioeconomic groups. Finally, advanced visualization techniques will be developed to enable the comparison of the qualitative free text responses and quantitative survey data, which includes distance travelled and length and regularity of visits. The ability to visualise both quantitative and qualitative data at national-scales may transform sustainable decision-making.

[1] Willcock, S., Camp, B. J., & Peh, K. S. H. (2017). A comparison of cultural ecosystem service survey methods within South England. Ecosystem Services, 26, 445-450.
[2] W.J. Teahan. 2018. A Compression-Based Toolkit for Modelling and Processing Natural Language Text, Information, Vol. 9, No. 294. MDPI Publishers. doi:10.3390/infoxx010001.
[3] Rick Walker, Llyr ap Cenydd, Serban Pop, Helen C Miles, Chris J Hughes, William J Teahan, and Jonathan C Roberts. 2013. Storyboarding for visual analytics. Journal of Information Visualization.

Project title: Optimization of co-located offshore wind and wave energy arrays

1st supervisor: Prof. Simon Neill
2nd supervisor: Dr David Christie (SEEC research fellow) and TBA
Department/Institution: School of Natural Sciences
Research theme:

Project description: Co-locating wave energy with offshore wind developments leads to synergies which can improve efficiency and reduce cost, including the advantages of a common consenting process, shared grid connection, and logistics. Power output can be smoothed by the phasing between wave and wind peaks within the same weather footprint, consideration of the swell component of waves, and further improved by strategic (micro-siting) within the array to modify the phase relationship between individual Wave Energy Converters (WECs). By absorbing wave energy, the WECs can reduce wave loading on the wind turbine structures, which in turn may modify the wave resource by reflection and diffraction. Energy yield, power quality, loading forces and cabling cost will depend on the spatial configuration of the devices within the array. This project will apply genetic algorithms to determine the optimal device configuration.
The student will use a coupled wind/wave resolving model running on a Supercomputing Wales to determine the effect of WECs and wind turbines on the surrounding wave and wind fields. Combining with wind and wave data, boundary conditions from global models will allow an estimation of annual energy yield for a given configuration. A cost function will be synthesized from estimation of a variable contribution to the infrastructure cost to represent wave loading forces, as well as cabling costs (which will be affected by both the internal layout and phase relationship). The number of devices and spatial footprint will be constrained by optimization with respect to levelized cost.

[1] Neill, S.P. and Hashemi, M.R. 2018. Chapter 9 - Optimization, Fundamentals of Ocean Renewable Energy - Generating Electricity from the Sea, Academic Press, Pages 237-270.
[2] Neill, S. P., Vögler, A., Goward-Brown, A. J., Baston, S., Lewis, M. J., Gillibrand, P. A., ... & Woolf, D. K. (2017). The wave and tidal resource of Scotland. Renewable Energy 114, 3-17.
[3] Neill, S. P., Hashemi, M. R., & Lewis, M. J. 2014. Optimal phasing of the European tidal stream resource using the greedy algorithm with penalty function. Energy 73, 997-1006.

BRISTOL UNIVERSITY

Project title: Machine Learning to Find New Physics in Muon Decays

1st supervisor: Prof Joel Goldstein
2nd supervisor: TBD
Department/Institution: Particle Physics, University of Bristol
Research theme: T1 - data from large science facilities

Project description: The Mu3e experiment at PSI will look for extremely rare muon decays; in particular it is designed to try to identify the lepton flavour-violating decay of a muon to three electrons at the level of one event in 10^16. The experiment will use the latest advances in detector technology to identify electrons with high spatial and temporal resolution, and advanced pattern recognition algorithms will be implemented electronically to filter the data in real time.
In this project the student will apply the latest developments in machine learning to Mu3e event reconstruction and filtering, developing new techniques that could be faster, more flexible and/or more effective than conventional algorithms.
This could lead not only to the optimisation of the physics reach for the three-electron channel, but also the capability to perform real-time detailed analysis to look for different signatures. The student will start by developing and optimising algorithms in simulation, and then will have the opportunity to commission and test them in early data from the running experiment.

Project title: Fast Inference with FPGAs for Particle Physics

1st supervisor: Dr Jim Brooke
2nd supervisor: TBD
Department/Institution: Particle Physics, University of Bristol
Research theme: T1 - data from large science facilities

Project description: This project will study the implementation of machine learning algorithms in programmable logic technology, for potential applications in large particle physics experiments. Such experiments produce increasingly large volumes of data, with increasingly sophisticated online data acquisition systems, that are often required to perform fast, low-latency, online processing to properly handle and store the data. Machine learning algorithms have thus far been generally restricted to extracting information in offline data analysis.
However, the implementation of machine learning algorithms in Field Programmable Gate Array (FPGA) technology may bring high performance image recognition and classification to online data acquisition systems, thereby extending the reach of the next generation of particle physics experiments. Challenges well suited to an ML approach will be identified at the Compact Muon Solenoid and/or Deep Underground Neutrino Experiment, followed by training of appropriate networks and evaluation of the physics impact. Implementations for inference in FPGA logic will be developed and demonstrated in hardware.

Project title: Advanced computational methods for dosimetry, planning and verification in emergent radiotherapy treatments

1st supervisor: Dr Jaap Velthuis (UoB Physics & Swansea University Medical School)
2nd supervisor: Dr Richard Hugtenburg (Swansea University Medical School & UoB Physics)
Department/Institution: Particle Physics, University of Bristol
Research theme: T1 - data from large science facilities

Project description: This project will support the development of novel verification and dosimetry techniques, better suited to advanced radiotherapy practice than existing technologies. In particular, computational methods are needed in the development of detectors in silicon and diamond materials, for high-throughput, large-format detector arrays and specialised dosimetry systems, working towards treatment planning and verification systems in novel therapies like diffusing alpha radio therapy (DART) and boron-neutron capture therapy (BNCT).
Monte Carlo modelling is used in combination with post-processing of Mbps data from detector arrays, providing diagnostics from beam and dynamic collimator systems in intensity modulated radiotherapy (IMRT) and verification of treatments for improved patient safety. Together with industry partners we are close to a product to verify IMRT in real time during treatment. We have successfully used machine learning techniques to reconstruct the beam shapes and intensities. Clever data mining allows the systems to use anomaly detection for preventive maintenance but more interestingly by combining the data from several centres we can disentangle misbehaving sensor systems from misbehaving linacs. This is particularly important for IMRT global roll-out of radiotherapy in LMICs where an absence of expertise is often a key factor in LINAC down time.
New radiotherapy techniques that present high levels of complexity in terms of accurate dosimetry will also be a focus, including the development of tissue equivalent microdosimetry systems in silicon and diamond and the translation of data from experiments in biophysical models. Challenges include the modelling of the diffusion of alpha emitting radon, and the similarly challenging use of BNCT in cancer treatment, with both avenues offering new hope in the treatment of radioresistant tumours. As such this project combines different but very much linked big data, machine learning challenges.

Project title: Machine learning to study accretion flows around black holes

1st supervisor: Dr Andy Young
2nd supervisor: TBC
Department/Institution: Astrophysics, University of Bristol
Research theme: T1 - data from large science facilities

Project description: The goal of this project is to make use of machine learning techniques to study accretion flows around black holes. Gas falling towards a black hole forms a thin disk. The inner regions of this disk produce strong X-ray emission that we can observe using space-based telescopes. The intensity of these X-rays varies as function of time and energy, and careful study of this variability can be used to investigate the astrophysics of the accretion disk, hot X-ray emitting corona above the disk, and black hole spacetime. However, the data and theoretical models can be complex and multi-dimensional. Machine learning provides an opportunity to cut through this complexity and significantly improve the efficiency of data analysis and modelling. In this project we will investigate a range of machine learning techniques and how they might be applied to this problem. Examples include the following: Simplifying complex, multi-dimensional regression models, regularisation, and dimensionality reduction techniques.
[1] Characterising the time variability of X-ray spectra using principal component analysis.
[2] Improving the efficiency of data modelling using parallel machine learning algorithms and hardware such as GPUs.
[3] Classifying accretion states using X-ray flux and colours.

Project title: Identifying and characterising the highest redshift clusters and proto-clusters in huge multi-wavelength data sets

1st supervisor: Prof Malcolm Bremer
2nd supervisor: TBC
Department/Institution: Astrophysics, University of Bristol
Research theme: T1 - data from large science facilities

Project description: Galaxy clusters are highly sensitive probes of both the cosmological evolution of structure in the universe and the astrophysical effect of environment on galaxy evolution. Identifying and studying these systems during their early evolution has, up until now, been challenging to do in an unambiguous manner, but is vital to make progress on both the evolution of structure and galaxies. The field is due to be revolutionised by the availability of deep, large-volume data sets across multiple wavebands. Using this, we will be able to obtain a clearer view of these early clusters and proto-clusters, particularly whether our currently very limited picture is skewed by selection effects.
In order to identify the signatures of these systems from these multiple huge data sets, current selection techniques are unlikely to be of much use, potentially ending up with either too many false positives or too few genuine systems (or systems skewed in character by assumptions inherent in the selection technique), a significant issue given the volumes (both data volumes and physical volumes) involved. By applying appropriate ML and AI techniques to the discovery and characterisation of these systems we aim to efficiently generate statistically valid samples of these systems and compare them to our theoretical expectations and computer models of their evolution and growth in a way that is unlikely to be possible using current techniques.

CARDIFF UNIVERSITY

Project title: Feeding and feedback: how massive galaxies interact with their gaseous haloes in the era of Big Data

1st supervisor: Freeke van de Voort
2nd supervisor: Mattia Negrello
Department/Institution: School of Physics and Astronomy
Research theme: T1 - data from large science facilities (astronomy) with links to T3 - novel mathematical, physical and computer science approaches (through high-performance computing and advanced data analysis)

Project description: Galaxies grow by forming stars from the gas in their interstellar medium, which they obtain mostly by accreting it from their surrounding haloes. Feedback events like supernova explosions and active supermassive black holes create powerful galaxy-scale outflows that blow gas out of the galaxies and back into the haloes around them. In this way, galaxies are intimately connected to their gaseous haloes - also called the circumgalactic medium (CGM). Cosmological, magnetohydrodynamical simulations of structure formation, run with massively parallel code on high-performance computing clusters, allow us to study these interactions in the CGM.
New methods have been developed to increase the resolution of such simulations by a thousand-fold by improving the scalability of the code, which opens up new avenues to study this important but elusive part of the cosmos. The huge amount of data generated (hundreds of Terabytes) can no longer be handled with traditional analysis techniques.
Novel approaches are therefore required for the analysis and visualization of these data as well as new ways to store and access the data. This PhD project will focus on massive galaxies and their environments with simulations that go far beyond the current state-of-the-art.
The PhD student will take the lead in developing new analysis and visualization techniques and have the opportunity to exploit this unique dataset to study the multiphase structure of the CGM, the flow of gas into and out of massive galaxies, and the feedback from stars and black holes and their role in quenching star formation. They will also be able to provide the community with efficient ways of storing, analyzing, and visualizing the next generation of cosmological simulations.

Project title: Novel algorithms and machine learning for topological quantum materials

1st supervisor: Dr. Felix Flicker (Cardiff)
2nd supervisor: Prof. Biagio Lucini (Swansea)
3rd supervisor: Dr. Tom Machon (Bristol)
Department / Institution: Physics and Astronomy, Cardiff
Research theme: T3 - novel mathematical, physical, and computer science approaches

Project description: Two of the central focusses of quantum materials research are high-temperature superconductivity, and developing topological quantum computation. The former promises the possibility of dissipationless energy transport, revolutionising energy production and storage by facilitating mass adoption of renewable energy, while the latter would allow certain calculations to be performed exponentially faster than is possible on any classical supercomputer, with applications from modelling currently-untractable physical systems, through to cyber security and cryptography. Major developments have occurred in both fields in recent months, with the world's first room temperature superconductor identified (albeit at high pressure), and 'anyons', the basis for topological quantum computation, identified in fractional quantum Hall insulators.
These systems, and many others of broad interest to the community, have evaded a theoretical understanding owing to the importance of interactions between particles, necessitating the development of novel numerical techniques to handle them. This project will involve the development and application of such techniques to 'topological' materials, both quantum and classical.
Dimer models -- how to arrange dominoes on a chess board -- provide a simple but powerful mathematical model of strongly-interacting matter. Their quantum extension was introduced as a simple model of high-temperature superconductivity, but is now understood to host a much wider range of exotic phenomena, including topological order and quantum spin liquids. Closely related are 'spin ice' models, which have successfully accounted for experimental observations consistent with the emergence of 'magnetic monopoles' in real materials. Open questions include obtaining statistics on correlations in dimer models at finite temperature; robustness to disorder; and confinement and the mass gap in the quantum model (including potential relevance to exact solutions in QCD, one of the Clay Institute's Millennium Prize Problems).
This project will develop novel numerical approaches based on quantum and classical Monte Carlo techniques for modelling these systems. We will introduce kinetic Monte Carlo techniques to model the dynamics of magnetic monopoles in spin ices; we will then use machine learning to identify the monopoles' existence via dynamical signatures, in collaboration with groups in Oxford and Cornell, supporting ongoing experimental work in Cardiff and elsewhere. We will also apply quantum Monte Carlo techniques to the study of dimer models. These algorithms are computationally very intensive, and will require the development of new methods for running efficiently on the latest heterogeneous high-performance computing architectures (including GPUs and FPGAs).

Project title: Non-invasive characterisation of tissue microstructure from MRI using Deep Learning

1st supervisor: Leandro Beltrachini (PHYSX)
2nd supervisor: Matthias Treder (COMSC)
3rd supervisor: Derek Jones (PSYCH)
Department/Institution: Schools of Physics and Astronomy (Main), Computer Science, and Psychology, Cardiff University, and Cardiff University Brain Research Imaging Centre (CUBRIC)
Research theme: both T2 (in particular medical imaging) and T3 (in particular novel physical and computer science approaches)

Project description: The non-invasive characterisation of biological tissue microstructure in vivo is of upmost importance to medicine, neuroscience, and basic biological research. To this end, researchers have combined MRI with mathematical models requiring assumptions on the size and shape of cellular components to make the problem mathematically tractable. However, such approximations introduce unwanted errors. Addressing this problem, the supervisory team is exploring a potentially disruptive methodology borrowed from materials science (Torquato, Annu.Rev.Matter.Res., 40:101-29, 2010). It consists of measuring statistical descriptors (SDs) of tissue microstructure using signals from the MRI scanner, from which histology-like representations may be reconstructed. These SDs have the advantage of describing the statistical nature of tissue components without relying on prior assumptions on cell shapes and arrangements, with huge potential to depict microarchitectures in the living body. Nevertheless, initial experiments were unstable and computationally demanding, lasting for days even in modern computers.
The Ph.D. student will develop a solution to the problem by introducing machine learning (ML) approaches: first, to generate fast tissue microstructure reconstructions based on MRI-based SDs; and second, to perform quick simulations of MRI signals for any given microstructure. Convolutional Neural Networks will be developed to solve these issues due to their flexibility and accuracy. Synthetic datasets representing biological tissues will be utilised to train/test the algorithms, with special emphasis on prostate cancer.
The PhD project will take place in the Cardiff University Brain Research Imaging Centre (CUBRIC), with key strengths in microstructural MRI. CUBRIC is a vibrant multidisciplinary research community housing >200 researchers across Schools and Colleges. Moreover, the student will benefit from access to state-of-the-art neuroimaging equipment, including the Connectom microstructural scanner with ultra-strong gradients (only 4 globally).

Project title: Evolving Ethical Deep Neural Networks

1st supervisor: Prof Roger Whitaker, Cardiff University
2nd supervisor: Dr Liam Turner, Cardiff University
Department/Institution: Cardiff University School of Computer Science and Informatics in cooperation with the Cardiff University Crime & Security Research Institute
Research theme: T3 - novel mathematical, physical and computer science approaches

Project description: AI is often trained to make decisions which can potentially adversely affect humans. For example, a self-driving car may be forced to crash in an unfamiliar scenario (aligned to "the trolley problem") or a recommender system may (de-)prioritise an applicant CV for a role based on its perception of the person's suitability. Because AI lacks human context, and because it may be trained on specific scenarios, it cannot be assured that ethical issues will be suitably handled without explicit consideration. Therefore, this project will look at the problem of evolving ethical AI. It will focus on using one of a range of techniques (neural evolution, generative adversarial networks) through which a neural network can become intrinsically structured to "do the right thing". This project can be embedded in supervised learning, deep reinforcement learning (evolutionary AI such as evolution strategies), unsupervised learning or artificial life (ALIFE) simulation (see, e.g., bibites).
The work will be developed in collaboration with IBM and other interested stakeholders. This PhD is suitable for someone keen to gain in-depth knowledge of state-of-the-art deep reinforcement learning through evolutionary processes or applying recent techniques such as adversarial AI. There is significant scope for technical creativity and experimentation, as well as engagement with a wide range of stakeholders.

Project title: Scalable anomaly detection for lipidomics

1st supervisor: Prof Stuart Allen, Cardiff University
2nd supervisor: Prof Valerie O’Donnell, Cardiff University
Department/Institution: Cardiff University School of Computer Science and Informatics in cooperation with the Cardiff University Systems Immunity Research Institute
Research theme: T2: biological, health and clinical sciences (medical imaging, electronic health records, bioinformatics) and T3 - novel mathematical, physical and computer science approaches

Project description: Lipids (fats) are molecules that are essential for life: (i) for energy metabolism and storage, (ii) for signalling during inflammation and development and (iii) they comprise the membranes that hold our cells together. Large lipidomics mass spectrometry datasets are available that have the potential to provide biological insight, in order to understand underlying disease processes and identify new biomarkers. However, researching lipids requires novel computational approaches to enable efficient processing and analysis of these datasets.
This project will look at the problem of developing robust, scalable and efficient processes to identify significant similarities and anomalies within lipidomics datasets. Initial focus will be on a motif/graphlet based approach, which has been demonstrated to be successful in other domains, including gene analysis (Machanick, P. and Bailey, T.L., 2011. MEME-ChIP: motif analysis of large DNA datasets. Bioinformatics, 27(12), pp.1696-1697.) and social networks (Topirceanu, A., Duma, A. and Udrescu, M., 2016. Uncovering the fingerprint of online social networks using a network motif based approach. Computer Communications, 73, pp.167-175.). However, motif approaches are computationally expensive, and this project will require novelty to allow comparison based on larger motifs.
The PhD is suitable for someone keen to develop their knowledge of state-of-the-art AI and data science techniques and apply these for important biological applications with the potential for significant impact. The student will benefit from a multidisciplinary supervisory team whose previous collaborations led to the development of LipidFinder, a bespoke tool to support computational pipelines for lipid analysis, enabling publications in Bioinformatics and Cell Metabolism.

Additional projects can be found here.

SWANSEA UNIVERSITY

Project title: AI based approaches multi-dimensional functional genomics

1st supervisor: Prof Steve Conlan (Medical School)
2nd supervisor: Prof Paul Rees (Engineering)
Department/Institution: Medical School, Swansea University
Research theme: T2 - biological, health and clinical sciences

Project description: This project will advance the fundamental understanding of the functional genomics status and dynamic responses in cancer environments, with project opportunities including single cell and spatial functional genomes including transcriptome and epigenome analysis. Project examples include:
- Exploring the spatial epigenomics of a 3D cellular models to understand the impact of pharmacological interventions on histone methylation (e.g. H3K4Me3, H3K9me2, H3K27me3). Epigenome profiling in response to interventions will be assessed at the cellular level using the Microscopic Imaging of Epigenetic Landscapes technique involving high content multi-parameter data acquisition and analysis to reconstruct high-resolution spatial epigenome models.
- Single cell and spatial analysis of cancer functional genomes obtained using 10X Genomics platforms to generate data from immune cell complements (ATAC and RNA seq) or spatial analysis of tumour biopsies (10 X Visium, RNA seq) from gynaecological cancer patients.
- Cell painting and/or data mining approaches developed in Swansea will be applied to overcome challenges for high content analysis including feature extraction and data analysis, and interpretation requiring the use of AI technologies (using Swansea's new ATOS supercomputing capability). The successful applicant will develop and implement AI based strategies for the high-content data generated from cellular models of tumour microenvironments/cancer patient samples using advanced and computationally expensive algorithms.
The successful applicant will join the Reproductive Biology and Gynaecology Oncology research group in Swansea's Medical School in collaboration with Prof Paul Rees in Swansea's College of Engineering. The successful applicant will be involved in data acquisition and analysis, and should have a degree in molecular biology or computer science or similar.

Project title: Tropical Quantum Field Theory

1st supervisor: Prof Biagio Lucini
2nd supervisor: Dr Yue Ren
3rd supervisor: Prof Gert Aarts
Department/Institution: Department of Mathematics, Swansea University
Research theme: T3 - novel mathematical, physical, and computer science approaches

Project description: Quantum Field Theories describe elementary particles and their interactions. The structure of the interaction term plays a crucial role in the definition of the observed physical phenomena. When the interaction is strong, Quantum Field Theories are generally intractable analytically. In these cases, one resorts to numerical computations or to emerging field theories that are easier to tame. Alternative or complementary methods include spatial transformations such as coarse-graining. In this approach, fields describing the particles of the theory are replaced with their integral over a finite region of space whose radius is smaller than the interaction scale. The emerging structure after repeated coarse-graining reveals non-perturbative properties of the theory such as scaling dimensions. An alternative set of transformation can be provided by applications and development of tropical geometry methods. Tropical geometry is the study of properties of shapes when the addition operation is replaced with the minimisation and the multiplication with the addition. Although as a field tropical geometry is still relatively young, insightful applications are emerging in a variety of contexts, such as the financial market, job scheduling and theory of phase transitions. This project will be the first exploration of tropical geometry applications to strongly interacting field theories such as Quantum Chromodynamics, which describes the interaction binding protons and nucleons in nuclei. Together with the development of analytic method, the project will devise a set of techniques and computational algorithms that can be used as a general framework for tropicalisation of Quantum Field Theories. It is anticipated that the corresponding numerical calculations will require high-performance computing capabilities.

Project title: Machine learning for multidimensional ultrafast x-ray spectroscopy.

1st supervisor: Dr Kevin O'Keeffe
2nd supervisor: Dr Adam Wyatt
Department/Institution: Physics Department, Swansea University and Central Laser Facility
Research theme: T1 - data from large science facilities, T3 - novel mathematical, physical and computer science approaches

Project description: Observing the dynamics of molecular systems on their natural timescale is a fundamental challenge in physics and chemistry. Recently, multidimensional spectroscopy using ultrafast x-ray pulses has emerged as a powerful method for tracking the motion of electrons during the first few femtoseconds of a light-atom interaction.
This technique records the spatially and spectrally-resolved interference pattern from two laser-generated x-ray sources at multiple source positions, providing access to phase information crucial for resolving ultrafast dynamics. Although this technique enables measurements with unprecedented temporal stability, the 4-dimensional interferograms which are generated are highly structured and challenging to analyse. The primary goal of this project will be to develop a machine learning tool capable of reliably identifying the key signatures in the interferogram related to electronic motion in atomic systems such as argon.
The algorithm will be trained using simulated interferograms based on strong-field calculations before being implemented on real data sets. The algorithm will then be extended to the analysis of interferograms generated using more complex targets such as molecular nitrogen and carbon dioxide. Developing robust methods for extracting data from such interferograms will provide new opportunities for understanding the behaviour of bond formation and breaking at the natural timescale of chemical reactions.

Project title: Understanding and optimizing therapeutics for ovarian cancer through an AI and advanced computing twin.

1st supervisor: Prof R. S. Conlan (Medical School)
2nd supervisor: Prof P. Nithiarasu (Engineering) 3rd supervisor: Prof D. Gonzalez (Medical School) 4th supervisor: Dr L.W. Francis (Medical School) 5th supervisor: (external): Dr L McKnight (Consultant Radiologist, Swansea Bay UHB)
Department/Institution: Medical School, Engineering and Swansea Bay UHB
Research theme: T2 - biological, health and clinical sciences & T3 - novel mathematical, physical and computer science approaches

Project description: Ovarian cancer (OC) is the seventh leading cause of cancer-related death in women worldwide. It causes around 4,100 deaths annually in the UK, where recurrence rates are up to 75% and the five-year survival rate is only 46%. High vascular permeability and compromised lymphatic drainage result in ascites, the accumulation of fluid, in the peritoneum that is associated with advanced OC.
We will develop a new paradigm for effective drug delivery/treatment based around microparticle drug delivery that will restrict particles (and therefore drugs carried within them) to the peritoneal cavity. Furthermore, particles will be geometrically-designed to preferentially accumulate at cancer sites (i.e. the walls of the cavity). Together this will result in increased efficacy, and reduced cytotoxicity and loss of therapeutic agents from the peritoneal cavity.
The project will exploit available computed tomography (CT) images of OC patients to generate virtual models (digital twins) of the ascites containing peritoneal cavity and surrounding structures. From this complex fluid flow modelling will be developed to predict microparticle movement within the digital twin and determine the particle geometry needed to target cancer sites.This proposal builds on and integrates academic research in the fields of patient imaging, computational modelling, AI and cancer biology, and will deliver an optimised pathway for computerised tomography (CT) scan simulation for drug delivery. The AI outputs will be immediately usable by other researcher/clinical teams, with their application not limited to OC or peritoneal disease but extensible to any CT scan.The project will involve interdisciplinary research between Engineering, Medicine and the NHS. Successful completion will lead to novel drug delivery pathways for OC therapeutics through utilisation of imaging data (20-40 of CT images); an exemplar for how patient data and samples can be used to drive healthcare developments.

Project title: ML-aided identification of social calls in pipistrelle bats at wind farms

1st supervisor: Dr Noemi Picco, Mathematics, Swansea University
2nd supervisor: Dr Farzad Fathi Zadeh (Computer Science, Swansea University) Potential Supervisor/Collaborators: Dr Thomas Woolley (Cardiff University) & Prof. Fiona Mathews
Department/Institution: Mathematics/ Swansea University
Research theme: T3: novel mathematical, physical and computer science approaches

Project description: Wind turbines are an important threat to bats. We currently do not understand why bats collide with turbine blades, and the rates of collisions seem high relative to the amount of observed activity. Recent studies found that bat activity appears to be higher at turbines than at control sites, and there is limited evidence from the USA using thermal imaging that appears to show repeated approaches of bats to the blades. These pieces of evidence suggest that there may be some attraction to the structures. Earlier this year, a paper was published describing a specific type of social call, produced by common pipistrelle bats, that is associated with chasing behaviour (Götze, Denzinger et al. 2020). We will test the hypothesis that these social calls are more common at wind turbines than at control sites, suggesting that bats may either be chasing other individuals, or indeed the turbine blades.
The student will work on a large dataset of sound files available from >50 wind farms across Britain. Matched control sites are available for a subset (c. 20) of these. Initially, sound analysis can be carried out using methods specifically developed for nonlinear and non-stationary data (Huang et al. 1998). The project would involve adoption of machine learning approaches to data analysis, developing automated pattern recognition workflows to identify bats species and patterns relative to the nature of the call.

Project title: ML-guided dynamical systems modelling of sepsis

1st supervisor: Dr Noemi Picco, Mathematics, Swansea University
2nd supervisor: Dr Farzad Fathi Zadeh (Computer Science, Swansea University) Potential Supervisor/Collaborators: Dr Thomas Woolley (Cardiff University) & Prof. Peter Ghazal (Cardiff University)
Department/Institution: Mathematics/ Swansea University
Research theme: T3: novel mathematical, physical and computer science approaches

Project description: Sepsis is defined as an abnormal and uncontrolled response of the immune system to infection. Because of our lack of understanding of why and how the immune response goes awry, at present there is no cure for sepsis, but only strategies to manage the symptoms. Critically, in many cases patients present with mild symptoms that very quickly degenerate into organ failure and life-threatening conditions, requiring resuscitation. It is therefore crucial that we understand the early onset of sepsis and recognise the patterns of drastic transition to severe condition.
A dynamical systems approach to the description of the dynamics is suitable to capture the key interactions between key players in the immune systems and the virus concentration. This project will develop a workflow for a systematic identification of the correct functional forms (currently chosen arbitrarily) as well as identification of critical thresholds resulting in severe sepsis. We will design approaches of machine learning and artificial intelligence, to match the model behaviour to the large dataset available, which includes time series readings of an extensive set of markers characterising the mice immune status at multiple time points. The ultimate goal is to be able to design early intervention strategies that can avoid the rapid escalation and the need for drastic action.

Additional projects can be found here.

Example research projects from previous years can be found here (2020 cohort) and here (2019 cohort).