CDT-AIMLAC Research

UKRI Centre for Doctoral Training in Artificial Intelligence, Machine Learning & Advanced Computing

Research projects

Our doctoral training programme is constructed around three research themes:

T1: data from large science facilities (particle physics, astronomy, cosmology)
T2: biological, health and clinical sciences (medical imaging, electronic health records, bioinformatics)
T3: novel mathematical, physical, and computer science approaches (data, hardware, software, algorithms)

Research projects are placed in one of the three themes. The CDT encourages in particular the development of synergies between the themes, via the sharing of common methods and a interdisciplinary supervisory team. Not all themes are available at all of the partner universities.

A sample of research projects is given further down on this page.

Research contacts

In order to discuss a PhD position/project at one of the partner universities, please contact:

T1: data from large science facilities

Astronomy/Cosmology: Prof Malcolm Bremer (Bristol), Prof Stephen Fairhurst (Cardiff)
Experimental Particle Physics: Dr Henning Flaecher (Bristol)
Theoretical Particle Physics/Physics: Prof Gert Aarts (Swansea), Prof Biagio Lucini (Swansea)

T2: biological, health and clinical sciences

Medical imaging: Prof Reyer Zwiggelaar (Aberystwyth)
Genomics, bioinformatics: Dr Tom Connor (Cardiff), Dr Shang-ming Zhou (Swansea)

T3: novel mathematical, physical, and computer science approaches

Computer science: Prof Reyer Zwiggelaar (Aberystwyth), Prof Jonathan Roberts (Bangor), Prof Roger Whitaker (Cardiff)
Mathematics, computational science: Prof Biagio Lucini (Swansea)

Example research projects, organised by host university, 2020 cohort

Aberystwyth University

Project title: Principled Application of Evolutionary Algorithms

1st supervisor: Dr Christine Zarges
2nd supervisor: TBD
Department/Institution: Department of Computer Science, Aberystwyth University
Research theme: T3 - novel mathematical, physical, and computer science approaches

Project description: Evolutionary algorithms are general and robust problem solvers that are inspired by the concept of natural evolution. Over the last decades, they have successfully been applied to a wide range of optimisation and learning tasks in real-world applications. Recently, some researchers argue that evolutionary computation now has the potential to become more powerful than deep learning: While deep learning focuses on models of existing knowledge, evolutionary computation has the additional ability to discover new knowledge by creating novel and sometimes even surprising solutions through massive exploration of the search space. This project will build upon recent momentum and progress in both, theory and applications of evolutionary algorithms and related randomised search heuristics. While such heuristics are often easy to implement and apply, in order to achieve good performance, it is usually necessary to adjust them to the problem at hand. Thus, the main goal of the project is to provide mathematically founded insights into the working principles of different randomised search heuristics to improve their applicability. This will include the development of novel mathematical approaches to facilitate their analysis as well as the development of new randomised search heuristics in a principled, theory-driven way. Interdisciplinary collaboration and the involvement of industry partners will support recent efforts to bridge the gap between theory and practice in this important research area.

Project title: Automatic Stroke Recovery Prediction using Artificial Intelligence

1st supervisor: Dr Otar Akanyeti
2nd supervisor: Prof Reyer Zwiggelaar
Department/Institution: Department of Computer Science, Aberystwyth University
Research theme: T2 - biological, health and clinical sciences T3 - novel mathematical, physical, and computer science approaches

Project description: Stroke is a leading cause of adult disability across UK and worldwide. People with stroke have reduced mobility, and because of sedentary lifestyle, they are more likely to develop other health issues including cardiovascular problems, dementia and depression. Previous studies have shown that exercise can lead to improvements in balance, walking and aerobic fitness of stroke survivors. However so far, we have limited understanding of neural, physiological and molecular mechanisms of exercise-induced stroke recovery. To elucidate some of these mechanisms, we have recently started an NHS-funded stroke rehabilitation project. Chronic stroke survivors participate in a longitudinal study focused on intensive-exercise rehabilitation with minimum duration of three months. During the study, we continuously monitor the progress of subjects by collecting a wide range of data from motion capture system, wearable sensors, omics technologies, physiological and cognitive tests, observer rated clinical measurements, medical records, patient interviews and surveys. Over time, the project will produce a rich dataset in various formats and at different temporal resolutions. The PhD student will contribute to the project by developing new rule-based, probabilistic and statistical machine learning approaches to integrate and analyse the multimodal, multifaceted data (e.g. fuzzy logic, Bayesian networks and convolutional neural networks). The overall goal is to design an AI-based recommendation system to assist health care Professionals in monitoring stroke recovery and predicting its outcome. This is a challenging AI problem, and the student will be trained in cluster computing, machine learning and big data approaches as part of their PhD program. They will also have the opportunity to apply their knowledge to other domains dealing with complex and multimodal data sets. The student will be part of a vibrant, multi-disciplinary research team including computer scientists, physicians, exercise rehabilitation experts, molecular biologists, psychologists and neuroscientists. In addition, they will gain firsthand experience in interacting with real patients according NHS ethical guidelines and data protection regulations.

Project title: Granular computing for explainable and interpretable knowledge modelling

1st supervisor: Dr Neil Mac Parthalain
2nd supervisor: TBD
Department/Institution: Department of Computer Science, Aberystwyth University
Research theme: T2 - biological, health and clinical sciences T3 - novel mathematical, physical, and computer science approaches

Project description: The rise of important initiatives such as Explainable AI, has placed increased focus on the accountability and human understanding of data and knowledge modelling techniques. Such initiatives focus upon sub-symbolic techniques with the aim of attempting to create some form of mechanism or framework which can be applied that makes them interpretable in some way. Granular computing (GrC) is a recent paradigm that is concerned with the processing of complex but interpretable information entities called information granules and has been applied extensively to problems such as data pre-processing and analysis and real-world problems such as human decision support and health informatics. The information granules emerge from the processes of abstraction and knowledge modelling of information or data. From a general perspective, such information granules are collections of entities that are usually formed at an atomic or numeric level and are arranged together based upon a form of relatedness or topological adjacency. This means that such modelling techniques are already explainable and interpretable using the concepts a human would use and do not require an additional framework. This project focuses upon proposing new models of hybrid granular computing techniques and extending them to the areas of semi-supervised, and unsupervised learning, both of which are emerging areas in knowledge modelling domains due to the prevalence of unlabelled or partially labelled data. The project will include the development of novel formal approaches to facilitate the analysis as well as the development of new hybrid granular techniques using robust theoretical foundations. Collaboration with other disciplines and industrial partners will assist in supporting recent efforts to link the theoretical contributions and applications in this area.

Project title: Approximating the Colour of Mars

1st supervisor: Dr Helen Miles
2nd supervisor: TBD
Department/Institution: Department of Computer Science, Aberystwyth University
Research theme: T1 - data from large science facilities T3 - novel mathematical, physical, and computer science approaches

Project description: The Panoramic Camera `PanCam' is a multispectral stereo camera system and the primary remote sensing instrument for the ESA ExoMars rover, an astrobiology mission to search for signs of life on Mars. PanCam includes two wide-angle cameras, each with a monochrome sensor combined with a series of filters: broadband Red, Green and Blue filters for colour images, and a variety of narrowband `geology' filters optimized for detecting hydrated minerals. Colour images are desirable but time and power budgets will mean that it will not always be possible to capture images using the three broadband colour filters. The aim of this project is to investigate the use of machine learning and AI approaches to automatically produce approximate RGB colour images given one or more images from the geology filters. We have a PanCam emulator available and a large collection of images already captured using the emulator during field trials at Mars analogue sites, with the opportunity to collect more data using the emulator. This work could expand to include images released by NASA from their previous and current missions to Mars, which also use multispectral camera systems with filters of different wavelengths. With the rover due to land on Mars in 2021, this is an opportunity to work on data for the ExoMars mission as it happens.

Project title: Knowledge Interpolation for Breast Cancer Abnormality Detection

1st supervisor: Prof Qiang Shen
2nd supervisor: Prof Reyer Zwiggelaar
Department/Institution: Department of Computer Science, Aberystwyth University
Research theme: T2 - biological, health and clinical sciences T3 - novel mathematical, physical, and computer science approaches

Project description: Breast cancer is the most common malignant tumour, affecting women worldwide. Mammography has been internationally accepted as the most reliable method for breast cancer screening, playing an important role in the diagnosis of breast cancer. Whilst computer-aided diagnostic systems aid in the early detection of developing breast cancer, they often face risks of missing potential abnormalities due to lack of extensive and precise knowledge that covers the full problem space. This project will look into how knowledge-based systems may be bettered to help perform mass classification of mammographic images, in support of the determination of malignancy. In particular, the project will build an effective tool, through the exploitation of fuzzy logic, to deal with the impreciseness and vagueness typically incurred in real-world screening, including the description of mammographic mass characteristics. More significantly, it will examine the critical situations where observations do not match any of the rules in the knowledge base (be it learned from historical data or directly provided by the domain experts or a mixture of both), with existing approaches deriving no or wrong diagnoses. This will be based on the utilization of the award-winning fuzzy rule interpolation techniques to enable approximate reasoning for regions where explicit diagnostic knowledge is lacking. The project will investigate how the underlying ideas of the recent advancement in weighted fuzzy rule interpolation that works on classical fuzzy rule models (represented by Mamdani type of knowledge encoding) may be extended to adaptive network-based fuzzy inference systems (ANFIS). As such, this project will involve both theoretical developments in automated reasoning techniques and empirical studies in applying the resulting weighted ANFIS interpolation mechanism, with an aim to improve the accuracy and reliability of mammographic image classification.

Project title: Detecting when deep learning goes wrong in medical image analysis

1st supervisor: Dr Bernie Tiddeman
2nd supervisor: Prof Reyer Zwiggelaar
Department/Institution: Department of Computer Science, Aberystwyth University
Research theme: T2 - biological, health and clinical sciences T3 - novel mathematical, physical, and computer science approaches

Project description: Deep learning has made huge progress in both general and medical image analysis in recent years. Nevertheless, deep learning algorithms have been shown to fail silently on certain images. For example a classifier may seriously misclassify an image with a high apparent confidence in the (incorrect) result (in the sense of a high softmax probability). Clearly such errors could be catastrophic for patient outcomes. This project would investigate the identification of medical images likely to prove problematic to a deep learning system. Previous work on general image classification has used various approaches such as: estimating the uncertainty in the input directly; use of a reject function (i.e. training a separate good/bad image classifier); detecting anomalies from the dataset (e.g. by finding the nearest neighbours, or reconstructing the image using an autoencoder); or identifying new classes not in the original dataset. Recent work has developed a novel evaluation approach and concluded current approaches are unreliable in practice. In this PhD, existing approaches would be investigated in the context of medical image analysis - focussing on mammography and prostate MRI - and novel algorithms would be explored.

Bangor University

Project title: Learning from Badly Behaving Data

1st supervisor: Prof Lucy Kuncheva
2nd supervisor: Dr Franck Vidal
Department/Institution: School of Computer Science and Electronic Engineering, Bangor University
Research theme: T3 - novel mathematical, physical and computer science approaches

Project description: Focusing on deep learning systems, this PhD will investigate the modern data challenge of data "behaving badly". In addition to coming in massive volumes, data can be streaming, drifting, partially labelled, multi-labelled, contaminated, imbalanced, wide, and so on. A prime example of considerable interest is image and video analysis where the same object, person, or animal must be detected, learned, identified and then re-identified in the subsequent image collection or video stream. To solve this problem, we should look into semi-supervised learning in the presence of concept drift, adaptive learning, transductive learning, and more. Deep learning neural networks may prove valuable at the stages when large labelled data sets have been accumulated. Given that multiple objects of interest may be present within the same image, methods from the area of restricted set classification should be explored. This project will seek to offer novel and effective solutions for "badly behaving" data. Where possible, we will aspire to offer theoretical grounds for those solutions to ensure transferability across application domains. A curious potential application is identification of individual animals in a herd or a group for the purposes of non-invasive monitoring. Such an application will cross over to the area of environmental studies, specifically ecosystem conservation and behavioural ecology.

Project title: Developing Artificial Intelligence Techniques to Improve Hydrological Model Regionalisation

1st supervisor: Dr Sopan Patil
2nd supervisor: Dr Panagiotis Ritsos
Department/Institution: School of Natural Sciences / School of Computer Science and Electronic Engineering, Bangor University
Research theme: T3 - novel mathematical, physical and computer science approaches

Project description: The focus of this PhD is to develop AI techniques that can help improve hydrological model regionalisation. Specifically, the research will investigate novel use of AI and information visualization to interactively relate hydrological model parameters to the physical properties of river basins. Hydrological models are essential tools for simulating streamflow in river basins and are widely used for forecasting floods and droughts. However, appropriate application of hydrological models requires a priori calibration of parameters using historical measured streamflow data. To make matters worse, previous research has shown that hydrological model parameters are not strongly correlated to the physical properties of river basins (e.g., topography, soils, land use). This limits the ability to regionalise hydrological models, i.e. estimate model parameters at ungauged river basins or modify parameter values if land use changes in a river basin. Recent advances in Artificial Intelligence (AI), specifically in Deep Learning, have resulted in the ability to provide efficient high-dimensional interpolators that can handle data of multiple dimensions and heterogeneous information, such as those encountered in hydrological modelling. Our approach will involve development of Deep Learning techniques to extract high level abstractions in hydrological models and physical river basin data, which can be used to test the impact of land management decisions on river basin hydrology. This abstraction will be made available to relevant stakeholders via an interactive visualization interface to facilitate the investigation of multiple hydrological and land-use change scenarios using interpolation, classification and, where possible, generalization. Our training dataset will include data from >1000 river basins across the UK, and the coupled AI-hydrological modelling workflow will be streamlined to operate on HPC framework. This research will advance the field of AI through application of novel techniques for hydrological model regionalisation and help improve the assessment of land management decisions on flood/drought risk.

Project title: Automated data cleaning, analysis and visualization from smartphone captured data for climate change

1st supervisor: Dr Simon Willcock
2nd supervisor: Dr William Teahan, Prof Jonathan Roberts
Department/Institution: School of Natural Sciences / School of Computer Science and Electronic Engineering, Bangor University
Research theme: T3 - novel mathematical, physical and computer science approaches

Project description: This research will investigate AI techniques to automate data cleaning, analysis and visualization of results from smartphone captured data science in the area of polulation growth, sustainability and climate change. It builds on collaborations with Natural Sciences and Computing, and require bespoke AI, Natural Language Processing (NLP) and visualisation solutions to be developed. Faced with population growth and climate change, sustainability has become one of the most important global challenges. To address this, we need high resolution spatiotemporal data on the state of natural systems and how we as a society are impacting them. Whilst artificial intelligence (AI) techniques are well established for the former (e.g. providing standardised hourly/daily/weekly data and analyses ranging from site-specific sensors up to remote satellites), most societal data are of extremely limited spatial and/or temporal resolution. To address this, we have pioneered the use of smartphone-based surveys in two countries with large sustainable development challenges: Bangladesh and Cambodia (https://msds.tools/). Our existing data demonstrates that real-time social data collection at large scales is now feasible and affordable. However, in order to up-scale these surveys, AI techniques to automate data cleaning, analysis and visualization of results are required, building on and enhancing bespoke AI, Natural Language Processing (NLP) and visualisation solutions developed at Bangor (see references below). Here, we will develop these techniques further. Machine learning (ML) algorithms automating data cleaning will be developed. Novel AI techniques will also be developed to analyse different types of social data, including quantitative (e.g. how much water was collected), qualitative (e.g. free text of how the water was used), spatial (e.g. locations of where the water was collected, or a GPS track of how to get there) and image (e.g. a photo of the water source) data. Such analyses will bring about an unprecedented level of survey data availability and so will require advanced visualization techniques to a) understand the results ourselves, but also to b) share these results back to the survey respondents via their smartphone. Finally, we will further advance survey methods by developing an AI approach to iteratively evolve dependent on the previous answers from that particular respondent - making future widespread, generalized surveys adapt based on earlier responses to become increasingly tailored to each individual over time.

Project title: Automated optimisation of industrial X-ray Computed Tomography

1st supervisor: Dr Franck Vidal
2nd supervisor: Dr Simon Middleburgh
Department/Institution: School of Computer Science and Electronic Engineering, Bangor University
Research theme: T3 - novel mathematical, physical and computer science approaches

Project description: This PhD will investigate how to automate the parametrisation of non-destructive testing (NDT) with Computerised Tomography (CT) for customised components. High Performance computing will be used to scan and tune multidimensional parameters, which is challenging using today's algorithms. Predictive analytics and machine learning techniques will be deployed to accelerate the process by reducing the number of simulations required. The PhD will extend current state of the art, where we have already demonstrated that it is possible to model scanning parameters to simulate highly accurate CT data acquisition processes. An additional aim is to automatically design the geometry of the holder on which the scanned object will be placed during the CT scan, by combining high performance X-ray simulation on GPU with mathematical optimisation. This approach will generate a large amount of scanning parameters and corresponding simulated data. This multi-disciplinary PhD combines simulation, optimisation, machine learning and material science at Bangor University and Swansea University. The potential impact of this research is to reduce the time needed to perfom a successful NDT examination by porting some of the time-consuming manual experiments (finding the right parameters, and finding the right holder geometry) into a fully-automated computerised framework.

Project title: Programme and Curriculum Analytics

1st supervisor: Dr Cameron Gray
2nd supervisor: Dr Dave Perkins
Department/Institution: School of Computer Science and Electronic Engineering, Bangor University
Research theme: T3 - novel mathematical, physical and computer science approaches

Project description: This research will investigate appropriate methods of Machine Learning algorithms, tools and processes and apply them to examine how educational design influences both achievement and experience in modules and programmes. From there, models will be generated to predict the success and popularity of any proposed curriculum or programme changes. A necessary part of the project will also be examining how best to communicate any insight generated to academics to promote necessary change and best practice. There is future scope to meld this information with other forms of Learning Analytics (LA) using ensemble methods in order to more accurately predict student metrics. Indeed, Learning Analytics is already gaining ground within UK institutions, with their renewed focus on data, possibilities of using big data, and using it to drive decisions and influence policies. Research within the LA sphere is now shifting from prediction of student achievement and retention to other aspects of student engagement.

University of Bristol

Project title: Machine Learning to Find New Physics in Muon Decays

1st supervisor: Prof Joel Goldstein
2nd supervisor: TBD
Department/Institution: Particle Physics, University of Bristol
Research theme: T1 - data from large science facilities

Project description: The Mu3e experiment at PSI will look for extremely rare muon decays; in particular it is designed to try to identify the lepton flavour-violating decay of a muon to three electrons at the level of one event in 10^16. The experiment will use the latest advances in detector technology to identify electrons with high spatial and temporal resolution, and advanced pattern recognition algorithms will be implemented electronically to filter the data in real time. In this project the student will apply the latest developments in machine learning to Mu3e event reconstruction and filtering, developing new techniques that could be faster, more flexible and/or more effective than conventional algorithms. This could lead not only to the optimisation of the physics reach for the three-electron channel, but also the capability to perform real-time detailed analysis to look for different signatures. The student will start by developing and optimising algorithms in simulation, and then will have the opportunity to commission and test them in early data from the running experiment.

Project title: New Physics searches in B and D meson decays with Machine Learning

1st supervisor: Dr Kostas Petridis
2nd supervisor: TBD
Department/Institution: Particle Physics, University of Bristol
Research theme: T1 - data from large science facilities

Project description: This project aims to discover physics beyond the Standard Model (SM) by using advanced machine learning techniques to study a vast number of B- and D-hadrons (bound state of beauty or charm quarks respectively) with unprecedented precision at current and future CERN facilities. The proposed research has two main branches: 1) Development of GPU based 4-body amplitude fits of decays of B- and D-hadrons using TensorFlow; 2) Development of fast simulation of both the collisions as well as of the response of particle physics detectors using Generative Networks.

Project title: Real time radiotherapy verification

1st supervisor: Dr Jaap Velthuis
Department/Institution: Particle Physics/Detectors, University of Bristol
2nd supervisor: Dr Richard Hugtenburg (Swansea University)
Research theme: T2 - biological, health and clinical sciences

Project description: We are currently developing a device that will be operated during radiotherapy treatment upstream of the patient and verify in real time the beam shape, which changes all the time, and the dose map. This requires online fast analysis and lots of Monte Carlo simulations to verify the treatment. This MC generation is fairly `inefficient' as the photon cross section is very low. We are therefore looking at alternative ways to do this. In addition, we expect that the device will be installed in many radiotherapy centres. Clever data mining will allow the systems to use anomaly detection to do preventive maintenance but more interestingly by combining the data from several centres we can disentangle misbehaving sensor systems from misbehaving X-ray linacs. The key challenge is to get the individual systems to learn to signal these faults while sharing as little data as possible due to, e.g., privacy reasons. This is very important as wrongly delivered radiotherapy treatments are extremely dangerous. As such this project combines three different but very much linked big data and machine learning challenges.

Project title: Using Machine Learning to Explore the Evolution of Active Galaxies with Euclid

1st supervisor: Dr Sotiria Photopoulou
Department/Institution: Department/Institution: Astrophysics, University of Bristol
2nd supervisor: Prof Malcolm Bremer
Department/Institution: Department/Institution: Astrophysics, University of Bristol
Research theme: T1 - data from large science facilities

Project description: Euclid is a European Space Agency (ESA) M-class mission, aiming to uncover the nature of the Dark Universe. This space telescope will map the majority of the extra-Galactic sky (15,000 sq. deg.) in the optical and near infrared bands with excellent spatial resolution. The combined data of Euclid and ground observations e.g. with the Large Synoptic Survey Telescope (LSST), will form possibly the largest astronomical dataset of the next decade with 10 billion detected sources. This PhD project pertains to the preparation and exploitation of Euclid data. Specifically, the candidate will be part of the Photometric Redshift Organizational Unit (OU-PHZ) and the Galaxy and AGN Evolution Science Working Group (GAE-SWG). In anticipation of the Euclid launch (~2022), we will work with currently existing public large datasets (ESO VISTA Public Surveys, KiDS, DECaLs, PANSTARRS, etc). The focus points of this project include - but are not limited to - source classification with machine-learning methods, and AGN/galaxy coevolution studies.

Cardiff University

Project title: Empowering supernova astronomy with Artificial Intelligence

1st supervisor: Dr Cosimo Inserra
2nd supervisor: TBD
Department/Institution: School of Physics and Astronomy, Cardiff University
Research theme: T1 - data from large science facilities

Project description: Supernovae are catastrophic stellar explosions shaping the visible Universe and affecting many diverse areas of astrophysics. Supernovae arising from massive stars, referred to as core- collapse supernovae, play a major role in many intriguing astronomical problems since they produce neutron stars, black holes, and gamma-ray bursts. We are now living in the golden era of transient astronomy, with roughly 11000 transients discovered per year. The advent of the Large Synoptic Survey Telescope will boost the number of yearly discoveries by a factor of 100. Task-specific algorithms employed until now for transients' classification have limitations in taming the zoo of transients. The main project goal is to develop an Artificial Intelligence tool (deep learning algorithm) that can process time-series (e.g. luminosity evolution) and non-time-series (e.g. environment information) and that can identify core-collapse supernovae in less than 15 days from explosions, which is when we can retrieve crucial information about the progenitor nature. A secondary goal is to build such an AI tool in a way that is scalable enough to be applied to the environment of compact stars mergers producing gravitational waves. This application can predict the merger type (what objects are merging and their masses) and allow for rate and population studies at far distances.

Project title: Examination of SARS-CoV-2 severity, transmissibility and spread within Wales through the analysis of linked patient health records and genomic sequence data

1st supervisor: Dr Tom Connor
2nd supervisor: TBC
Department/Institution: Cardiff School of Biosciences, Cardiff University
Research theme: T2 - biological, health and clinical sciences

Project description: As part of the COVID-19 response we have sequenced in excess of 5,000 SARS-CoV-2 genomes, representing approximately 40% of all Welsh COVID-19 cases. We have linked the viral genomic data we have generated to the linked, anonymised, patient health record dataset held by SAIL. Within our sequenced dataset, we have identified a set of variants that are present at varying frequencies across our dataset. We have already undertaken analyses to examine one of these - a mutation in the spike protein which may increase viral transmissibility. This project will focus on extending this work through the examination of the enormously rich and complex SAIL dataset to perform linked analyses of viral sequence data with the extensive linked medical records within SAIL in order to quantify the effect of viral variants on disease severity and transmission in Wales. This project makes use of a dataset of unprecedented size and scope, and will provide opportunities to extend the approaches developed and deployed to examine other pathogenic species for which whole genome sequence data is available in Wales.

Project title: AI for Gravitational Waves environments

1st supervisor: Dr Cosimo Inserra
2nd supervisor: TBD
Department/Institution: School of Physics and Astronomy, Cardiff University
Research theme: T1 - data from large science facilities

Project description: Core-collapse supernovae are the final, explosive demise of massive stars and are responsible for black hole formation. As a consequence of the prevalence of binarity amongst massive stars, they provide the leading progenitor channel of producing compact object binary systems with two black holes. General Relativity tells us that their binary orbit will shrink owing to energy losses via gravitational-wave (GW) emission. Following this shrinking, during the final few orbits, a prominent gravitational waveform is produced. Mergers of compact binaries therefore represent the true final fates of massive stars. However, unlikely mergers where a neutron star is involved, there is no electromagnetic emission arising from the merger of two black holes due to their intrinsic nature. Hence, any effort to link any gravitational waveform produced by a black holes merger to astrophysical information and/or its progenitor stars has produced null results. The project will focus on the environment of black-hole mergers, which is the only way to retrieve useful astrophysical information from such events. The project will focus on retrieving information on galaxies in the likelihood region of previous GW mergers via electromagnetic spectroscopy. Machine learning approaches will then be used to retrieve any link between the environmental information and those retrieved from the waveform. When the dataset will be rich enough, an Artificial intelligence algorithm will be built to predict what kind of galaxy will likely be the host of future, far black hole mergers.

Project title: Using AI to studying black-holes and neutron-stars via gravitational-waves

1st supervisor: Dr Vivien Raymond
2nd supervisor: TBD
Department/Institution: School of Physics and Astronomy, Cardiff University
Research theme: T1 - data from large science facilities

Project description: Project description: Ripples in the fabric of space-time, called gravitational-waves, carry information about invisible matter in the universe, and can allow for extreme tests of modern physical theories. Since 2015, we are now able to directly measure those ripples with gravitational-wave observatories such as LIGO and Virgo. This project will use machine learning, neural networks and artificial intelligence to develop efficient and evolving technique to infer the properties of future detections. Detectors are becoming so sensitive, signals so frequent and complex, that traditional manual analyses will soon not be practical anymore. Instead, AI can be leveraged to extract the scientific information from a growing set of gravitational-wave detections, while adapting to changes and unknown in the detectors and the data.

Project title: Machine learning for big astronomical data

1st supervisor: Dr Mikako Matsuura
2nd supervisor: TBD
Department/Institution: School of Physics and Astronomy, Cardiff University
Research theme: T1 - data from large science facilities

Project description: While modern telescopes made large area of surveys of Galaxy and nearby galaxies, it is challenging task to extract and pin-point the exact locations of what we are looking for in these surveys. Recently, machine learning techniques has been developed well for pattern recognitions in images, that could potentially help to identify and classify astronomical objects. The project is to develop and optimise machine learning to identify and classify supernova remnants, star-forming regions and young stellar objects on Herschel and NIKA survey of Galactic plane and nearby galaxies. The extracted data will be examined if supernova remnants compress the surrounding gas, and star-formation nearby.

Project title: Observing and understanding gravitational wave signals from black hole and neutron star mergers

1st supervisor: Prof Stephen Fairhurst
2nd supervisor: TBD
Department/Institution: School of Physics and Astronomy, Cardiff University
Research theme: T1 - data from large science facilities

Project description: Following the first observation of gravitational waves in 2015, we have now observed around 50 signals from merging black holes and neutron stars. During your PhD, we expect to observe hundreds more signals. This PhD project involves looking for the most unique signals, such as those with highly spinning black holes exhibiting orbital precession, or binaries where the masses of the two components of the binary are very different. The project will involve obtaining a better understanding of the gravitational waves emitted by such systems; developing machine learning and AI techniques to ensure that our searches are capable of identifying these signals buried in the detector data and ensuring that the astrophysical parameters of the system; such as the masses and spins of the black holes, are accurately measured. The search results will be used to obtain a better understanding of how these black hole and neutron star binaries are formed.

Project title: Disentangling dark matter and baryons in the era of exascale astronomy

1st supervisor: Dr Tim Davis
2nd supervisor: TBD
Department/Institution: School of Physics and Astronomy, Cardiff University
Research theme: T1 - data from large science facilities

Project description: Our universe is a dark and mysterious place. Only around 15% of the matter density in our universe is baryonic, while the rest is dark matter, an unknown substance which seems to interact with visible matter only gravitationally. This dark matter component is very important in the formation and growth of galaxies, defining the "cosmic web" backbone of our universe. One of the key pieces of evidence for dark matter comes from the rotation of gas in galaxies, as revealed by radio telescopes. At large radii this cold gas rotates much faster than it should, revealing the presence of a massive halo of dark material. Disentangling this signal and measuring the amount of dark matter typically requires an intensive manual modelling process. However, in the next decade new large projects such as the Square Kilometer Array (SKA), will deliver an exabyte of data each day - meaning highly efficient automated unsupervised algorithms will need to be developed to disentangle the dark matter signal, and better constrain the properties of this mysterious part of our universe. A student taking on this project would develop new methods to model the kinematics of galaxies, using convolutional neural networks, autoencoders, and other novel algorithms to develop the tools we need to take best advantage of the upcoming era of exascale astronomy.

Project title: The gaseous haloes around massive galaxies in the era of Big Data

1st supervisor: Dr Freeke van de Voort
2nd supervisor: TBD
Department/Institution: School of Physics and Astronomy, Cardiff University
Research theme: T1 - data from large science facilities

Project description: Cosmological, magnetohydrodynamical simulations aim to capture all relevant processes that lead to the large-scale structure of the Universe and the formation of galaxies. They have to be run with massively parallel code on high-performance computing clusters. New methods have been developed to increase the resolution of the environments of galaxies, called the circumgalactic medium (CGM), by a thousand-fold, launching this field into the Big Data regime. Unsurprisingly, the huge amount of data, hundreds of TeraBytes, can no longer be handled with traditional analysis techniques. Novel approaches are therefore required for the analysis and visualisation of these data, such as Principle Component Analysis and machine learning, but also new ways of designing databases that enable the most efficient storage of and access to the data. This PhD project will focus on massive galaxies and their environments, with simulations that go far beyond the state-of-the-art. The PhD student will take charge of developing new analysis and visualisation techniques and have the opportunity to exploit this unique dataset to the fullest, to study the multiphase structure of the CGM, the flow of gas into and out of massive galaxies, their chemical enrichment, and the feedback from supermassive black holes and supernovae. The PhD student will be able to reveal the crucial role the CGM plays in the formation of massive galaxies as well as provide the community with better ways of storing, analysing, and visualising the simulations of the future.

Project title: Investigating the Epoch of Galaxy Formation Using Artificial Intelligence

1st supervisor: Prof Steve Eales
2nd supervisor: TBD
Department/Institution: School of Physics and Astronomy, Cardiff University
Research theme: T1 - data from large science facilities

Project description: We recently completed the largest far-infrared survey of the extragalactic sky, the Herschel ATLAS, which detected almost 500,000 sources, ranging from nearby galaxies to dust-enshrouded galaxies at redshifts>4 seen during their initial galaxy-building burst of star formation. NASA and ESA currently have no plans for a future far-infrared space telescope, and so our survey is likely to remain the main source of information about the far-infrared and submm sky for several decades. The poor angular resolution of the Herschel Space Observatory meant that we faced a major challenge in identifying the optical counterparts to the far-infrared sources. We used a simple Bayesian technique that took account of the distance of the possible counterpart from the far-infrared source and the optical magnitude of the counterpart (a fainter counterpart is more likely to be close to the far-infrared source by chance). The H-ATLAS team (160 members in 16 countries, led from Cardiff) released all their catalogues last year but there is still a huge amount to be done. First, lack of time meant that we never looked for counterparts at all in our largest field (~200,000 sources). Second, there are several new, deeper sets of images available on which to look for counterparts. Three, the rapid development of machine-learning techniques means that we should be able to develop a method that uses all the properties of the potential counterpart (its flux densities in all the available photometric bands not just the flux density in a single band) to estimate the probability that it is associated with the far-infrared source. The student will initially produce a set of training data for the identification analysis using the much deeper and smaller (in area) COSMOS field where we can use deep radio data to identify all the counterparts. The student will then use a neural network, trained using the COSMOS data, to find the most probably counterparts to all the far-infrared sources. The student will write this up as a paper and release the catalogues of counterparts to the worldwide astronomical community. If time permits, we will proceed to deep learning techniques.

Project title: Developing automatic supernova and star-forming region detector

1st supervisor: Dr Mikako Matsuura
2nd supervisor: TBD
Department/Institution: School of Physics and Astronomy, Cardiff University
Research theme: T1 - data from large science facilities

Project description: AI is very power tool to investigate and process a large quantity of astronomical data. Using image recognition software supported by AI, we will develop an automatic identification software of supernovae and star-forming regions. The project uses the existing catalogue of Herschel Space Observatory's Galactic plane survey as a starting point, and find further supernovae and star forming regions. We are anticipating to find any difference in dust properties between these two different regions, hence, understand the evolution of dust in the interstellar medium. It is also expected that the project can capture the event of supernovae triggering star-formation.

Project title: Inferring brain tissue microstructure from standard structural imaging

1st supervisor: Dr Leo Beltrachini
2nd supervisor: TBD
Department/Institution: School of Physics and Astronomy, Cardiff University
Research theme: T2 - biological, health and clinical sciences

Project description: The characterisation of brain structure in vivo and non-invasively is crucial for understanding biological processes in health and disease. One technique used to perform such a characterisation is diffusion Magnetic Resonance Imaging (dMRI), which allows to depict brain tissue structure with exquisite detail. Despite of all the advantages provided by dMRI, it has a major limitation: the acquisition time. For this reason, dMRI measurements are not usually part of routine imaging protocols, which usually focus on more standardised structural measurements. This limits the usability of clinical data for preforming accurate computational experiments to better inform the medical diagnosis. Then, there exist an interest in inferring brain tissue microstructure based on available information, such as standard structural images. If achieved, this would also be of unprecedented importance for speeding-up dMRI acquisitions by using the generated data as prior knowledge in the corresponding post-processing analysis. In this project, the student will address the issue by using artificial intelligence tools to predict personalised information of brain tissue micro-architecture using statistical models based on existing, high-quality data. This idea is grounded on the hypothesis that there exists a relationship between tissue anisotropy from one part, and scalar magnetic properties (e.g. T1, T2) and morphology from another part, which will be modelled statistically based on training data available in CUBRIC. This will be done by designing and implementing a data-driven statistical learning framework based on partial least squares regression models. The implementation in the graphical processing unit (GPU) will be pursued for speeding-up the computing time.

Project title: ST-AI; Application of AI approaches to improve patient outcomes for sexually transmitted infections

1st supervisor: Dr Thomas Connor
2nd supervisor: TBD
Department/Institution: School of Biosciences, Cardiff University 3rd supervisor: Dr Zoe Couzens (Public Health Wales, Health Protection)
Research theme: T2 - biological, health and clinical sciences

Project description: Neisseria gonorrhoeae (NG) poses a major public health challenge. It is the second most frequently diagnosed sexually transmitted infection in Europe, and isolates are increasingly resistant to key treatments - ceftriaxone and azithromycin. Increasing resistance is driven by a multitude of factors. The nature of sexual behaviour and of the way that patients present for care for sexually transmitted infections, combined with variable provision of care for STIs across the UK complicates the delivery of targeted treatments, and may affect the increase in resistance that is being observed. This study seeks to utilise AI approaches to improve our ability to type, track and treat NG infections in Wales. It builds upon work already being undertaken within Public Health Wales, and seeks to extend what is currently possible. Broadly, it has three interrelated elements, the outcomes from which will provide a route to improved patient care; 1) The interrogation of population-level health data to identify complex risk factors that relate to NG disease; 2) The linking of genomic sequence data to population-level health data to inform the development of molecular tests and to gain increased resolution on NG risk factors; 3) The development of systems to perform risk assessment of a patient in real time, using information collected from patients via an online client to either reassure a patient or trigger the sending out of a self test kit and visit to an STI clinic. Depending on background, the successful candidate will begin work examining the population-level patterns and risk factors of NG disease using SAIL, and then move on to either utilising genomic sequence data or to undertaking the research that will underpin the potential patient-facing system.

Project title: Human-Machine Collaboration with Deep Learning Agents

1st supervisor: Prof Alun Preece
2nd supervisor: TBD
Department/Institution: Cardiff University School of Computer Science and Informatics in cooperation with the Cardiff University Crime and Security Research Institute
Research theme: T3 - novel mathematical, physical and computer science approaches

Project description: Recent advances in artificial intelligence (AI) and machine learning (ML), especially ML based on deep neural networks (so-called deep learning), has led to a range of successful applications in tasks such as image recognition, classification, and anomaly detection. However, it is widely acknowledged that in many applications, effective task performance involves a combination of machine intelligence and human judgement; i.e., collaboration between human and machine agents. The first generation of deep learning systems suffered from being `black boxes', with minimal ability to explain their decisions, making them of limited use in human-machine collaboration applications. This situation is steadily improving, with increasingly-sophisticated explainable AI (XAI) techniques. However, there is still a lack of knowledge about how best to equip a deep learning based machine agent with XAI capabilities geared to help improve task performance in a human-machine team. Specifically, we are interested in applications that involve some combination of the following factors that complicate the problem: (1) decision-making where the ML agent needs to process multimodal data (for example, including imagery, audio, and text classification) so an explanation must itself be multimodal and also will likely involve the 'fusion' of data from multiple modalities; (2) decision-making where the machine agent needs to learn from the human agent (for example, the human 'tells' the machine some important piece of information to modify its model of the world); (3) decision-making where the human agent needs to learn from the machine agent (for example, the machine needs to communicate and hence transfer some 'insight' to the human, allowing them to have the same insight themselves in future). This PhD is suitable for someone keen to gain in-depth knowledge of state-of-the-art deep learning and XAI, and an interest in human-computer collaboration in general. The project will be carried out in collaboration with IBM UK, Dstl, Cardiff University Crime and Security Institute and with international cooperation (University partners in the US via the Distributed Analytics and Information Sciences Distributed Technology Alliance).

Project title: Adaptive Neural Networks through Epigenetic Processes

1st supervisor: Prof Roger Whitaker
2nd supervisor: TBD
Department/Institution: Cardiff University School of Computer Science and Informatics in cooperation with the Cardiff University Crime and Security Research Institute
Research theme: T3 - novel mathematical, physical and computer science approaches

Project description: The recent increased availability of both data and computation has led to global interest in artificial neural networks - these have been shown to be highly effective in creating artificial intelligence particularly when functioning at scale (e.g., depth). Deploying neural networks typically involves training to establish appropriate weights between inputs and outputs. However such training is scenario dependent, with retraining needed if there is a significant change in the underlying environment. Therefore it is highly desirable for future neural networks to have the ability to self-adapt in relation to changes in their deployment scenario, so that flexibility is apparent and performance is maintained without full-scale retraining. This will become more important as AI engages with dynamic situations where neural networks may need to ideally learn and adapt with greater agility. In this project the aim is to take inspiration from genetic algorithms (GAs) for neural network self-adaptation. Longstanding work has demonstrated that GAs can successfully be used to evolve a high performance population of neural networks (e.g., Evolving Neural Networks through Augmenting Topologies - NEAT) while more recent work by Uber AI labs has demonstrated that genetic algorithms are a remarkably competitive alternative to training deep neural networks for reinforcement learning. The project will involve working in the relatively new field of epigenetic algorithms, and using these to promote adaptation of neural networks that are represented in a genetic form. Genetic processes inspire epigenetic algorithms, where the interaction with the environment influences the genes held by an individual between cycles of population reproduction. Epigenetics is a phenomenon that is now recognised in biology, pharmacology and medicine with many applications. The methodology will involve developing suitable representations, epigenetic techniques and evaluation methods to explore the problem of self-adaption in neural networks. This is an ambitious goal and this project will seek to establish fundamental steps in this area by assessing benchmarks that are well understood. This PhD is suitable for someone keen to gain in-depth knowledge of state-of-the-art deep learning and neuroevolution, and an interest in developing future AI with new and general capabilities. The project will be carried out in collaboration with IBM UK, Dstl, Cardiff University Crime and Security Institute and with international cooperation (University partners in the US via the Distributed Analytics and Information Sciences Distributed Technology Alliance).

Project title: Exploiting network motifs to enhance prediction of contagion in complex networks

1st supervisor: Prof Roger Whitaker
2nd supervisor: Prof Alun Preece
Department/Institution: Cardiff University Crime and Security Research Institute
Research theme: T3 - novel mathematical, physical and computer science approaches

Project description: Network motifs are the over-representation of small induced sub-structures in a directed network, compared to what can be expected against some baseline (e.g., at random). Motifs are useful for characterising complex networks, which may be too large or dynamic to engage other types of network analysis. Network motifs have been established as a useful methodology for determining the underlying and often hidden characteristics of a network. This project will consider using motifs as a basis to predict the susceptibility of a network to different forms of contagion (e.g., both simple and complex contagion). The work will be undertaken in close collaboration with Crime and Security Institute researchers, using large-scale data sources to investigate the potential of motifs to offer advanced warning against different forms of social contagion in a variety of networks and scenarios. These will centre on, but will not be restricted to social media, and will consider the potential to address dynamic and (near-) real time scenarios. The project will involve considering a range of prediction strategies, based on supervised (and potentially other types of) learning.

Swansea University

Project title: Machine learning for lattice QCD

1st supervisor: Prof Gert Aarts
2nd supervisor: Prof Chris Allton, Prof Biagio Lucini
Department/Institution: Physics Department, Swansea University
Research theme: T1 - data from large science facilities

Project description: Lattice QCD aims to understand the strong force via the use of large-scale numerical simulations of QCD. Recently there is an increased interest in applying machine-learning techniques to a variety of problems. This includes the determination of simulation parameters, the generation of configurations, the analysis of configurations in absence of an order parameter, spectral reconstruction, and possible uses for theories with a sign problem. In this project we will first explore the (limited) lattice QCD literature on machine learning and subsequently determine which direction to pursue, based on interests of the student and promise of the methods.

Project title: Machine learning with anti-hydrogen

1st supervisor: Prof Niels Madsen
2nd supervisor: TBD
Department/Institution: Physics Department, Swansea University and CERN
Research theme: T1 - data from large science facilities

Project description: The ALPHA Antihydrogen experiment makes use of several particle detector technologies, including a Silicon Vertex Detector, Time Projection Chamber, and a barrel of scintillating bars. One of the key challenges for these detector systems is to distinguish between antihydrogen annihilations and cosmic rays, a classification problem machine learning can do excellently. Presently this task is done by the use of cuts based on two high-level variables from the detectors for online analysis, and boosted decision trees with high level variables in offline analysis. This project would take a student into the future of machine learning. High level variables are a powerful tool for discrimination, however they are slow to pre-process. The challenge of this PhD project would be to build both online and offline analyses that have different processing budgets. Initially the plan is to investigate the application of modern machine learning techniques, such as deep learning, to attempt to beat the current cutting edge decision tree analysis used by the collaboration. Subsequently the project will expand to look at replacing the high level variables with lower level variables to reduce pre-processing time. Ultimately, a small enough model that can interpret raw detector output can make a real-time online analysis, with the final goal of programming an FPGA or micro-controller to perform accurate, real-time classification of detector events. The combination of these projects would build a robust and comprehensive thesis that investigates machine learning applied to particle detectors. It will clearly illustrate that good data preparation is the key to accurate classification models, as well as demonstrate the speed that can be achieved using simple models to handle low level data. Demonstration of a micro-controller and FPGA level classification would have a large impact for the particle detector community contributing to detector trigger systems and live diagnostics beyond the scope of the ALPHA experiment.

Project title: Machine learning for multidimensional ultrafast x-ray interferometry

1st supervisor: Dr Kevin O'Keeffe
2nd supervisor: Dr Adam Wyatt
Department/Institution: Physics Department, Swansea University and Central Laser Facility
Research theme: T1 - data from large science facilities, T3 - novel mathematical, physical and computer science approaches

Project description: Observing the dynamics of molecular systems on their natural timescale is a fundamental challenge in physics and chemistry. Recently, multidimensional interferometry using ultrafast x-ray pulses has emerged as a powerful method for tracking the motion of electrons during the first few femtoseconds of a light-atom interaction. This technique records the spatially and spectrally-resolved interference pattern from two laser-generated x-ray sources at multiple source positions, providing access to phase information crucial for resolving ultrafast dynamics. Although this technique enables measurements with unprecedented temporal stability the 4-dimensional interferograms which are generated are highly structured and challenging to analyse. The primary goal of this project will be to develop a machine learning tool capable of reliably identifying the key signatures in the interferogram related to electronic motion in atomic systems such as argon. The algorithm will be trained using simulated interferograms based on strong-field calculations before being implemented on real data sets. The algorithm will then be extended to the analysis of interferograms generated using more complex targets such as molecular nitrogen and carbon dioxide. Developing robust methods for extracting data from such interferograms will provide new opportunities for understanding the behaviour of bond formation and breaking at the natural timescale of chemical reactions.

Project title: Enhancing the diagnostic performance of a bowel cancer blood test using advanced machine learning algorithms and the incorporation of information from the patient's medical record

1st supervisor: Prof Peter Dunstan
2nd supervisor: Prof Dean Harris
Department/Institution: Physics Department and Medical School, Swansea University
Research theme: T2 - biological, health and clinical sciences T3 - novel mathematical, physical and computer science approaches

Project description: This project concerns the extended development of a blood-based diagnostic for bowel cancer by factoring in patient record information. In Europe, colorectal cancer (CRC) is the second most common cancer, with approximately 450,000 new cases per year. CRC is the 3rd most common cancer in the UK with 60% presenting at a late stage, III /IV. Early diagnosis makes a significant difference to survival rates. The Swansea Biospectroscopy group led by Professors Harris and Dunstan have developed an effective blood test based upon laser spectroscopy. The test utilises machine learning algorithms and is trained on spectral pattern recognition to optimise the diagnostic for early detection. It is possible to further improve the development of the algorithms to include patient record information where additional patient factors can be used to reduce false positives and eliminate false negatives. In particular the effect of co-morbidities, clinical features including age and family history of cancer and the patient's current medication are key factors which the project will aim to incorporate. For further advancing the early diagnostic potential of the test the project can also develop its diagnostic accuracy in the detection of polyps and identify those patients most likely to develop malignancy. The findings of this study will advance the translation of a blood-based diagnostic for bowel cancer into the healthcare system. It is anticipated that the doctoral researcher will become highly skilled in HPC and the development of appropriate machine learning codes using the infrastructure offered by the CDT.

Project title: Imaging Flow Cytometry of Tumour-Educated Platelets For Cancer Diagnostics

1st supervisor: Prof Kenith Meissner
2nd supervisor: Prof Paul Rees (College of Engineering/Swansea University)
External collaborators: Dr Bethan Psaila (Oxford), Dr Henkjan Gersen (Bristol), Dr Christopher Gregory (Edinburgh)
Department/Institution: Department of Physics, Swansea University
Research theme: T2 - biological, health and clinical sciences; T3 - novel mathematical, physical and computer science approaches

Project description: Liquid biopsy - the non-invasive sampling of cancer cell-derived biomarkers from peripheral blood - is showing promise in early detection of cancers. The majority of current approaches focus on circulating tumour cells (CTC), cell free DNA (cfDNA) and extracellular vesicles (EVs), which are technically challenging to isolate and insufficiently sensitive. Platelets have also been investigated as a liquid biopsy tool. These are highly abundant, anucleated cells produced primarily by megakaryocytes in the bone marrow. They are vital for coagulation, but also play key roles in tumour growth, invasion, metastasis and suppression of anti-tumour immunity. Platelets perfuse tumours, adhere to activated tumour endothelium and also coat CTCs in the bloodstream. Platelet coating of CTCs facilitates CTC immune evasion, survival, tethering and invasion at sites of metastasis. Platelet-tumour cell interactions lead to platelet sequestration of tumour-derived biomolecules including proteins and RNA ('tumour-educated platelets' - TEPs). The presence of cancer alters RNA splicing events occurring in platelets, resulting in specific platelet mRNA 'signatures' that have been shown to identify patients with cancers with remarkable sensitivity. TEPs may thereby act as easily accessible, physiological 'sentinels' of malignancy. This project supports an ongoing collaboration between researchers at Swansea University, Bristol University, the University of Oxford and Edinburgh University. The overarching goal of the ongoing research collaboration is to validate TEPs as a superior platform for early cancer diagnosis as well as molecular Profiling of tumours compared to current plasma-based cfDNA techniques. This project will focus on developing AI/machine learning analysis of Imaging Flow Cytometry (IFC) data on platelets to answer the key question: Are the immunophenotypic and topological properties of platelets altered following exposure to tumour cells or tumour-cell derived EVs? IFC is a high throughput technique that produces brightfield and fluorescence images of individual cells to enable both fluorometric and topological analysis of the cells. The figure shows typical IFC data for resting (top) and activated platelets (bottom). Thus, the analysis can identify changes in the activation state of populations of platelets which can then be correlated with quantifiable individual platelet topological changes, such as area, perimeter, shape and fractal dimension. In this highly interdisciplinary project, the doctoral researcher will be exposed to both technical developments as well as the translation of clinical samples. The doctoral researcher will develop the skills to: (1) fuse fluorometric and topological image analysis techniques; (2) develop AI/machine learning techniques to distinguish sub-populations of cells; (3) optimise AI/machine learning techniques toward diagnostics for early cancer detection. Upon completion of this research project, the doctoral researcher will be positioned at the intersection of data science, physical sciences and medicine.

Project title: Development of AI for quantification of Magnetic Resonance Spectroscopy and multi-parametric MRI data

1st supervisor: Dr Sophie Shermer
2nd supervisor: Dr Frank Langbein (Computer Science, Cardiff)
Department/Institution: Physics Department, Swansea University
Research theme: T2 - biological, health and clinical sciences

Project description: Magnetic resonance spectroscopy (MRS) enables the detection and potentially quantification of metabolites and chemical biomarkers for disease based on non-invasive in vivo imaging techniques. However, the complexity of in vivo spectra combined with experimental, technical and theoretical limitations, makes robust reliable quantification of MRS data extremely challenging. Recent work comparing existing tools for quantification of edited spectra has shown considerable lack of consistency, reliability and accuracy of MRS quantification results compared to ground truth [1] and machine learning has shown promise [2] to improve these results. Your project will build on these preliminary results and aim to develop new AI based techniques for robust quantification of MR spectra. The project will combine experimental design and data acquisition as well as the development, testing, validation of new algorithms. [1] arXiv:1909.02163 [2] arXiv:1909.03836.

Project title: Predictors associated with the development of multiple sclerosis: A machine learning driven study with linked MS register and routine health care datasets

1st supervisor: Dr Shang-Ming Zhou
2nd supervisor: Mr Rod Middleton
3rd supervisor: Prof Sinead Brophy
Department/Institution: HDRUK, Medical School, Swansea University
Research theme: T2 - biological, health and clinical sciences

Project description: Multiple sclerosis (MS) is the commonest cause of disability in younger people with a substantial cost to the person with the condition, their family and society as a whole. The development of disease modifying therapies (DMT’s) over the last 25 years has revolutionised the approach to the condition but if the condition could be identified earlier they could be better used to prevent disability. The ability to identify predictors of MS early would mean we could speed up access to a diagnosis and for disease modifying treatment, reducing disability. This study is unique in that it brings together patient reported data, and routine data on health records from across England and Wales. This means we have the potential to have patient reported symptoms, laboratory test results, prescriptions, diagnosis, medications and procedures. The MS Register at Swansea University has been collecting Multiple Sclerosis data continuously for 8 years from people with MS and the NHS, making it the largest repository of MS data in the UK. The MS Register contains patient reported data, clinical data and other structured outcome data collected from specialist secondary care clinics, including medications and measures of severity. This project will link data from across the UK, including: General Practice, Hospital admission and outpatient data, laboratory (results data), the UK MS Register and with Office of National Statistics Data (ONS) where appropriate for patient mortality. In this project, machine learning and natural language processing (text mining) techniques will be developed to extract clinical signals and symptoms from clinical notes. These techniques will be used to code text from notes and letters and so enhance the information on patients before diagnosis. This PhD student will examine if combining the extracted signals and structured data (as in model above) can improve predictive value of the model. This work will help to develop hypothesis as to what factors may be on the causal pathway to developing MS. So, this PhD project will demonstrate the promising application of Machine Learning to healthcare data by bringing together both routine medical data and patient reported data and examining the added knowledge gained from texts. The findings from this work can be extrapolated an applied to other health conditions.

Project title: Understanding and optimizing therapeutics for ovarian cancer through an AI and advanced computing twin

1st supervisor: Prof Steve Conlan (Medical School)
2nd supervisor: Prof Perumal Nithiarasu (Engineering)
3rd supervisor: Dr Shang-Ming Zhou (Medical School)
External: Dr Liam McKnight (Consultant Radiologist, Swansea Bay UHB)
Department/Institution: Medical School and Engineering, Swansea University
Research theme: T2 - biological, health and clinical sciences T3 - novel mathematical, physical and computer science approaches

Project description: Ovarian cancer (OC) is the seventh leading cause of cancer-related death in women worldwide. It causes around 4,100 deaths annually in the UK, where recurrence rates are up to 75% and the five-year survival rate is only 46%. High vascular permeability and compromised lymphatic drainage result in ascites, the accumulation of fluid, in the peritoneum that is associated with advanced OC. We will develop a new paradigm for effective drug treatment based around microparticle drug delivery that will restrict particles (and therefore drugs carried within them) to the peritoneal cavity. Furthermore, particles will be designed that will preferentially accumulate at cancer sites (i.e. the walls of the cavity). Together this will result in increased efficacy, and reduced cytotoxicity and loss of therapeutic agents from the peritoneal cavity. The project will exploit available computed tomography (CT) images of OC patients to generate virtual models (digital twins) of the ascites containing peritoneal cavity and surrounding structures. From this complex fluid flow modelling will be developed to predict microparticle movement within the digital twin and determine the particle geometry needed to 'target' cancer sites. This proposal builds on and integrates academic research in the fields of patient imaging, computational modelling, AI and cancer biology, and will deliver an optimised pathway for computerised tomography (CT) scan simulation for drug delivery. The AI outputs will be immediately usable by other researcher/clinical teams, with their application not limited to OC or peritoneal disease but extensible to any CT scan and, through adaptation, magnetic resonance (MR) imaging data.
The project will involve interdisciplinary research between Engineering, Computer Science, Medicine and the NHS. Successful completion will lead to novel drug delivery pathways for OC therapeutics through utilisation of imaging data (20-40 of CT images); an exemplar for how patient data and samples can be used to drive healthcare developments. Machine learning will play a major role in the proposed project. When the digital twin model is established, we will have plenty of opportunities to generate synthetic data on parametric variations. The information obtained from both synthetic and clinical data will be incorporated to develop LSTM models as we anticipate strong temporal components in the results. Towards the end of this project we should be able to demonstrate how deep learning can be used in treatment planning. The demonstrated tool will be tested on new geometries for creating better understanding the treatment planning. We anticipate that the proposed treatment planning tool will lead to substantial further funding application towards implementation.

Project title: Advanced machine learning for unravelling unknown patterns of polypharmacy and multi-morbidity from social media and electronic health records to enhance care of patients with chronic conditions

1st supervisor: Dr Shang-Ming Zhou
2nd supervisors: Prof Andrew Morris, Prof Sinead Brophy
Department/Institution: Medical School, Swansea University
Research theme: T2 - biological, health and clinical sciences

Project description: As the population ages, caring for the patients with multimorbidity and the extent to which their needs are met are sharp exemplars of the most important tasks facing healthcare services across the world in the 21st century. This project is intended to contribute to the solutions of the two greatest challenges currently confronting healthcare: the linked problems of multimorbidity and polypharmacy. This project will develop and use advanced machine learning and AI techniques to discover previously unknown patterns of polypharmacy and multimorbidity from electronic health records and social media, and predict patient cohorts at risk, detect adverse drug events caused by a combination of drugs, and identify patterns of prescriptions for intervention to facilitate drug verification. Therefore, this project will help gain in-depth knowledge of pharmacovigilance for patient safety, and more insight into what constitutes "best care" for patients with multimorbidity.

Project title: Exploring patterns of polypharmacy in Type 2 diabetes

1st supervisor: Prof Sinead Brophy
2nd supervisor: Dr Shang-Ming Zhou
3rd supervisor: Dr James Coulson (All Wales Therapeutics and Toxicology Centre)
Department/Institution: HDRUK, Medical School, Swansea University.
Research theme: T2 - biological, health and clinical sciences

Project description: Type 2 diabetes mellitus is a disease characterised by insulin resistance; pancreatic dysfunction and hyperglycaemia. People with diabetes often have and develop other conditions and are on multiple medications. For example, medications for their diabetes, for protecting the heart (beta blockers, ACE inhibitors), cholesterol control, in addition to medications for unrelated conditions (arthritis, asthma, kidney conditions). People with diabetes are exposed to a high degree of polypharmacy but it is not clear what patterns of drug prescriptions are common and the long term implications of exposure to multiple drugs. This study will examine temporal patterns of prescriptions and Profile patients into drug pattern groups based on their medications. The study will examine the frequency of specific drug patterns and the outcomes given a specific drug Profile pattern, e.g. emergency hospital admissions, coded adverse drug events. This study will used prescription data and dispensing data (from the All Wales Therapeutics and Toxicology Centre) at the GP level and hospital discharge prescription data (this analysis will include identifying medications that are not taken up by the GP but were prescribed from hospital). The PhD student will use linked medications data, GP records, hospital records to identify patterns of drug prescriptions and their temporal relationships. The PhD student will develop temporal data mining techniques to examine adverse events/lack of adverse events associated with specific medication Profiles. E.g. a Profile with statins may include less cardiac admission codes compared to a Profile without statins for people with diabetes. This work will greatly inform our knowledge of what polypharmacy patterns are patients exposed to and the prevalence of these patterns among people with diabetes. It will give an evidence base to the best practice in treating people with diabetes and so improve patient care.

Project title: Cell morphologies predict compound efficacy and mechanism action; deep learning approaches in pharmaceutical development

1st supervisor: Lewis Francis
Additional supervisors: Claire Barnes, Deya Gonzalez, Paul Rees
Department/Institution: Swansea University Medical School
Research theme: T2 - biological, health and clinical sciences

Project description: Inefficient drug discovery is a prevalent problem in the pharmaceutical industry, with low numbers of new drugs being approved by the FDA. Current estimations put compound development costs at circa 1billion dollars. High content screening methods based on disease specific phenotype mapping have begun to supplement and replace more traditional high throughput screening processes to predict drug effect in drug discovery pipelines. Improved scalability, decreasing costs and novel multiparametric analysis, supplement high throughput microscopy based cellular imaging and other flow cytometric technologies, incorporating real biological responses earlier in the hit to lead development stage. Ovarian cancer (OC) is the 6th most lethal cancer in women worldwide. Treatment outcomes have improved only marginally over the last fifty years with the UK 10-year survival rate for ovarian cancer patients reaching only 35 %. Disease heterogeneity driven by epigenetic modifications is thought to be central to the therapeutic challenges faced in the clinic with the 'one drug for all patients' ethos proving ineffective for advanced disease. Through CEAT, a large industry-academic partnership with partners GSK, GE, Bruker, Axis bio and Porvair life Sciences, the Reproductive Biology and Gynaecological Oncology Group (RBGO) in Swansea University, has developed large scale, phenotypic data sets interrogating the pharmacogenetic effects of drug compounds in High Grade Serous Ovarian Cancer cells. Complemented by analytical frameworks and high-performance computational resources this project will open new avenues for data-rich phenotypic Profiling of tested small molecule candidates. Developing better drug candidates motivates the focus on integration of novel approaches and new technology development. Image-based chemical genetics offers a great potential to systematically characterize small molecules in human cell-based systems and to probe chemical-genetic interactions in complex and human diseases such as OC. The project will optimise the ability to identify, understand and process OC specific, patient derived phenotypic screening outputs. Applying deep learning approaches to interrogate image based, high dimensional Profiling data in patient derived models by integrating drug response and genetic/epigenetic data sets, this PhD student will focus on identifying cancer specific vulnerabilities.

Project title: MAchine LeaRning for The BuIlt ENvIronment (MARTINI)

1st supervisor: Dr Richard Fry
2nd supervisor: Dr Lucy Griffiths
Associate supervisor: Dr Ben Beck, Monash University, Australia
Department/Institution: Swansea University Medical School, Swansea University
Research theme: T2 - biological, health and clinical sciences

Project description: The built environment (BE) in which we live is integral to human health. Research has shown that residing in 'liveable' neighbourhoods characterised by good access to shops, services and quality parks, connected streets to facilitate walking, sufficient residential densities to support public transport services and local businesses, minimal crime, and opportunities for social connectedness, is associated with improved health outcomes in adults. The volume, quality and availability of remotely sensed data (e.g. Aerial Imagery, Synthetic Aperture Radar (SAR) and Light Detection and Ranging (LiDAR)) has improved significantly over the past 10 years with repeat daily satellite coverages of the globe now available. This provides a unique opportunity to develop analytical frameworks to characterise the BE in terms of both hard engineered and human aspects (e.g. cars, flows of people and soft engineering). For example, recent work has shown that cyclists are more vulnerable to road traffic where there are soft engineered elements to the BE (i.e. painted road separation). However, characterisation of the BE using traditional survey methods are time consuming and expensive. Building on recent, this PhD would develop a computer vision and machine learning framework which can be used to develop risk models of the built environment for cycling. Specifically, the research questions are: 1) Assess current and develop novel GIS based machine learning and computer vision techniques suitable for developing risk models for the built (cycling) environment (BCE); 2) Statistically analyse how these models relate to objectively measured data captured by cyclists for close passes and perceptions of risk.

Project title: Improving Precision and Convergence of Machine Learning Algorithms with application to Lattice QCD and GPUs

1st supervisor: Dr Benjamin Mora
2nd supervisor: TBD
Department/Institution: Computational Foundry
Research theme: T3: novel mathematical, physical and computer science approaches

Project description: Lattice QCD researchers are used to improve numerical algorithms to understand particles like quarks or gluons better. The algorithms are at the crossroad between Monte Carlo techniques and inverse problem solvers, and usually improved variants of the conjugate gradient algorithm. Similarly, Machine Learning (ML) researchers try to solve (or at least minimise) inverse problems using randomised methods like Stochastic Gradient Decent (SGD). With the advent of better algorithms and concepts (e.g. Generative Adversarial Networks), high performance graphics cards (GPUs) and specialised accelerators (e.g. Google’s TPUs), some reasonably good levels of AI can nowadays be obtained with current methods in specific applications. Hence, it is clear that both Physics and ML have common interchangeable areas of research and there has never been a more exciting time to combine Physics and ML knowledge. This project will therefore try to aim at the following problems: 1) Can we ensure faster learning with Machine Learning? More precisely, are there efficient methods that can replace algorithms based on SGD? This is an extremely important problem due the huge quantity of calculations needed to train a neural network. 2) What is the influence of arithmetic precision in ML applications? One aim is to provide new algorithms that are more robust at calculating dot products from standard types (e.g. float or doubles), especially on GPUs. The new techniques will possibly improve stability of CG methods and reduce the number of times the algorithm needs to be restarted. 3) On the opposite direction, can (low complexity) approximation methods for linear algebra operators be useful to neural networks, and in particular in the context of GANs? Can approximation techniques also be useful to lattice QCD?

Project title: Clinical decision support tool for suicide prevention using machine learning and electronic health records

1st supervisor: Prof Ann John
2nd supervisor: Dr Marcos del Pozo Banos
Department/Institution: Swansea University Medical School
Research theme: T2 - biological, health and clinical sciences; T3 - novel mathematical, physical and computer science approaches

Project description: 800,000 people are estimated to die by suicide annually worldwide, accounting for 1-2 deaths in every 100. Only one-third are in contact with mental health services during the preceding year. However, almost all have contact with other health services in that period, often for seemingly unrelated conditions. This project will seek to develop 'always-on' AI screening tools that use existing healthcare and administrative data to reliably identify opportunities to offer targeted support to individuals at risk of suicide when they present to healthcare services, alerting clinicians in a variety of healthcare contexts. The project objectives are: (1) to define the user specifications and the legal and ethical implications of the proposed tools through research and facilitated meetings and workshops with stakeholders; (2) to use AI and anonymously-linked, person-level, routinely-collected health and demographic data in order to develop risk estimation models that meet the requirements set by stakeholders; and (3) to scrutinise the AI-generated models in order to advance our knowledge of the clinical and social-cultural factors underpinning suicide - thereby targetting a lack of clinically-sound underpinning research, particularly research relating to implementation of AI in clinical practice (beyond radiology). The solution will use routine primary and secondary care data processed independently and in tandem, and integrate into the workflow of mental-health screening, diagnosis and treatment. It will be developed following a 'clinical-first' approach - demonstrating the validity of the results in clinical terms, thereby gaining the trust of the healthcare community and helping to ensure future uptake. The project will use existing data science and informatics infrastructure and cross-disciplinary expertise on Swansea University's Farr Institute and Health Data Research UK substantive site. It will use real data stored in the Secure Anonymised Information Linkage (SAIL) Databank [saildatabank.com], a regularly updated repository of anonymised, person-based linkable data for over 3 million people in Wales. This is a highly interdisciplinary project with both the technical and the clinical sides being equally developed. The doctoral researcher will develop the skills to: (1) process big data and electronic health records, characterized by high volumes of incomplete, noisy and mostly binary data; (2) analyse this data using AI and epidemiological tools, developing new, optimized designs of the former and validating them with the latter; (3) dissect AI models to produce first and foremost clinically relevant results; and (4) incorporate into the design the input from physicians, nurses, emergency staff, psychiatrists, experts on healthcare law and ethics, patients and members of the public. These skills will position the doctorate researcher at the interface across the disciplines of data science, epidemiology and mental health: a role which the MRC had identified to suffer from lack of capacity.

Project title: Robust Parameter Optimisation for Image Segmentation

1st supervisor: Dr Thomas Torsney-Weir
2nd supervisor: Dr Alma Rahat
Department/Institution: Computational Foundry, Swansea University.
Research theme: T2 - biological, health and clinical science T3 - novel mathematical, physical, and computer science approaches

Project description: Medical Professionals rely on robust image segmentation algorithms to highlight anomalous tissues in a patient, e.g. cancer. However, image segmentation algorithms are typically calibrated on a limited number of training images through a tedious trial and error process. This gives limited context to the robustness against overfitting which makes it difficult to predict how well the segmentation algorithm will perform on unseen images. This could lead to an incorrect diagnosis. The goal of this project is to use a combination of visualization techniques and Bayesian model calibration techniques to develop a system for robust parameter optimisation with a focus on image segmentation algorithms.

Project title: Precision Medicine with Deep Learning and Big Data

1st supervisor: Dr Gary KL Tam
2nd supervisor: Dr Shangming Zhou
Department/Institution: Computer Science and Biomedical Science, Swansea University
Research theme: T2 - biological, health and clinical sciences

Project description: Conventional medicine designs and uses generic treatments to treat large groups of people without much consideration of patients' genetics, medical history, family history or environmental factors. More personalized, data-driven treatments can benefit both the patients and clinicians, improving therapy effectiveness, reducing its time and cost, and even discovering new cures. With the prevalence of Electronic Health Records (EHR) in healthcare a lot of data is collected. It offers information about patients' progression throughout various diseases. This project will consider modern deep learning approach, explore the possibility to analyze such prevalent but annoymised patient records, and discover treatments, drugs and procedures that may be effective. It will provide clinicians with useful insights into what might be the best treatment for a given patient, improving healthcare quality and cost efficiency.

Project title: Geometric learning for multi-dimensional time series

1st supervisor: Dr Farzad Fathizadeh
2nd supervisor: Prof Biagio Lucini
Department/Institution: Department of Mathematics, Swansea University
Research theme: T3 - novel mathematical, physical and computer science approaches

Project description: Advances in concurrent recording of behaviours and neural activities in the intact brain have led to an invaluable source of information for fathoming into the properties of brain networks, and for determining the statistical properties of animal cognition and social behaviour. Multi-modal recordings of neural activities at different spatiotemporal scales are accompanied with considerable noise and require advanced and novel analytical and statistical techniques for signal detection in the corresponding time series. In previous work, a statistical method for the detection and sorting of neuronal signals in noisy time series through large scale hypothesis testing and the so-called geometric learning has been devised. Geometric learning is a method that associates a graph to a given data set; one can then read off the local geometry of the data in the heat kernel of the Laplacian of the graph (viewed as an approximation of the Laplacian of a curved geometry). In this project, this analysis technique will be generalised to multi-dimensional (correlated) time series and machine learning will be used for classification of the detected signals. Part of the project will be about detection of decision boundaries for hypothesis rejections by simulations, and working out theoretical aspects of the observed boundaries.

Project title: Monitoring, decision making and data structures in AI

1st supervisor: Dr Edwin Beggs
2nd supervisor: Prof John Tucker
Department/Institution: Maths / Computer Science, Swansea University
Research theme: T3 - novel mathematical, physical and computer science approaches

Project description: Much of the data that is used in modern AI comes from a process of monitoring. This is true of the systems of science, engineering and medicine, and of natural environments. We propose to study a fundamental theory of monitoring that combines measurement, data collection, classification, and interventions and actions. Central to monitoring is classification and categorisaton, which in turn require decisions to be made in a particular context. Furthermore decisions are commonly based on ambiguous or partial information. For example, the sensors used to collect data can be error prone and/or also have a delayed response (or even can be completely corrupted by a cyber attack or fail altogether). In addition, classification is often made by non-explainable tools, such as neural nets. In this project, states of systems will be partitioned into natural modes of behaviour. These modes are modelled using a multi-dimensional geometric arrangement called a simplical complex. These geometric data types have a rich theory, where the higher dimensional structure models contested (multi-possibility) decisions which have to be made. In an error prone world, decisions can be based on probabilistic methods and using mathematical theories of belief already established in AI. We hope to apply this model to both physical experiments and control systems. In addition, we hope to incorporate human based reasoning into AI by using the above methods, such as human methods of decision making and paradigm shifts. The theory may be applicable to human-centred systems.

Project title: Explainable Graph-Based Machine Learning for Multivariate Graph Analysis in Public Health

1st supervisor: Dr Daniel Archambault
2nd supervisor: Dr Michael Edwards
Department/Institution: Computer Science Swansea University
Research theme: T3 - novel mathematical, physical and computer science approaches T2 - biological, health and clinical sciences

Project description: In public health settings, social networks are often encoded as multivariate graphs with both static and dynamic information. The human actors in these networks have demographic and survey information associated with them along with their social ties. Given this information, the analyst wants to understand how the information associated with the nodes and the social ties influence behaviours (under-age drinking, mental health, non-suicidal self injury etc). In this project, we explore how graph-based machine learning can be used as part of the analytics process in order to create explainable AI visual analytics systems to help find solutions to these important societal problems.

Project title: High-order methods for computational design and data-driven engineering

1st supervisor: Dr Nelly Villamizar
2nd supervisor: Dr Gary Tam
Department/Institution: Department of Mathematics and Department of Computer Science, Swansea University
Research theme: T3 - novel mathematical, physical and computer science approaches

Project description: The aim of this project is to advance the development of a mathematical framework for the integration of geometric modeling and simulation using spline-based finite elements of high degree of smoothness. High-order methods are known to provide a robust and efficient methodology to tackle complex challenges in multi-physics simulations, shape optimisation, and the analysis of large-scale datasets arising in data-driven engineering and design. However, the analysis and design of high-order methods is a daunting task requiring a concurrent effort from diverse fields such as applied algebraic geometry, approximation theory and splines, topological data analysis, and computational mathematics. Building on current tools and taking advantage of the diverse expertise at Swansea University, the goal of this project is to extend the theoretical foundations of these techniques and to adapt them to new challenges arising in applications such as CAD, solid modeling industry and the creative industry (part of the UK digital economy strategy).

Project title: Putting the Human Back in the Loop of Bayesian Optimisation for Aerospace Design

1st supervisor: Dr Alma Rahat
2nd supervisor: Dr Sean Walton
Department/Institution: Computational Foundry, Swansea University
Research theme: T3 - novel mathematical, physical and computer science approaches

Project description: Bayesian optimisation algorithms have proved useful for solving many complex aerospace design problems. In simple terms a designer defines a problem with a starting shape for a component such as a wing as well as some kind of performance metric the algorithm tries to maximise. Computational fluid dynamic (CFD) simulations are run to estimate the performance metric to drive the optimisation process which may be computationally expensive and time-consuming. A surrogate model for performance is thus used to identify the most promising solutions that may be subjected to CFD simulations. As new simulations are performed, the surrogate model is retrained and predictions for good solutions improve, and after a certain number of simulations, the best estimation of the optimal design is presented to the designer. Most of the current research has been focused on reducing the number of simulations required to estimate the best design by introducing new utility functions that define how to balance between exploration and exploitation from the predictions of surrogate models in locating promising solutions. However, they mostly ignore the preferences from the designer on some of the most important aspects of design, such as aesthetics or fabrication feasibility, that can not be simply modelled by a computer during the optimisation process. Therefore, the successful PhD candidate will develop techniques for putting the human back into the loop and allowing the optimiser to be driven by designer preference as well as performance metrics.

Project title: Predicting Effective Control Parameters for Evolutionary Algorithms Using Machine Learning Techniques

1st supervisor: Dr Sean Walton
2nd supervisor: Dr Alma Rahat
Department/Institution: Computational Foundry, Swansea University
Research theme: T3 - novel mathematical, physical and computer science approaches

Project description: Evolutionary algorithms have proven to be capable of solving complex problems. A limitation of these techniques is that they often have a number of control parameters which need tweaking for every new problem they are applied to in order to maximise performance. The most common way to address this problem is to allow these control parameters to adapt as the algorithm runs. Recent developments however have shown that by sampling a new problem it is possible to predict effective control parameters before the start of the optimisation. In this project you will work to develop new techniques using machine learning to predict effective control parameters for new problems.

Project title: A Multiscale Modelling Approach to Study the Effects and Responses of DNA Damage Response (DDR) Inhibitor Drugs

1st supervisor: Dr Gibin Powathil
2nd supervisor: Dr Noemi Picco
Department/Institution: Department of Mathematics, Swansea University
Potential supervisor/collaborators: Professor Mark Chaplain, University of St Andrews Dr James Yates, AstraZeneca
Research theme: T3 - novel mathematical, physical and computer science approaches

Project description: The increasing complexity of clinical and biological effects of multimodality therapies often result in substantial challenges to the clinical and preclinical development of novel therapeutic drugs. Mathematical modelling, based on a systems approach, informed by experimental data can be often very helpful in understanding and studying the multiple (nonlinear) therapeutic effects and responses of these drugs, helping the preclinical design and development, and its clinical implementation. The multiscale complexity of cancer as a disease necessitates the adoption of a multiscale approach, incorporating appropriate mechanisms to obtain meaningful and predictive mathematical models to study the therapeutic effects and outcomes. This highly interdisciplinary project aims to develop a multiscale experimental data driven mathematical and computational models to study and analyse the effects and efficacy of DNA damage response inhibiting drugs. Once the model is developed and fully calibrated and validated, it will be used to study optimal sequencing, scheduling and dosing alone and in combination with multimodality therapies. It will be also used to inform in vivo and preclinical studies, moving a step closer to potential drug development and clinical trial designs.

Project title: Efficient Learning of the Optimal Probability Distribution over the Policy Space in Reinforcement Learning

1st supervisor: Dr Alma Rahat
2nd supervisor: Dr Sean Walton
Department/Institution: Computational Foundry, Swansea University
Research theme: T3 - novel mathematical, physical and computer science approaches

Project description: Reinforcement Learning is an important technique in Machine Learning for control problems, and it is closely related to how we learn in an unknown environment. In this method, an agent performs a possible action based on what state it is in and its prior belief of the reward for that action, and receives a reward or punishment as a consequence of that action; this helps it to differentiate between good and bad actions given a state as the agent may update its belief with more experience. Essentially, the learner aims to develop an estimation of the optimal probability distribution for selecting an action over the state-action space (also known as the policy space). Traditional approaches use repeated trials and errors to estimate this distribution over the policy space, which can be time-consuming due to the sheer number of required repetitions for a good estimation. Given the utility of reinforcement learning, it would be game changing to be able to improve the speed of learning by reducing the number of repetitions required. In this project, inspired from the Bayesian optimisation approaches, we propose to investigate a novel data- driven direct policy search approach where you will model the probability distribution from carefully selected data, take an entropy based search approach to identify the most informative trials to perform, and sequentially improve the estimation of the optimal probability distribution over the policy space.

Project title: Multi-Objective Evolutionary Approach Towards Dynamic Graph Drawing

1st supervisor: Dr Alma Rahat
2nd supervisor: Dr Daniel Archambault
Department/Institution: Computational Foundry, Swansea University.
Research theme: T3 - novel mathematical, physical, and computer science approaches

Project description: To draw an appealing dynamic graph, it is important to strike a balance between readability and stability as there is a natural conflict between these objectives. Current approaches set up force systems which can be optimised to reach a reasonable trade off between the two. In this project, we investigate how multi-objective evolutionary algorithms can be used to explore the trade-offs between readability and stability.

Project title: Tropical geometry of deep neural networks

1st supervisor: Yue Ren
2nd supervisor: TBD
Department/Institution: Computational Foundry, Swansea University
Research theme: T3 - novel mathematical, physical and computer science approaches

Project description: Tropical geometry is a relatively new area of mathematics that studies piecewise-linear structures arising from polynomial equations. These so-called tropical varieties arise naturally in many areas within and beyond mathematics, such as algebraic geometry, combinatorics, and optimization, as well as biology, economics, and physics. Wherever they appear, tropical varieties often allow for new computational approaches to existing problems. In this project, we explore the connection between tropical geometry and machine learning that was uncovered very recently. We study tropical varieties of deep neural networks and examine how its intrinsic geometry and combinatorics affects the underlying neural network and its inner workings.

Project title: Visualization of regression surfaces

1st supervisor: Thomas Torsney-Weir
2nd supervisor: TBD
Department/Institution: Computer Science/Swansea University
Research theme: T3 - novel mathematical, physical and computer science approaches

Project description: The goal of this project is to develop interactive visual exploration methods to explain complex regression models. This project contributes to the field of explainable AI. Machine learning models have shown remarkable predictive capabilities. However, they produce complex models that are difficult to interpret. Efforts in interpretable machine learning have concentrated on classification models which will not work with continuous data (i.e. regression models). Regression models are important to a variety of domains such as finance, weather prediction, economics, and utilities. These models bring a number of unique challenges compared to classification problems. Some examples are: understanding the rate of change of predictions and visualizing continuous spaces.

Project title: Mathematical and Computational Modelling of Brain Evolution, Development and Disease

1st supervisor: Dr Noemi Picco
2nd supervisor: Dr Gibin Powathil
Department/Institution: Department of Mathematics, Swansea University
Research theme: T3 - novel mathematical, physical and computer science approaches

Project description: The brain is the seat of our highest cognitive functions. Critically, precise composition and positioning of neurons are determined during development and are key to the emergence of these cognitive functions. Variations of the developmental programme can lead to speciation as well as malformations such as schizophrenia, epilepsy, and microcephaly. The recent zika virus epidemics exposed the lack of our basic understanding of fundamental mechanisms of neural development. The developmental program leading to the formation of the brain is the result of a complex regulation of cellular processes in space and time. To date, brain development has been studied through analysis of sparse temporal data that may miss crucial information. The project aims to develop novel mathematical and computational approaches that account for both the spatial and temporal aspects of this process leading to the vast array of brain architectures, shapes and sizes that we see in different animal species. The project will explore the hypothesis that this variety emerged from a trade-off between proliferative and spatial constraints and preferential expansion of certain proliferative zones of the developing brain. Drawing from techniques of machine learning and optimisation, the project aims to map all the possible evolutionary pathways of the brain, to highlight the evolutionary principles and fundamental mechanisms of normal brain development shared across species, and to provide insight into disease and malformations.

Project title: Visualising Extremely Large Dynamic Networks through AI and HPC

1st supervisor: Dr Daniel Archambault
2nd supervisor: Prof Jonathan Roberts (Department of Computer Sciene, Bangor University)
Department/Institution: Department of Computer Science, Swansea University
Research theme: T3 - novel mathematical, physical and computer science approaches

Project description: Until recently, state-of-the-art dynamic graph drawing algorithms used the timeslice as a basis for drawing the network. New event-based approaches [1,2] have changed this approach and are designed to draw such networks directly in 2D+t. In either case, all algorithms use small, local areas of the network metrics as a basis. Many of these localised structures are considered in parallel propagating upwards to realise an overarching global behaviour, allowing domain scientists to visualise the network. However, when humans try to understand large graphs, we use a top down approach, looking at the global, high level features first followed by individual details (Overview First, Zoom and Filter, Details on Demand). Artificial intelligence provides an opportunity to manage these two simultaneously, by considering top down examples validated by a human and applying scalable bottom up algorithms in the background for localised detail refinement. This PhD would consider the following goals: 1) Produce novel visualisation approaches for extremely large dynamic graphs. These approaches would consider top down cases suggested by supervised learning techniques while simultaneously using more localised refinement frequently seen in the dynamic graph drawing literature. 2) Dynamic graph drawing and visualisation is highly parallelizable. We will use HPC technology to further scale the visualisation and drawing process to larger data sets.

Project title: Graph Matching for Big Data - an AI and Machine Learning approach

1st supervisor: Dr Gary KL Tam
2nd supervisor: Dr Yukun Lai, Prof Paul Rosin (Department of Computer Science, Cardiff University)
Department/Institution: Department of Computer Science, Swansea University
Research theme: T3 - novel mathematical, physical and computer science approaches

Project description: Graph matching is a fundamental core problem in many research areas like computer vision, image and geometry processing, robotics, and medical imaging. It is also related to many disciplines (e.g. bioinformatics, psychology, dentistry, physics) and supports many downstream applications (intelligent image editing, image morphing, biometrics, evaluation of surgery outcome, high performance data analysis and knowledge discovery, study in phylogenetic evolution and even drug discovery etc). Graph matching however is a computationally very hard problem, and traditional techniques use approximation algorithms to seek the best solution. These techniques often require specific domain knowledge and tailored constraints to drive the search of a solution. However, when the dataset is highly complex (e.g. some form of hierarchical or temporal correlation) or there is little knowledge about the dataset, existing generic graph matching techniques struggle to perform. In this project, we are going to use AI and big data, with the help of a new mathematical formulation (using the spectral graph theory and the cycle consistency constraint), we can explore a special class of deep graph matching algorithms for such challenging tasks. The interested candidate is expected to have sound mathematical background and programming skills. Knowledge in general machine learning is desirable.

For details on how to submit your application, see the Applications page.