PyKale: Knowledge-aware machine learning from multiple sources in python

Abstract

Machine learning is a general-purpose technology holding promises for many interdisciplinary research problems. However, significant barriers exist in crossing disciplinary boundaries when most machine learning tools are developed in different areas separately. We present Pykale - a Python library for knowledge-aware machine learning on graphs, images, texts, and videos to enable and accelerate interdisciplinary research. We formulate new green machine learning guidelines based on standard software engineering practices and propose a novel pipeline-based application programming interface (API). PyKale focuses on leveraging knowledge from multiple sources for accurate and interpretable prediction, thus supporting multimodal learning and transfer learning (particularly domain adaptation) with latest deep learning and dimensionality reduction models. We build PyKale on PyTorch and leverage the rich PyTorch ecosystem. Our pipeline-based API design enforces standardization and minimalism, embracing green machine learning concepts via reducing repetitions and redundancy, reusing existing resources, and recycling learning models across areas. We demonstrate its interdisciplinary nature via examples in bioinformatics, knowledge graph, image/video recognition, and medical imaging.

Publication
arXiv preprint arXiv:2106.09756
Haiping Lu
Haiping Lu
Professor of Machine Learning, Head of AI Research Engineering, and Turing Academic Lead

I am a Professor of Machine Learning. I develop translational AI technologies for better analysing multimodal data in healthcare and beyond.

Xianyuan Liu
Xianyuan Liu
Visiting PhD Student
Robert Turner
Robert Turner
Senior Research Software Engineer at University of Sheffield
Peizhen Bai
Peizhen Bai
PhD Student
Shuo Zhou
Shuo Zhou
Academic Fellow at University of Sheffield (past PhD Student)
Lawrence Schobs
Lawrence Schobs
PhD Student