PyKale: Knowledge-aware machine learning from multiple sources in python

Abstract

Machine learning is a general-purpose technology holding promises for many interdisciplinary research problems. However, significant barriers exist in crossing disciplinary boundaries when most machine learning tools are developed in different areas separately. We present Pykale - a Python library for knowledge-aware machine learning on graphs, images, texts, and videos to enable and accelerate interdisciplinary research. We formulate new green machine learning guidelines based on standard software engineering practices and propose a novel pipeline-based application programming interface (API). PyKale focuses on leveraging knowledge from multiple sources for accurate and interpretable prediction, thus supporting multimodal learning and transfer learning (particularly domain adaptation) with latest deep learning and dimensionality reduction models. We build PyKale on PyTorch and leverage the rich PyTorch ecosystem. Our pipeline-based API design enforces standardization and minimalism, embracing green machine learning concepts via reducing repetitions and redundancy, reusing existing resources, and recycling learning models across areas. We demonstrate its interdisciplinary nature via examples in bioinformatics, knowledge graph, image/video recognition, and medical imaging.

Publication
arXiv preprint arXiv:2106.09756
Haiping Lu
Haiping Lu
Director of the UK Open Multimodal AI Network, Professor of Machine Learning, and Head of AI Research Engineering

I am a Professor of Machine Learning. I develop translational multimodal AI technologies for advancing healthcare and scientific discovery.

Xianyuan Liu
Xianyuan Liu
Assistant Head of AI Research Engineering & Senior AI Research Engineer
Peizhen Bai
Peizhen Bai
PhD Student (now a Senior Machine Learning Scientist at AstraZeneca)
Shuo Zhou
Shuo Zhou
Academic Fellow at University of Sheffield (past PhD Student)
Lawrence Schobs
Lawrence Schobs
PhD Student