Mathematics of the Data Science Ontology

Warning: This document is under construction.

In this guide, we explain the mathematical underpinnings of the Data Science Ontology. To make sense of it, you should first read the beginning of the introductory guide, to learn informally what we mean by “concepts” and “annotations”. You should also have a working knowledge of monoidal categories and their graphical languages . There are now several excellent introductions to these ideas, pitched at nonspecialists and focusing on applications outside pure math [BS10, CP10, Sel10]. The math described here belongs to a small but growing line of work on categorical knowledge representation [SK12, Pat17].

Throughout this website and the ontology data format, we use terminology of programming language theory. Category theory has its own distinctive terminology. The following dictionary may be helpful for translating back and forth.

Programming	Category theory	Notation
type	object	$X$
function	morphism	$f: X \to Y$
inputs	domain	$\text{dom}(f)$
outputs	codomain	$\text{codom}(f)$
composition	composition	$f \cdot g$ or $fg$
product	cartesian monoidal product	$X \times Y$ and $f \times g$
unit type	monoidal unit	$1$
copy	internal comultiplication	$\Delta_X: X \to X \times X$
delete	internal counit	$\lozenge_X: X \to 1$

References

[BS10]: John C. Baez and Mike Stay, 2010. Physics, topology, logic and computation: A Rosetta Stone. DOI , arXiv

[CP10]: Bob Coecke and Eric Oliver Paquette, 2010. Categories for the practising physicist. DOI , arXiv

[Pat17]: Evan Patterson, 2017. Knowledge representation in bicategories of relations. arXiv

[Sel10]: Peter Selinger, 2010. A survey of graphical languages for monoidal categories. DOI , arXiv

[SK12]: David I. Spivak and Robert E. Kent, 2012. Ologs: a categorical framework for knowledge representation. DOI , arXiv