\noindent\textbf{Mathematics as a Motor for Innovation}: \emph{Innovations based on mathematical knowledge and algorithms} yield many improvements in economy, ecology, health care, security, and in society overall.

Our global positioning system (GPS) needs relativistic mathematics, our mobile phones use allocated frequencies through combinatorial optimisation, the combinatorics of our genome yields clues to curing rare diseases, the privacy of our communications depends on cryptographic protocols steeped in number theory, and our national security is relying on the mathematical analysis of increasingly complex networks.

Fundamental mathematical research and its direct application in practical situations make

many engineering, science, and business innovations that enrich society and mankind possible.

\noindent\textbf{Mathematics as a Motor for Innovation}: Innovations based on mathematical knowledge and algorithms yield many improvements in economy, ecology, health care, security, and in society overall.

Our global positioning system (GPS) needs the mathematics of relativistic physics, our mobile phones use frequencies allocated through combinatorial optimization, the combinatorics of our genome yields clues to curing rare diseases, the privacy of our communications depends on cryptographic protocols steeped in number theory, and our national security is relying on the mathematical analysis of increasingly complex networks.

Fundamental mathematical research and its direct application in practical situations enable many engineering, science, and business innovations that enrich society and mankind.

%

Such applications increasingly drive modern mathematical research, which

depends critically on collaborative tools, computational environments, and online databases.

Many of these digital tools have

revolutionised the way mathematical research is conducted and how it is taken into applications.

For example, engineers use mathematical tools to build and simulate physical models with millions of variables, combining building blocks from databases and algorithms from subroutine libraries.

Another example are automated representations for high-dimensional datasets via structured tensors.\smallskip

Such applications more and more drive modern mathematical research, which depends critically and increasingly on collaborative tools, computational environments, and online databases.

Many of these digital tools have revolutionized the way mathematical research is conducted and how it is turned into applications.

For example, engineers now use mathematical tools to build and simulate physical models based on systems of differential equations and using millions of variables, combining building blocks and algorithms taken from libraries from all over the internet.

%Another example are automated representations for high-dimensional datasets via structured tensors.

\smallskip

\noindent\textbf{Problem: Oligopolization of Mathematics} There is very high commercial interest in the development of mathematical representations as proprietary services and datasets, which leads to the danger of monopolizing their availability.

Indeed we are seeing that the large engineering and internet companies are strategically buying (all) the relevant, innovative startups and hiring top researchers, essentially privatizing and oligopolizing public data, knowledge and technological know-how.

Even in the field of mathematics -- which could be assumed to be ``pure'' and thus immune -- this is the case for, e.g., machine learning algorithms and datasets, e.g. in the case of Wolfram Inc., which has started integrating mathematical data into the ``Wolfram Language'' almost a decade ago. \smallskip

Indeed we are seeing that the large engineering and internet companies are strategically buying (all) the relevant, innovative startups and hiring top researchers, essentially privatizing and oligopolizing public data, knowledge, and technological know-how.

Even in the field of mathematics --- which could be assumed to be ``pure'' and thus immune --- this is the case for, e.g., machine learning algorithms or the data curation of Wolfram Inc., which has started integrating mathematical data into the Wolfram Language almost a decade ago. \smallskip

\noindent\textbf{The Cure: Open Data/Software}: We are strongly convinced that mathematical data and algorithms should be openly available for the research community and industry according to the FAIR principles~\cite{FAIR} and their should be open access to all resources.

The members of this consortium have demonstrated this commitment and its benefits with the open access services and datasets they have developed in the past, such as Modelica~\cite{Modelica:on}, SLICOT~\cite{BenMeh:sslsct99}, SageMath~\cite{SageMath:on}, EuDML~\cite{EuDML:on}, swMATH~\cite{swMATH:on}, or LMFDB~\cite{lmfdb:github}. \smallskip

\noindent\textbf{The Cure: Open Data/Software}: We are strongly convinced that mathematical data and algorithms should be openly available for the research community and industry according to the FAIR principles~\cite{FAIR} and that there should be open access to all resources.

The members of this consortium have demonstrated this commitment and its benefits with the open access services and datasets they have developed or contributed to in the past, such as Modelica~\cite{Modelica:on}, SLICOT~\cite{BenMeh:sslsct99}, SageMath~\cite{SageMath:on}, EuDML~\cite{EuDML:on}, swMATH~\cite{swMATH:on}, or LMFDB~\cite{lmfdb:github}. \smallskip

% what is this proposal about - aim

\noindent\textbf{Project Aim}: We will provide mathematicians and scientists with

...

...

@@ -28,49 +26,57 @@ The members of this consortium have demonstrated this commitment and its benefit

\smallskip

% How will we achieve this?

\noindent\textbf{Prerequisite: Deep FAIRness}: To achieve this, we will need to build services that understand the semantics~\cite[Rec. 7]{FAIR} of the mathematical data they operate on --- only if the mathematical meaning of the data is accessible in all its depth can computer applications provide mathematically sound, interoperable services. We call this \emph{deep} FAIRness.\smallskip

\noindent\textbf{Prerequisite: Deep FAIRness}: To achieve this, we will need to build services that understand the semantics~\cite[Rec. 7]{FAIR} of the mathematical data they operate on --- only if the mathematical meaning of the data is accessible in all its depth can computer applications provide mathematically sound, interoperable services. We call this \emph{deep} FAIRness.

\smallskip

Moreover, because the mathematical standard of rigor and the inherent complexity of mathematical data make Deep FAIRness more essential than in other scientific disciplines, \emph{mathematics is an ideal test case for developing the semantic aspects of the FAIR principles}.

\highlight{Due to the mathematical standard of rigor and the inherent complexity of mathematical data, deep FAIRness is both more difficult and more important for mathematics than for other scientific disciplines.

That also means that mathematics is an ideal test case for developing the semantic aspects of the FAIR principles in general.}

A lot of knowledge sharing motivated by FAIRness has already been done; to name just a few examples from different walks of mathematics:

A lot of FAIR-motivated knowledge sharing has already been done.

A few examples from different walks of mathematics are:

\begin{compactenum}[\em i\rm.]

\item The Modelica language uses symbolic representations of differential equations and control algorithms to model cyber-physical systems and bases simulation services on that. Hundreds of reusable libraries are available on GitHub alone.

\item Highly standardised subroutine libraries like LAPACK~\cite{LAPACK:on}, SLICOT, or MUMPS~\cite{MUMPS:on} form the backbone of almost all engineering software packages.

\item Mathematical information services like zbMATH~\cite{zbMATH:on}, EuDML, and swMATH extend bibliographic metadata of mathematical publications with math subject classifications (essentially taxonomic semantic information) and use automatic extraction to give users enhanced, semantic search capabilities.

\item Libraries of formalized mathematics directly specify the meaning of mathematical definitions, theorems, and proofs in a machine-verifiable way. Tens of thousands of such formal proofs are available in open libraries.

\item Mathematical databases like the ``L-functions and modular forms database'' (LMFDB), the ``GAP Small Groups Library''~\cite{GapSmallGroups:on}, or the ``Open Encyclopedia of Integer Sequences'' (OEIS~\cite{OEIS:on}) store millions of mathematical objects together with their semantic properties, both human-curated or machine-generated.

\item Mathematical databases like the L-functions and Modular Forms DataBase (LMFDB~\cite{lmfdb:on}), the GAP Small Groups Library~\cite{GapSmallGroups:on}, or the Open Encyclopedia of Integer Sequences (OEIS~\cite{OEIS:on}) store millions of mathematical objects together with their semantic properties, both human-curated or machine-generated.

\end{compactenum}

These are used industrially (\emph{i}.-\emph{iv}.) and academically (\emph{i}.-\emph{v}.), inner-mathematically (\emph{ii}., \emph{iv}., \emph{v}.) and transdisciplinarily (e.g. \emph{i}.-\emph{ii}. in engineering and \emph{iii}. in program verification).

These are used industrially (\emph{i}.-\emph{iv}.) and academically (\emph{i}.-\emph{v}.), and inner-mathematically (\emph{ii}., \emph{iv}., \emph{v}.) and transdisciplinarily (e.g. \emph{i}.-\emph{ii}. in engineering and \emph{iii}. in program verification).

But the various representations are non-interoperable, and the datasets therefore are not reusable across systems and communities. This leads to large gaps in the FAIRness of mathematical data and results in missed opportunities for innovative services that could revolutionize mathematical research and applications.\smallskip

\noindent\textbf{Open Source/Data Ethos}: The mathematical community predominantly shares the ethos of open access to publications, software (including source code), and datasets. In fact, all of the examples above are either fully open, partly open, or are currently in the process of opening up the data/software further.

For mathematical software, the open-source ethos has been established already for more than 50 years in subroutine libraries such as LAPACK, % SCALAPACK,

SLICOT, or MUMPS which are produced according to a widely accepted documentation and implementation standard, and are at the core of almost all successful commercial or non-commercial software packages including MATLAB~\cite{MATLAB:on}or SageMath.

For mathematical software, the Open Source ethos has been established already for more than 50 years in subroutine libraries such as LAPACK, % SCALAPACK, SLICOT, or MUMPS

which are produced according to a widely accepted documentation and implementation standard and are at the core of almost all successful commercial and non-commercial software packages including MATLAB~\cite{MATLAB:on}and SageMath\cite{sagemath:on}.

Throughout this project we will reuse and extend open source code, and \TheProject will benefit from future open source contributions during and beyond the lifetime of the project. Moreover, the \pn project will follow the example of the H2020 OpenDreamKit project and conduct all of its development openly in public repositories. In fact like the OpenDreamKit project, which \pn follows up on, the \pn proposal was developed publicly (on \url{https://gl.kwarc.info/mathhub/data-proposal/}).

Thanks to this ``by-users-for-users'' model, \TheProjectwill be steered by the actual needs of the community.

Thanks to this by-users-for-users model, \TheProject will be steered by the actual needs of the community.

\noindent\textbf{The \TheProject team} is a Europe-wide collaboration that brings together a leading body of

mathematicians and transdisciplinary computational researchers, with an extensive track

record of delivering innovative open source software solutions.\smallskip

\smallskip

\noindent\textbf{Impact}: Standardizing a data framework, unifying services, and hosting all on a public, high-profile infrastructure like the EOSC will enable huge progress in effective research, research communication, and reproducibility in computational mathematics and science.

By focusing on public, open standards and service interoperability \TheProject will simultaneously maximise sustainability and impact. Even though the primary target users are \emph{researchers in mathematics}, the set of beneficiaries extends to researchers, teachers, and industry practitioners in scientific computing, physics, chemistry, biology, engineering, medicine, earth sciences and geography, as well as social sciences and finance.

\noindent\textbf{Impact}: Standardizing a data framework, unifying services, and hosting all on a public, high-profile infrastructure like the EOSC will enable huge progress in effective research, research communication, and reproducibility in computational mathematics and related sciences.

By focusing on public, open standards and service interoperability \TheProject will simultaneously maximize sustainability and impact. Even though the primary target users are researchers in mathematics, the set of beneficiaries extends to researchers, teachers, and industry practitioners in, e.g., scientific computing, physics, chemistry, biology, engineering, medicine, earth sciences and geography, as well as social sciences and finance.

\TheProject will foster the development of models that are mutually beneficial to academia and highly innovative SMEs and enable tool chains that bridge the gap between fundamental mathematical research and domain-specific computational technology, thus supporting the faster application, exploitation, and commercialization of basic research.

Finally, by preparing an ISO standard for mathematical data, \TheProject will scale up these impacts and make them sustainable.\smallskip

\smallskip

\noindent\textbf{Sustainable \pn Infrastructure} It is important to note that while the result of the \pn project is a software infrastructure consisting of

\noindent\textbf{Sustainability}

The result of the \pn project will be a software infrastructure consisting of

\begin{inparaenum}[\em i\rm)]

\itemmultiple mathematical data that jointly comprise multi-terabyte data bases,

\itemvarious specialized indexes over them, and

\itemmathematical services that serve, compute with, visualize, and validate this data.

\itema uniform data representation standard that is ready for semantics-aware FAIR data sharing,

\iteminnovative user-oriented services that validate, serve, compute with, and visualize this data and leverage it inexisting widely used applications, and

\itemthe uniform integration of multiple community-driven mathematical datasets that jointly comprise multi-terabyte databases.

\end{inparaenum}

The complete \pn software infrastructure will run on a single, well-equipped, modern commodity-grade server that can be maintained sustainably by a part-time (1/4 FTE) experienced system administrator. In particular, the \pn project is not applying for or incurring the future need of dedicated hardware infrastructure or large maintenance teams.

To make these results sustainable beyond the project duration, we will submit the above standard for ISO certification, develop all services in a way that allows for easy deployment on the EOSC Hub, and exploit the datasets made available uniformly through \TheProject in extensive community outreach efforts to publicize the EOSC Hub and FAIR data sharing.

Moreover, the complete \pn software infrastructure will run on a single, well-equipped, modern commodity-grade server that can be maintained sustainably by a part-time (1/4 FTE) experienced system administrator.

In particular, the \pn project is not applying for or incurring the future need of dedicated hardware infrastructure or large maintenance teams.

% why do we need money at all

\noindent\textbf{Funding Need}: One might think that many of the solutions described above will eventually be organized by the mathematical community anyway. But a coordinated effort is needed to create a single, semantic data representation standard and corresponding FAIR service framework. Without such a concerted effort --- which requires the funding and institutional support as provided by the EOSC --- we will likely see the continued development of a multitude of non-interoperable system-specific ``standards'' and competing commercial offerings, which are already becoming more and more entrenched.

As Mathematics is a small -- albeit foundational -- discipline, the \pn proposal stays well below the maximal funding level provided in the INFRAEOSC-02-2019 call to make the proposed project cost-effective.\smallskip

\noindent\textbf{Funding Need}: One might think that many of the solutions described above will eventually be organized by the mathematical community anyway. But a coordinated effort is needed to create a single, coherent data representation standard and the corresponding FAIR service framework. Without such a concerted effort --- which requires the funding and institutional support as provided by the EOSC --- we will likely see the continued development of a multitude of non-interoperable system-specific ``standards'' and competing commercial offerings, which are already becoming more and more entrenched.

As mathematics is a small --- albeit foundational --- discipline, the \pn proposal stays well below the recommended funding level provided in the INFRAEOSC-02-2019 call in order to make the proposed project cost-effective.

\smallskip

The time for the \pn project is ideal as it is a follow-up to the successful OpenDreamKit project (2015--2019), which has built A Virtual Research Environment Toolkit for Mathematics.

OpenDreamKit has identified many of the problems and designed many of the solutions described in this proposal.