Metadata management with Mauro Data Mapper
Data has overtaken oil as the most valuable resource in the global economy, with veritable data megaliths such as Apple and Amazon ranked in the top 5 most powerful companies in the world.
Data is relevant in every single market sector. Businesses utilise information they collect to
- Understand customers better
- Improve internal processes
- Accurately predict market trends
Data on its own is not useful unless the context and meaning is clear. Without additional contextual information (known as metadata) data cannot be interpreted accurately and any conclusions may be unreliable.
What is metadata?
Metadata is essentially data that describes other data, providing additional information, such as the quality, source and collection method. Gathering metadata allows data to be categorised and stored, making it easier to find, analyse and compare. Metadata means that where required, there is the potential to transfer data between different systems. This ability to use data in a different system or context, without losing the original meaning or intention of the data is referred to as semantic interoperability.
Infinite amounts of metadata can be associated with each dataset, meaning that managing and filtering the vast amount of metadata can be a time-consuming and resource intensive task. Creating and maintaining this metadata adds to the complex data management challenges that businesses already face.
What is a metadata catalogue?
Metadata catalogues may help businesses to alleviate the problem. These are centralised repositories where the metadata is stored. The data is analysed or inspected and the resulting metadata can be merged in different formats, from multiple data sources , into one structured database, ideally automatically.
Often, metadata is stored in a project specific location, which can have the unintended consequence that information which could be valuable to other areas of the business may be difficult to find. Metadata catalogues allow linked datasets to be shared both internally and externally where necessary. Users can be assigned specific roles with different access levels for creating, sharing and updating data descriptions. Catalogues facilitate collaborative learning and allow businesses to extract the maximum value from their metadata
Compliance with General Data Protection regulations
Metadata is a valuable resource, but organisations must be aware of their responsibilities when it comes to managing this information. Businesses have adapted to complying with GDPR, but the repercussions of the regulations may be less apparent for metadata. Yet GDPR does of course apply to all data, even to metadata. Potential fines and repercussions for businesses who fail to comply with GDPR requirements can be hefty. Organisations should therefore catalogue and understand what exact embedded data they hold. They should also heed the relevant GDPR regulations when it comes to storing and sharing this data.
In the context of Mauro, the metadata for certain types of datasets will be made available for research purposes. The metadata catalogue can state whether each of these data points contains identifying information. For example:
- Patient Name (contains personal identifying information)
- Diagnosis code (does not)
Classifying the metadata means that the owners of each dataset will be able to take steps as relevant to ensure that it is used safely and in compliance with GDPR.
Mauro Data Mapper
To help businesses in their pursuit to organise their metadata, OCC have collaborated with the Clinical Informatics Group at the University of Oxford’s Big Data Institute (BDI) to develop a platform which can create bespoke metadata catalogues. Mauro Data Mapper is a toolkit for the design and documentation of databases, data flows, and data standards, as well as related software artefacts.
Mauro stores and manages descriptions of data using data models, which are structured collections of metadata. These data models employ standard terminologies and semantic links to ensure data definitions are comparable. Each data model is version controlled to keep track of changes in design, implementation or understanding.
The Mauro Data Mapper (Mauro) has a fast-growing community of developers and users from a range of healthcare and other public sector organisations, collaborating to improve and use the metadata catalogue. Healthcare is just one of many sectors that can benefit from Mauro’s Data Mapping capabilities to organise metadata into a single, understandable resource.
Mauro’s extensible and open-source design means it can be fully customised to suit any application. OCC can offer the following services:
- The ability to develop a new front end and/or work with an existing system
- Add new functionality
- Build new plugins, or customise existing ones
- Offer UX design services for new use cases
- Setup new instances of Mauro
- Integrate with another system (such as a data warehouse)
- Use Mauro to federate metadata across multiple organisations
Companies with in-house developers or those who want to develop the catalogue themselves, can take the core of Mauro and build their own front end. If required, they can seek support from either the open-source community or OCC themselves.
Alternatively, OCC’s developers can work with companies to tailor Mauro to suit their specific needs. Utilising their expertise from developing Mauro itself, OCC can:
- Provide custom development services. This can include anything from modifying the user interface to meet company standards, to integrating new plugins for specific data formats
- Help with hosting and deploying new features as well as ongoing maintenance
Mauro tracks how data has flowed through an organisation as well as how it has been interpreted.
No matter the type, location or current structure of your data assets, OCC can assist you in developing a bespoke metadata management solution which will help you to organise, understand and utilise every item of metadata.
Find out more about OCC projects here.
For more information, contact us at Info@oxfordcc.co.uk.