The Cohort Migrator Toolkit
A methodology to harmonise different cohorts into a standard data schema, helping the research community to generate evidence from a wider variety of data sources.

GitHub Code here.

About CMToolkit

CMToolkit is a python-based application designed to migrate and harmonize clinical cohorts from CSV format into the OHDSI OMOP CDM schema. This procedure increases the interoperability of the data by allowing the exportation of several cohorts into a new system reusing the same scripts.





6,669 subjects

398 cohort attributes

172 standardized concepts

Why CMToolkit?

  • The CMToolkit proposes a strategy for semi-automatic harmonisation of large amounts of medical concepts in clinical studies.

  • It creates new opportunities for the study of rare conditions, where typically isolated cohorts do not provide enough statistical evidence.

  • The results can augment clinical knowledge by automatically computing new patient information during the migration stage.

Core Team

This methodology was validated by the cohort data owners and developed by the following team members:

João R. Almeida

Researcher at University of Aveiro

Luís B. Silva

CTO at BMD-Software

José L. Oliveira

Full professor at University of Aveiro