From definitions to design: why seven dimensions matter for data visiting

20 Jan 2026

The Seven-Dimensional Data Visiting Framework provides a practical, governance-by-design structure for configuring proportionate, lawful, and trustworthy data visiting in genomic research, grounded in the GA4GH data visiting lexicon.

By Donrich Thaldar (Professor of Law, University of KwaZulu-Natal)

In recent years, data visiting has moved from an interesting technical concept to a practical necessity in genomics and health research. As legal constraints on cross-border data transfers tighten and expectations around privacy, sovereignty, and accountability rise, simply “sharing data” by copying datasets from one institution to another is increasingly untenable. GA4GH has responded to this reality by clarifying what we mean when we speak about data visiting, federated data analysis, and related concepts, through a shared lexicon that enables researchers, institutions, and policymakers to speak the same language (https://www.ga4gh.org/document/ga4gh-data-sharing-lexicon/ and https://doi.org/10.1186/s40246-025-00784-z). 

But clarity of terminology, while essential, is only the first step. The harder question comes next: how should data visiting actually be designed and governed in practice?

That question motivated the development of the Seven-Dimensional Data Visiting Framework (7D-DVF), which takes the GA4GH lexicon as its starting point and translates it into a practical, configurable governance tool (https://doi.org/10.1186/s40246-025-00864-0). Rather than treating data visiting as a single, monolithic model, the framework disaggregates it into seven adjustable dimensions, each of which functions as a governance lever that can be tuned to fit a specific legal, ethical, and technical context. The framework is descriptive rather than prescriptive: it provides a structured way to describe, compare, and justify governance choices across different data visiting models, rather than telling users which configuration to adopt.

Why a multidimensional approach?

Discussions about data visiting in genomics frequently treat it as a single, coherent model with a fixed set of properties. Even where data visiting is formally defined, it is often spoken about as if it were inherently secure, privacy-preserving, or legally safer, rather than as a practice that can be implemented in many different ways. This tendency to treat data visiting as a unified object obscures the fact that its governance-relevant features are neither uniform nor inevitable.

The 7D-DVF responds to this conceptual compression by disaggregating data visiting into a set of independent, adjustable dimensions. In doing so, it reframes data visiting not as a single solution, but as a configurable space of possible arrangements, in which properties such as security, visibility, autonomy, control, and auditability vary by design. This multidimensional structure makes the complexity of data visiting explicit and usable, turning it into a set of governance levers rather than a monolithic model. Which configurations are permissible or appropriate in any given case will, in turn, be constrained by applicable legal and ethical requirements, which vary by jurisdiction and context.

The seven dimensions as practical levers

The first dimension, researcher autonomy, concerns how much freedom a visiting researcher (or algorithm) has. Can custom code be executed, or are users limited to fixed queries? High autonomy may accelerate discovery but requires stronger compensating safeguards; low autonomy can dramatically reduce risk but may constrain scientific creativity.

Data location asks a deceptively simple question: where does the data physically or virtually reside during analysis? Centralised clouds, institutional servers, and fully decentralised in-situ models all carry different implications for sovereignty, jurisdiction, and control. In many African and Asian contexts, this dimension is decisive for compliance with data localisation laws.

Closely related is data visibility — what the researcher can actually see. Full dataset access, metadata-only views, and query-only interfaces all fall under data visiting, but they support very different risk–utility trade-offs. Query-only visibility, for example, can enable international collaboration even where raw data may never lawfully leave a jurisdiction.

The nature of the shared data makes explicit what is sometimes left vague: are we dealing with identifiable or anonymised data? Each category maps onto different legal obligations and ethical expectations. Treating this as a configurable dimension helps avoid both over-regulation and under-protection.

At the other end of the analytical pipeline lies output governance. What happens to results once an analysis is complete? Are outputs released automatically, reviewed before export, or subjected to privacy-enhancing techniques such as aggregation or noise injection? In many real-world systems, output governance is the last — and most important — line of defence against unintended disclosure.

The trust and control model captures who is in charge. Is governance centralised in a single trusted research environment? Distributed across a federation of peers? Or embedded computationally through technical controls that enforce rules automatically? This dimension is particularly important where historical power imbalances or concerns about data extraction shape institutional and community trust.

Finally, auditability and traceability ask whether actions can be reconstructed and scrutinised. Comprehensive logging, selective monitoring, and embedded provenance metadata all offer different ways of ensuring accountability. In an era of AI-driven analysis, traceability is increasingly central to both legal defensibility and public trust.

Figure 1. Seven-Dimensional Data Visiting Framework (7D-DVF) (Source: author)

From abstraction to application

What makes the 7D-DVF especially useful is that it does not prescribe a single “correct” configuration. Instead, it supports proportionality. A rare disease registry spanning multiple jurisdictions might combine restricted visibility, medium autonomy, strong output governance, and full auditability. A low-risk federated AI training exercise on de-identified data might legitimately relax some of these controls to maximise scalability and efficiency.

This modularity also makes the framework legible to non-technical stakeholders. Ethics committees can use it as a structured checklist. Policymakers can see how legal requirements translate into system design choices. Technical teams gain a shared vocabulary for explaining why a particular architecture looks the way it does.

Why this matters for GA4GH

This work builds on and complements ongoing efforts within GA4GH, including the Data Visiting Study Group, which has played a central role in developing the shared lexicon for data visiting and continues to provide a forum for advancing technical, legal, and governance discussions in this area.

As federated systems, trusted research environments, and AI-driven analytics continue to mature, the central challenge will not be whether data visiting is technically possible, but whether it is governed well. The 7D-DVF provides a structured way to move from reactive compliance to governance-by-design — enabling privacy, sovereignty, and research utility to be balanced deliberately rather than by accident. The framework is intended to support the work of the Data Visiting Study Group and related GA4GH initiatives by offering a shared analytical structure for examining how data visiting can be designed and operationalised across differing legal and ethical contexts.

In this sense, the framework serves as an invitation to design data visiting systems that are transparent, adaptable, and worthy of the trust placed in them.

Donrich Thaldar is a Professor of Law at the University of KwaZulu-Natal and a co-chair of the GA4GH Data Visiting Study Group, with research interests in data governance, health law, and genomic research.

Further Reading

Related Products

Latest News

Policy Briefs
20 Jan 2026
From definitions to design: why seven dimensions matter for data visiting
See more
Strand of DNA composed of connected nodes.
News
16 Dec 2025
International genomic data sharing by health technologies industries (HTIs): eight Points to Consider
See more
Blog Posts
9 Dec 2025
Older research participants are motivated to receive genetic results for the benefit of younger relatives
See more