Visual metaphor showing locked data icons inside an “open” box, symbolizing how access to supposedly open data can still be controlled by gatekeepers.

Open Data or Closed Doors? 3 Powerful Insights on Who Controls Access and Sharing

Contents

I. Open Data: Ideal vs. Reality

In academic, policy, and technology circles, open data is frequently presented as a universal public good—capable of unlocking innovation, enabling cross-border collaboration, and accelerating scientific discovery. The narrative paints a picture of frictionless exchange, where knowledge flows without barriers across disciplines, regions, and institutional boundaries.

However, the practical experience of many data contributors tells a more complex story. The label “open” often conceals a layered set of gatekeeping mechanisms—technical, legal, and institutional—that ultimately shape who can use the data, under what conditions, and at what pace. These filters may not be explicit, yet they profoundly influence both the reach and the impact of datasets.

Barriers can take subtle forms: restrictive file formats that require proprietary software, metadata stripped of crucial context, or access protocols that are technically public but operationally opaque. In such cases, openness exists more as a formal declaration than as a functional reality, limiting practical reuse despite nominal availability.

To be genuinely open, data must meet more than the existence test of having a public URL. It must be findable via well-indexed repositories, usable without excessive technical overhead, and accountable through transparent provenance and governance that safeguard the interests of both contributors and users.

Absent these conditions, open data risks becoming an exercise in performative transparency—visible enough to claim compliance with openness principles, yet structured in ways that preserve control for a limited set of intermediaries. This raises a critical question: is “open” merely a label, or is it a measurable practice of shared stewardship?

II. Hidden Gatekeeping in “Open” Systems

While open platforms are often celebrated for their accessibility, many conceal structural layers of control that operate beneath the surface. These mechanisms may be invisible to casual users but can profoundly shape whose data is visible, whose is marginalized, and whose is excluded entirely. Such gatekeeping can emerge from infrastructure ownership, the design of data standards, and economic systems that reward certain actors while sidelining others.

II.A. Infrastructure and Portal Ownership

In digital biodiversity infrastructures, the entity that owns or operates the platform wields significant influence over access conditions and the pace of data curation. Control over hosting environments, authentication systems, and primary access points allows these actors to set priorities and determine the terms of engagement.

Even in systems branded as community-driven, decision-making power may reside with a small administrative group. The architecture—APIs, rate limits, uptime commitments—reflects the priorities of those operators, meaning that a single policy or technical change can redefine who participates in data-driven research.

II.B. Curation, Standards, and Editorial Bias

Standards make data interoperable, enabling datasets from different sources to work together. Yet every standard encodes choices: what counts as valid, how it is labeled, and what is left undefined. These decisions can privilege well-documented subjects while sidelining data that does not fit established schemas.

During curation, content outside the dominant taxonomies or formats—such as local classification systems or unconventional field observations—may be altered or removed to fit the standard. While this may improve technical uniformity, it can also erase valuable nuance, narrowing the diversity of information preserved.

II.C. Funding Flows and Citation Economies

Financial support determines which datasets remain accessible and up to date. Resources typically flow toward projects with established visibility, creating a self-reinforcing loop: well-funded datasets attract more citations, which in turn secure additional funding and policy influence.

Meanwhile, valuable but less-visible datasets—often maintained by smaller teams—may receive limited attention and risk obsolescence. Without systems that recognize and resource these contributors, open data ecosystems risk becoming concentrated around a small set of dominant actors.

III. Ethics of Access: Consent, Provenance, and Benefit Sharing

Behind every dataset lies a chain of decisions, relationships, and contexts. Ethical access is not simply about allowing downloads—it is about recognizing the people, ecosystems, and histories represented within the data. Consent, provenance, and benefit sharing are the cornerstones of responsible openness.

Without these elements, the promise of open data risks becoming transactional rather than collaborative, privileging speed and scale over fairness and long-term trust.

III.A. Informed Consent in Data Collection

The assumption that digitized data can always be freely shared ignores the necessity of informed consent—particularly for datasets tied to culturally significant areas or sensitive ecological sites.

Consent is not a static agreement. It should evolve alongside the ways data is reused, reinterpreted, or combined with other sources. Without this flexibility, contributors may lose control over how their information is represented and applied.

Clear, revisitable consent protocols build confidence and reduce the risk of ethical breaches, ensuring that openness does not come at the expense of autonomy.

III.B. Provenance as Accountability

Provenance is the documented record of where data comes from, how it has been modified, and who has been responsible for its stewardship. It is critical for trust, reproducibility, and recognition.

Yet, during aggregation or integration, provenance is often lost or stripped away—breaking the ability to verify, attribute, or contest data use. Once that link is gone, accountability becomes harder to enforce.

III.C. Fair and Equitable Benefit Sharing

For openness to be more than a one-way extraction of information, benefits must flow back to the contributors. These can include shared research findings, co-authorship opportunities, improved conservation outcomes, or technical tools tailored to local needs.

Benefit sharing can strengthen trust and encourage participation, turning data release into a mutual exchange rather than a unilateral transaction.

Mechanisms for benefit sharing should be embedded into agreements and workflows from the start, not treated as an afterthought.

Without them, open data infrastructures risk becoming extractive systems that erode the very relationships needed for their sustainability.

IV. Licensing, Governance, and Infrastructure Geography

Licensing terms, governance arrangements, and the physical and institutional locations of infrastructure shape how “open” data can truly be. These factors are often perceived as background details, yet they determine practical access, participation, and influence.

Examining them closely reveals that openness is not a static state—it is a negotiated condition shaped by decisions at multiple levels.

IV.A. Licensing as a Tool of Control

Licenses can enable openness, but they can also limit it. Clauses restricting commercial use, derivative works, or redistribution may narrow who can apply the data and under what circumstances.

Complex legal language can discourage smaller organizations or researchers without legal expertise from engaging with a dataset, even when it is technically “open.”

Clarity, simplicity, and transparency in licensing not only broaden potential use but also reduce misunderstandings and legal risks for all parties involved.

IV.B. Governance Structures and Decision-Making

Every open data platform operates under a governance model—formal or informal—that determines how priorities are set, disputes are resolved, and changes are implemented.

When governance is concentrated within a small set of actors, decision-making may reflect narrow perspectives. This can limit innovation, exclude valuable contributions, and reduce trust in the system.

IV.C. Geography and Concentration of Infrastructure

Hosting locations, funding sources, and maintenance responsibilities influence not only performance and reliability, but also governance and data flow patterns. Infrastructure concentration in a few regions can lead to dependency on external decision-making and technical standards.

Distributed models—where data storage, processing, and governance are shared—can help balance influence and increase resilience against technical or policy shifts in any single location.

Such distribution also supports redundancy, reduces single points of failure, and fosters local capacity to manage and adapt systems as needs evolve.

In practice, infrastructure geography is as much a strategic choice as it is a technical one, with lasting effects on how “open” an open data ecosystem can be.

V. Pathways Forward: From Open Data to Just and Equitable Data

Turning open data from an aspirational slogan into a living, equitable practice requires more than releasing datasets into the public domain. It calls for reshaping the systems, policies, and relationships that govern how data is produced, accessed, and used. This transformation is not only technical—it is cultural, procedural, and relational.

The task ahead is to build infrastructures that are transparent in their governance, fair in their distribution of benefits, and resilient in the face of evolving social, environmental, and technological challenges. Doing so requires an explicit shift from seeing openness as a transactional act to treating it as an ongoing commitment to shared responsibility and accountability.

Below are key strategies that can move open data systems toward that vision.

V.A. Redefining “Open” with Equity at Its Core

The conventional definition of “open” often stops at accessibility—whether the data can be viewed or downloaded. This view is incomplete. A richer definition includes the capacity for diverse actors to influence priorities, participate in governance, co-create standards, and benefit from outcomes.

Equity-centered openness acknowledges that not all participants enter the data ecosystem with equal resources, capacities, or influence. Without mechanisms to level the playing field, the promise of openness risks reproducing existing disparities.

Embedding equity requires revisiting governance charters, funding criteria, and operational norms so that they explicitly include provisions for representation, fair attribution, and benefit sharing. Infrastructures that adopt this approach not only expand participation but also enhance the quality and relevance of the data they host.

V.B. Building Participatory Governance Models

Governance is where the principles of openness are either reinforced or undermined. A participatory model ensures that decision-making reflects the diversity of those affected by data use—not just the perspectives of technical or administrative elites.

Such models can incorporate mechanisms like rotating leadership roles, open nomination processes, and public consultation periods for policy changes. These tools prevent the consolidation of power and encourage transparent, accountable processes.

When governance structures are inclusive, they tend to produce more adaptive and context-aware policies. This adaptability strengthens the system’s ability to address emerging challenges—whether technological, ecological, or social—without compromising its core values.

V.C. Investing in Local Infrastructure and Skills

Equity in open data is impossible without equitable access to the tools, infrastructure, and expertise needed to participate fully. This means going beyond one-off workshops or temporary funding streams and committing to long-term investment in locally controlled infrastructure.

Such investment includes physical assets like servers and high-speed connectivity, but also human capacity—data managers, curators, and technical specialists embedded within local institutions. These teams ensure that data is not only stored but actively managed, interpreted, and used to address local priorities.

By strengthening local infrastructure, data ecosystems reduce dependency on external platforms, enhance resilience against geopolitical or market shifts, and foster innovations that emerge from on-the-ground realities rather than imported templates.

Ultimately, this approach transforms data contributors into co-governors, ensuring that their role extends far beyond initial collection.

V.D. Embedding Ethical Frameworks in Data Workflows

Ethics cannot be an afterthought applied at the point of data release; it must be embedded in every stage of the data lifecycle. This means designing workflows that include explicit steps for informed consent, culturally appropriate metadata, and transparent documentation of data provenance.

Operationalizing ethics might involve automated systems that prompt curators to review consent terms before ingestion, metadata fields that capture community permissions, and dashboards that display benefit-sharing commitments in real time.

Such integration turns ethics from a reactive compliance measure into a proactive, visible safeguard—one that strengthens trust between data providers, managers, and users.

V.E. Moving from Extraction to Reciprocity

Historically, many data systems have functioned on an extractive model: data flows outward from contributors to centralized repositories, while benefits flow in the opposite direction, often bypassing the origin entirely. A reciprocal model reverses this logic.

Reciprocity involves tangible returns—shared analytical tools, access to processed datasets, capacity-building programs, or revenue-sharing agreements where commercial applications emerge from shared resources.

This shift is more than ethical; it is strategic. When communities see direct, sustained benefits from participation, they are more likely to continue contributing high-quality, timely data. Trust deepens, collaborations endure, and the system as a whole becomes stronger.

Ultimately, reciprocity transforms open data from a static repository into a living network of mutual investment and shared outcomes.

Conclusion: Reclaiming the Meaning of Openness

Openness has long been celebrated as both a moral commitment and a technical achievement. Yet in practice, its meaning is often reduced to a procedural checkbox—declared in policy, but thinly enacted in reality. We must ask: who gets to define “open,” and whose interests are served when that definition is sustained without challenge? To reclaim openness, we must face difficult truths: data is never neutral; infrastructures are built upon political choices; and governance frameworks inevitably decide whose voices will resonate and whose will be muted. Justice is not a decorative principle—it is the foundation of legitimacy. If openness continues to be measured solely by accessibility metrics, it will reproduce the very asymmetries it once promised to dismantle. The deeper question is not simply whether the world can reach the data, but whether the data ecosystems we are constructing can genuinely serve biodiversity, science, and the diverse communities whose knowledge and resources they rely upon. Reclaiming the meaning of openness requires courage: the courage to redistribute authority, to embed equity into every stage of the data lifecycle, and to acknowledge that true openness is measured not by the volume of data shared, but by the fairness and reciprocity of the relationships it sustains.

References

Borgman, C. L. (2015). Big Data, Little Data, No Data: Scholarship in the Networked World. MIT Press. DOI: https://doi.org/10.7551/mitpress/9964.001.0001

Bezuidenhout, L., Leonelli, S., Kelly, A., & Rappert, B. (2017). Beyond the digital divide: Towards a situated approach to open data. Science and Public Policy, 44(4), 464–475. DOI: https://doi.org/10.1093/scipol/scw036

International Indigenous Data Sovereignty Interest Group. (2019). CARE Principles for Indigenous Data Governance. Global Indigenous Data Alliance. URL: https://www.gida-global.org/care

Parsons, M., Fisher, K., & Nalau, J. (2022). Ethics of data sharing in environmental sciences. Nature Sustainability, 5, 15–21. DOI: https://doi.org/10.1038/s41893-021-00790-4

Convention on Biological Diversity (CBD). (2010). Nagoya Protocol on Access to Genetic Resources and the Fair and Equitable Sharing of Benefits Arising from their Utilization. URL: https://www.cbd.int/abs/

Leonelli, S. (2016). Data-Centric Biology: A Philosophical Study. University of Chicago Press. DOI: https://doi.org/10.7208/chicago/9780226416502.001.0001

Further Reading: Resources on Biodiversity Data Governance

FAQ: Questions on Biodiversity Data Governance

Why do discussions about biodiversity data often center on FAIR and CARE principles?

FAIR (Findable, Accessible, Interoperable, Reusable) and CARE (Collective Benefit, Authority to Control, Responsibility, Ethics) are widely cited in policy and data infrastructure debates. However, they are not the only ways to think about data stewardship, and each has its blind spots. FAIR tends to emphasize technical accessibility, while CARE focuses on rights and ethics—but neither automatically guarantees equity or justice. Other approaches rooted in local governance, customary law, or environmental ethics may be equally relevant, yet often remain underrepresented. See more diverse perspectives at Ecolonical.

How does provenance reveal more than just a data trail?

Provenance is not only about tracing datasets for reproducibility—it can expose who has authority, whose contributions are visible, and whose are ignored. It can uncover histories of exclusion or exploitation, as well as chains of trust and consent. When provenance is incomplete, it’s worth asking: who benefits from those gaps, and who is left invisible?

Why might “open” biodiversity data still be inaccessible in practice?

Open data labels can mask practical barriers: restrictive licensing, proprietary formats, institutional approvals, or technical complexity. Openness on paper doesn’t always translate into openness in reality. The question is less “is it open?” and more “open for whom, and under what terms?”. See examples of such barriers at Ecolonical Open Data Barriers.

What challenges arise when applying community-oriented data governance models?

Whether labeled CARE or otherwise, community-oriented models require more than statements of principle. Barriers include lack of standardized cultural metadata, power imbalances in decision-making, and the absence of enforceable benefit-sharing. Tokenistic adoption of such frameworks can create an appearance of fairness without delivering real change.

Once biodiversity data is public, can it still be governed locally?

It’s possible, but only if governance and technical safeguards are built in from the start: custom licensing, Traditional Knowledge (TK) labels, and binding agreements that survive beyond initial publication. Otherwise, public release can effectively mean loss of control. Sometimes, the most ethical decision may be not to publish certain datasets at all.

How can benefit-sharing move beyond symbolic gestures?

Real benefit-sharing goes beyond co-authorship mentions—it involves resource access, training, and genuine co-decision-making power. If benefits are defined only by the institutions holding the data, they risk reinforcing the same inequities that open data was meant to challenge.

Who decides the rules of “open” biodiversity data, and who should?

Governance is about power: who sets the rules, who enforces them, and who gets excluded from the conversation. A truly equitable system would require shifting decision-making authority toward those most affected by data use, not just those with the infrastructure to store and distribute it. For critical reflections, visit Ecolonical Data Governance.

Author

  • Milena-Jael Silva-Morales, AI and Data Expert in Urban & Territorial Systems, Energy-Biodiversity-Water Nexus, and Ethical AI.

    Milena-Jael Silva-Morales is a systems engineer with a PhD in Urban and Territorial Systems and the founder of Ecolonical LAB, an independent research lab integrating data science, AI, and territorial systems to address local and global sustainability challenges. With over 15 years of experience leading international, multidisciplinary R&D initiatives, she is recognized for bridging science, technology, and policy to deliver transformative solutions in water, energy, and biodiversity systems.

    View all posts

This article is governed by the Ecolonical Open Knowledge License (EOKL Lite V1). This license explicitly prohibits the use of its contents for AI model training, dataset integration, algorithmic processing, or automated decision-making systems. Unauthorized computational aggregation, reproduction beyond permitted terms, and any use conflicting with open knowledge principles are strictly restricted.

For legally binding terms, compliance obligations, and permitted exceptions, refer to the License Usage Policy.

Under specific conditions, this content aligns with the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. However, any AI-related processing, direct commercial exploitation, or automated derivative work remains subject to EOKL Lite V1 restrictions.

Creative Commons License

Leave a Reply

Your email address will not be published. Required fields are marked *