skip to content

DataKind’s Ethical Principles

The following is inspired by DataKind UK's original publication of their principles at https://www.datakind.org/2018/01/22/doing-data-for-good-right/

Intended audience: DataKind Volunteers

Ensuring our work is done in the service of humanity and in alignment with our values is our top priority at DataKind. At each stage of a DataKind project, there are various actions DataKinders take to make sure that the project is done ethically. These actions help DataKind teams minimize the potential harm of solutions, allowing them to be more sustainable, impactful, and ethical.

This article outlines DataKind’s ethical principles. What these principles look like specifically within each project stage is described in articles throughout the Projects section of the Playbook. At DataKind, we reference these principles at each stage throughout the project process:

1) Harms & Benefits Evaluation

I will actively seek to consider the potential harms and benefits of my work with DataKind.

We evaluate who will be most impacted by our work and what that impact will be. Will our project empower communities who are marginalized, or become an oppressive tool for the powerful? When there is an ethical question, DataKind teams work with project partners to understand their intent and goals, and ensure their recommended solution won’t cause harm to the impacted community. This analysis is done in various ways, including evaluating the potential harms of model outputs or considering the different ways organizations can make decisions using our analyses. Emphasizing the benefits and harms also lets us compare to the baseline: what if we do nothing? Even during project handoffs, we need to make sure we continue evaluating because we never know how our models will be used in the future.

2) Context & Impact

I will actively seek to understand the context of the data and tools I use, and mitigate negative impacts to the best of my ability.

Understanding the context of a data solution is complex. It involves interrogating biases not only of the data, but also of the partner organization, ourselves, and the served community. This requires us to delve into the historical context of those we work with to assess data collection methods and other activities. Our solutions will have affordances and assumptions built in that we must ensure respect for the context where the solution will be deployed, so that it will have an optimal impact. Moreover, making sure that whoever is impacted consents to data collection and proposed solutions is vital in order to respect the definitions of privacy within an impacted community, while also opening doors for participation in the solutions we build.

Addressing internal and external biases allows us to create more robust data solutions. We believe that it is essential to ensure impacted populations are equitably represented, which could be as simple as making sure there are representative and balanced identifiers in a dataset or as complex as digging into a data collection pipeline to make sure that certain groups were not left out. By making sure the impacted population is represented in the dataset and consents to the solution being developed, unforeseen discrimination and harm can be avoided.

3) Communication & Transparency

I will enable others to understand the data and analysis choices I have made, now and in the future.

We believe that people have a right to understand how models that impact them work. We prioritize creating excellent documentation that takes into consideration accessibility (such as local language), readability, sustainability, and understandability. Ranging from impacted communities that need colloquial project briefs to technical partners who need to understand the parameters of a model, good communication starts from a point of transparency and equity. Disaggregating datasets, creating diagrams and images of model architectures, and reporting out results simply, but rigorously allows all participants to evaluate the state of a project, detect any errors or privacy concerns, and reproduce results to validate the solution.

4) Capacity & Humility

I will actively seek to understand my own limits and the limits of the organizations involved.

As volunteers, we have finite time to dedicate to DataKind projects. Partner organizations may have limited funding, minimal technical infrastructure, or missing data. In order to proceed ethically in a project, we must understand the limits and biases of ourselves and our partner organizations. These constraints allow us to accurately represent what is possible to our partners and build more creative solutions.

5) Openness & Learning

I will debate and discuss ethical choices.

We understand that the ethical principles laid out in this document may be flawed or incomplete. Ethical questions in data science rarely have a clear cut solution, so openness to discussion, debate, and iteration are essential to move towards the best decision possible. We commit to improving our ethical toolkit by engaging with the research and technologies developed by others. By tackling ethical challenges head on, we can become better data scientists and better people.

Conclusion

This article provides an overview of the ethical considerations that apply throughout our work at DataKind. In each Playbook article on specific project stages, you will find more detailed descriptions, actionable steps to implement these five pillars, and tools to integrate them throughout the project process. Now go forth, and do some good with data!

Note: DataKind’s principles were originally drafted and published by DataKind UK on their blog. DataKind UK keeps their principles up-to-date here.

Contributer(s): Caitlin Augustin, Benjamin Kinsella, Daniel Nissani, Rachel Wells

Contact us

If you would like to learn more about us, partner with us, or get in touch, visit our website or email community@datakind.org.

Subscribe to our newsletter
Subscribe