Summary of Ecosystem Interviews Released

June 30, 2023

Interviews with ecosystem participants have been an important way that we’ve learned more about the CKAN ecosystem. We’re pleased to present a summary of the themes that have emerged from these interviews with people that play very different roles in the ecosystem. We are indebted to Dr. Elise Silva for coding, analyzing, and drafting this report, which synthesizes information gathered over the span of five months with 30 different people. During these conversations, interviewees shared their motivations for being involved, and perspective as a member of the CKAN ecosystem. Members of the ecosystem helped to develop many of the questions and lines of inquiry that guided our conversations earlier this year through a crowd-sourced collaborative document.

The findings from these conversations have informed the design of the sense-making workshops we’ve been conducting this summer. Please visit our workshop blog post to register for the remaining workshops, and feel free to share your thoughts in a comment beneath this post.

Qualitative Report

By Elise Silva, Nora Mattern, Bob Gradeck, Sami Baig, Liz Monk, Joel Natividad, Steve Saylor, Ross Reilly, David Walker

Introduction

Purpose of research:

To develop a plan to support the sustainability of the CKAN ecosystem

From the project proposal:

Following the documentation and research, the project team will engage members of the POSE in interviews and virtual workshops or discussions to better-understand the perspectives, experiences, motivations, values, priorities, goals, and contexts of members of the OSE, including the core development team, systems integrators, code contributors, and implementing organizations. These conversations will help to uncover important issues affecting sustainability, growth, and inclusiveness of the OSE, and will inform planning efforts. To build capacity in conducting engaging and effective virtual workshops, the project team will participate in training in how to develop effective virtual sessions. Where possible, these engagements will be integrated with existing CKAN programming, such as the monthly “CKAN Live’’ presentations, weekly development meetings, and annual CKAN conference.

Methods

  • Dataset included 19 interviews of 30 individuals associated with CKAN ecosystem.
    • The team took care to solicit interviews from international users, especially from the Global South
  • Unstructured data consisted of notes taken during each interview by members of the POSE research team.
  • Notes were coded using constructivist grounded theory, a hybrid open/closed coding approach, and descriptive coding method.
  • The codes were categorized and thematized using a reflexive approach to identify patterns.
  • Basic textual analysis and querying was conducted to understand thematized patterns.
  • Findings were corroborated with another layer of textual/observational data from sense-making workshops.

Interview Team:

  • Nora Mattern, University of Pittsburgh
  • Steve Saylor, University of Pittsburgh
  • Joel Natividad, DatHere
  • Sami Baig, DatHere
  • Bob Gradeck, University of Pittsburgh
  • Ross Reilly, University of Pittsburgh
  • David Walker, University of Pittsburgh
  • Liz Monk, University of Pittsburgh

Summary of Findings

Below are summarized lists of what participants said either directly or indirectly regarding what they like, dislike, or would like to change either in the CKAN product or the wider ecosystem.

Positive Sentiments

  • Appreciative of people coming together, core team’s commitment
  • Extensions and plugins help customize
  • Comfortable and widely used product
  • Customizable metadata, flexibility around metadata
  • Data request feature
  • Open source is positive and important

Pain Points

  • Implementation, onboarding, training
  • More features would be appreciated like data viz
  • Difficulty communicating and therefore improving product
  • Difficulty connecting community members together to collaborate or learn
  • Barriers to contributing including time, money, expertise
  • Slow response rate
  • Documentation is old
  • Requires deep expertise in order to use and customize, steep learning curve
  • Eurocentric
  • UX not user friendly
  • Few checks and balances to make sure it’s all working well together
  • Fear of open source meaning less reliable

Suggestions from Community

  • Community manager
  • Some funding or support to help people “find time” to contribute or other incentives
  • More marketing, outreach, and community-building events, communications, trainings
  • Better documentation standards
  • Training videos or materials
  • Better understanding of various user needs from those implementing to end users
  • Faster decision making
  • More support within the ecosystem for upgrades, etc.
  • Bigger community involvement via slack channels, discussion boards, or other means
  • Modernize look
  • Advocate for platform – give people a sense of why to invest time
  • Better search functionality
  • Improve data quality
  • Data visualization features
  • Lower barriers to get people started
  • Improve onboarding
  • Advocate for open data

Overarching Themes

The purpose of thematizing a dataset is to understand patterns as such uncover (or create) meaning. While the POSE group had research questions some of those changed and morphed throughout the project depending on their experiences interviewing. As such, I used a reflexive thematic analysis approach which is a flexible approach letting themes arise from the data rather than having predefined themes in mind. This allowed me to change, remove, and add them as I analyzed the codes. This kind of coding is iterative and reflective and recognizes the subjective nature of qualitative analysis.

Access

There are multiple levels of access including how people talk about barriers to accessing, understanding, and implementing the product on a technical level. Related but also distinct are the technical, social, or linguistic barriers to access for diverse user groups, especially non- Eurocentric ones. Such issues of access are closely related to social-justice related advocacy/activism proclivities that motivate individuals to adopt the product or work to change it to be more inclusive.

Affinity

Affinity means closeness, proximity, or likelihood. There is structural affinity in the way the ecosystem is built to help parts work together. There is a personal affinity in wanting to be a member of the community; a likelihood of contributing or “giving back;” a proximity to other people in the network; or exhibiting values of citizenship/volunteerism. Then there is an affinity to the product itself: choosing it because it’s free, usable, works for organization/user needs, etc. Ultimately affinity encapsulates what does or does not appeal to members about the product or their involvement in the ecosystem. I found that there were personal, financial, educational, and social affinities for CKAN’s ecosystem.

Synergy

This refers to the level of cooperation or interaction of people and systems. It could refer to how people interact with one another hierarchically and laterally through communication or personnel channels. It encapsulates how the system evolves through community input and how responsive it is to change. It describes how effectively individuals and parts of the ecosystem are connected. The idea of responsiveness is connected to synergy whether it be responsiveness within the ecosystem’s membership or or how responsive the product/system is to multifaceted user needs.

Resources

Resources are the people and things that go into creating, maintaining, and improving the ecosystem. Some are quantifiable and some are not. Money, personnel, time, giving back, and expertise are some resources identified.

Use-Ability

This refers to both people and the product. What is a user’s ability to implement the product or troubleshoot when there are issues? Where is the point of entry? What are the literacies, limitations, and questions users face? Also how useful and usable is the product? Where are the pain points and what features would improve or make it more usable?

Values

There was much discussion regarding values whether these be organizational values or personal ones surrounding ideas of open source or open data. Ultimately, there are discrete personal values, organizational/institutional values, social values, and mission-based values that were discussed in the interviews. These different levels of values motivate or constrain individuals in participating in open source, open data, or the CKAN ecosystem itself. Those that hold strong values regarding open source, open data, or CKAN use those values to advocate or evangelize for the system/product.

Wider Framing and Implications

It continues to be worth exploring how other similar ecosystems are functioning and perhaps talk to members of those to see how they evolved in later phases of this project. Drupal was mentioned many times. This could answer questions about long-term sustainability or a maturing product. While there were many suggestions for improvement, there are very real resource constraints that participants pinpointed and so looking at similar systems might help prioritize what are viable interventions in the CKAN ecosystem. Further considerations include the very real contradictions of any open-source enterprise in a corporatized society. That is, open-source products are meant to be free, but no labor is truly free. Further, while open data/open source is built upon the tenets of accessibility, so often these systems struggle with that very thing, CKAN being no exception. In many ways, the central work of this study speaks to these wider issues through the lens of CKAN as a case study.

Limitations

As with all research, there are limitations which constrain what the dataset can reasonably tell us. First, the interviews were not recorded or transcribed so as to help participants be more at ease. While they may have spoken with more candor without being recorded, the notes taken were done so through an intermediary from the POSE group which may have missed certain things or transcribed them in a way that wasn’t fully representative of what was said. Further, different note-takers affected the ability to standardize the notes. The notes were semi-structured and changed depending on user groups which make it hard to systematically compare all groups equally. However, regardless of these limitations very clear themes arose from the interviews and most importantly many suggestions for improvements that are further corroborated by workshop data.

Conclusions

The interviews identified several important places to make strategic interventions. While the POSE team foresaw these as being important areas of development, it was helpful and validating to see these issues discussed at length by other members of the community.

Resources: Gather resources to help incentivize ecosystem members to give back. This could include fundraising that would allow for individuals to spend more time developing new features, etc.
Community Building: A community manager would help with advertising, facilitating collaboration, encouraging interactions, scheduling trainings and workshops, and generally overseeing the health of the ecosystem.
Process Optimization: Better documentation and stronger feedback loops (or communication channels like maintained discussion boards) will help increase responsivity rate and keep product up-to-date.
Training (Scaffolding): More robust onboarding help and training materials in written or visual forms (multiple languages) would be helpful to attract, retain, and empower users.
Inclusivity Practices: More attention needs to be paid to users in Global South including bolstering representation, increasing language diversity, and building accessibly measures.

Codebook

Numbers are the total number of times each code shows up in the complete dataset. Sub codes are not aggregated into parent code numbers.

  • CKAN (general mentions) 20
    • Challenges 70
    • Successes or Likes 22
    • Suggestions 58
  • Communication 30
    • Discourse or Language 3
    • Discussion Board 7
    • Documentation 13
    • Feedback Loops 20
      • Roadmap 3
    • Question 7
  • Competition 4
    • Socrata 9
  • Data 13
    • Data Culture 17
      • Closed Data Culture 3
      • Open Data Culture or Mandate 26
    • Linked Data 6
    • Metadata 18
    • Standards 10
  • Derilinx 4
  • Ecosystem Actions, Involvement, Interactions
    • Adoption 5
      • Deployment 9
      • Procurement 3
    • Advocacy 8
      • Activism 2
    • Co-Create 2
    • Collaborate 20
    • Community Building 19
      • Deepening Engagement 12
      • Outreach 10
      • Scaffolding 4
    • Convey/Consume 7
    • Partnership 10
  • Ecosystem Roles
    • Adopter 7
      • Former Adopter 2
      • Never Adopter 1
    • Champion 8
    • Contractor 5
    • Contributor, Contribute 32
    • Core Tech Team 23
    • Developer 14
    • Integrators 5
    • Stewards 16
    • Vendors 10
  • Governance Model 11
  • Harvesting 9
  • Learn 7
    • Literacies 2
  • Metrics 7
  • Models (Other Open Source) 7
    • Drupal 8
  • Onboarding 11
    • Training Resources 5
  • Open Source 25
  • Org Type 1
    • Government 34
    • Library 14
    • Non-Profit 2
    • Universities 3
    • Open Knowledge Foundation 6
    • Public 3
  • Organizational Values 7
    • Privacy 2
  • Product Development, Ongoing 1
    • Customization, Unique requirements 6
    • Features 4
      • Extension 14
    • Maintenance 13
      • Troubleshoot 3
    • New Features 3
    • Sustainability 2
  • Resources 12
    • Compensation 2
    • Funding 19
    • Personnel 13
    • Time 13
  • Social Justice 7
  • Useability 10
    • Comfort 3
    • User Needs 22