One of the side-events of the Open Government Data Camp, last week, was an Organisational Identifiers Workshop put together by Tim Davies and Chris Taggart. The meeting discussed the various challenges in linking information about organisations held in separate data sets. Although participants were careful to avoid the word “ontology“, one of the break-out groups did look at describing relations between organisations. Since I graduated on research into “part-of” relations in an ontology, and what you can infer from them, I joined that discussion. Here’s what we came up with.
The workshop was a good chance to catch up with where things are right now, with several organisations at the table and participating online that have to deal with information about organisations:
- The IATI standard needs organisational identifiers to refer to individual donors and recipients of grant money and payments. IATI does not want to provide this standard, but rely on an external one. They will need some way to represent up to the level of government departments as part of an upcoming pilot project, to capture intended donor flows in a meaningful way.
- The Open Corporates website, and its companion the Open Charities website, capture information about organisations, but also lack a common identifier scheme, as well as ways to describe relations between organisational entities (especially the complicated relations between companies).
- Within the open government data movement, and the Open Knowledge Foundation, there is a need to represent organisational units such as departments, and be able to deal with renaming and reorganisation of such units over time.
- The Sunlight Foundation is dealing with for instance DUNS numbers, which often are too detailed for the purpose of identifying a larger organisation (every outlet of a supermarket chain will have its own number).
- GlobalGiving, OpenSpending and IATI are looking into decentralised registrars, but each registar basically expresses a different type of relation between a legal or organisational entity and a purpose, such as tax registration or legal entity.
- Everyone faced a difficulty of dealing with entities which cannot register as such (e.g. informal associations), and so are not in any registrar’s database.
- To end this list, many people will talk about a known brand as if it is a company, and would expect to access information that way, but even these have no single register.
How to create identifiers for organisations across the world, which might not be registered anywhere, and which relate to each other and to more generic concepts, in such a way that we can capture all the meaningful relations and data we want to capture?
How to make sure it works with the schemas already in use in big organisations? And that it works with data stores that are not open? Without introducing another naming authority?
- You should be able to determine an ID without requesting it from anyone.
- You should be able to resolve it to commonly known registrars.
- You should know where to find the list of those registrars.
- You should be able to represent the granularity (aggregating detailed levels of information, allowing for splitting up individual entities into smaller ones)
- Who decides what is a good registrar?
We split up in a couple of groups, one looking at identifying public bodies, another at the technical architecture that might be needed, and a third at common terms to describe relations between organisational entities. I joined that third group.
We spent some time discussing various types of relations, and I also looked around to find possible candidate schemes, but without much luck. I couldn’t find an obvious example, like the FOAF standard for personal relations. A few standards, like OrgPedia, or the Organizational Ontology, seem likely candidates, but don’t cover this area (yet?).
We looked at some use cases:
- A company wants to show their supply chain, to demonstrate that their suppliers are ok, or perhaps to “crowd-source” the question whether they are: “these are our suppliers, if you think they’re not ok, let us know”.
- A campaigning organisation wants to express what they know about organisational ties, to support their arguments on why the ties should be broken.
- A reporting entity wants to express their donation relations, for instance to a government department, and be able to deal with changes due to reorganisation.
- A watch-dog organisation wants to express that a certain company has changed names or merged or split operations, but still remains to pursue the same activities.
- A consumer wants to find out what a certain company has done, but basically only knows that company through a name or brand, without knowing the exact structure behind it.
We acknowledged additional cases, like finding influential relations between corporate or organisational entities based on board membership or roles of individuals, but decided not to take that on in this discussion.
We came up with a first-version typology of relations. The naming and exact semantics will need further review.
These are relations between entities that have a “permanent” and “structural” character. Of course, all these relations are bound in time, but the beginning and end points may not be known.
We distinguished two sub categories.
- Organisational relations express membership, ownership, or hierarchy.
- “is member of” (an association, group, cabal); “is affiliated to”; “is organisational unit of” (department, location); “is shareholder of”; “is owner of”
- Contractual relations express transactions between entities. For instance, a relation “donates to” would express a sizable or structural donation from one entity to another. In the IATI standard, this would mean there should be (one, but probably more) “activities” records or “transactions” records.
- “has contract with” with eg. subcategories “owes money to” (long-term debts, mortgages), “is supplier to”, and “licenses to”; “in legal conflict with”; “donates to”
(This typology still fails to capture something like a brand as abstract entity.)
These are relations that express a change in the structure or responsibilities of some entities, often the beginning or the end of particular entities. We identified four basic types:
- Split into: A splits into B, C, … A ceases to exist, B, C, … come into existence.
- Spin-off off: A creates B as a separate entity
- Merger: A, B, … merge into C. A, B, … cease to exist, C comes into existence.
- Acquisition: A acquires B and moves B’s assets into A. B ceases to exist.
More work is needed to mold this into a useful standard (relations are currently described from the perspective of one end, there is still plenty of room for interpretation, things have not been tested on real-life examples described as use cases, and so on).
And, of course, we’d need those organisational identifiers to refer to other entities, and find ways to delegate resolving identifiers to services that can provide additional information on those identities. See the whole report of the workshop on the OGDCamp wiki for the results in the other discussions as well.
But thinking about and discussing relations between those entities brought back memories of all the fun in making machines infer and report unknown relations 🙂