I am in the (very) early stages of a personal project where I plan to build out the SOLI ontology for my project, and want to make sure I’m contributing in the correct way. I can update an OWL file locally for purposes of the project, but I’d like, if possible, for my work to be available to others.
At a very high level, my plan is to convert raw corporate registry data from the Illinois* Secretary of State (stored in fixed width text files) into a graph database that can be queried to find relationships between individual entities. The idea is that the end user could search for a specific LLC and quickly return results showing other entities that might be related by, for example, the same business address or by shared members.
SOLI is the perfect ontology for this project, but just in my limited browsing I know that I’m going to spend some time building out further as it relates to corporate entities. Just as one example, right now SOLI does not have a class of “registered agent,” which is something that I’ll want to map for each individual entity. Once I do all the work of building out this part of SOLI, I want to make sure it can get used beyond my little side project. Thus, before starting, I want to make sure that as I create new classes, etc., I’m adhering to the correct standards.
I notice in GitHub that there are plans to write a markdown file with a guide for contributors, but that isn’t complete yet. So this is not to rush anyone–I just want to make sure I’m doing things right from the outset so that it saves everyone less work later on.
*I’m starting with Illinois because it’s where I live and practice, and because they make the bulk registry available for free (albeit in an outdated format). I’ve looked, and I think there are ~15 or so states that provide free bulk data, and another ~15 or so that make you pay in some fashion. I hope to incorporate bulk data from more states much later on, but am keeping it simple (as possible) for now.