We got the opportunity to build on the bioregional lifeboard at CTC22 hackathon on the climate. At CTC21 hackathon we took one bioregion, the watershed of the river Dee in Aberdeenshire and combined mapping and data science. Based on this learning we decided to focus on building the tools to allow anyone to build their own bioregion.
Where to start? Enter a place or for this weekend a river name. The text entered queries wikipedia and brings back a list of rivers. On selecting a river, either a lifeboard and map will be displayed for the river or set up build buttons to start constructing a bioregion. Day one focused on all the data that needs linked together and where the data should be saved, locally or a public resource. The resource chosen was wikiData. For some rivers, given they have a wikipedia page it is highly likely a wikidata page will exist too e.g. River Don. The other source of river data is OpenStreetMaps, the River Don as an example. And finally their is data on the watershed itself rainfall to river flow to soil to biodiversity metrics etc. The team split into three to focus on user experience, querying OpenStreetMaps to the data science.
The goal for the rest of the weekend was to connect or provide joining indexes between the open data source or api’s.
Wikipedia to Wikidata → The solution found is to create an index or JIT SPARQL query to wikidata based on the river based on the wikipeda page name. From wikidpdata and more advance query can extract information on place/river to give starting coordinates that can be handed to an OpenStreetMaps API search.
OpenStreetMap search query: A software library was built to traverse the river coordinates. The key challenge is finding the root relationship of the XML structure that connects all the points on the river and tributaries. The river Leven was plotted out using the library by the end of the weekend. Alternatively a more DIY approach was taken with a mapping trace tool used to create a new dataset of geoJSON that could be saved to wikidata.
Data Science: Having set the context of the river coordinates the rivers watershed coordinates was the next challenge to capture in geoJSON. The River Archive provides that information. A manual query of their site allow us to identify the measurement stations and the shape files. The shape files were manually converted using openMapper . A watershed is made up of one or more sub areas to work is still required to index all sub watersheds that belong to the same river and to extracting the outside region coordinates. The waterflow and rainfall data where then imported into python note books to produce charts. The chart stored on github and indexed by wikidata reference.
The user experience did not quite get the the stage where we could display these chart in the context of the watershed area and the river.
What is next? The key challenge is connecting the data and where possible automating software to produce the SPARQL queries on demand or starting a software library to query for data or to change file formats ready for the web. These are still significant challenges, especially if the query is open for any bioregion any where in the world. That remains the goal but the next stage will be to connect the parts above for 4 watersheds areas in Scotland. From there automation routines for the rest of Scotland and the UK can be addressed. From there international dataset can be identified or other inspired communities can use their local hackathons to liberate and connect those datasets.
Thank you to the team and CodeTheCity for working on the project and facilitating the weekend.