A Principled Approach to Pay-as-you-go Data Management
Within the RLD, the pay-as-you-go approach to data management is complemented with a principled tiered approach to the design of support services where an increase in the level of active data management has a corresponding increase in the associated effort (Curry et al., 2019). This tiered approach to data management provides flexibility by reducing the initial effort and barriers to joining the dataspace. The tiers for the RLD are a specialisation of the 5 stars scheme defined by Tim Berners-Lee for publishing open data on the Web (Heath and Bizer, 2011).
TBL’s Five Star Data
The W3C Linking Open Data (LOD) project started in 2007 and began publishing datasets under open licenses and following the linked data principles. To encourage people to publish linked data, the inventor of the web and the initiator of the linked data paradigm, Tim Berners-Lee, proposed a five star rating system (Berners-Lee, 2006). The rating system helps data publishers to evaluate how much their datasets conform to the linked data principles. The first star is to make data available on the web, with each additional star corresponding to increased reusability and interoperability of the published data as more of the principles of linked data are followed.
5 Star Pay-as-you-go Model for Dataspace Services
In contrast to the classical one-time integration of datasets that causes a significant upfront overhead, the RLD adopts a principled pay-as-you-go paradigm for supporting an incremental approach to data management. At the foundation of the approach is the principle that the publisher of the data is responsible for paying the cost of joining the dataspace. This pragmatic decision allows the RLD to grow and enhance gradually with participants joining or leaving the dataspace at any time. The next principle is that data is managed following a tiered approach, where an increase in the level of active data management has a corresponding increase in associated costs.
The tiered approach to data management provides flexibility by reducing the initial cost and barriers to joining the dataspace. The tiers are described using a specialisation of the 5 star scheme defined by Tim Berners-Lee. The original star scheme has been extended to consider the level of integration of the data sources with the support services of a dataspace. At the minimum level, a data source needs to be made available with a dataspace. Over time the level of integration with the support services can be improved in an incremental manner on an as-needed basis. The more investment made to integrate with the support services; the better integration is achievable in the dataspace. The 5 star pay-as-you-go model for the RLD has the following tiers:
- The participant is published in the dataspace with limited or no integration with support services.
- The participant is publishing data in a machine-readable format. This enables services to provide a minimal level of support with basic functionality (e.g. browsing) where available basic interfaces are exposed.
- The use of a non-proprietary format enables support services to provide essential services at the data-item/entity level with support for simple functionality (e.g. keyword search).
- The participant is integrated with most support service features (e.g. structured queries) with an awareness of its relationships to other participants with basic support for federation.
- The participant is fully integrated into the support services (e.g. question answering) and linked to relevant participants. It plays its full role in the global view of the dataspace.
Berners-Lee, T. (2006) ‘Linked data-design issues’, https://www.w3.org/DesignIssues/LinkedData.html.
Curry, E. et al. (2019) ‘A Real-time Linked Dataspace for the Internet of Things: Enabling “Pay-As-You-Go” Data Management in Smart Environments’, Future Generation Computer Systems, 90, pp. 405–422. doi: 10.1016/j.future.2018.07.019.
Heath, T. and Bizer, C. (2011) ‘Linked Data: Evolving the Web into a Global Data Space’, Synthesis Lectures on the Semantic Web: Theory and Technology, 1(1), pp. 1–136. doi: 10.2200/S00334ED1V01Y201102WBE001.