Geolocation of Data Sources

All data source definitions (primary, compose record store, federation) will provide geographical points (latitude and longitude) with textual descriptions of the location and infrastructure ownership.

REST Endpoints

Method Path
POST /system/data-privacy/requests Submit new data privacy request.
GET /system/data-privacy/requests List data privacy requests.
GET /system/data-privacy/requests/{ID} Get details about specific request
GET /system/data-privacy/requests/{ID}/responses List responses for a specific request
POST /system/data-privacy/requests/{ID}/responses New response
GET /system/stores List of all known data stores (primary + CRS + federated) with their geolocation info and other data-privacy related details
GET /system/stores/drivers List of all available store drivers their capabilities and configuration details
GET /compose/record-stores List of all available CRS
POST /compose/record-stores Add new CRS
GET /compose/record-stores/{ID} Get details on a single CRS
PUT /compose/record-stores/{ID} Update existing CRS
DELETE /compose/record-stores/{ID} Remove existing CRS

Capabilities

Level Name Additional description MoSCoW Effort
Driver SensitivityLevel Maximum data sensitivity level supported Must Low
Driver Partitioning Can records be partitioned?

Does data source (and driver) support partitioning?

Can records be saved in separated partitions (tables, collections, files)?

Must Mid
Driver PartitionFormatValidator Validator for generated partition name

When a new module is connected to the CSR with driver that has enabled partitioning, generated partition name for that module (using PartitionFormat and actual values from the module) is verified.

Must Mid
Driver ValueEncodingStrategies All supported strategies for value encoding

key-value-appendix: Legacy, additional table with “_values” suffix that stores record id, field name, string, order. Obsolete

json: Fields stored in a single field as object;
Multi-value fields are stored as arrays
Native JSON types (bool, number, string) are used

Must High
CRS Encrypted Informative, Is data encrypted?
CRS GeoLocation Where is data stored Low
Module PartitionFormat How are records partitioned? Allow different partitioning formats using simple syntax with placeholders:

“fixed_name”: records are stored in the fixed partition

“{module}”: records are stored in per-module partitions (for example: separate table)

“corteza_{module}”: records are stored in a prefixed, per-module partitions

Placeholders
{module}: Module’s partition handle; module’s handle used by default

Must High
Field SensitivityLevel Field data sensitivity Must Low
Module ValueEncodingStrategy Selected strategy for value encoding

See “ValueEncodingStrategies” in this table for details

Must High
Module Action logging Selected strategy for value encoding Must High
Module Action logging Action logging policy Should Low
Module RBAC Allow record-level access control Could Low
Module Automation Event triggering policy

Allow control over events triggered by particular module

Event types:

before-create
after-create
before-update
after-update
before-delete
after-delete

Could Low

Resources

Types and Resources

List of Corteza types and resources that need to be implemented or extended.

Name Action Kind Component
Corteza Privacy (5)
Store driver

Naming

  • StoreDriver
  • store-driver
  • store_driver
Implement Type Corteza
Primary store

Naming

  • PrimaryStore
  • primary-store
  • primary_store
Implement Instance Corteza
Primary compose record store Implement Instance Compose
Compose record store

RBAC Operations

compose component

create.record-store

CRS resource

read
update
delete

If CRS can be read it can be assigned to a module (if the role is allowed to create/update a module)

Events: No events are triggered.

Implement Resource Compose
Compose namespace Extend Resource Compose
Compose module Extend Resource Compose
Compose module field Extend Resource Compose
Federation node Extend Resource Federation
Privacy request

Naming

  • PrivacyRequest
  • privacy_request
  • privacy-request

RBAC Operations: none

Events

  • before create
  • after create
Implement Resource System
Privacy response

Naming

  • PrivacyRequest
  • privacy_request
  • privacy-request

RBAC Operations

system component

data-privacy-requests.manage

Events

  • before create
  • after create
Implement Resource System

Additional details on type/resource properties

Access control: what kind of access control should be implemented
Automation: new
General information: none of the new resources support resource translations

Encoding and Decoding

Operations and data that flows in and out of the store need to be properly encoded and decoded.

Standard Resource Store Operations

Lookup

  • Finds one resource and return error if not found.
  • CortezaID is the most commonly used filter as the primary key. In addition to that, a handle (textual identifier with enforced character range constraints) and other single or multiple fields can be used.
  • Preprocessors can be used to apply certain rules. Casting to lower-case and avoiding case sensitivity is the most usual one.
  • Ensures resource is decoded in a way that no information is lost.

Search

  • Returns all matching resources but do not yield error if none are found.
  • Searching can support structured filtering, FTS, sorting and cursored pagination.
  • Ensures resource is decoded in a way that no information is lost.
  • Must support per-resource access-control callbacks to ensure proper pagination

Create

  • Must support per-resource access-control callbacks to ensure proper pagination
  • Store can support storing of new resources.
  • Stores a new resource using provided resource ID.

Update

  • Store can support updating existing resources
  • Updates an existing resource using provided resource ID.
  • Ensures resource is encoded in a way that no information is lost.

Delete

  • Removes a resource. It accepts the same conditions as lookup operation.

Truncate

  • Removes all resources

Compose Records Store Operations

In addition to standard resource store operations, CRS handles mapping and type encoding values from module field types to store’s internal types and vice-versa for decoding.

Each field kind must be properly encoded for compatibility with store and harvest its full potential.

Privacy request policy

Defines how new requests are treated.

Options:

  • auto-approve lookup
  • auto-approve change
  • auto-approve removal

Policies are stored in a setting.

Extra level of access control is implemented to protect certain setting prefixes according to allowed RBAC operations on component.

Features

Abbr Feature User stories Status
SD Store drivers

Related to User stories:

  • As a developer I can use store driver API endpoint because I want to get list of available store drivers
  • As an admin I can see list of all available store drivers because I want to decide which ones I’ll use

Summary: How can we connect to data storage, what are its restrictions and capabilities

Store driver serves as an interface between Corteza services and underlying storage. it can provide full support for all Corteza store needs and act as primary store or provide partial, specialised features, e.g.: caching layer or compose record read and write features.

Architecture is designed in a way to support SQL NoSQL, plain-text file on the disk or remote web-service, REST API. All these storage types might not support all features and can not be a primary store.

Each driver needs to inform the CRS using it, what kind of behaviour to expect, what capabilities can be used and how. This information is then transferred to the compose module that is using the CRS.

All relevant capabilities from driver, CRS and module are merged and accessible via API to enable clients (frontend web applications) to provide the best possible user experience.

In addition to capabilities, driver also provides required and available configuration options. These options are then used to configure a CRS.

Available drivers are statically defined in the code only. Handpicked few are included in the codebase, additional drivers can be brought in via plugins.

Examples of driver capabilities and features:

– Can it be used as a primary store driver or is it intended only for CRS?
– Can schema be modified (create, alter tables, indexes)
– Data type mapping rules for module field types (to/from store type)
– Encoding/decoding functions for module field types

2 Defined
CRF Corteza record federation

Related to User stories:

  • As an admin I can see configured federation nodes as a store because I want to see all stores on one list
  • As an admin I can define location information for every federation node
  • As an admin I can see a warning on the UI when I want to share a module with private data because I might did a mistake

Summary: Additional information on Corteza record federation nodes

Each node also needs to be represented as a Store.

Each node has information and capabilities.

3
CRS Compose record store

Related to User stories:

  • As an admin I can edit module and select a different compose record store because I want to move records to another CRS.
  • As an admin I can see a list of available CRSs in the admin application because I want to see where in the world records are stored
  • As an admin I can configure a new CRS because I want to enable users to store records in it
  • As an admin I can manage existing CRS because I want to adjust attributes and capabilties
  • As an admin I can set location information for every CRS because I want to inform users where the data is stored
  • As a developer I can use CRS API endpoint because I want to get list of configured CRS resources
  • As a developer I can use CRS API endpoint because I want to add a new CRS resource
  • As a developer I can use CRS API endpoint because I want to remove an existing CRS resource
  • As a developer I can use CRS API endpoint because I want change a CRS resource

Summary: How and where compose records stored, what are restrictions and capabilities

Compose record store resource makes alternative and external data sources available for record interaction. It allows records to be stored and accessed from different databases.

Each CRS is configured with exactly one store driver and must provide all configuration options required by driver.

Store driver comes with set of attributes and capabilities that CRS must comply to and utilise when accessing and storing records.

Each CRS can raised level of data sensitivity. CRS’ data sensitivity can be raised any time but it can be lowered only to the highest used level by any of the valid compose module fields.

9 Defined
CMF Compose module field

Related to User stories:

  • As an administrator I can use a special UI that helps me configure data privacy on fields because I do not want to manually set RBAC rules to on modules, fields and records.
  • As an admin I can modify fields because I want to change data sensitivity on a field and describe how private data is used

Summary: Additional capabilities on compose module field to indicate use of private data, location and other information and capabilities provided by CRS (via Module)

Module fields can have data sensitivity level raised from low (default) to moderate, high or restricted. Module field sensitivity can not be set higher than one set on the CSR.

This allows data privacy officers to filter and locate relevant records and values in case of data-privacy request.

2 Defined
CM Compose module

Related to User stories:

  • As a an admin I can use Compose Admin because I want to manage CRS settings on a module
  • As a an admin I can use Compose Admin because I want to describe how private data in the module is used

Summary: Additional capabilities on compose module to indicate use of private data, location and other information and capabilities provided by CRS

Compose modules can have alternative (to primary) CRS configured.

2
CRV Compose record values

Related to User stories:

  • Empty

Summary: How corteza encodes record values for storage

Notes: encoders / decoders.

0
DPC Data privacy console

Related to User stories:

  • As a user I can log in into Data Privacy Console because I want see my old submissions and responses
  • As a user I can log in into Data Privacy Console because I want to download my personal data
  • As a user I can log in into Data Privacy Console because I want to submit new privacy request

Summary: How (anonymous) users can submit request review, deletion or correction of private data

3
DPO Data privacy officer console

Related to User stories:

  • As a DPO I can create a privacy-request policy because I want to control what happens to privacy requests
  • As a DPO I can access list of data privacy requests because I want to find ones that need response
  • As a DPO I can review submitted request and prepare response
  • As a DPO I can use a tool to help me find and link sensitive data with the request
  • As a DPO I can search for records with data-sensitivity fields because I want to collect requested data
  • As a DPO I can update records with sensitive data because I want to comply with the data update or removal request
  • As a DPO, I can reject submitted request because it’s not unsubstantiated or aligned with the defined privacy policies.

Summary: How assigned data officers can review and act on data-privact requests

7
RF Record formats

Related to User stories:

  • As a compose admin I can import schema from Schema.org because I want to create my models according to standard structures
  • As a developer I can use Compose API to retrieve records formatted according to JSON-LD specifications.
  • As a compose admin I can adjust defaults names and types used for generating JSON-LD payload.

Provide basic support for JSON-LD.

  1. Compose admin will be able to import pre-existing schema from Schema.org to create modules on the fly.
  2. Compose admin will be able to modify module fields and change default field names and types used for generating JSON-LD payload
  3. If request is sent with appropriate “accept” HTTP header (application/ld+json) response handler will encode record and its values into a standard JSON LD payload.
3
PS Primary store

Related to User stories:

  • Empty

Summary: Where the primary data and resources are stored

Primary store is a virtual concept that serves as a link between corteza services and configured store driver.

It holds basic store attributes that are needed to satisfy data privacy requirements:

  • Geographical location of the data
  • Level of acceptable data sensitivity
0 Defined
DSL Data sensitivity levels

Related to User stories:

  • As an admin I can see list of all available data sensitivity levels because I want to decide which ones I’ll use when configuring CRS or module fields
  • As an admin I can modify list of available data sensitivity levels because I want set a different data sensitivity policy

Summary: How sensitive is the stored data?

By default, 4 levels are installed:

  • low
  • medium
  • confidential
  • restricted

Corteza administrator can modify and adjust them.

2
Compose Record

Related to User stories:

  • As a user I want to access record’s system fields because I want to change owner or source
1

User Stories

Feature Story Status
Compose module field As an administrator I can use a special UI that helps me configure data privacy on fields because I do not want to manually set RBAC rules to on modules, fields and records.
Compose module field As an admin I can modify fields because I want to change data sensitivity on a field and describe how private data is used
Corteza record federation As an admin I can define location information for every federation node
Corteza record federation As an admin I can see configured federation nodes as a store because I want to see all stores on one list
Compose Record As a user I want to access record’s system fields because I want to change owner or source
Data sensitivity levels As an admin I can see list of all available data sensitivity levels because I want to decide which ones I’ll use when configuring CRS or module fields
Data sensitivity levels As an admin I can modify list of available data sensitivity levels because I want set a different data sensitivity policy

For now, Corteza will not enforce or prevent changes or rules. It is up to admin to properly configure the instance

A new RBAC operation is added: `data-privacy.manage-data-sensitivity-levels`.

Record Formats As a compose admin I can adjust defaults names and types used for generating JSON-LD payload.
Record Formats As a developer I can use Compose API to retrieve records formatted according to JSON-LD specifications.
Data privacy officer console As a DPO I can search for records with data-sensitivity fields because I want to collect requested data
Data privacy officer console As a DPO, I can reject submitted request because it’s not unsubstantiated or aligned with the defined privacy policies.
Data privacy officer console As a DPO I can review submitted request and prepare response
Data privacy officer console As a DPO I can update records with sensitive data because I want to comply with the data update or removal request
Data privacy officer console As a DPO I can access list of data privacy requests because I want to find ones that need response
Data privacy officer console As a DPO I can create a privacy-request policy because I want to control what happens to privacy requests
Data privacy console As a user I can log in into Data Privacy Console because I want see my old submissions and responses
Data privacy console As a user I can log in into Data Privacy Console because I want to submit new privacy request
Corteza record federation As an admin I can see a warning on the UI when I want to share a module with private data because I might did a mistake
Compose module As a an admin I can use Compose Admin because I want to describe how private data in the module is used
Compose module As a an admin I can use Compose Admin because I want to manage CRS settings on a module
Compose record store As a developer I can use CRS API endpoint because I want change a CRS resource
Compose record store As a developer I can use CRS API endpoint because I want to remove an existing CRS resource
Compose record store As a developer I can use CRS API endpoint because I want to add a new CRS resource
Data privacy console As a user I can log in into Data Privacy Console because I want to download my personal data
Compose record store As a developer I can use CRS API endpoint because I want to get list of configured CRS resources
Record formats As a compose admin I can import schema from Schema.org because I want to create my models according to standard structures
Compose record store As an admin I can set location information for every CRS because I want to inform users where the data is stored
Store drivers As a developer I can use store driver API endpoint because I want to get list of available store drivers
Compose record store As an admin I can manage existing CRS because I want to adjust attributes and capabilties
Compose record store As an admin I can configure a new CRS because I want to enable users to store records in it
Compose record store As an admin I can edit module and select a different compose record store because I want to move records to another CRS.
Compose record store As an admin I can see a list of available CRSs in the admin application because I want to see where in the world records are stored
Store drivers As an admin I can see list of all available store drivers because I want to decide which ones I’ll use

Roles

One new role is preinstalled to Corteza:

Data Privacy Officer

Type: common role

Handle: data-privacy-officer

Users that are allowed to view and manage privacy requests

Sensitive data owner

Type: contextual role

Handle: sensitive-data-owner

Expression:

(corteza::compose:record) resource.dataSourceID == userID

(corteza::compose:module-field) record.dataSourceID == userID

Used to allow access to records and values.

JSON-LD

All data in Corteza will be serialized to JSON-LD format (currently it is all serialized to JSON) with initial support for at least one semantic web ontology (e.g. schema.org).

The JSON-LD format will also be used to describe data privacy properties.

Corteza Privacy pages

Corteza Privacy: UI Prototype
Corteza Privacy: Architecture and Work Plan
Corteza Privacy: Proof of Concept Development