Corteza Privacy - Architecture and Work Plan

Geolocation of Data Sources

All data source definitions (primary, compose record store, federation) will provide geographical points (latitude and longitude) with textual descriptions of the location and infrastructure ownership.

REST Endpoints

Method	Path
POST	/system/data-privacy/requests	Submit new data privacy request.
GET	/system/data-privacy/requests	List data privacy requests.
GET	/system/data-privacy/requests/{ID}	Get details about specific request
GET	/system/data-privacy/requests/{ID}/responses	List responses for a specific request
POST	/system/data-privacy/requests/{ID}/responses	New response
GET	/system/stores	List of all known data stores (primary + CRS + federated) with their geolocation info and other data-privacy related details
GET	/system/stores/drivers	List of all available store drivers their capabilities and configuration details
GET	/compose/record-stores	List of all available CRS
POST	/compose/record-stores	Add new CRS
GET	/compose/record-stores/{ID}	Get details on a single CRS
PUT	/compose/record-stores/{ID}	Update existing CRS
DELETE	/compose/record-stores/{ID}	Remove existing CRS

Capabilities

Level	Name	Additional description	MoSCoW	Effort
Driver	SensitivityLevel	Maximum data sensitivity level supported	Must	Low
Driver	Partitioning	Can records be partitioned? Does data source (and driver) support partitioning? Can records be saved in separated partitions (tables, collections, files)?	Must	Mid
Driver	PartitionFormatValidator	Validator for generated partition name When a new module is connected to the CSR with driver that has enabled partitioning, generated partition name for that module (using PartitionFormat and actual values from the module) is verified.	Must	Mid
Driver	ValueEncodingStrategies	All supported strategies for value encoding key-value-appendix: Legacy, additional table with “_values” suffix that stores record id, field name, string, order. Obsolete json: Fields stored in a single field as object; Multi-value fields are stored as arrays Native JSON types (bool, number, string) are used	Must	High
CRS	Encrypted	Informative, Is data encrypted?
CRS	GeoLocation	Where is data stored		Low
Module	PartitionFormat	How are records partitioned? Allow different partitioning formats using simple syntax with placeholders: “fixed_name”: records are stored in the fixed partition “{module}”: records are stored in per-module partitions (for example: separate table) “corteza_{module}”: records are stored in a prefixed, per-module partitions Placeholders {module}: Module’s partition handle; module’s handle used by default	Must	High
Field	SensitivityLevel	Field data sensitivity	Must	Low
Module	ValueEncodingStrategy	Selected strategy for value encoding See “ValueEncodingStrategies” in this table for details	Must	High
Module	Action logging	Selected strategy for value encoding	Must	High
Module	Action logging	Action logging policy	Should	Low
Module	RBAC	Allow record-level access control	Could	Low
Module	Automation	Event triggering policy Allow control over events triggered by particular module Event types: before-create after-create before-update after-update before-delete after-delete	Could	Low

Resources

Types and Resources

List of Corteza types and resources that need to be implemented or extended.

Name	Action	Kind	Component
Corteza Privacy (5)
Store driver Naming StoreDriver store-driver store_driver	Implement	Type	Corteza
Primary store Naming PrimaryStore primary-store primary_store	Implement	Instance	Corteza
Primary compose record store	Implement	Instance	Compose
Compose record store RBAC Operations compose component create.record-store CRS resource read update delete If CRS can be read it can be assigned to a module (if the role is allowed to create/update a module) Events: No events are triggered.	Implement	Resource	Compose
Compose namespace	Extend	Resource	Compose
Compose module	Extend	Resource	Compose
Compose module field	Extend	Resource	Compose
Federation node	Extend	Resource	Federation
Privacy request Naming PrivacyRequest privacy_request privacy-request RBAC Operations: none Events before create after create	Implement	Resource	System
Privacy response Naming PrivacyRequest privacy_request privacy-request RBAC Operations system component data-privacy-requests.manage Events before create after create	Implement	Resource	System

Additional details on type/resource properties

Access control: what kind of access control should be implemented
Automation: new
General information: none of the new resources support resource translations

Encoding and Decoding

Operations and data that flows in and out of the store need to be properly encoded and decoded.

Standard Resource Store Operations

Lookup

Finds one resource and return error if not found.
CortezaID is the most commonly used filter as the primary key. In addition to that, a handle (textual identifier with enforced character range constraints) and other single or multiple fields can be used.
Preprocessors can be used to apply certain rules. Casting to lower-case and avoiding case sensitivity is the most usual one.
Ensures resource is decoded in a way that no information is lost.

Search

Returns all matching resources but do not yield error if none are found.
Searching can support structured filtering, FTS, sorting and cursored pagination.
Ensures resource is decoded in a way that no information is lost.
Must support per-resource access-control callbacks to ensure proper pagination

Create

Must support per-resource access-control callbacks to ensure proper pagination
Store can support storing of new resources.
Stores a new resource using provided resource ID.

Update

Store can support updating existing resources
Updates an existing resource using provided resource ID.
Ensures resource is encoded in a way that no information is lost.

Delete

Removes a resource. It accepts the same conditions as lookup operation.

Truncate

Removes all resources

Compose Records Store Operations

In addition to standard resource store operations, CRS handles mapping and type encoding values from module field types to store’s internal types and vice-versa for decoding.

Each field kind must be properly encoded for compatibility with store and harvest its full potential.

Privacy request policy

Defines how new requests are treated.

Options:

auto-approve lookup
auto-approve change
auto-approve removal

Policies are stored in a setting.

Extra level of access control is implemented to protect certain setting prefixes according to allowed RBAC operations on component.

Features

Abbr	Feature	User stories	Status
SD	Store drivers Related to User stories: As a developer I can use store driver API endpoint because I want to get list of available store drivers As an admin I can see list of all available store drivers because I want to decide which ones I’ll use Summary: How can we connect to data storage, what are its restrictions and capabilities Store driver serves as an interface between Corteza services and underlying storage. it can provide full support for all Corteza store needs and act as primary store or provide partial, specialised features, e.g.: caching layer or compose record read and write features. Architecture is designed in a way to support SQL NoSQL, plain-text file on the disk or remote web-service, REST API. All these storage types might not support all features and can not be a primary store. Each driver needs to inform the CRS using it, what kind of behaviour to expect, what capabilities can be used and how. This information is then transferred to the compose module that is using the CRS. All relevant capabilities from driver, CRS and module are merged and accessible via API to enable clients (frontend web applications) to provide the best possible user experience. In addition to capabilities, driver also provides required and available configuration options. These options are then used to configure a CRS. Available drivers are statically defined in the code only. Handpicked few are included in the codebase, additional drivers can be brought in via plugins. Examples of driver capabilities and features: – Can it be used as a primary store driver or is it intended only for CRS? – Can schema be modified (create, alter tables, indexes) – Data type mapping rules for module field types (to/from store type) – Encoding/decoding functions for module field types	2	Defined
CRF	Corteza record federation Related to User stories: As an admin I can see configured federation nodes as a store because I want to see all stores on one list As an admin I can define location information for every federation node As an admin I can see a warning on the UI when I want to share a module with private data because I might did a mistake Summary: Additional information on Corteza record federation nodes Each node also needs to be represented as a Store. Each node has information and capabilities.	3
CRS	Compose record store Related to User stories: As an admin I can edit module and select a different compose record store because I want to move records to another CRS. As an admin I can see a list of available CRSs in the admin application because I want to see where in the world records are stored As an admin I can configure a new CRS because I want to enable users to store records in it As an admin I can manage existing CRS because I want to adjust attributes and capabilties As an admin I can set location information for every CRS because I want to inform users where the data is stored As a developer I can use CRS API endpoint because I want to get list of configured CRS resources As a developer I can use CRS API endpoint because I want to add a new CRS resource As a developer I can use CRS API endpoint because I want to remove an existing CRS resource As a developer I can use CRS API endpoint because I want change a CRS resource Summary: How and where compose records stored, what are restrictions and capabilities Compose record store resource makes alternative and external data sources available for record interaction. It allows records to be stored and accessed from different databases. Each CRS is configured with exactly one store driver and must provide all configuration options required by driver. Store driver comes with set of attributes and capabilities that CRS must comply to and utilise when accessing and storing records. Each CRS can raised level of data sensitivity. CRS’ data sensitivity can be raised any time but it can be lowered only to the highest used level by any of the valid compose module fields.	9	Defined
CMF	Compose module field Related to User stories: As an administrator I can use a special UI that helps me configure data privacy on fields because I do not want to manually set RBAC rules to on modules, fields and records. As an admin I can modify fields because I want to change data sensitivity on a field and describe how private data is used Summary: Additional capabilities on compose module field to indicate use of private data, location and other information and capabilities provided by CRS (via Module) Module fields can have data sensitivity level raised from low (default) to moderate, high or restricted. Module field sensitivity can not be set higher than one set on the CSR. This allows data privacy officers to filter and locate relevant records and values in case of data-privacy request.	2	Defined
CM	Compose module Related to User stories: As a an admin I can use Compose Admin because I want to manage CRS settings on a module As a an admin I can use Compose Admin because I want to describe how private data in the module is used Summary: Additional capabilities on compose module to indicate use of private data, location and other information and capabilities provided by CRS Compose modules can have alternative (to primary) CRS configured.	2
CRV	Compose record values Related to User stories: Empty Summary: How corteza encodes record values for storage Notes: encoders / decoders.	0
DPC	Data privacy console Related to User stories: As a user I can log in into Data Privacy Console because I want see my old submissions and responses As a user I can log in into Data Privacy Console because I want to download my personal data As a user I can log in into Data Privacy Console because I want to submit new privacy request Summary: How (anonymous) users can submit request review, deletion or correction of private data	3
DPO	Data privacy officer console Related to User stories: As a DPO I can create a privacy-request policy because I want to control what happens to privacy requests As a DPO I can access list of data privacy requests because I want to find ones that need response As a DPO I can review submitted request and prepare response As a DPO I can use a tool to help me find and link sensitive data with the request As a DPO I can search for records with data-sensitivity fields because I want to collect requested data As a DPO I can update records with sensitive data because I want to comply with the data update or removal request As a DPO, I can reject submitted request because it’s not unsubstantiated or aligned with the defined privacy policies. Summary: How assigned data officers can review and act on data-privact requests	7
RF	Record formats Related to User stories: As a compose admin I can import schema from Schema.org because I want to create my models according to standard structures As a developer I can use Compose API to retrieve records formatted according to JSON-LD specifications. As a compose admin I can adjust defaults names and types used for generating JSON-LD payload. Provide basic support for JSON-LD. Compose admin will be able to import pre-existing schema from Schema.org to create modules on the fly. Compose admin will be able to modify module fields and change default field names and types used for generating JSON-LD payload If request is sent with appropriate “accept” HTTP header (application/ld+json) response handler will encode record and its values into a standard JSON LD payload.	3
PS	Primary store Related to User stories: Empty Summary: Where the primary data and resources are stored Primary store is a virtual concept that serves as a link between corteza services and configured store driver. It holds basic store attributes that are needed to satisfy data privacy requirements: Geographical location of the data Level of acceptable data sensitivity	0	Defined
DSL	Data sensitivity levels Related to User stories: As an admin I can see list of all available data sensitivity levels because I want to decide which ones I’ll use when configuring CRS or module fields As an admin I can modify list of available data sensitivity levels because I want set a different data sensitivity policy Summary: How sensitive is the stored data? By default, 4 levels are installed: low medium confidential restricted Corteza administrator can modify and adjust them.	2
	Compose Record Related to User stories: As a user I want to access record’s system fields because I want to change owner or source	1

User Stories

Feature	Story	Status
Compose module field	As an administrator I can use a special UI that helps me configure data privacy on fields because I do not want to manually set RBAC rules to on modules, fields and records.
Compose module field	As an admin I can modify fields because I want to change data sensitivity on a field and describe how private data is used
Corteza record federation	As an admin I can define location information for every federation node
Corteza record federation	As an admin I can see configured federation nodes as a store because I want to see all stores on one list

Compose Record	As a user I want to access record’s system fields because I want to change owner or source

Data sensitivity levels	As an admin I can see list of all available data sensitivity levels because I want to decide which ones I’ll use when configuring CRS or module fields
Data sensitivity levels	As an admin I can modify list of available data sensitivity levels because I want set a different data sensitivity policy For now, Corteza will not enforce or prevent changes or rules. It is up to admin to properly configure the instance A new RBAC operation is added: `data-privacy.manage-data-sensitivity-levels`.
Record Formats	As a compose admin I can adjust defaults names and types used for generating JSON-LD payload.
Record Formats	As a developer I can use Compose API to retrieve records formatted according to JSON-LD specifications.

Data privacy officer console	As a DPO I can search for records with data-sensitivity fields because I want to collect requested data
Data privacy officer console	As a DPO, I can reject submitted request because it’s not unsubstantiated or aligned with the defined privacy policies.
Data privacy officer console	As a DPO I can review submitted request and prepare response
Data privacy officer console	As a DPO I can update records with sensitive data because I want to comply with the data update or removal request
Data privacy officer console	As a DPO I can access list of data privacy requests because I want to find ones that need response
Data privacy officer console	As a DPO I can create a privacy-request policy because I want to control what happens to privacy requests

Data privacy console	As a user I can log in into Data Privacy Console because I want see my old submissions and responses
Data privacy console	As a user I can log in into Data Privacy Console because I want to submit new privacy request

Corteza record federation	As an admin I can see a warning on the UI when I want to share a module with private data because I might did a mistake
Compose module	As a an admin I can use Compose Admin because I want to describe how private data in the module is used

Compose module	As a an admin I can use Compose Admin because I want to manage CRS settings on a module
Compose record store	As a developer I can use CRS API endpoint because I want change a CRS resource
Compose record store	As a developer I can use CRS API endpoint because I want to remove an existing CRS resource

Compose record store	As a developer I can use CRS API endpoint because I want to add a new CRS resource
Data privacy console	As a user I can log in into Data Privacy Console because I want to download my personal data

Compose record store	As a developer I can use CRS API endpoint because I want to get list of configured CRS resources
Record formats	As a compose admin I can import schema from Schema.org because I want to create my models according to standard structures
Compose record store	As an admin I can set location information for every CRS because I want to inform users where the data is stored
Store drivers	As a developer I can use store driver API endpoint because I want to get list of available store drivers
Compose record store	As an admin I can manage existing CRS because I want to adjust attributes and capabilties
Compose record store	As an admin I can configure a new CRS because I want to enable users to store records in it
Compose record store	As an admin I can edit module and select a different compose record store because I want to move records to another CRS.
Compose record store	As an admin I can see a list of available CRSs in the admin application because I want to see where in the world records are stored
Store drivers	As an admin I can see list of all available store drivers because I want to decide which ones I’ll use

Roles

One new role is preinstalled to Corteza:

Data Privacy Officer

Type: common role

Handle: data-privacy-officer

Users that are allowed to view and manage privacy requests

Sensitive data owner

Type: contextual role

Handle: sensitive-data-owner

Expression:

(corteza::compose:record) resource.dataSourceID == userID

(corteza::compose:module-field) record.dataSourceID == userID

Used to allow access to records and values.

JSON-LD

All data in Corteza will be serialized to JSON-LD format (currently it is all serialized to JSON) with initial support for at least one semantic web ontology (e.g. schema.org).

The JSON-LD format will also be used to describe data privacy properties.

Corteza Privacy pages

– Corteza Privacy: UI Prototype
– Corteza Privacy: Architecture and Work Plan
– Corteza Privacy: Proof of Concept Development

Geolocation of Data Sources

REST Endpoints

Capabilities

Resources

Types and Resources

Additional details on type/resource properties

Encoding and Decoding

Standard Resource Store Operations

Compose Records Store Operations

Privacy request policy

Features

User Stories

Roles

Data Privacy Officer

Sensitive data owner

JSON-LD

Corteza Privacy pages

About

Community

Resources

Follow