Data Architecture

Data Architecture is the process for structuring data so that it supports all Data Use Scenarios. In typical data warehouses, data marts, and operational data stores, data structures can be in third normal form (3NF), dimensional or snowflake form to facilitate drill-down dashboards, de-normalized data structure extracts for end users, or some combination of them. In each of these, the underlying technology is usually a relational database management system.

However, with the advent of text, voice to digital, social media, sensor, and other semi-structured and unstructured data, data does not always fall neatly into a relational database schema. In addition, as data use scenarios require time-sliced and time-aware data, enterprise, business unit, and local data hierarchies, classifications, and aggregates, and other specific demands for trustworthy data, data architecture has become more sophisticated.

Data architecture for trustworthy data needs to address:

  • Capturing unmodified data values from SSoRs but appending columns to each SSoR table row to record pertinent “extract” related facts such as SSoR name/provenance, data acquisition run information, and so forth;
  • Persisting SSoR data as needed to support real-time integration;
  • Storing the lowest level of grain / most granular version of each data element;
  • Supporting complete (every change to a data element) history of data over time;
  • Providing multiple perspectives of data elements;
  • Combining time-variance and multiple perspectives of content;
  • Providing Audit, Balance, and Control (ABC) data controls based on SSoR data content to control critical values such as row counts, unique identifiers, total dollars, and so forth;
  • Including ABC for data stored in Errors and Exceptions (E+E) for User follow-up and corrective actions;
  • Including data objects for storing and analyzing ABC and E+E results;
  • Persisting ABC and E+E metadata and data for analysis, metrics, and reporting;
  • Supporting use of interim data in a work area while it is made “trustworthy.”