Data Acquisition

Data acquisition is sourcing data for a Trustworthy Data Repository. When Business Intelligence began, the concept was to source data for a data warehouse from a Source System of Record (SSoR), usually the system that first captured the data. However, as data stores proliferated throughout companies ETL (data extract, transform, and load) processes began to source data from convenient locations rather than a true SSoR. As a result, it is not unusual for an organization to not know where data originates, which is clearly a problem for data trustworthiness.

Whether an organization uses ETL or ELT (extracting, loading, and transforming data in place), it is essential to source each and every data element from its SSoR whenever possible to get it into a trustworthy data repository quickly and consistently.

But what must data acquisition process do to make data trustworthy? Trustworthy data is data that is:

  • Ready to use without any further manipulation or preparation by a user of the data, for any Data Use Scenario;
  • Adheres to all known business rules established for it, which requires knowing, documenting, and applying these rules in the data acquisition process and present them, as needed, so users can understand how the data they rely on has been prepared for their use;
  • Integrates into the Governance process to correct data exceptions because data exceptions may not simply flow into a data repository without correction, otherwise it might corrupt business analytic results.