Skip to content

Mastering Master Data

Understanding Master Data

Data servers behind glass panels with a network illustration
Master data is emerging as an essential business capability.

It can provide accuracy and control over critical business information and objects. Yet success with master data is not being realized because it remains poorly understood in the business community. The root cause of this situation is viewing master data as a data, rather than business, problem.

What is Master Data?

Unfortunately, this simple question is difficult to answer because master data has consistently been viewed as a technical problem arising from duplicated data used in an inconsistent manner by business units and application systems.

Consider this definition: “Master Data Management (MDM), also known as Reference Data Management (NB: I’ll make the distinction between Master and Reference Data later in this analysis), is a discipline in Information Technology (IT) that focuses on the management of reference or master data that is shared by several disparate IT systems and groups. MDM is required to warrant consistent computing between diverse system architectures and business functions” (Wikipedia).

Or this view of master data as: “A new way to correct the age-old problem in companies that the left-hand does not know what the right hand is doing. The goal: Merge all the disparate, oft conflicting records you have on customers and transactions into one authenticated master file” (Primer: Master Data Management, by S. Jae Yang,

And one last definition from a respected expert on data and data quality: “Master data sets are synchronized copies of core business entities used in traditional or analytical applications across the organization, and subjected to enterprise governance policies, along with their associated metadata, attributes, definitions, roles, connections and taxonomies” (Defining Master Data, by David Loshin,

With the prevailing view that master data is a technical issue, it is no wonder that master data is poorly understood by business people. Master data is first and foremost a business issue and treating its technical aspects treats the symptoms, not the underlying problem. Unless this basic fact is accepted and then acted upon, a master data initiative will fail because it will influence the information technology environment only, not the business and its operation.

Why Master Data is a Business Problem

Where can a business person go for an accurate list of all the company’s customers, products, suppliers, or contracts?

Business needs access to pertinent data on these business entities for many business purposes, including validating customers and suppliers, orders and invoices, and receivables and payables. Every business has this critical information distributed among several transaction processing and back office systems, but having this data easily available in one place allows the business to easily and effectively deal with these issues and more in the business. If no such single place exists, these business entities cannot be controlled or managed effectively. Creating this single place of business information for critical business entities is the purpose of master data.

My definition of master data is this: “Master data is the official data representation of the real entities which are part of the business and where real objects are those that physically exist in the world.” Let’s look at this definition piece by piece:

  • Official means that master data is always the correct, accurate, complete, and, for all purposes, official information about the physical business entity. As such, master data is a critical element for regulatory compliance and audit support.
  • Data representation means that master data is data, not the thing itself. However images of products, offices, employees, contracts, and so forth can also be elements of master data.
  • Real entities are those that physically exist in the world including customers, suppliers, products, contracts, office locations, employees, and other entities that a company uses or creates in doing business.
  • Which are part of the business means that only the entities and facts pertinent to the business are included in master data.

Looked at in this manner, master data is the accurate recording of data attributes about critical business entities that can be correlated with the “real world.” Master data is the representation of all that is real in the business.

When viewed as an important business element, master data has a clear impact on the business.

For example, if you sell a product to a customer, it is increasingly necessary to know what other products that customer has bought from you. If you are a financial services company, Patriot Act requirements require this, as does HIPAA for hospitals, and so forth. Even if you sell products that have warrantees, such as auto parts, it can be necessary to know that other required parts were purchased so the warranty can be honored. Customers, products, locations, contracts, sales, and all other master data are central to the operation of business today.

Some important business elements are not included in this definition. These include organization (used to organize information, reports, metrics, and so forth), categorizations (such as a hierarchy of products, classification of offices or locations, and so forth), and other means used to understand the business. These elements are used throughout a business but they are abstract, that is they don’t physically exist in the real world.

This is reference data, for which my definition is this: “Reference data is the official data representation of the abstract structures used throughout a company to understand the organization, classification, or other perspectives of a business’s real entities.” Business Intelligence professionals will recognize reference data as consistent with dimensions in dimensional models.

Master data and reference data cover the real (companies, customers, products, offices, vendors and so forth) and abstract (organization structure, hierarchies and classifications of products, customers, vendors and so forth) elements of a business. Master and reference data are business problems and not data problems because the business, not IT, is responsible for knowing, managing and keeping these elements of the business correct and consistent. Business elements and associated master data are subject to compliance and audit; and reference data, though less so, should be handled as rigorously as master data in all other regards. For this reason, master data will be the term used to refer to both master data and reference data for the remainder of this series.

How IT Contributes to Problems with Master Data

A person touching a screen with data visualizations
Let me be blunt: decades of business and IT focus on applications and functionality with little or no regard to managing data separately from application systems have created an unmanaged data environment.

In fact, it cannot even be called a data environment – data in nearly every business is simply an extension of its application systems. Addressing master data requires IT to separate data from the applications that use it.

This is not easy. Applications and their data structures are tightly coupled and lack the flexibility needed to separate them easily. This problem is aggravated by purchased software where software vendors strive to protect their intellectual property. Anyone who has acquired a major back or front office application knows that it is not easy to access a software system’s data stores to add, update, or delete data or alter physical data structures without affecting the integrity of the application. While SOA is meant to decouple software units and data, it is clear that this is not happening fast enough to solve the master data problem, especially with purchased and proprietary software.

There are several reasons why tying a master data solution to a business application or platform are a risk:

  • master data is used in many applications (according to a survey by Tower Group, companies maintain master data separately in at least 11 or more source systems), so maintaining master data in one application system to serve all applications that use it will likely overload the throughput capability of the chosen application;
  • different applications use different subsets of master data, so the chosen application will need to be modified to store all master data attributes needed by all the applications in the business;
  • modifying an application may affect the ability to keep the application current with application upgrades and enhancements, especially if the application is purchased (I have on more than one occasion worked with a client in the unfortunate circumstance of being unable to upgrade to a new software release because of the time and cost associated with reapplying custom ‘enhancements’ already implemented and used in the business – a situation that festers until the application itself must be, at great cost, replaced or upgraded with significant rework);
  • this complicates the future of your master data, especially if the vendor goes out of business, the application is superseded, sold, or simply dropped, or if the business unit that uses the application wants to replace it for business reasons.

It isn’t as if the tactic of making an application “strategic” has worked before. Many companies have had initiatives to develop a customer, product, or other “master file,” many more than once, but because data has not been separated from applications the problem reappeared overnight.

IT contributes to problems with master data because IT treats master data solutions as another application. When master data is tackled, it is typically as customer data integration (CDI) or product information management (PIM) application. This adds another application process, albeit one that can resolve some issues with subsets of master data, into the application infrastructure of IT.

The overall impact of this approach is problematic because implementing the solution as a new application continues the process of correcting data after it has been entered into a transaction processing system. “Master data” constructed in this way is not always, the correct, accurate, complete, and, for all purposes, official information because it is updated after one or another transaction processing system. A true master data solution ensures that the master data repository is the only place where master data is added, changed, or deleted.

This aspect fundamentally changes the interaction between master data and applications that use it. Applications will not add, update, or delete master data records. Instead, they will need to use an API to invoke a data delivery service for master data to perform add, update, or delete functions in the master data repository. If master data is physically required to reside in an application system, it is copied directly from the master data repository, never added independently to the application’s data store.

While this is will create application problems, it is necessary for success because master data is an enterprise issue and must be handled as such.

When decisions are made at the division or department level, master data cannot be solved by either the business or IT. Fundamentally, the process for making these decisions, called governance, needs to be addressed in order to deal with master data effectively.

However, master data for now is stuck inside a wide array of application systems where it is often inconsistent, incorrect, and unreliable. Yes, distributed or federated master data is a data problem and developing a comprehensive master data solution must tackle these data problems head on. This requires dealing with the quality and integration of data in application systems, a process which is well defined and supported by products, sources of business data external to the company, and other resources to supplement and improve a business’s in-house data.

These process and technologies for data quality and integration have been promoted by vendors as the solution to master data management but this is not the case. The full extent of the data problem for master data goes beyond data quality and integration—it begins with information architecture. Consider the following issues and questions:

  • Platform and logistic issues: Where will the master data repository reside? How will master data content be delivered to the applications, data warehouses and marts, and business users who need it? Will all technical platforms used for these be supported and if not, how will master data inconsistencies be avoided?
  • Data and content issues: Which data attributes will be included in master data and which will be left in application systems? Why are the selected attributes considered master data (centralizing the data is not a valid reason)? Is unstructured data required?
  • Master data and application integration issues: When an application needs master data to be in its data store, how will that data be kept consistent with the master data repository? When applications “share” or transmit master data to one another, how will they be kept in synch across the business? When these applications have identifiers and keys different from each other and/or from the master data repository, how will the data be kept correct even when different keys are being used?
  • Data delivery and security issues: Master data will be some of the most sensitive data in the business, so how will access to it be controlled? Will master data be encrypted? Will its transmission to an application be encrypted? What protocols will be used for data delivery services and authorizations used for master data?
  • Data audit, control, and compliance issues: What audit processes will be put in place to ensure that the master data repository is the official, correct version of master data and that no exceptions exist in application data stores? What processes and controls are in place to ensure that master data is only accessed by authorized people and applications? How is regulatory compliance, such as for financial or patient data, ensured for appropriate master data?
  • Master data management issues: Who is accountable for managing master data? How is master data managed? How is the usage of master data by business people, applications, and IT controlled? How is master data kept current and correct? How are master data reconciled with corresponding data values in application data stores?

Only by developing answers, processes, and structures for these issues and questions will a comprehensive master data solution be possible. Once master data has been architected into the information framework of the business, a successful master data solution can begin to be developed.

Succeeding with Master Data

Success with master data is a challenging proposition.

Thinking strategically, master data is at the heart of critical initiatives in many businesses today. Whether your company is committed to being customer-centric, productfocused, or operationally excellent, master data is essential for success and the degree of success will depend on how your company manages master data and its necessary responsibilities.

Because master data is a business problem first and foremost, business objectives and issues must be addressed on an enterprise, not line of business, basis: For customer master data, who is responsible for customer satisfaction and what data do they need? For product master data, who decides what products and services are offered by the company and who is responsible for ensuring that pricing is consistent across all distribution channels? These questions are typical and their answers cross organizational boundaries and must be addressed to achieve business objectives successfully.

The first requirement for success with master data is that it must tie to a strategic business initiative. Let’s look at a few strategic initiatives occurring in business today and see what they tell us about master data:

Becoming customer-centric

Many businesses understand that customers need to be understood as individuals and households rather than simply transaction based entities. These businesses are looking at the value of customers, overall: their lifetime value to the company; the nature of their business relationship, whether it’s based on service, price, or other factors; and the profitability of the customer.

Customer-centric companies recognize the importance of consolidating, integrating, and confirming data on their customers and often this is where they start. However, it is critical, if the company truly wants to become customer-centric, that essential business questions be addressed. If your company knows a customer has a high lifetime value and is worth keeping, which organization unit is responsible for keeping the customer satisfied with your company’s products and services? Generally, the answer is no one business unit.

This is the first challenge to becoming customer-centric: recognizing that a series of fragmented customer interactions delivered by separate customer service, account, product, or billing business units does not ensure customer satisfaction. Businesses where customers churn are beginning to tackle this challenge in order to keep their high-value customers.

The best practice in this case is to establish a customer retention unit to monitor customer interactions with the business and take appropriate actions. By looking at all of a customer’s interactions with the business, patterns that indicate a customer may be considering severing his or her relationship with the company can be detected. An example of the need for this is a global financial services company. How will a branch service representative know that the customer across the desk is an important customer traveling internationally—and by not knowing, what treatment will the customer receive? How will this treatment affect the customer’s loyalty to the company?

Scenarios such as this help define the content of customer master data: all of a customer’s accounts, interactions, and other data required to understand the importance of the customer to the company need to be available everywhere a customer can interact with the company.

Also, establishing a customer retention unit requires rethinking account and customer service responsibilities. Customer master data needs to include the data needed to detect behavioral patterns so that the need for a customer retention intervention can be determined. This separate business unit can intervene and act on the customer’s behalf to resolve any issues and answer any questions. Where this has been done, churn has been successfully reduced.

It is important to note that being customer-centric does not mean treating all customers the same. Being customer-centric means that you know who your high-value customers are, who your unprofitable customers are, and treat each accordingly.

Becoming product-centric

Many businesses desire to differentiate themselves on the quality and innovation of their product offerings. However, in large companies, product offerings originate in different departments and business units. This results in a line-of-business approach to products with little thought given to a company’s overall product strategy. Product-centric companies recognize that a focus on product management is critical for success, particularly focusing on product profitability, product differentiation, and product appeal to customer demographics.

These companies recognize the importance of looking at products as critical factors in driving market success. This requires looking at: how well a product is doing in the market (is its market share increasing faster than the competitions, is the rate of growth consistent, growing, or declining, and so forth); product quality, especially if quality can be affected by suppliers; and product profitability.

This represents challenges to the business as well. Assessing product mix, profitability, and market share require application of a consistent methodology and criteria not usually found in companies with multiple lines of business. Further, assessing quality and root causes of quality problems usually crosses organizational boundaries.

The best practice in this case is to establish a product management unit. This unit evaluates products, markets, product performance, and product plans. A product management unit presents challenges to the business organization, especially around product evaluations, plans, and investment decisions.

Product master data that supports product management therefore needs to include all products offered by the company, the cost and revenue factors for each product, its quality measurements, its market data, including market data for competitors and their products, and data on customers’ needs, problems, and demographics for market analysis and product development.

Becoming operationally excellent

Many businesses are focused on improving business operations intending to become operationally excellent. This requires a focus on business processes, performance, and effectiveness. Operationally excellent companies recognize that a focus on operations, especially performance metrics, is critical for success.

The business challenge here is to look beyond organizational structure as the sole means to assess operational performance. Business processes, which often cross organizational boundaries, need to be considered. Individual business “pain points,” like increasing inventory turns, optimizing service staff, and any identifiable instance which can affect the bottom line also need to be included.

This is different than becoming customer- or product-centric. Here the business is looked at in organizational and abstract ways, such as by work location, business process, organization structure, and so forth. There is no business unit responsible for operational excellence—this is a responsibility of executive management. The challenge is for individual managers to be assigned accountability for improving a performance metric or an identified “pain point.”

The best practice in this case is to establish a strategic performance management unit. This unit tracks strategic business goals and the metrics that affect or measure them. This requires defining and understanding business operations from the perspectives of the organizational structure and the flow of work through business processes.

Master data in this context consists of reference data (see the definitions of master and reference data above), business processes, metrics, key performance indicators (KPIs), and the underlying data that support them. Industry benchmarks are also required in order to determine operational superiority objectively. These are familiar to data warehousing, business intelligence, and executive scorecard/dashboard professionals.

Students of business literature will recognize the above categories as areas of strategic emphasis written about, I believe, by Michael Treacy. While a company may set its strategic focus on one of these areas, every company is struggling with customer value and retention, product profitability and management, and operational excellence. Therefore, an effective master data strategy needs to address all three.

What do these strategic initiatives show us about master data? First, they challenge the business to address new organizational challenges and structures to be effective. Second, all require data that does not exist in existing transaction systems, including: customer demographic data for determining lifetime (potential) value; market and competitive data for products; and industry benchmarks for performance comparisons.

There are operational and IT implications in how master data is implemented: How will master data be established? How will it fit in with existing and future application systems? How will this affect decision making about application systems and business functionality? These questions also cross existing organizational boundaries and must be addressed for the success of the business. There are important IT factors that are essential for the success of a master data solution:

Technologies used for master data must conform to a rigorous architecture

Too many IT organizations focus on technology issues such as whether a master data repository solution should be federated, distributed, or a single data store. Also, there are specialized needs for Customer master data and for Product master data and it seems as if different technologies and solutions may be required to satisfy their unique problems. Generally, these issues are driven by capabilities of products promoted by vendors as master data solutions. Products and vendors are essential for success with master data but their selection and use must be driven by a rigorous master data architecture.

Master data must be maintained in a single master data environment

Even though there are significant differences in the domains, characteristics, and uses of customer, product, and reference data. Product master data, for example, must serve pricing, distribution channels and catalogs, warranty and return processes, and profitability analysis and planning while customer data serves customer service, billings and receivables, and order taking. Because these have such diverse differences, they are often implemented as separate solutions.

However, the control issues of managing access to and use of master data, the audit and compliance issues of ensuring the correctness of master data, and the security issues of data encryption are best served by maintaining all master data, in spite of its diversity, in a single master data environment. Using master data in transaction applications, data warehouses, and enterprise reporting should be facilitated by a set of SOAcompliant data delivery services constructed to address these issues.

Master data must have a consistent set of policies and practices

The master data solution can only be effective if it is administered consistently across data domains and their application. This requires a consistent set of policies and practices, particularly monitoring master data content, quality, and usage for overall management and control.

Master data requires effective governance

Clearly, management and control of master data requires an effective governance and decision-making process. This must go beyond data quality and address the role and structure of master data in the company’s business, information, and application architectures. Governance is critical for effective IT support of the business. For more on this topic, see my paper. “A Manager’s Guide to Successful Governance.”

While master data is a responsibility of the business, how IT implements and manages it will determine its success.

Establishing an effective master data solution requires planning, activating, and controlling steps

How to Succeed with Master Data

Here are a few tips to help your team succeed

Planning for Master Data

  1. Identify ways in which centralized master data will be used to improve the business and set business improvement targets— this is the first step toward developing a master data solution. Without this, master data will become a temporary IT technical fix.
  2. Establish the customer retention, product management, and strategic performance management business units and name the executive sponsor of each—this builds on the business focus by chartering the business unit that will use master data for new capabilities and naming the senior executive who will be responsible for their implementation and results.
  3. Define the mission, goals, objectives, and measurements for the new business units—this formalizes the unit’s purpose and expected results and makes master data critical for meeting business objectives.
  4. Establish master data management governance, policies, and practices—this provides the specification for master data governance and management.
  5. Architect the master data environment— this establishes the technical foundation for the master data solution.

Activating Master Data

  1. Establish one master data repository for all master data—this requires consolidating and integrating data from applications and other systems to contain a single repository of master data. An information architecture is essential for developing an effective master data solution.
  2. Ensure that master data aligns with the ‘real world’—this is a different level of integrity that for a data warehouse, where data is expected to match that for its source system of record. Here, customer master data is expected to be true for the real-world customer. This is another reason why the business is responsible for master data.
  3. Ensure that the master data repository is the one, official source for master data that is always correct and up-to-date at any given instant on any given day—this is the most rigorous technical standard for master data. If this condition is not met, then the master data solution is a neverending data integration effort which will require reconciliation and transaction correction processes to keep master data correct.
  4. Establish data delivery services for all master data—making master data accessible to applications and systems that need it will require data delivery services for the master data repository. This is important for integrating master data into applications.
  5. Establish external sources of data as required—as noted above, external data will be required to maximize the value of your master data. Plan for it and include it from the beginning of your master data solution.

Controlling Master Data

  1. Monitor master data quality and correctness—because the real world is likely to change faster than your master data. External data sources can be used to confirm or correct aspects of your master data. In any case, this is essential for the long-term usefulness of your master data.
  2. Ensure that IT applications integrate properly with master data—without ensuring that applications are adhering to the use of data delivery services for master data and the application ability to add, change, or delete master data values remains disabled, master data inconsistencies will creep back into the applications and systems environment and create uncertainties about the usefulness of your master data.
  3. Make adjustments as required to eliminate problems and improve effectiveness—new business uses for master data will develop, occasional problems will be found, new data will be required, and other demands will be made. Adjustments should be made as needed to keep your master data usable and useful.

None of this is easy, as seen in the business and IT challenges discussed above, and master data solutions are just now coming into focus in many companies. Master data is a discipline and a journey, not simply a project or a task. It requires determining how to use master data for the benefit of the business, addressing data quality and integration issues, decision processes for application integration with master data, and measuring effectiveness. Master data will continue to be an ongoing business element and, if these guidelines are followed, can be an asset to any business.

Download a PDF version of this whitepaper by filling out this form