|
Return to Newsletter Contents...
Reference Data Management: with Excel?
by: Sean Beal, Senior Consultant,
CRCP
Most reporting environments are a challenging mix of
ever-changing requirements, moving targets, and ad hoc requests.
It can be difficult to manage these challenges and respond in a way that
satisfies the needs of the end user and maintains the integrity of your
company’s data. A particularly
sticky area is that of reference data management.
The Set Up
What is ‘reference data’?
Well, that depends on who you ask!
Often it means that kind of data that is sometimes referred to as ‘lookup
data’ or ‘domain data’. This is the
data that categorizes other data. Some common examples of this might include
country, state, and county codes, gender codes, or transaction codes.
When these examples are implemented in an ODS (Operational Data Store) or
Data Warehouse, they typically have a common structure that includes a single
primary key and a description field.
They also typically have a slow rate of change.
However, there is also that type of reference data that
describes the manner in which a business (or some portion of the business) is
managed or conducted. Examples of
this could include state tax rate tables used when generating payroll, or
commission calculation matrixes. These
examples don’t have a common structure, and can change rapidly, making them
difficult to manage.
So how can we handle this rapidly changing, variably
structured reference data? One
common way would be to purchase or develop custom software to allow the owners
of the data to enter and update the data as they please.
This data could then be extracted and integrated into the ODS or Data
Warehouse. In a perfect world (one
with unlimited budgets and fixed requirements), this would work…well, perfectly.
Even in the very imperfect real world,
we might be able whip up a web front-end in order to let users manage back-end
data like this; however, these types of projects typically suffer from a lot of
scope creep as additional features are requested and new types of reference data
are needed. We then end up bolting
on interface after interface, until our simple web front-end becomes a tangled
beast. Perhaps the budget simply
doesn’t allow for software development or purchase.
What can we do?
Make it work…
There already exists, on nearly every end-user’s computer,
a flexible front-end interface for managing data:
that’s right, Microsoft Excel: that labyrinth of formulas referencing
cells referencing tabs, the bane of ‘single source of the truth’, that scourge
of Crystal Report designers everywhere.
It can be put to good use; you simply need convention and a good ETL tool
like SQL Server Integration Services (SSIS) or Data Integrator.
Let’s consider one of the examples mentioned above, that of
commission calculation data. A
commission plan’s purpose is usually to drive sales or motivate performance
while advancing the business goals of the company.
As the methods for this change, the commission plan evolves as well.
This could happen from one year to the next, one quarter to the next, or
even month to month. We need to put
a plan in place for gathering this data, integrating it into our ODS or Data
Warehouse, and using it to generate reports for the Payroll department.
For the purposes of this article, we’ll assume what is usually true:
there is little time and even less budget with which to accomplish our
goals. What we do have is
Excel and SSIS.
As always, our first steps are to define and design.
We’ll need to know what output is required, what needs stored and for how
long, and with what frequency we’ll be aggregating or manipulating the data.
The answers to these questions will allow us to build the back-end
structures and processes; the machinery into which our reference data gets fed.
These back-end structures are not tied to our reference data management
solution, and would likely look the same no matter what we used.
Now we can leverage Excel, convention, and a good ETL tool.
By creating an Excel template for data entry and modification, and
defining a known storage location for these templates (convention), we can use
an SSIS package to loop through any existing files, load the data into our
target structures, and invoke our back-end processes.
The process might look something like this:
1. Owners of the commission data create, update, or
delete information in the Excel template.
The Excel file or files are then stored in a pre-determined network
location.
2.
ETL processes (SSIS, Data Integrator) use the
Excel templates as a data source to feed the back-end structures and processes
that calculate the commission amounts based on a pre-determined schedule.
3.
Calculation results are delivered to the Payroll
department for further evaluation and use.
Make it work better…
Of course, there are some risks here.
Users may enter data incorrectly in the spreadsheet template, or store
the template in the wrong location altogether.
There can be difficulties with managing primary keys.
In general, there is less control of the process than a purchased or
custom-developed entry/management system might provide.
On the other hand, we can quickly and easily build the
scaffolding to support our business process, reducing the time to deployment and
allowing the primary objectives to be met.
We can then come back at a later date, when the requirements have
solidified and the system has settled (and maybe even when the budget has been
renewed), and ‘make it work better’ by implementing custom or purchased
software.
Go to Top |
Return to Newsletter Contents
|