|
Return to Newsletter Contents...
Database Change Management
by: Sean Beal, Sr. Consultant, CRCP
Whether you are developing a web application with a
database back end, an Operational Data Store, or a full blown Data Warehouse,
managing the evolution of the development and production database schemas
presents a serious challenge.
In our line of work, nearly everything we do involves a
database. We need to knock out those
five Crystal Reports for Client ‘A’ - point me to the database.
Client ‘B’ has requested some changes to that HR web application - I need
to modify the database. The work
load for that data warehouse for Client ‘C’ is enough for 3 developers - they
need to modify the database. All of
these scenarios have common elements:
a database schema is established or defined; it then changes over time.
How do we track these changes and synchronize them between multiple
developers? How do we propagate
these changes to the production system?
How do we roll back a change?
In other areas of development, we already have the answer:
a version control system. At its
most basic, version control provides protected management of the text files (the
code) related to a project, automatically merges changes to the same file by
different developers, tracks the history of these changes, and provides
a mechanism to revert a file to any given revision, or point in time.
Why not just use a version control system for the database
schema? You can…and you can’t.
The problem lies with the nature of the changes themselves.
Changes to the ‘code’ of a project are usually just changes to a text
file. When we update or get the
latest version of the file, we don’t really care about what steps the developer
took to arrive at the new code…we just want the new code!
I stated earlier that ‘you can’ use version control because there are
certain areas of database schema change that are much like this: stored
procedures and views, for instance.
With these objects, it is probable that we need only the latest version of the
code or text.
The Problem
This is not true, however, when we consider something like
a database table. When tracking
changes to a database table, we not only need to know what the new table schema
is (including columns, data types, indexes, etc.) but we need to know the method
of transition.
Remember, the table consists not only of its definition or schema,
but of its data as well.
Consider this simple example:
Your HR database contains a table called ‘employee’.
This table contains a record for each employee in your organization.
The original schema looked like this:

Employees were originally defined as active if they had a
hire date but no termination date.
This logic was beginning to be propagated throughout many reports, views, and
stored procedures, which could potentially cause a problem should this
definition ever change. It was
decided then that each employee record needed an active flag. The new schema
looks like this:

In this example, we have two states…the original schema and
the new schema. The transition in
this example is made up of the changes applied to the table to arrive at the new
state without losing our data. The
combination might look like this:
-
STATE 1 – Original Schema
o
add the ‘is_active’ column to the table
o
update the’ is_active’ column via business logic
computation
-
STATE 2 – New Schema
You can see that in order to reproduce this modification we
need not only the beginning and ending state, but also the transitional steps in
between. This is true whether we are
propagating the change to multiple development environments or pushing it out to
production.
The Solution?
We can ‘solve’ the problem in different ways depending on
the requirements of the project:
Handle it manually
Your project or situation may not be large enough or
complex enough to warrant the effort involved in tracking this kind of change.
Need a change to the development database?
Make the change and forget it.
You can point all the developers at a single development database so
everyone is always working on the current version.
Deploy to production as soon as a change is approved so the steps aren’t
forgotten, and be prepared every now and then to spend some time tracking down
issues brought about by changes you weren’t aware of.
Pros: low
investment in time and resources, flexible
Cons: no change history, requires greater communication and
ongoing management of developers, deployment to production is more difficult,
difficult to scale
Use (or build) a Patch Management system
Every change to the database schema and data is tracked as
a patch or migration. These are
scripted and their application to database is automated.
Each change or patch is stored in the version control system so that they
are available to all developers; a developer needs only to check out the latest
patches and run the automated update before beginning new work in order to
ensure that his or her environment is up to date.
When work is completed, tested, and approved, it can be easily rolled out
to production via the same automation methodology.
Pros:
history of changes preserved, makes deployment very easy, scales well
Cons: increased initial investment in time via communicating
methods and learning new tools or scripting languages
Conclusion
Change control in any sense is challenging, but can be
especially so for database development.
There is no one answer, and this article is meant only to provide a high
level look at the issue. I’d be very
interested in hearing about how you handle this in your environment.
Drop me a line at
sean.beal@resultdata.com. I’ll
share your feedback in an upcoming article.
Go to Top |
Return to Newsletter Contents
|