SQL version control methodology

There are several questions on SO about version control for SQL and lots of resources on the web, but I can’t find something that quite covers what I’m trying to do.

First off, I’m talking about a methodology here. I’m familiar with the various source control applications out there and I’m familiar with tools like Red Gate’s SQL Compare, etc. and I know how to write an application to check things in and out of my source control system automatically. If there is a tool which would be particularly helpful in providing a whole new methodology or which have a useful and uncommon functionality then great, but for the tasks mentioned above I’m already set.

  • Git - ClearCase methodology
  • Is it a good idea to put db/schema.rb to .gitignore list ??
  • Can Git handles this use cases?
  • Git stopped working in Xcode 4 after migration
  • How to migrate gitorious to gitlab repositories
  • Clearcase to Git migration issues
  • The requirements that I’m trying to meet are:

    • The database schema and look-up table data are versioned
    • DML scripts for data fixes to larger tables are versioned
    • A server can be promoted from version N to version N + X where X may not always be 1
    • Code isn’t duplicated within the version control system – for example, if I add a column to a table I don’t want to have to make sure that the change is in both a create script and an alter script
    • The system needs to support multiple clients who are at various versions for the application (trying to get them all up to within 1 or 2 releases, but not there yet)

    Some organizations keep incremental change scripts in their version control and to get from version N to N + 3 you would have to run scripts for N->N+1 then N+1->N+2 then N+2->N+3. Some of these scripts can be repetitive (for example, a column is added but then later it is altered to change the data type). We’re trying to avoid that repetitiveness since some of the client DBs can be very large, so these changes might take longer than necessary.

    Some organizations will simply keep a full database build script at each version level then use a tool like SQL Compare to bring a database up to one of those versions. The problem here is that intermixing DML scripts can be a problem. Imagine a scenario where I add a column, use a DML script to fill said column, then in a later version that column name is changed.

    Perhaps there is some hybrid solution? Maybe I’m just asking for too much? Any ideas or suggestions would be greatly appreciated though.

    If the moderators think that this would be more appropriate as a community wiki, please let me know.

    Thanks!

  • Preserving only file additions from a series of Git commits?
  • git-svn: re-link git-svn cloned repo to svn
  • Getting “fatal: This operation must be run in a work tree?” on bare repository
  • Version control: merging changes between dev and live sites
  • Do you keep your build tools in version control?
  • Why does re-merging result in a repeated merge conflict in git?
  • 6 Solutions collect form web for “SQL version control methodology”

    I struggled with this for several years before recently adopting a strategy that seems to work pretty well. Key points I live by:

    • The database doesn’t need to be independently versioned from the app
    • All database update scripts should be idempotent

    As a result, I no longer create any kind of version tables. I simply add changes to a numbered sequence of .sql files that can be applied at any given time without corrupting the database. If it makes things easier, I’ll write a simple installer screen for the app to allow administrators to run these scripts whenever they like.

    Of course, this method does impose a few requirements on the database design:

    • All schema changes are done through script – no GUI work.
    • Extra care must be taken to ensure all keys, constraints, etc.. are named so they can be referenced by a later update script, if necessary.
    • All update scripts should check for existing conditions.

    Examples from a recent project:

    001.sql:

    if object_id(N'dbo.Registrations') is null 
    begin
        create table dbo.Registrations
        (
            [Id]                    uniqueidentifier not null,
            [SourceA]               nvarchar(50)     null,
            [SourceB]               nvarchar(50)     null,
            [Title]                 nvarchar(50)     not null,
            [Occupation]            nvarchar(50)     not null,
            [EmailAddress]          nvarchar(100)    not null,
            [FirstName]             nvarchar(50)     not null,
            [LastName]              nvarchar(50)     not null,
            [ClinicName]            nvarchar(200)    not null,
            [ClinicAddress]         nvarchar(50)     not null,
            [ClinicCity]            nvarchar(50)     not null,
            [ClinicState]           nchar(2)         not null,
            [ClinicPostal]          nvarchar(10)     not null,
            [ClinicPhoneNumber]     nvarchar(10)     not null,
            [ClinicPhoneExtension]  nvarchar(10)     not null,
            [ClinicFaxNumber]       nvarchar(10)     not null,
            [NumberOfVets]          int              not null,  
            [IpAddress]             nvarchar(20)     not null,
            [MailOptIn]             bit              not null,
            [EmailOptIn]            bit              not null,
            [Created]               datetime         not null,
            [Modified]              datetime         not null,
            [Deleted]               datetime         null
        );
    end
    
    if not exists(select 1 from information_schema.table_constraints where constraint_name = 'pk_registrations')
        alter table dbo.Registrations add
            constraint pk_registrations primary key nonclustered (Id);
    
    if not exists (select 1 from sysindexes where [name] = 'ix_registrations_created')
        create clustered index ix_registrations_created
            on dbo.Registrations(Created);
    
    if not exists (select 1 from sysindexes where [name] = 'ix_registrations_email')
        create index ix_registrations_email
            on dbo.Registrations(EmailAddress);
    
    if not exists (select 1 from sysindexes where [name] = 'ix_registrations_email')
        create index ix_registrations_name_and_clinic
            on dbo.Registrations (FirstName,
                                  LastName,
                                  ClinicName);
    

    002.sql

    /**********************************************************************
      The original schema allowed null for these columns, but we don't want
      that, so update existing nulls and change the columns to disallow 
      null values
     *********************************************************************/
    
    update dbo.Registrations set SourceA = '' where SourceA is null;
    update dbo.Registrations set SourceB = '' where SourceB is null;
    alter table dbo.Registrations alter column SourceA nvarchar(50) not null;
    alter table dbo.Registrations alter column SourceB nvarchar(50) not null;
    
    /**********************************************************************
      The client wanted to modify the signup form to include a fax opt-in
     *********************************************************************/
    
    if not exists 
    (
        select 1 
          from information_schema.columns
         where table_schema = 'dbo'
           and table_name   = 'Registrations'
           and column_name  = 'FaxOptIn'
    )
    alter table dbo.Registrations 
        add FaxOptIn bit null 
            constraint df_registrations_faxoptin default 0;
    

    003.sql, 004.sql, etc…

    At any given time I can run the entire series of scripts against the database in any state and know that things will be immediately brought up to speed with the current version of the app. Because everything is scripted, it’s much easier to build a simple installer to do this, and it’s adding the schema changes to source control is no problem at all.

    You’ve got quite a rigorous set of requirements, I’m not sure whether you’ll find something that puts checks in all the boxes, especially the multiple concurrent schemas and the intelligent version control.

    The most promising tool that I’ve read about that kind of fits is Liquibase.
    Here are some additional links:

    Yes, you’re asking for a lot, but they’re all really pertinent points! Here at Red Gate we’re moving towards a complete database development solution with our SQL Source Control SSMS extension and we’re facing similar challenges.

    http://www.red-gate.com/products/SQL_Source_Control/index.htm

    For the upcoming release we’re fully supporting schema changes, and supporting static data indirectly via our SQL Data Compare tool. All changes are saved as creation scripts, although when you’re updating or deploying to a database, the tool will ensure that the changes are applied appropriately as an ALTER or CREATE.

    The most challenging requirement that doesn’t yet have a simple solution is version management and deployment, which you describe very clearly. If you make complex changes to the schema and data, it may be inevitable that a handcrafted migration script is constructed to get between two adjacent versions, as not all of the ‘intent’ is always saved alongside a newer version. Column renames are a prime example. The solution could be for a system to be devised that saves the intent, or if this is too complex, allows the user to supply a custom script to perform the complex change. Some sort of version management framework would manage these and “magically” construct deployment scripts from two arbitrary versions.

    for this kind of issue use Visual studio team system 2008 for version controlling of your sql database.

    In tsf there are no. of feature avialbe like

    • Datacompare
    • Schemacompare
    • version controlling

    about database version control : http://www.codinghorror.com/blog/2006/12/is-your-database-under-version-control.html
    for more detail check : http://msdn.microsoft.com/en-us/library/ms364062(VS.80).aspx

    We are using SQL Examiner for keeping database schema under version control. I’ve tried the VS2010 also, but in my opinion VS approach is too complex for small and mid-size projects. With SQL Examiner I mostly work with SSMS and use SQL Examiner to check-in updates to SVN (TFS and SourceSafe is supported also, but I never tried it).

    Here is description of SQL Examiner’s approach: How to get your database under version control

    Try DBSourceTools. (http://dbsourcetools.codeplex.com)
    Its open source, and specifically designed to script an entire database – tables, views, procs to disk, and then re-create that database through a deployment target.
    You can script all data, or just specify which tables to script data for.
    Additionally, you can zip up the results for distribution.
    We use it for source control of databases, and to test update patches for new releases.
    In the back-end it’s built around SMO, and thus supports SQL 2000, 2005 and 2008.
    DBDiff is integrated, to allow for schema comparisons.
    Have fun,
    – Nathan.

    Git Baby is a git and github fan, let's start git clone.