Re: cgen, "incremental database updates", etc.

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: cgen, "incremental database updates", etc.

jason dwyer
just reading through this thread now,

this is a pretty common issue in ongoing systems development
environments with db backends, not just for cayenne projects.

on my current project ( no cayenne unfortunately ) we have a side
project which manages the database structure ( and some content)
transition from version to version.

it basically uses ant , extending JDBCTask to do the work.

a basic xml file contains a series of statements, grouped into software
version elements, and the extended ant task simply digests that file and
executes each statement as needed, simply using raw JDBC Statements.

it requires one small table in the db containing the name of the patch,
and the software version it was patched into, and the task skips any
already applied revisions. as patches are applied, the details are added
to the table.

seems to work pretty well, and the project has undergone several hundred
db patches. only recently have we had to re-organise it so patches are
split out into branch-specific xml files ( one big xml chunk gets pretty
ugly to manage across >3 cvs branches...).

the dev team (6 members) keep a cvs watch on the xml files, and run the
ant task whenever theres an incoming change for the branch.

in the end, this strategy makes the changes to the db independent of the
app, and we've incorporated it into the deployment process.

works quite well, and you maintain complete control over the statements
being executed.

j



 

On Thu, 2005-07-21 at 08:05 -0500, Robert Zeigler wrote:

> Sounds like it's about the scope of what I'm looking at.
> Don't drop anything, just add missing stuff. I'm a bit leery
> about writing anything that auto-drops information. :)
> Thanks for the input.
>
> Robert
>
> J�rgen Saar wrote:
> > To avoid part of those problems, we decided to automate only
> > one kind of structur changes:
> >   Adding missing columns and tables.
> >
> > So after this process we can be sure that nothing in database structure
> > is missing. (There maybe unused structure elements)
> >
> > After this, possibly neccesary helper programs do their work
> > to convert data and add missing informations.
> > (This is started by hand and after a backup)
> >
> > Then there is a last option:
> > There is a helper programm that drops all columns and tables
> > that are not part of the actual repository.
> > This maybe very critical and is normaly done after piloting
> > but rarely on normal productive installations
> >
> >
> > I think the first part could sometime be done by cayenne
> >
> >
> > [hidden email] schrieb am 21.07.05 09:41:48:
> >
> >>Hi Robert,
> >>
> >>IMO generic full-featured database upgrade and versioning management  
> >>system is a problem with scope at least as big as Cayenne or maybe  
> >>even bigger. I think this is a good idea for a new open source (and  
> >>even commercial) product. I wonder what's already out there in this  
> >>area?
> >>
> >>I observed DB evolution on a number of projects, and automating db  
> >>synchronization is anything but simple. There are two groups of  
> >>problems to be solved - upgrading schema and migrating data to the  
> >>new version.
> >>
> >>Cayenne can be a starting point for the first problem - building  
> >>schema diffs and constructing SQL to incrementally change the schema.  
> >>DbLoader class can provide a DataMap view of the actual schema on the  
> >>fly, so you can build some merge algorithm based on that. But there  
> >>is a number of serious limitations. First is JDBC - drivers may not  
> >>provide the information about certain DB features. Second (something  
> >>that we can address to some point in the future) DbLoader ignores a  
> >>number of metadata parameters that are not relevant to Cayenne. Such  
> >>as the distinction between tables and views, indexes, triggers, etc.  
> >>Views and stored procedure schemas are reengineered, but their actual  
> >>implementations are treated as black box.
> >>
> >>Now the second part - upgrading the data... This is the worst part. I  
> >>don't think you can automate this at all. DBAs are charging their  
> >>fees for a reason. You can have giga or even terabytes of data and  
> >>you'll have to apply changes that are pretty much arbitrary (say you  
> >>normalized one table into multiple tables, or split data in one  
> >>column in two or more other columns, etc.)... DB-specific SQL that  
> >>performs such transformation is unavoidable.
> >>
> >>So if I am to approach writing such tool, I guess I'd go with the  
> >>general paradigm used in business software - if you can't completely  
> >>replace humans with a computer program, write a program that helps  
> >>human to do what they are already doing. In this case it probably  
> >>means management of a repository of version-specific SQL scripts and  
> >>some Cayenne-based intelligence to auto-detect version that user is  
> >>running and building the right script from repository. Something you  
> >>can manually do now with CVS... As for the version detection, I guess  
> >>the simplest thing would be a DB table that stores version explicitly.
> >>
> >>
> >>Andrus
> >>
> >>
> >>
> >>
> >>On Jul 21, 2005, at 2:30 AM, Robert Zeigler wrote:
> >>
> >>
> >>>Hi all,
> >>>
> >>>I'm looking into the possibility of writing some stuff to "gracefully"
> >>>handle upgrading "older" versions of a database to the current version
> >>>of a database.  Part of the core problem lies in not knowing where an
> >>>"old" database might be starting from... it could be a "fresh install"
> >>>scenario, where the db is starting from scratch.  It could also be an
> >>>update scenario from any of a number of previous versions of the
> >>>database.  I know the subject has come up briefly here before, and one
> >>>suggestion was to record changes in a sort of incremental file.
> >>>Good idea. :) But... there are a couple of... issues I have with that.
> >>>That would basically be recording raw-sql; one of the things I like
> >>>about cayenne is the ease with with different databases can be  
> >>>used. I'd
> >>>rather stay away from raw sql if possible...
> >>>So... I've started looking at what sort of api cayenne exposes for
> >>>examining the database-structure as it is supposed to be (ie, as  
> >>>defined
> >>>in the mapping file), and also the api, if any, for manipulating  
> >>>and/or
> >>>querying table meta-data.  I would appreciate any feedback on this
> >>>subject before I embark down a potentially long, dead-end road. :)  
> >>>(And
> >>>yes, I will be looking closely at the cgen ant task. :) Thanks in  
> >>>advance!!!
> >>
> >
> >
> > _________________________________________________________________________
> > Mit der Gruppen-SMS von WEB.DE FreeMail k�nnen Sie eine SMS an alle
> > Freunde gleichzeitig schicken: http://freemail.web.de/features/?mc=021179
> >
> >
> >

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: cgen, "incremental database updates", etc.

Robert Zeigler-2
Thanks for the info. In the end, I've decided to use an approach
along these lines; I'll definitely have a look at JDBCTask.

Robert

jason dwyer wrote:

> just reading through this thread now,
>
> this is a pretty common issue in ongoing systems development
> environments with db backends, not just for cayenne projects.
>
> on my current project ( no cayenne unfortunately ) we have a side
> project which manages the database structure ( and some content)
> transition from version to version.
>
> it basically uses ant , extending JDBCTask to do the work.
>
> a basic xml file contains a series of statements, grouped into software
> version elements, and the extended ant task simply digests that file and
> executes each statement as needed, simply using raw JDBC Statements.
>
> it requires one small table in the db containing the name of the patch,
> and the software version it was patched into, and the task skips any
> already applied revisions. as patches are applied, the details are added
> to the table.
>
> seems to work pretty well, and the project has undergone several hundred
> db patches. only recently have we had to re-organise it so patches are
> split out into branch-specific xml files ( one big xml chunk gets pretty
> ugly to manage across >3 cvs branches...).
>
> the dev team (6 members) keep a cvs watch on the xml files, and run the
> ant task whenever theres an incoming change for the branch.
>
> in the end, this strategy makes the changes to the db independent of the
> app, and we've incorporated it into the deployment process.
>
> works quite well, and you maintain complete control over the statements
> being executed.
>
> j
>
>
>
>  
>
> On Thu, 2005-07-21 at 08:05 -0500, Robert Zeigler wrote:
>
>>Sounds like it's about the scope of what I'm looking at.
>>Don't drop anything, just add missing stuff. I'm a bit leery
>>about writing anything that auto-drops information. :)
>>Thanks for the input.
>>
>>Robert
>>
>>J�rgen Saar wrote:
>>
>>>To avoid part of those problems, we decided to automate only
>>>one kind of structur changes:
>>>  Adding missing columns and tables.
>>>
>>>So after this process we can be sure that nothing in database structure
>>>is missing. (There maybe unused structure elements)
>>>
>>>After this, possibly neccesary helper programs do their work
>>>to convert data and add missing informations.
>>>(This is started by hand and after a backup)
>>>
>>>Then there is a last option:
>>>There is a helper programm that drops all columns and tables
>>>that are not part of the actual repository.
>>>This maybe very critical and is normaly done after piloting
>>>but rarely on normal productive installations
>>>
>>>
>>>I think the first part could sometime be done by cayenne
>>>
>>>
>>>[hidden email] schrieb am 21.07.05 09:41:48:
>>>
>>>
>>>>Hi Robert,
>>>>
>>>>IMO generic full-featured database upgrade and versioning management  
>>>>system is a problem with scope at least as big as Cayenne or maybe  
>>>>even bigger. I think this is a good idea for a new open source (and  
>>>>even commercial) product. I wonder what's already out there in this  
>>>>area?
>>>>
>>>>I observed DB evolution on a number of projects, and automating db  
>>>>synchronization is anything but simple. There are two groups of  
>>>>problems to be solved - upgrading schema and migrating data to the  
>>>>new version.
>>>>
>>>>Cayenne can be a starting point for the first problem - building  
>>>>schema diffs and constructing SQL to incrementally change the schema.  
>>>>DbLoader class can provide a DataMap view of the actual schema on the  
>>>>fly, so you can build some merge algorithm based on that. But there  
>>>>is a number of serious limitations. First is JDBC - drivers may not  
>>>>provide the information about certain DB features. Second (something  
>>>>that we can address to some point in the future) DbLoader ignores a  
>>>>number of metadata parameters that are not relevant to Cayenne. Such  
>>>>as the distinction between tables and views, indexes, triggers, etc.  
>>>>Views and stored procedure schemas are reengineered, but their actual  
>>>>implementations are treated as black box.
>>>>
>>>>Now the second part - upgrading the data... This is the worst part. I  
>>>>don't think you can automate this at all. DBAs are charging their  
>>>>fees for a reason. You can have giga or even terabytes of data and  
>>>>you'll have to apply changes that are pretty much arbitrary (say you  
>>>>normalized one table into multiple tables, or split data in one  
>>>>column in two or more other columns, etc.)... DB-specific SQL that  
>>>>performs such transformation is unavoidable.
>>>>
>>>>So if I am to approach writing such tool, I guess I'd go with the  
>>>>general paradigm used in business software - if you can't completely  
>>>>replace humans with a computer program, write a program that helps  
>>>>human to do what they are already doing. In this case it probably  
>>>>means management of a repository of version-specific SQL scripts and  
>>>>some Cayenne-based intelligence to auto-detect version that user is  
>>>>running and building the right script from repository. Something you  
>>>>can manually do now with CVS... As for the version detection, I guess  
>>>>the simplest thing would be a DB table that stores version explicitly.
>>>>
>>>>
>>>>Andrus
>>>>
>>>>
>>>>
>>>>
>>>>On Jul 21, 2005, at 2:30 AM, Robert Zeigler wrote:
>>>>
>>>>
>>>>
>>>>>Hi all,
>>>>>
>>>>>I'm looking into the possibility of writing some stuff to "gracefully"
>>>>>handle upgrading "older" versions of a database to the current version
>>>>>of a database.  Part of the core problem lies in not knowing where an
>>>>>"old" database might be starting from... it could be a "fresh install"
>>>>>scenario, where the db is starting from scratch.  It could also be an
>>>>>update scenario from any of a number of previous versions of the
>>>>>database.  I know the subject has come up briefly here before, and one
>>>>>suggestion was to record changes in a sort of incremental file.
>>>>>Good idea. :) But... there are a couple of... issues I have with that.
>>>>>That would basically be recording raw-sql; one of the things I like
>>>>>about cayenne is the ease with with different databases can be  
>>>>>used. I'd
>>>>>rather stay away from raw sql if possible...
>>>>>So... I've started looking at what sort of api cayenne exposes for
>>>>>examining the database-structure as it is supposed to be (ie, as  
>>>>>defined
>>>>>in the mapping file), and also the api, if any, for manipulating  
>>>>>and/or
>>>>>querying table meta-data.  I would appreciate any feedback on this
>>>>>subject before I embark down a potentially long, dead-end road. :)  
>>>>>(And
>>>>>yes, I will be looking closely at the cgen ant task. :) Thanks in  
>>>>>advance!!!
>>>>
>>>
>>>_________________________________________________________________________
>>>Mit der Gruppen-SMS von WEB.DE FreeMail k�nnen Sie eine SMS an alle
>>>Freunde gleichzeitig schicken: http://freemail.web.de/features/?mc=021179
>>>
>>>
>>>
>
>

Loading...