Architecture and Data Blog

Thoughts about intersection of data, devops, design and software architecture

Data specialists should pair with developers

Model to share specialists on agile projects

Traditionally the data-team is used to sitting in their own area and working for many project teams by handling requests either via a ticketing system or vi email. The hand-over of work or throwing of work over the wall creates knowledge silos and inefficiencies.


Automatically adding columns to Rails migrations

Allowing for common audit columns to be added to all rails migrations

Many projects need addition of identical columns to all the tables created by the project. Audit columns are an example of such a requirement. The requirement is to add columns such as created_by, created_date, modified_by and modified_date to all the tables, these columns store, who created the row, when the row was created, who modified the row last and when was it modified. created_by and created_date are required to be present when the row is inserted and thus are required to be not nullable. Adding these columns to each and every table is a lot of work for developers.


Database naming conventions in different environments

Allowing for change in environment configuration

In every enterprise and every project we end up having multiple environments, especially the database side of the enterprise tends to stick around for a longer period of time and has much more dependencies or application integration as opposed to application urls etc. Given this, how to name the servers, databases and schemas becomes a very important decision, do these names provide for an easy way to use the application and not make it harder or the developers to access the database.


Setup and Teardown of database during testing

When doing Performance Testing or running Unit/Functional tests on a database, there is a need to periodically get the database to a known state, so that the tests behave in a predictable way and to get rid of all the data created by the tests. Some of the ways to get a clean database are. Using Scripts: Recreate the database using scripts, the same scripts that are used in development environment.

Experience using DBDeploy on my project

We have been using DBDeploy on my project for more than 6 months now and wanted to show how things are going. First lets talk about set up, we are using dbdeploy in our Java development environment with ANT as our build scripting tool, against a Oracle 10g database. Define the ANT task first <taskdef name=“dbdeploy” classname=“net.sf.dbdeploy.AntTarget” classpath=“lib/dbdeploy.jar”/>; Now we create the main dbinitialize task a ANT task to create you database schema, using the upgrade generated by the dbdeploy file shown below.

Why do Evolutionary Design

What are the motivations for doing evolutionary design?

Why do Evolutionary Design or Iterative Design or Incremental Design? Everyone who has not worked in an evolutionary manner asks this? My answer, if you think the system you designed is NOT GOING TO CHANGE EVER then sure you can do design once and deploy once and you are done, move on to next project. But tell me one project you have been on, that does not have any changes in requirements, changes in technology, changes in look and feel etc after it was deployed.

When does evolutionary design happen?

What part of the product, project, iteration, story or day does evolutionary design happen?

A question I get, mostly related to the evolutionary database design and development. When the pair (team) gets a new feature (story) to work on, the team looks at the existing table/database design and sees if the current design is enough to implement the feature they are working on. If the currency database design does support the feature they are trying to implement, then they do not have to change the database at all, they will move on to implement the feature and change the application code as necessary.

Parsing mapping files for usage information

Currently working on a legacy application, thats been in production for a long time now. I wanted to find out what are the Tables and Columns being used by the application. Since we could see that some table columns where not being used. We are using a Object Relational mapping framework on the project, so we decided to write some code that would parse all the mapping files and gives us a list of table names and columns.

Long Running Data Migrations during Database Refactorings

how to manage long running data migrations

When you are refactoring large databases, you will have certain tables that have millions of rows, so lets say we are doing the Move Column refactoring, moving the TaxAmount column from Charge table which has millions of rows to TaxCharge table. Create the TaxAmount column in the TaxCharge table. Then have to move the data from the TaxAmount column in the Charge table to the TaxAmount column you created in the TaxCharge table.

Moscow

How to use stored procedures as interface to the data

Last week I was at SD Best Practices in Moscow, doing a presentation on “Refactoring Databases: Evolutionary Database Design”. Moscow seems like a interesting place, loads of huge buildings, squares, fountains and roads. Things some how feel rundown, feels like a player trying to regain his former ability or glory. Opening Keynote by Jim McCarthy about how teams should operate was interesting, he proposed 11 principals or protocols as he calls them, to be followed by members in a team so that the team becomes more productive, many of these protocols are about avoiding waste and promoting clear communication channels.