Architecture and Data Blog

Thoughts about intersection of data, devops, design and software architecture

Automated testing of custom sql

Avoid silent failures

Lots of time hand coded SQL is written in the applications for performance sake, getting data using complex SQL out of the tables. As the database structure changes it becomes really hard to find out all the SQL that needs to be changed. In other cases SQL is generated based on certain conditions in the code and its hard to find if the generated SQL is valid after database changes are done.


Generate boiler plate code

Avoid hand coding boiler plate code

When accessing the database using stored procedures for basic Create, Read, Update and Delete (CRUD) functions, or when you want to write triggers that capture the before and after values in tables, or when you want to create Plain Old Java Objects (POJO’s) that match the database objects, writing these by hand takes a lot of effort and also since the requirements are changing in an agile projects at a frequent rate, there will be design changes to meet the requirement changes, so the triggers, CRUD stored procedures or Data Access Objects (DAO) are going to be out of data, instead of hand coding, its better to generate the code, using the metadata of the database


Data specialists should pair with developers

Model to share specialists on agile projects

Traditionally the data-team is used to sitting in their own area and working for many project teams by handling requests either via a ticketing system or vi email. The hand-over of work or throwing of work over the wall creates knowledge silos and inefficiencies.


Publish data models in CI Pipeline

Enable wider usage of data models by publishing the latest models on team Wikis

Many a times ER models are created by the data team and are not shared outside of the data team generally for the lack of tools licenses, since its not feasible for the entire team to purchase licenses for the ER modelling tools such as Erwin Data Modeller or Er Studio


Trusting search results from Google & Others

Should we trust the results shown by any search engine?

Search engines have been great to find information on the internet with their ease of use and the ability to find cross linked content with specific keywords. This ease of use is being exploited by scammers and imposters.


SSL Connection to AWS Aurora

Creating secure connection to the AWS aurora involves multiple steps

On a recent project we had to connect to AWS Aurora postgres 10.6 version of the database in SSL mode using JDBC and Java 11 JRE. When the Aurora cluster is setup, we can force all connections to use SSL by using the options group settings (forceSSL=true), establishing secure connection from the application to the database is not as easy as it looks.


Static Analysis of PL/SQL code

Applying modern development practices to database code

In all new development and sometimes during legacy codebase modernization, developers tend to add code quality checks and static analysis of codebase such as style checks, bug finders, cyclomatic complexity checking etc. into the CI/CD pipeline. When we inherit a codebase that has much PL/SQL and there is a desire to put the PL/SQL code base through the same types of code analysis, what options does a developer/dba have?


ID generation with ORM's Table, Sequence or Identity strategy

In this blog post, I will discuss ID generation techniques using the Object Relation Mapping frameworks such as Hibernate, Toplink, ActiveRecord, Entity Framework. When using hibernate or any other ORM mapping framework. There is a need to generate primary key values for the “id” columns. These values can be generated by using IDENTITY, SEQUENCE or TABLE strategies. Generating custom values for primary keys or other values is a topic for another blog post.


Using liquibase to load data and ignore some columns

Loading data into tables is needed many times on projects to load test, Liquibase provides a method to load data into tables with lots of customization. In the example shown below, I’m loading zip code data with the following column layout

"Zipcode","ZipCodeType","City","State","LocationType","Lat","Long","Location","Decommisioned","TaxReturnsFiled","EstimatedPopulation","TotalWages"

Two factor authentication to authorize credit"

Simple flow of credit file usage by Individual Citizens