IT Wisdom Misunderstood: Data Stores should be Flexible to All Possible Future Data

Today I am continuing my reminiscence over system design problems I have addressed in the past, this time looking specifically at a data-structure a colleague of mine used in a particular situation, and the consequences of that decision.  Let’s begin!

The scenario was that of running expensive credit-checks for loan applicants.  The ‘Risk’ team wanted easy access to the results, both for processing on-the-fly as part of the loan application decision process, and also for future analysis to see if they could identify markers for successful repayment.  Credit-checks like the ones my client used are expensive, and can return hundreds of varying fields in complex data structures and the general thinking about the process was something like this:

  • There are about 30 key fields we regularly retrieve (but not quite always) and we would like access to those fields to be easy;
  • We may need to add more important fields to this list of easy-access fields in future, so we don’t want to commit to a fixed structure – ‘that would be wasteful’;
  • We have to store the full result set for later analysis as XML, as we paid a lot for it and it may prove to have useful fields deep in the belly which we may need to access one day.

While I am writing this in 2017, the environment I am referring to was at a client from about 2008 – 2010, at a time when the client was using SQL Server 2005.  While later versions of SQL Server have introduced functionality that might have helped us here (such as Sparse Columns) they were not available to us then.

Continue reading

IT Wisdom Misunderstood: ‘SELECT *’ Is Bad!

Select Star is BadI was reminiscing over my last two main roles (which together account for 10 years!) and I reminded myself of some of the challenges that I faced in that time.  In two separate businesses I found coding styles and system-design scenarios that seemed to have been borne from some nugget of wisdom that seemed to me to have been miunderstood, and had led to code that had huge problems.  As ever, I do not want to pretend that the lessons I learnt from these scenarios can always be applied; I imagine that in-general my conclusion will be something like ‘your mileage may vary’… but I hope that I can at least demonstrate some harm came about from these misapprehension in the given scenario.

Today I want to discuss the idea that ‘SELECT *’ is bad; and what I really mean in this context is that ‘selecting all the data from a record in a database table is wasteful‘… with a possible additional proviso added ‘…if you don’t need it all‘. To try and put that into a wider context; the ideal some people have in mind is that we should only retrieve data from a database if we really need to use it.  It is probably also appropriate to mention that the environment I am referring to was one in which only very simple Object Relational Mapping (ORM) was in-use, meaning that selection of records was generally hand-written and mostly referred to a single table at a time, and not companies with huge volumes of database use.

I believe that this view leaves out several key factors; which will hopefully be obvious once described: Continue reading

The One Missing Feature

Sometimes the success or failure of a new system function can end up depending on one feature; and if that feature or function is missing the whole reason for the system existing can be undermined.

This system we are looking at today is one which can track vehicles, and one feature is the ability to plot journeys on a map in a web-browser; it’s actually very cool, but the plotting of the route was a little slow. Continue reading

– Make it consisent – Make it consisent – Make it consisent

We made a mistake recently, breaking one of our own rules; Be Consistent.  Now, of course it is not always possible to ‘be consistent’, sometimes because you are doing something truly new; but often because one incorrectly sees differences – when you may be better off seeing patterns and similarities (and thus implementing something to fit an existing pattern)!

Continue reading

Premature Simplification – Allowing UI Display Formats to Drive Data Storage Formats

What is Premature Simplification – other than ‘Allowing UI Display Formats to Drive Data Storage Formats’? It will be easiest to start with an example: A company receives records from many devices, and decides that the end-user of the system web site will never want to view detail at finer grain than 1 second… so they decide that all time formats should be stored without milliseconds (or ticks) – that is to say, timing data is rounded or truncated to seconds. Continue reading

Using Live Data as a Source of Test Databases

Many modern development tools are providing ways to create databases and populate them with test data; often with the idea that unit tests can then be run against them. But there is an alternative approach available to some people; which is to use live data as a source for our test environments.  Now, there may be reasons why this is not possible (not the least of which is ‘compliance’), and there are certainly issues of practicality that will need to be considered, but if you are allowed to do this there can be huge benefits.

Continue reading

Top 10 Ways to Write Easy-to-Use SQL

This post is about some top tips and tricks to use when writing and modifying SQL.  At first sight they may seem a little bit strange, but hopefully after explanation they will make sense. Let’s take a look at the first bit of SQL:

SELECT    SomeTableId
          , TeamName            -- (2)
          --, EmployeeId
          , EmployeeName
          --, BigXmlCol         -- (3)
FROM      dbo.SomeTable         -- (4)
WHERE     1 = 1                 -- (5)
AND       TeamName LIKE 'Blah%'
--AND       EmployeeName LIKE 'Nigel%'  -- (6)
ORDER     BY TeamName
--ORDER    BY EmployeeName

Continue reading