The Dangers of Duplicate Data

Received an email from Amazon today, this email explaining that delivery of my parcel was delayed demonstrates the dangers of duplicate data, I think:

29/09/2016 is actually a Thursday!

29 September 2016 is actually a Thursday!

The duplication I am thinking of is that in addition to providing the date; 29/09/2016, they have also chosen to provide a refinement; ‘Friday’.

The problem here is date is actually a Thursday!

No doubt someone thought that while providing the date was specific, perhaps some people might like to see a day-of-week because; well, I guess we all have days when we know the day but not the date! But to have provided duplicate interpretations of the same data is is only helpful if it is correct!

I’d be fascinated to know how it came to be that they got this wrong in this particular email!

 

In this simple case, the bigger problem is that of extracting the day of the week from the date.  But in other senses you may see real-world examples not dissimilar to this.  In addition to ensuring the first conversion from one format to another is correct, if duplicate forms of the same data is persisted in a database you need to be really certain that any modifications to one field correctly update the ‘duplications’ in the other related fields.

 

Another thing you need to be sure about with providing your customers with information is that it is correct.  A few minutes after receipt of the email above, I got another email telling me that the parcel was delivered successfully!

 

Software Inventory – Joel on Software

While the Software Inventory post by Joel Spolsky is a thinly-veiled advertisement for Fog Creek’s new (at the time) ‘project management’ tool Trello; the points Joel makes are nevertheless deeply meaningful to me. Points like this:

“The trouble is that 90% of the things in the feature backlog will never get implemented, ever. So every minute you spent writing down, designing, thinking about, or discussing features that are never going to get implemented is just time wasted.”

Or this:

“… the desire never to miss any bug report leads to bug bankrupcy, where you wake up one day and discover that there are 3000 open bugs in the database, some of which are so old they may not apply any more, some of which can never be reproduced, and most of which are not even worth fixing because they’re so tiny.”

…though you probably won’t need any more than 20 bug reports to find outdated or incorrect tickets!

I’ve just had another reason to hunt-out and refer to this blog post professionally, in an effort to encourage a client that having a bunch of years-old tickets assigned to someone but ‘on-hold’ when the original ticket requester has left the business is a sure-sign that really those projects are never going to happen and are probably not a good idea anyway!

- Make it consisent – Make it consisent – Make it consisent

We made a mistake recently, breaking one of our own rules; Be Consistent.  Now, of course it is not always possible to ‘be consistent’, sometimes because you are doing something truly new; but often because one incorrectly sees differences – when you may be better off seeing patterns and similarities (and thus implementing something to fit an existing pattern)!

Continue reading

Premature Simplification – Allowing UI Display Formats to Drive Data Storage Formats

What is Premature Simplification – other than ‘Allowing UI Display Formats to Drive Data Storage Formats’? It will be easiest to start with an example: A company receives records from many devices, and decides that the end-user of the system web site will never want to view detail at finer grain than 1 second… so they decide that all time formats should be stored without milliseconds (or ticks) – that is to say, timing data is rounded or truncated to seconds. Continue reading

Two Hard Things

This post by Martin Fowler quotes Phil Karlton:

There are only two hard things in Computer Science: cache invalidation and naming things

(and Martin adds the derivative quote: ‘there are two hard things in computer science: cache invalidation, naming things, and off-by-one errors’) which is quite nice. Today’s post was originally about naming things being hard… but I think I can extend it to something close to the two topics in the quote. Continue reading

Rejuvenating a Stalled Project

The client had a stalled project with many changes to an existing system; there were several areas of new functionality, but they were not completely trusted. There was considerable concern regarding the reliability and performance of the database, and in some senses the database was seen as the whole problem.  Not only had this set of changes stalled, but minor fixes had also stacked up waiting to be released and all were dependent upon the main project.
Continue reading

DatabaseLifecycleSlice

Using Live Data as a Source of Test Databases

Many modern development tools are providing ways to create databases and populate them with test data; often with the idea that unit tests can then be run against them. But there is an alternative approach available to some people; which is to use live data as a source for our test environments.  Now, there may be reasons why this is not possible (not the least of which is ‘compliance’), and there are certainly issues of practicality that will need to be considered, but if you are allowed to do this there can be huge benefits.

Continue reading

HR DataModel Overtime Use Case Slice

Making Use-Case Diagrams Useful

We have found that communicating our system designs with clients is most usefully done with diagrams rather than large chunks of text.  Some years ago, we looked at using UML Use Case Diagrams for this communication – but see what Martin Fowler has to say about them in  ‘UML Distilled':

“But almost all the value of use cases lies in the content, [of the textual cases]  and the diagram is of limited value.’

In other words, in his opinion, you should use the textual Use Cases, not the diagrams (which just map those texts visually).

‘Pure’ UML also seems to disappoint in terms of producing very dry monochrome diagrams with stick figures, and simple primitives such as boxes and ovals.  Is this really the best we can do to convey the use of a system?

Continue reading