It's Y2K all over again

We've all seen a lot of press blaming COBOL as the cause of COVID-19 unemployment delays, along with shock that so many systems still run on COBOL. I have also seen comments along the lines of "I thought we had got rid of COBOL during Y2K", and certainly lots of discussion about how this is similar to the Y2K situation.

For those of us who worked on Y2K remediation, it certainly brings back memories - some good, and some bad.

A memory that suddenly came back to me was about the "shortcuts" that were taken, which seemed reasonable at the time, but might indeed bring back some of the Y2K problems.

A little background: the Y2K problem that caught everyone's attention was the use of two-digit years. When applications were being designed in the 1970s, programmers needed to save space everywhere. Disk space was very expensive, so you certainly were not going to keep 4-digit years on fields like birthdates, expiry dates, etc. With millions of rows of data, with a few dates per row, that would add up to megabytes of "wasted" space if every year contained the same two digits '19'.

This sounds silly now when your phone has 32GB, but in the 1970s, the IBM 3340 (Winchester) disk drive was the size of a washing machine and held two 30MB disks, so yes, it really was necessary to save a few MB on a file.

This all worked just fine, with systems doing date calculations like subtracting the birth year from the current year to determine age, etc.

Programmers and IT management knew that by the year 2000 there would be problems IF these programs were still being used, but during the 1980s and even early 1990s, the expectation was that these programs would be replaced or rewritten before 2000.

For example, in 1999, you could subtract a two-digit birth year of 80 from a current year of 99, and get an answer of 19 years old, but in 2000, subtracting a birth year of 80 from a current year of 00 would give -80, rather than the correct answer of 20.

During the 1990s, it became common for new programs (and even maintenance on older programs) to use 4-digit years internally, taking the two-digit year from the file and prefixing it with '19'. This meant that if/when the year fields were expanded in the records, there would be very little change needed in the programs.

After 1990, forward-thinking companies brought in standards stating that all dates used in new files should use 4-digit years, and sometimes went as far as to start expanding date fields any time a change was being made to a file. This sounds simple, but it was a lot of work.

In well-organized sites, there was only one definition of a file's layout, in a shared copybook, so when you expanded a year field in the copybook, you had to find and checkout every program that used that copybook, write a conversion routine to update all the records in the file, then compile and test every program. There were often programs that did not even use the date fields at all, but they would still have to be compiled and tested because the length of the record had changed or fields had moved. And all this work then had to be scheduled to move into production.

This was a lot of work even in well-organized sites, and when there were multiple copybooks it was more work. This effort was impossible to justify within a regular maintenance budget, so some companies gave this task it's own budget. Many did not.

There were various reasons - they were planning to migrate to a purchased package, or to rewrite the entire application, or there was another reason why the application was only expected to be used for another couple of years, etc. In any case, they did not see a reason to update code which would be going away before the year 2000.

Of course, quite a few of these migration or rewrite projects failed or did not get funded, and the code kept on running in production.

Then around 1998, IT management started to get very worried. They called in consultants to perform audits and develop action plans. That took care of that year.

So in 1999 companies were staring at a hard date and they simply did not have enough time to do the work and testing to expand all the date fields in all the files.

What to do? Then the same thing seemed to happen in companies all over the world: a bright spark would point out that most programs already used 4-digit years internally, and just needed a bit of logic to decide when to use '20' rather than '19', and there was enough time to retrofit this kind of code into all the programs that used dates. There would be no need to change the record lengths, so programs that did not actually use the dates could be left alone. This could work! They would all be heroes!

Someone would ask: how do you decide when to use '20'? A bit of thinking would ensue, and a general plan would emerge: if the date was a birthdate and the oldest person in the file was 85 years old, then you could safely assume that any year from '00' to '24' should start with '20'. That would allow the two-digit field in the file to represent a 100-year window from 1925 to 2024. And if the date was to do with a policy or a loan, and the company had no documents prior to 1950, then you could use the same logic to represent a 100-year window from 1950 to 2049.

Brilliant, everyone would say - let's write this up, hire some contract programmers, and get moving! Another bright spark would then pipe up "hey - I hope this system gets replaced before 2020", and everyone would laugh.

Previous Post Next Post