First Project
Sunday, 11 October 2009

The primary goal of our team was to regain the internal client’s trust. The company’s systems were running very, very slow, and since that was the major complain of almost everyone, we decided to focus efforts and try to solve that problem first.

The first part of the project comprised a full mapping of the company’s IT environment, hardware and software, from workstations to servers, and during the course of that task we actually found some minor things that had to be fixed regarding to the physical infrastructure, but nothing that could explain why our Oracle server, a powerful Sun Solaris machine, was working near 100% of CPU usage all day long. The answer to that question should be, we thought, inside our internal software applications.

The team that built the software is not at the company anymore and for our sorrow the code they left has no documentation at all. But it was not that difficult for a good team of technical guys, which we actually have, to find out the source of our server’s misery.

The programmers and analysts who preceded us had a COBOL background, and when they started to build the second generation of software to control the company’s business using Oracle and Microsoft Visual Basic, they tried to apply some concepts that may work fine with COBOL, but which are definitely a lead to disaster within a relational database environment.

If you are a COBOL guy, then let me tell you, before anything else, that I have nothing against it. Actually, COBOL was the programming language of choice when I started technical school (a long time ago). The problem is: those guys just tried to pull a square piece through a circular hole, and obviously, it didn’t fit.

Just to illustrate the situation, I’ll tell you about the most critical mistake they did:

[Tech mode ON]

It is natural for COBOL programmers to store dates on ISAM databases using YMD string format (those standing for Indexed Sequential Access Method databases, and Year, Month, and Day, respectively). Modern databases, though, have a native data type for storing date information.

The guys actually used that format on their database tables design, but despite of that fact, whenever they needed to write an SQL sentence to filter records by date, they tried to get back to the YMD concept, forcing Oracle to convert millions of data fields from the native date format into strings in order to perform date comparisons. Result? Very slow system response, reports taking an eternity to start printing, and angry, lots of angry from all users.

[Tech mode OFF]

Once we had identified the cause of the problem we moved to a tuning process phase for all major applications, browsing through countless lines of code in order to switch the database search model involving date fields. It took time, and a lot of testing, but it worked like a charm.

When we finally put the tuned applications into production, users were amazed. Actually, some of them called us right away saying things like: “Hey, I’m printing a report that usually takes about 30 minutes to run, and now it performed in just 15 seconds; something must be wrong.” And it was a joy for us to be able to answer back: “No Dude, it’s just the opposite. Something is finally right!”

Lessons Learned:

• Walk the field and listen to the people working there. Since they are the ones performing the daily workflow, they may point you obvious things that might be overlooked from higher points of view.

• Whenever moving to new ground, try to get advice from someone who has been there already. The fact that you’ve successfully traveled through a swamp doesn’t mean you’ll be prepared to walk through the desert.

• Know thy enemy. If you are opening a project to solve a problem, then you must first understand what the problem really is.

 

The real leader has no need to lead. He is content to point the way.

Henry Miller