Our Perl division has worked on several large scale Perl projects over the years, each with its own set of challenges. However, in almost every case, there was a common denominator, the fact that they were under one form or another of legacy status.
This article is about how we tackled one of those projects and rewrote a Perl legacy system in modern Perl.
The decision to rewrite it was taken after client consultation and taking into account the current state of the system and the desired state of the system. Refactoring it completely was not an option as the architecture itself was not suitable for the current needs.
The rewrite took 9 months and a team of 3 developers.
We rewrote a main part of the business, an ETL system that had a huge impact on the business processes. Our client’s entire product line relied on this system, as such, we had to take into account every single detail that may have an impact on the quality or data integrity.
Making the correlation to legacy code and old code, this was actually a pretty new project, just 4 years old, but with several add ons, fixes and patches over the years. In those 4 years a lot of people worked on the project, so there was a hefty dose of personal marks. None of the folks who actually worked on the code were in the company at the time we worked on it so there was no transfer of knowledge on the coding level, only on the business level.
We changed a lot of the project logic, improving the process itself and also adding a few things. Although it was mostly a rewrite job, we also improved some things that would fall in line with refactoring.
In time the data volume increased substantially and it continued to grow, the system couldn’t handle it anymore and the processing time was too long and increasing. The lack of performance was damaging the business. At the same time, the data complexity also grew, making maintaining it difficult and building on top of it grueling.
The constant patching and developer turnover made it very bug happy, a lot of things were overlooked so the quality of the products that relied on the data processed by the ETL system suffered. The system was really hard to maintain and could not scale to suit the current business needs.
Challenges
The main challenge was rethinking the entire architecture so that it becomes easy to maintain in the future, whilst also remaining scalable. The rewrite was necessary, but we were fully aware that during this process we could also make new mistakes that might hamper other people in the future, so planning ahead and making sure that we did cover every corner was our greatest challenge.
The entire project was data oriented, so understanding the dataset in such a way that we understood any and all implications related to them was crucial. Even though there were no developers around that worked on the existent code, the business analysis people on the client side were very helpful. They also helped us a lot in understanding what we could get rid of, and what should stay in.
When it came to the code itself, things changed a bit. The TMTOWTDI principle and the freedom in synthax made our lives really hard. We needed a lot of time and patience to understand the code and to understand why certain things were done in a certain way. A few of them we gave up on as they were too convoluted. Overall, the code was very hackish.
Another drawback, strictly related to Perl, was the fact that several of the CPAN modules that were in use, were also no longer maintained or had little to no documentation. We took a lot of time to analyze the modules, what they did, how they worked to be able to understand their role in the grand scheme of things.
As the system relied on a lot of AWS services, paid services, we also had to balance efficiency and financial cost. Last, but not least, an overall challenge was managing and structuring what at a first glance seemed and was a tremendous amount of work.
Approach
The first step was analyzing in great detail the current implementation, identifying major and minor issues. We managed this by isolating the code logic in pieces, then analyzing it integrated with other system parts. It was a long and time consuming process, but it proved very effective.
Afterwards we regrouped and reimplemented the system logic, by structuring it in many different and separate pieces.
Result
The overall result was a product that was scalable, maintainable, with several improvements in performance. It ticked every box that the client had when we started the project.
Although it was a rough experience, in the end it had its rewards. There were a lot of teachable moments on clean programming, architecture and the other side of the coin when it comes to flexibility in coding. Our work was documented in order to make the future legacy code a lot easier to handle for the people that come after us.
The Perl Team