Hacker vs. UniKL – TA perspective

Editor note: As part of responsible disclosure, the matter has been sent out to MOHE IT/Network Security and MyCERT with the reference number of MyCERT-202103221082. I recently got contact of the CEO of UNIKL and the article was forwarded to him for further action. 

In most breach stories, we often hear one side of the story. Since I reported the breach, UniKL has not yet reached out to me, nor any press release was seen regarding the matter. As observers, you only see the well drafted press release, often concealing the details of what happened. The extent of any incident is only determined when and if the attacker decides to publish the data. While I had no intentions of writing anything further on this matter, a close peer nudged me and said that I should write a second piece on this story. There wasn’t much to pursue, but fate has it, had other plans.

As the earlier article went live on Linkedin, the attacker, Marwaan (I think i spelt it right) came publicly, replying to the article thread and responded to the thread, claiming responsibility. This is a rare opportunity, providing all of a look into the attacker and the attack. Marwaan agreed to an interview. The full length of the interview will be published by SecurityLah podcast.

But first, if someone claims responsibility, I need to be certain about the claim. Trust, but verify. So i asked for some proof on unpublished information that would validate Marwaan’s claim. A screenshot was provided, attached as below.

That pretty much, to me, confirms that he is indeed the attacker, or at best, someone who has access to the data. Good enough for me. Let’s continue.

(No spoiler’s here, but listen to the full length interview, to be published in 2 parts starting Monday 29 March 2020. at SecurityLah)

I was curious about whether Marwaan had indeed contacted UniKL regarding this matter. I asked him proof of the issue, and he provided screenshots of emails sent to UniKL pertaining to this matter.

From this, it seems to corroborate Marwaan’s narrative that he reached out to UniKL regarding the system weaknesses.

Some key pointers I picked up throughout the interview. UniKL seemed to have taken the matter lightly, and not done an assessment and full incident response. Marwaan also confirmed that based on his knowledge, there is only an IT and Communications team, but clearly lack presence of a cyber security team. Marwaan went on to explain that they had taken the “google” approach of search and deploy controls without understanding what needs to be done, and at times blindly trusting information provided by Marwaan.

This doesn’t bode well for UniKL, which seem to have been seen not managing the situation and respond accordingly. I’m happy to be able to get details from UniKL to present a balanced view on what happened, from their perspective, with necessary proofs to back their claim up, just like what I did with Marwaan. So far, whatever that has been shown, seems to put UniKL in a negative light.

2 key issues i picked up from this incident for this article.

First, incident reporting. Do organizations have a way for general public to report cyber security incidents? When I googled “UniKL report cybersecurity incident”, I see links of UniKL and its cybersecurity programs, but not actually anything related or allowing general public to report cyber security incidents. A case of not practicing what they preach as they teach cyber security? This certainly erodes my confidence to even think of studying there, especially cyber security. Marwaan also explained that he was given the run around, with staffs not even knowing what to do when someone reports such issues.

Do your organization suffer from such problems? Only you know.

Secondly, organizations are ill-prepared to face such issues. Incident response and coordination needs severe improvement. In any instance, when such incidents happen, organizations will alert key stakeholders on the incident, prepare a holding statement to manage the press and issue a first stab at this matter. The approach of “lets-be-silent-and-this-will-go-away” usually ends up making the organization guilty of concealment, lowers trust on the ability of the management and creates opportunity for further speculation. At this point, I have written 2 articles and data of the students, staffs, bank details may be circulating somewhere, which opens up opportunity for future attacks to be even more deadly. Imagine if the bank account was cleared as the attacker has access to the machines logged into the bank portal?

What can happen from here?

It all depends on the authority. The spillage of the attack has been confirmed to even hit MOHE, which creates high severity of this matter. MyCERT has been involved (or notified, by myself and also the attacker), and will most likely issue a holding statement, if this matter blows up. The Personal Data Protection Commission is yet to be seen on this matter (I wonder if MyCERT will reach out and inform them and do a joint investigation). While MOHE falls under the category of CNII (Critical National Information Infrastructure), UniKL doesn’t. However, by nature of processing personal data, UniKL will come under PDPA requirements.

Malaysia lacks reporting requirements for breaches. FireEye made the Solarwinds hack announcement as part of SEC filing. It’s time Malaysia starts looking at something similar, or better. Until such regulations become mandatory, we will continue to see organizations sweeping such issues under the carpet, paving way for more deadly and catastrophic attacks to be possible. We as a country may have recently launched a strategy, but it remains a strategy until something firm is implemented and enforced. Logically, the world is facing a pandemic, and the focus is on the issue, but cyber threats don’t look at whether it’s pandemic, or not, will continue to persist.

I’m sitting by the sideline, with my box of [redacted] popcorn watching to see how this unfolds. One thing’s for sure, there are tonnes of wisdom to be learnt from this incident.

Reference:

  1. UniKL Hack – Dr. Suresh Ramasamy – https://www.linkedin.com/pulse/case-study-unikl-hacked-ramasamy-cissp-cism-gcti-gnfa-gcda-cipm/
  2. CNII – CyberSecurity Malaysia – https://cnii.cybersecurity.my/main/about.html

Hacker Scorn – Tale of UniKL

This isn’t your typical “I-got-breached/hacked” case study. In fact, I found it so interesting, initially I didn’t pay much attention. What got to me was the level of details that the hacker was able to provide to prove the hack was indeed real and pretty much placed the smoking gun in his hands.

Let’s dive into the details.

UniKL is a Malaysian based university wholly owned by MARA (Majlis Amanah Rakyat), an agency under the Malaysian Ministry of Rural Development. UniKL has campuses spread across the country.

So what happened?

A hacker was seen selling information obtained through hacking on UniKL. The following is the posting from the hacker.

It seems that the hacker was not happy with UniKL’s response, which resulted the posting of the leak. The conversation with the hacker was made available by a twitter user publicly.

The hacker in fact reached out to UniKL on this matter.

However UniKL didn’t seem to take heed in the initial parts, claiming it’s under control. The hacker, unhappy with the situation, took the matter to Facebook to complain about it.

Is it just me, or the hacker seems to be emotionally involved with the hack? Getting emotionally involved with the hack seems, (i dont know), dangerous? Emotions aside, was there really a hack?

Looking into the breach data

The hacker proceeded to provide proofs of breach.

Exhibit 1,: seems like a listing of students.

Exhibit 2: More student listing.

Exhibit 3: Student details

Exhibit 4: ASP.NET application configuration

Wow. Just wow. In my lingo, we call that pwnage! But wait, there’s more!

Exhibit: Videos from a shared folder

Exhibit: Shared folder listing

Exhibit: SQL data dump

Exhibit (i lost count): e-Procurement System (wait, this is supposed to be super confidential!).

Talking to the hacker

There was a conversation recorded between a person and the hacker.

Seems like the hacker is pretty much deep into UniKL’s infrastructure. What scares me is that the person also had access into UniKL’s CIMB account!

While CIMB Corporate banking requires 2FA, the attacker most likely had remote access into the shared PC which is used by Finance department for processing payments. Imagine the ability of doing a fund transfer fraudulently?

The hacker seems to know how badly the Authentication system is broke.

A giveaway on tradecraft – SQL injection.

The hacker does not like being ignored. And seems to have presence/persistence in the network

The hacker even divulged the staff identity whom he/she spoke to regarding the matter. (I dont know if this qualifies as doxxing, since it is a public profile).

My assessment on this matter

Based on all the details provided, its can be said with high confidence that UniKL is breached.  The extend of breach warrants a serious look into the IT management and indicates poor cyber security hygiene, judging from the amount of data amassed by the attacker.

It’s also confirmed with high confidence that this is NOT a nation state threat actor. Its most likely the work of an individual who seem to have vested interest in UniKL, judging from the emotional outburst.

There seem to be dialogue between the hacker and UniKL. From the conversation between the 2 parties, it seems the approach taken went sour to the point of the attacker publishing the breach. Unconfirmed news mentioned that the hacker tried to extort UniKL and didn’t work, while the hacker claims that he/she was trying to remedy the situation. Eitherways, it is obvious that the situation went south which created the breach going public. Also noted that UniKL did not make press release about the matter (as of the writing of this article), which indicates either downplaying the issue or hoping it goes away), which is a poor approach, causing the issue to now blow up publicly (PR needs to be improved).

A lot of lessons learnt in this incident, a lot of what-you-should-not-do in such incidents.

Reference

  1. UniKL – About Us – https://www.unikl.edu.my/about-us/
  2. Twitter – Kimohitomo http://www.twitter.com/kimmohito

Singtel breach (2021) – case study

What happened Singtel?

Singtel, in a report, released a statement that they are currently investigating a data breach involving customer data. For those who aren’t familiar, Singtel is a Singapore based group of telecommunications companies around Asia, as well as a telco licensee in Singapore.

Singapore was notified by Accellion that the data breach occurred due to its file sharing system. The system was breached by unidentified threat actors (aka hackers). Singtel explains that it’s a standalone system and its used to share information within and with external parties.

Singtel explains that the use of Accellion FTA product was legitimate and had support running till April 2021. In mid-December 202, Accellion had issued a patch within 72 hours of the zero day notification. Accellion had noted attacks based on the reported zero days till end of January 2021.

What about Accellion?

Accellion, through its own website had a press release on the matter.

Interestingly, Accellion made a clear note that the product affected was a 20 year old “approaching end-of-life” product. Typical corporate sales techniques, Accellion uses this opportunity to urge its customers to migrate to its newer platforms. Interesting to note that Accellion makes it clear that the FTA platform is “legacy” and implies that, while the product is under support, organizations should either have migrated across to newer platforms or start doing so (preferring to “upgrade” to its own new version).

Analysis of the incident

Lets look at each part of this and the claims made by the respective organizations.

  1. The FTA system is a standalone system.

My assessment? True and False.

Lets look into the function of the FTA. Essentially its an FTP (file transfer protocol) server used for transferring files in and out of the organization. There seems to be some issues with this setup. Singtel further explains that the platform is used by both internal and external parties.

Did anyone notice a huge blinking red flag here? No? I’ll explain why.

In a typical telco setup, these FTP servers are crucial part of the equation. CDR (call data records) are often put into FTP servers before it gets passed to mediation and eventually billing and charging. Again, big red blinking light – CDR!!!

Why would file transfer be needed for external parties?

It’s used for many reasons, i’ll outline 2 as example. Firstly is bill payments. Some bill payments use REST API for immediate settlement, while others use bulk payment (aka batch) which uses file transfer via FTP. a bank may receive payments from respective customer and does update every night at 3am. Another scenario would an outsource arrangement involving a third party to perform corporate account provisioning, and then doing bulk activation based on the files provided.

Good hygiene practice, the file transfer platform should be completely separate and  isolated between internal and external parties.

Next, the question of whether the system is isolated or not. For me, an isolated system is a system that doesn’t have connectivity to any other systems, like a Windows 10 PC at home only connected to the internet. But a file transfer system? You can see that the system/network/security admins would have punched holes on the firewall in order for the system to be able to receive and transfer files. Yes, it is interconnected, but whether it can access the other interfaces (both ethernet and 3G specific) depends on what ports are open.

2. Usage of legacy platform.

This is where both parties seem to have differing views. Singtel seems to think that the product is supported (noting that EOL is around the corner), hence safe to use. Accellion however minces no words and blatantly put legacy tag to the platform.

Logical ensuring question – why didn’t Singtel migrate their platforms to a newer one? (This is the part where i throw theories into the equation, only Singtel would know the real reason)

Firstly, don’t fix what’s not broken. Remember it’s a 20 year old platform, and assuming that Singtel had used for half of it’s useful lifetime, that’s easily 10 years! The folks who provisioned and configured the platform may have moved on, or even retired! So, it works, it continue to work hence don’t touch!

A system migration can make or break a CIO/CTO’s career. We look back at statements made by Singtel. The FTA platform is used internal and external parties. This means firewall rulesets needs to be migrated. New service accounts need to be created. Permissions need to be mapped. Application ID’s need to be created. Batch jobs or cron jobs running in the server modified. God knows what else needs to be done! Now that’s just the internal parts. Minus the system, you’d have internal application owners screaming blood at you due to KPI missage!

The next big headache is coordinating initiatives with the external parties. I’ve had experience during migration where one of the external parties wanted to bill me for their migration! We, of course, declined politely and said that migrations are handled by individual organizations at their own cost (providing timelines to migrate across).

3. Why didn’t the patch work?

Singtel seem to indicate that the patch provided by Accellion didn’t work. Noting what Accellion mentioned, the patch was produced within 72 hours. One has to wonder if proper regression and quality checks were performed before patches were released. Reminds of Microsoft, who previously released a patch for a patch (in their credit, they’ve come a long way).

Conclusion

Tech debt is real, and in Singtel’s case just hit them with a huge interest. While one can argue its a zero-day issue, it is without a doubt that the legacy platform should have been managed out. Reminds me of the switch issue in MAHB? From a glance, seems like Singtel has lots of work ahead of them. They are moving in the right direction, I only hope they take a comprehensive look at their environment and not “scope down” into just the FTA.

Reference

  1. ZDNet: Singtel breach – https://www.zdnet.com/article/singtel-hit-by-third-party-vendors-security-breach-customer-data-may-be-leaked/
  2. Singtel Release: https://www.singtel.com/personal/support/about-accellion-security-incident
  3. Accellion Press Release: https://www.accellion.com/company/press-releases/accellion-provides-update-to-recent-fta-security-incident/
  4. MAHB Airport Case Study – https://www.drsuresh.net/2019/09/mahb-case-study-aug2019/
  5. Tech Debt – https://www.drsuresh.net/2019/08/cyber-tech-debt/

 

e-Pay data breach – a case study

Introduction

e-Pay is a solution part of GHL group of companies. Based on their website, e-Pay is was founded when Malaysia’s telco industry was just emerging in the late nineties. We have been providing top-up services ever since prepaid mobile plans became popular. Since our simpler beginnings, e-pay has expanded to include a host of other e-payment services, allowing us to drop our earlier moniker “One Stop Prepaid Reload” to adopt the now more accurate “One Stop E-Payment Service Provider”. Our network has grown fast, as we continue to build bridges between Product Partners, Merchants and Consumers across the entire nation, and now expanding out into Asia Pacific.

If you visit e-Pay’s website, it seems like business as usual. They have the promos on their website on their product and services. Everything seems normal. But is it?

On Feb 2021

@bank_security revealed that data breach occurred, revealing personal details of over 300,000 E-Pay customers exposed online. A threat actor was spotted selling a database of 380,000 customers on an data sharing forum for USD 300 (about RM1,215), which translates to be about 0.32 sen per user.

Further looking into the metadata reveals that a wealth of personal information was made available. The usual stuff (login, password) but what caught my attention was data of birth (?), nationality (?), which seems to indicate that the breach data isn’t just payment information, but seem to indicate information that is typically captured during a signup process.

These type of data are common in breaches. You’ll find past breaches having similar types of data being exposed.

What did e-Pay say?

TheEdgeMarket took a step further and contacted GHL on the matter. GHL released a statement saying that “currently investigating these serious allegations and are checking our system”. GHL asserts that the allegations are limited only to the e-pay online reload and bill payment collection system (E.V.E) and does not impact other e-pay and GHL businesses and operations

For a company that’s investigation a supposed breach, they seem to know exactly where the breach may have happened. Good sign of system awareness I suppose.

GHL adds further that the E.V.E system operates on an independent stand-alone system which does not interfere with the technical operations of other e-pay and GHL merchant acquiring systems and servers.

“Investigations are still underway and we will continue to update on the progress and any new findings. In the meantime, we would advise E.V.E users to go to our official website and change their passwords as precautionary measures.”

“E.V.E users should not click on unverified e-mail links urging them to update their credentials but to do so only on our official website,” it said.

Nothing out of ordinary, but i’m curious as to why a bill payment system needs data of birth or even nationality?

It would have seen passable as any other breach, but then a new piece of puzzle emerged.

What happened in 2020?

Kela, an organization that performs dark net monitoring issued a startling statement in their social media account.

According to Kela, based on the posting above, the breach data has been around for some time.  Specifically the same breach data has been published before, namely March 2020 and August 2020. The reporting on the breach by @bank_security was on February 2021.

The is raises some fundamental question about the whole incident.

  1. Was e-Pay aware that they’ve been breached at 2020?
  2. Was e-Pay  only made to be aware of the breach because of the recent reporting?
  3. By making a statement that only one system (EVE) was affected, how did they excluded others? It seems that either they know more about the breach, or trying to stem bleed from casting doubt on other systems.
  4. Will they release a full statement on this matter (have to wait and see)
  5. Instead of making a blanket statement, have GHL/e-Pay reached out to their customers regarding  this matter?
  6. As this is a PII breach, I’m wondering if PDPC (Personal Data Protection Commission)/MCMC (Communications and Multimedia Commission)  to issue a statement on this matter?

Conclusion

While this incident highlights the issue with one organization, it exposes a larger problem. There is no framework that compels organizations to publicly disclosure breaches. This is due to 2 factors. Lack of laws governing data breaches and shame factor. I’m reminded of an organization (believe was a financial) had huge IT meltdown causing their operations to be severely affected which was rumoured to be a ransomware attack but downplayed it as an IT glitch. After that, no further reporting or coverage was seen on that matter.

I lauded the efforts of FireEye for coming up front about Solarwinds attack. Malaysia needs to normalise breach reporting and notification, especially so when personal information and in this case payment information experiences data breach.

Reference:

  1. E-pay website – https://www.e-pay.com.my/
  2. SoyaCincau website – https://www.soyacincau.com/2021/02/04/e-pay-customer-database-breach-380000-sale-forum/
  3. The Edge Market: https://www.theedgemarkets.com/article/ghl-system-investigating-epay-data-breach-claims
  4. KELA – https://ke-la.com/

Selayang Hospital IT system case study – Jan 2021

Digitization and Hospital Management

As part of digital move introduced by the former Prime Minister of Malaysia Tun Dr. Mahathir Mohamed, Selayang Hospital underwent a major transformation, introducing the THIS – Total Hospital Information System. This system was aimed to provide a comprehensive hospital solution that covers imaging and patient information. Based on research done to gauge the acceptance level and satisfaction amongst nurses in the hospital, the research shows marked improvement in turn around time and satisfaction level on the efforts of digitization.

That was early 2000.

In May 2019

Come to current times, in May 2019, New Straits Times reported that technical problems plaguing the once famed THIS solution that is deployed in Selayang Hospital. Since May 4th2, some 40% of elective surgeries were rescheduled due to system failure.

Ministry of Health was quoted by Bernama, stating that the elective surgeries had to be postponed due to lack of access to the pathological reports, critically for cancer patients. The technical issues also affected the hospital’s ability to service patients seeking outpatient treatment. The ministry added that patients seeking inpatient service, emergency treatment and surgical emergency had to be carried out manually (not sure what’s meant by manually…).

THIS system has had its challenges, however staffs working in the hospital mentioned that the experience in the 2019 outage was the longest that they had ever experienced (I read that as the system has failed before, just not that long).

A doctor was quoted to say that the computerization was done long time ago (quote year 2000) and the system has had numerous instances where the system was down, forcing the staff to manually chart down details. System downtime would last 6-7 hours. (Wow!)

Deputy Health Minister (then) Dr. Lee Boon Chye confirmed that it s an old system that needs to be upgraded and its being done as he is being quoted (2019). Patients were given the option of whether to continue or not with their appointment, which may lead to longer waiting time , or opting to reschedule at a later date.

Come January 2021

On 5 January 2021, Malaysiakini carried the report stating that Selayang Hospital had another downtime. The hospital had to access the records manually for the past 3 days due to failure of its IT system, burdening the already overloaded hospital operations. This time a system called PowerChart was blamed for the issue (Is PowerChart part of THIS? Or is a replacement?) PowerChart has been down since January 6, 2021. Hospital officials was quoted saying that the system is used to record everything, from patient history and their progress. This included lab investigations, CT scans, blood test results and reports.

Looking at what has happened, I’m wondering what’s happening at the Selayang Hospital. The hospital is already facing a pandemic situation, and IT system reliability is causing additional headache to the overburdened staffs.

Is this new?

There seems to be a spat of technology issues covering Malaysian healthcare. About one year ago, I wrote about issues plaguing Sg Buloh hospital which had IT issues due to continued use of Windows XP. Sg Buloh hospital was reported to be using Windows XP as their server platform, instead of Microsoft Server operating systems. This is also due to 32bit operating system limitation which had caused the problem to occur. I covered this in extensive detail in this article.

Some glaring questions on the system

  1. Is PowerChart a replacement for THIS? Or is it part of it?
  2. Has MOH completed the system upgrade, which it stated in 2019?
  3. Constant failures of IT systems at a critical national infrastructure warrants a relook at how these systems are selected, deployed and whether the effectiveness is there over time.
  4. Based on the 2019 report, it seems that the system may not have the necessary hardware or software upgrades. Is capacity management and system availability tracked, measured and actioned upon?

Conclusion

In order for healthcare to serve the population effectively, there is a serious need for healthcare systems to be functioning and working in prime condition. There can be no lapse, as these situations create more opportunity for healthcare failure and workforce exhaustion. Systems are meant to ease the burden and workload of the hospital staffs, and should rightfully function in effective manner.

Reference

  1. Malaysiakini (9 Jan 2021) – Selayang Hospital’s IT system down for 3 days and counting – https://www.malaysiakini.com/news/558254
  2. New Straits Times (17 May 2019) – 22 year old Selayang Hospital getting much needed technical upgrade – https://www.nst.com.my/news/nation/2019/05/489365/22-year-old-selayang-hospital-getting-much-needed-technical-upgrade
  3. Rosnah H., Zubir M.A., Akma Y.N. (2004) International Perspective Selayang Hospital: A Paperless and Filmless Environment in Malaysia. In: Ball M.J., Weaver C.A., Kiel J.M. (eds) Healthcare Information Management Systems. Health Informatics Series. Springer, New York, NY. https://doi.org/10.1007/978-1-4757-4041-7_11
  4. Mohamad Yunus, N., Ab Latiff, D., Abdul Mulud, Z., & Ma’on, S. (2013) Acceptance of Total Hospital Information System (THIS), International Journal of Future Computer & Communications
  5. Sg Buloh Hospital: Jan 2020 Case Study – https://www.drsuresh.net/2020/01/sg-buloh-hospital-jan2020-case-study/

 

Singapore to propose Infosec tech rating – a review

BlackHat Asia recently hosts Singapore’s Deputy Chief Executive Brigadier General Gaurav Keerthi. Gaurav Keerthi was speaking on Singapore’s initiative for a voluntary “Cybersecurity Labelling Scheme” that is aimed at rating consumer’s broadband gateway.

In his speech, Gaurav Keerthi draws parallel between the importance of public utility such as water supply & sewerage, focusing on the aspects of fresh water and ensuring good governance. This is crucial for public health and safety. On this note, he equates infosec/cyber security to having access to clean water and proper sewerage. He opines that the citizens can be “scolded”  into better behavior.

Today, Singapore already offers Singpass to its citizens, an authentication/authorization services which allows the citizens to access services in a secured manner. This service is also offered to corporatations such as banks and financial sector to prevent having islands of authentication services.

To make the Cybersecurity Labelling Scheme wholesome, Gaurav Keerthi explains that it will expanded to connected devices. The first phase would be focused on ISP provided gateways and smart hubs, using a 4 star rating. Details on how the devices will be rated is going to be released during Singapore International Cyber Week which starts on October 5th, 2020. Being the Asian leader, Gaurav Keerthi mentions that Singapore plan to share the labelling scheme to other countries in the notion of public good.

My take on the matter

It’s a laudable effort from Singapore, taking a forefront on consumer devices. It’s no secret that consumer devices have wrecked havoc due to poor security. Cheap embedded devices such as IP CCTV Cameras have been known to cause issues, and was primed at source of Mirai botnet.

Firstly, lets understand why the choice of smart hubs and ISP gateway as a first stab into the scheme. ISP gateways are the most common and necessity for a household to get internet access. The device is either provided by the Service Provider as a bundle, or purchased by the end user. Today, these devices, from a physical perspective will be required to go through a type approval, which is a mandatory process. Such processes are for example CE marking from EU to ensure interoperability and ensure no conflict of operations against other devices while conforming to the requirements such as operating frequency/spectrum. Some might even say this will be a similar attempt, but from the cyber security perspective.

Analysis and Questions

To dissect this further, and raise some pertinent questions about the implementation of the Cybersecurity labelling scheme, I have grouped it into 8 categories.

Devices

Physical devices are often manufactured by OEM and sold by ISP. Often, we see that OEM also sells these devices in the open market. Bundled packages would include devices being thrown in as a freebie as part of the offerings. These devices have a lifetime, and some ISPs do not refresh these devices after its End-Of-Life, but at times position themselves to offer new package in order for the consumer to get new devices. Consumer may opt for a new package, or be happy with the current offerings and only choose to update the devices by themselves by purchasing from the retailer. This creates the responsibility conundrum between the ISP and the OEM provider. Reason, some models of these devices may be specific to an ISP (due to exclusivity contract nature).

Question:

  1. 1. Who is responsible in ensuring that the devices are certified under the scheme? The ISP? The OEM? The reseller? The consumer? What governs the relationship? Contractual? Explicit requirements?

Responsibility

Each device has 4 parties associated to it. The end-user, usually the consumer. The retailer, if its purchased from them. The reseller, in this case ISP, who pre-selects and provides the device. And the manufacturer, who produced the device. Each party is part and parcel of the whole value chain will have, some roles to play. The assessor will be the one conducting the scheme audit and certification.

Question

  1. Are these lines of responsibilities clear?
  2. Are they defined and the relationship in the ecosystem explicit, outlining their roles and responsibility?

Process

The process of this scheme defines how the whole thing works. It starts by establishing the requirements, roles and responsibilities, liabilities and limitations. While the scheme is voluntary, in the subsequent section, you will see how it can become a precursor to other things.

Questions

  1. Probably the most easiest question to answer – What is the certification process?
  2. Is the certification process a one time?
    1. Does it recur?
    2. How often is it retested to ascertain continuous compliance
  3. Once a device is certified, is the rating lifetime?
  4. How does device obsolescence affect rating?
  5. Is there a requirement on minimum device lifetime support as part of this scheme?
  6. How much does the process cost?
  7. Is there a requirement to the parties responsible in ensuring that the devices continue to serve at the rating provided?
    1. If so, what do they need to do? (I.e. timeframe to issue patches, automated patching, etc)
    2. If not, is there a downgrade process and how is that communicated?

Scope of Scheme

There are many variables in having this scheme. While the device is one, the moving parts (metaphorically) varies, depending on the complexity of the device. Hence, it is necessary to see what is being certified.

Questions

  1. Is the hardware itself certified?
  2. Are the supporting cables and peripherals certified?
  3. What are the determinant factor for a peripheral or supporting items such as cables be required for certification?
  4. Is the software/firmware certified?
    1. Is the certification based on version?
    2. Is the certification based on family of hardware support?
    3. A user compiles his/her own firmware (aka Tomato) for his/her router. Are those certificates? If so how? Codebase? Version? And back Roles & Responsibility
  5. Some devices have embedded micro codes. Does the scheme cover micro codes? SOC? Chip level instructions?
  6. How does the star scoring work? Equal weightage for all components, or in parts? Weighted? Percentage? Traffic lights style?

Let’s look at more complex aspects of this venture. While some of this may sound hypothetical, history has shown that anything and everything will eventually take place.

Vulnerabilities

Anything that has a piece of code, may also have a piece of error or bug.  It can be at a compiler level, code level, or even binary level. The security of the device ultimately defined on how well the code executes and is resistant against attacks. Devices with code often has 3 types of code (i) Specific code (usually due to hardware) (ii) Common Code (Free or Open Source) and (iii) Proprietary Code. The term bug is loosely used in this article to refer to vulnerability or software defect.

  1. Hardware bugs have been quite common, with the recent Intel CPU based hardware vulnerability. Does the scheme take into account hardware based vulnerability checks? This will have impact on even laptops as most Intel CPUs  still have some vulnerability to these types of attacks.
  2. Hardware bugs are often resolved with updated drivers (though drivers themselves pose security risks). Are these components part of the scheme? Drivers and even hardware/firmware are often guarded with strong agreements.
  3. I am a gateway developer, who purchases a hardware solution and develop an application on top of the platform. I do not have access to hardware layer information but have the application code for it. Can I get star rating for the scheme?
  4. Open Source Code is a common sight in any commercial product. Is the testing of these code done as part of the product or evaluated separately?
  5. What’s the impact when a common Open Source component (for e.g. OpenSSL) has a vulnerability? Does this warrant a review of the component as its own ? If so, will the scheme cover all Open Source components?

Pricing

Any product, given the time and effort to produce it will formulate the pricing strategy of that device.

  1. Will higher rated device be priced higher?
  2. Will the pricing also cause product isolation (cheaper devices will not run secure firmware?)

Consumer View – Simple vs Complex

At the end of the day, its the consumer who will take the simplified look. “I will only go for 4 star rating because i want to be secure” Herein lies the problem

Ah Chong (A local food court hawker) walks to the shop, wants to buy a gateway. He looks at this set of requirements, which his friend advice meets the scheme 4-star requirements.

  • CPU – MCX1234 – 4 star
  • Hardware – Cap Ayam revision 2.2.5 – 4 star
  • Cables – Xiao Kia  power code with Xiao Kia Ethernet cable – 4 star
  • Firmware – Potato Router version 1.2.99 – 4 star

Ah Chong checks every single component to make sure its 4 star before buying it, by going to the Scheme websites, putting each item one by one.

Is this simple? Or complicated?

2 days later, Potato Router firmware using OpenSSL 0.9.8h has a Remote Code Execution vulnerability and there is active exploitation (for this scenario). The router has a firmware update page, but requires Ah Chong to login, download the latest firmware.

Is this simple? Or Complicated?

Ah Chong buys the router 4 days after the vulnerability was announced. Being non-technical, he has no clue about the matter and is not aware of the latest tech/cyber security development. The packaging shows the information as 4 star validation, which is accurate as of the point of manufacturing.

Is Ah Chong deceived? Is he still confident of his purchase decision?

Future of the Scheme

For all intents and purposes, these schemes are beneficial and brings good to the society. It helps raise the bar for social and economic improvements through cyber security enhancements in total. However, future considerations need to be taken into account

  1. Today the scheme is voluntary. Once the scheme proves itself, its only natural for the governments to make it mandatory. Will it stay voluntary?
  2. Once it becomes mandatory, its easy to include requirements such as TLS MITM, traffic snooping, etc as a requirements for these devices (especially for gateways). Will this be an eventuality at the name of cyber security?

Moving Forward

There is no doubt that this initiative is necessary. Long has it been that consumer devices suffer from insufficient security controls. However the issue isn’t that simple. I fear that CSA Singapore may have oversimplified the matter. Oversimplification is a problem in Cyber Security and will due be a problem by itself. I understand drawing the parallel between the 1800 water issues and cyber security, but the issue and complexity is far from it. This is indeed an interesting development, and I am curious to know how CSA intends to address the concerns raised above.

P/S: Fellow journalist who happens to use these materials, do give me some credit, thanks!

Reference

  1. The Register – https://www.theregister.com/2020/10/01/singapore_infosec_strategy/
  2. Mirai Botnet – https://www.csoonline.com/article/3258748/the-mirai-botnet-explained-how-teen-scammers-and-cctv-cameras-almost-brought-down-the-internet.html
  3. Spectre and Meltdown – https://meltdownattack.com/

Privacy – White elephant in the room with COVID-19?

If it’s one thing life has taught me, “almost” everything has a price. For a good sum, you can get a person to sell his phone. For others, something else. It’s known fact that we live in the world of data.

Everything we do today generates data. Every step you take, every move you make (no its not a song), every interaction. Our lives have become a digital data lake, filled with details of what happens. Data can be in many forms. Audio – forms of conversations. Video – CCTV footages. Logs of transactions, usage, and patterns that is formed based on behavior.

In 2018, Strava, a company that produces fitness tracking solution had inadvertently revealed secret military base due to its users heatmap. The visualization component of the app provided heat maps on user clustering, which indicated secret military presence. This wasn’t the outcome Strava had foreseen, but undoubtedly become prevalent.

I have a saying about data. “Once you create data/information, you are forever doomed to tend to it till it ceases to exist”. Something like Sisyphus, who was condemned to rolling the stone up the mountain, only to find it back down the very next day.

If you attend a conference and visit the booths just for a “look-see”, you most often find a simple glass bowl. In it, a stack of name cards. Name cards are wealth of personal information. One could argue that its corporate information, which is apt. However you’d also find mobile phone numbers. Unless if those are company issued (remember the good ol days of Blackberry?), you’ve just handed over your personal mobile phone number to (not one) but countless number of individuals who will have access to that information. Ever wonder how a completely unknown sales person calls you up for similar products…. *cricket sound*

Ironically all that for a “possibility” of winning <insert the latest gadget name> or a booth token/premium. I remember the time when we went to a week long debate about managing personal information in the form of name cards in context of whether it is a business or a personal venture during the implementation of Malaysia’s PDPA for a telco, together with then the Commissioner of PDPC.

With MCO, contact tracing became a “new normal” (see I can also do buzzwords). Contract tracing is when the outlet you visit requires you to put your details such as name, phone number and temperature. It’s implemented quite simply, using a piece of paper or a book with the visitor jotting down his/her details. Just hypothetically, if you see a person of interest walking up to the same outlet, all one has to do is glean over and note down the number that was written on the contact log. There’s 2 school of thoughts on this matter. First, the contact details given, in some instances, are bogus to prevent such exact situation, which sadly defeats the purpose. Second, it’s a requirement, hence the burden of protecting such information belongs to the establishment collecting that information…. *cricket sound*

Point to note, what happens after MCO? Is the log book going to end up in a dump somewhere with all of the contact details?

What about contact tracing apps? I’d like to cite the example of AarogyaSetu app from India. When it was initially launched, the creators were barraged with queries of privacy and surveillance, which eventually lead to the app being open sourced. While the code was open sourced, upon closer inspection reveals that it has a few critical missing parts, and also found that it was retaining logs of other devices the app had come into contact with ( a database inside the app stores all of the Bluetooth addresses). The internet community celebrated its victory, having to compel the authors to publish the codes on Github.

The importance of having such applications being code published are a few. The codes allow the collective hive of internet to find any potential bugs or issues which allows the app to be improved and become safer over time. The transparency of code helps to allay fears of surveillance and privacy concerns. There’s research done on privacy preserving scheme which can be used to ensure that the app only captures relevant information. In an increasing rise of police state, such as the Black Lives Matter movement (re: George Floyd) in the US and worldwide, having such steps shows commitment of the respective nation states to their rakyats (meaning citizen in Malay). It’s been seen that data leakages happen due to poor security on the backend (such as exposed data buckets on the internet).

There is no doubt, the new normal has everyone getting adjusted into doing things differently. But that doesn’t mean privacy needs to take a back seat. Things can be done in proper manner, just needs some serious thought through. Age of smart phones has made it much easier for anyone and everyone to do contact tracing easily, but it also comes with serious fore-thought for it to be effective.

In Malaysia, we have a number of mobile apps. State government of Selangor published the “SeLangkah” app to do simple contact tracing. Malaysian central government introduced MySejahtera and MyTrace for COVID-19 tracking. MySejahtera has been seen to be adopted as part of the wider strategy while SeLangkah seems to be most retailer’s choice.

With these apps in place, in Malaysia, there are questions left hanging

  1. What are the security considerations and controls put in place to ensure that the mobile application is secure?
  2. Will the codes be published (quoting the Minister of MOSTI who made the statement on 10 May 2020) ?
  3. Where is the data captured by these apps stored? Are those storage secure? Who has access to those data? What type of data is secured?
  4. How is the security of the backend servers and services of these mobile applications?
  5. Has the mobile app undergone necessary security validation (i.e. vulnerability assessment/penetration testing/code audits)?
  6. What happens after the Movement Control Order (MCO) dismantled? What happens to the application and the data being captured? Whose responsible in enduring that the data is not kept beyond its use and disposed securely?

References

1. Strava fitness band gives up military presence – https://www.theguardian.com/world/2018/jan/28/fitness-tracking-app-gives-away-location-of-secret-us-army-bases

2. Myth of Sisyphus – https://en.m.wikipedia.org/wiki/The_Myth_of_Sisyphus

3. AarogyaSetup Android app Github page – https://github.com/nic-delhi/AarogyaSetu_Android

4. DP3T – Decentralised Privacy-Preserving Contact Tracing – https://github.com/DP-3T/documents

5. Minister allys privacy fears in contract tracing – https://www.thestar.com.my/news/nation/2020/05/10/khairy-allays-privacy-concerns-over-contact-tracing-app

 

Hackers for Hire – The case of Dark Basin

Mad kudos to Toronto based Citizen Labs for this excellent work!

Citizen Labs just published (about 13 hours ago) an expose of an Indian company, dubbed as ‘Dark Basin’ which is responsible for hacking thousands of individuals over six continents. The victim list isn’t just random joes, but public figures, rich and the affluent, NGOs including Electronic Frontier Foundation (EFF).

I wasn’t really surprised when an Indian company was coined for this. India is known as tech factory, producing software development and technology talents which went all over the world. And this included the dark side of technology.

Not too long ago, I was involved in forensic investigation of a high level intrusion, affecting Board of Directors and Senior Management of European  region telecommunications provider. Working closely with the law enforcement agencies, we were able to trace and eventually find out that the perpretrators were from India and had vast infrastructure for such clandestine operations. The Norman Hangover report was published, detailing the bits and bytes of the attack.

Back to the recent expose; the company used a variety of methods to target their victims. Primary mode of attack is through phishing. Their effective rates were high, simply because they were extremely persistent. They would do intelligence gathering of the clients, and attempt multiple times from different angles until the clients fall prey to the attack. On the background, a server is set up to masquerade as valid login pages, such as Google, or Facebook. Once a victim enters their password, their credentials are exposed to the attackers and its used for whatever other purposes deemed fit. In some attacks, these attackers were seen using these illegally obtained credentials to send out phishing email to other related entities, making them also fall prey to the attack.

These series of attack is attributed with high confidence towards Belltrox InfoTech Services. The attribution is made based on a few factors. The domain previously used by Belltrox – belltrox.org was registered by the email address from Yahoo, which was also used to register other phishing sites. Eventually this email address was changed. Operating hours of which the phishing emails were sent correspond to IST – GMT+0530. References to Indian festivals were made on their URL shortener (powered by phurl). Incidentally the same URL shortener was used by the attackers to link back to their CV. The founder of the company – Sumit Gupta (named as Sumit Vishnoi in DOJ documents) were previously indicted on hacking-for-hire scheme. In short, they were identified due to severe lack of opsec (I think the staffs didn’t know that they were suppose to keep it hush hush).

LinkedIn provides wealth of information about Belltrox and its circle. Based on the recommendations received by Belltrox and its staff, its clear that Belltrox has been working with private investigatiors and government agencies. This includes Canadian government officials, local law and state enforcement agencies and former intelligence agency staffs who are most likely gone professional.

Victimology indicates a large pool of diverse targets, which shows that the nature of business is not specific, but demand driven. This includes NGO that goes after large corporations, such as the #Exxonknew campaign. Interesting targets to note including friends and family members of those involved in the campaign, including the legal counsel.

At this point, it is certain that Belltrox is the source of the phishing campaigns. Who hired them still remains unknown. Sumit Gupta, when contacted, denied of any wrongdoing and stated that his firm assists his clients to retrieve emails for private investigators based on the credentials provided. (Yup…. eyes rolling here).

Belltrox has a wide range of industries where their target resides. This includes short sellers, hedge funds, financial journalist, global financial services, legal services, Eastern and Central Europe, Russia, government agencies and even individuals involved in private dispute.

Tools Techniques and Procedures – aka Tradecraft

Key modus operandi of Belltrox is phishing. They deploy a number of phishing kits (which they even leave it open/available). To power these phishing kits, a URL shortener is used. The URL shortener is based on a package called phurl, which creates a sequential numbered shortened URL, which makes it easy for the good guys ™ enumerate and identify what are the actual long URLs. Through this, the list of domains used for phishing is identified.

While phishing isn’t really new, this revelation strengthens the idea that phishing is very much relevant and effective. Login pages of commonly used services such as Google/Facebook is hosted, creating the opportunity for the attacker to capture credentials.

Hacking-as-a-Service (HaaS) – Global issue

HaaS is becoming a global thorn in the cyber realm. Emergence of such players, including reports on the based on Dark Matter highlights a lucrative market for such services, and that while the service remains clandestine, demand and need for such services continues to thrive. Legal frameworks are still developing over the need to handle and dismantle such services.

HaaS also presents an issue for attribution of attack. In this case, Belltrox was coined as the attacker, but the actual puppet master remains hidden. This can apply for nation state sponsored attacks, completely washing their hands away which engaging a contractor to do the dirty work for them.

Protecting Yourself

These attacks highlight a key need for 2 factor authentication. Worthy to note that any security control put into place makes it harder, but not impossible for attackers to get through.  The Dark Basin attacks runs on the premise that the victims did not secure their Google account with 2FA, making it easy for the attacker to use their ill gotten credentials to gain access.

References

  1. Norman Hangover Report – https://paper.seebug.org/papers/APT/APT_CyberCriminal_Campagin/2013/Norman_HangOver%20report_Executive%20Summary_042513.pdf
  2. Citizen Labs – Dark Basin – https://citizenlab.ca/2020/06/dark-basin-uncovering-a-massive-hack-for-hire-operation/
  3. EFF phishing attempts – https://www.eff.org/deeplinks/2017/09/phish-future

 

Sg Buloh hospital – Jan 2020 case study

Introduction

The Malay Mail reported that Sungai Buloh hospital (SBH) was recently hit with IT failures. Sg Buloh hospital is quite well known to the denizens of Klang Valley, being a governmental hospital of choice to many. I personally find the service is very good, doctors are friendly, professional and I don’t spend much time waiting as the processes are quite efficient.

The news report highlights difficulties in retrieving medical and investigation reports of patients. This problem was pinned at the hospital using Windows XP as the Operating System. Reports also mention that the hospital main servers were down for sometime. It was also quoted that the issue also stemmed due to limited storage space due to Windows XP OS limitation.

Some facts for consideration in this article, which to me sounded odd, but plausible. Lack of clarity on the matter, unfortunately fuels to the speculation (of which some are discussed in this article)

What about Windows XP?

Firstly the use of Windows XP. Microsoft positioned Windows XP as an end-user operating system, targeted to be installed on desktops and laptops. It was never meant to be used in server environments, though I have noticed small organization turns a desktop into a file share for the organization. The OS was never meant to be a server platform (though it can be at a minuscule deployment), and surely not for a hospital the size of SBH.

Microsoft declared Windows XP obsolete starting April 8, 2014. That means from the EOL (end-of-life) date, there will be no support. Meaning, if there is a bug or vulnerability in the OS, it will not be fixed. This also implies that the software ecosystem, such as anti-virus, endpoint protection and other critical software will also not made available as the efforts to maintain the software will be focused towards supported operating system. So, not only you have an OS that doesn’t have any updates, you also lose the updates for other software that runs on that ecosystem. Essentially a ticking time bombs.

Without any further information, one of the points of issues faced by SBH was the server failure. This can potentially be attributed to the use of Windows XP as the server OS (god forbid, but based on experience can/may happen), or a general failure at the server, be it at the OS, Application or Hardware level (may even be network, the server might be running on a 10Mbps network card, supporting the whole hospital). Again, without any further details, one can only speculate on this matter.

Another interesting point to note that the report also pointed out that Windows XP has a storage limitation. A quick check shows that the first limitation is at the memory level, and its due to the 32bit architecture. There are also issues at the file system level, as XP primarily supports FAT (File Allocation Table) format, which has a hard limit of 32GB. Worth to note that Microsoft also released a version of Windows XP for 64 bit. Windows XP supports FAT32, however the format tool natively supplied with XP doesn’t allow creation of FAT32 partitions (Primer: FAT/FAT32 creates an index of where files are located, based on free space available, which gives the OS a location of the files in the disk).

Software Obsolescence

Software obsolescence isn’t new, but its worth revisiting to understand how it can contribute to this situation.

When a software companies declare that a product is at End-Of-Life, its a statement to inform customers that they will no longer support that product, and that the customer should opt for newer software. While at face value it looks pretty simple, the impact is far reaching.

Hardware Compatibility

Firstly, upgrading an OS requires the hardware the be compatible. This means, if my father had a PC that he uses at home, he needs to first check if the PC can be upgraded. At times, the change in the OS can be drastic. Meaning, that the PC may not be compatible due to several reasons. For example, moving to a new PC hardware because the processor and it’s architecture is also obsolete. In this case, a new OS may only support 64bit architecture and may not have backwards compatibility to a 32 bit architecture. Windows XP supported both 32 bit and 64 bit architecture, paving way for the move from 32bit hardware to 64 bit hardware, which offers better scalability and flexibility. While this implies the link between OS and hardware, some subtle changes at software may have similar effect. Apple introduced MacOS Catalina which effectively prevented running on 32bit applications on it, making it a pure 64bit based Operating System. Many users reported the issue due to application availability, and some users ended up reverting back to MacOS Mojave.

Secondly, while the hardware becomes a factor to look in, another point to consider is the hardware compatibility towards the OS, from the point of drivers. Drivers are software that allows the OS to “talk” to the hardware. Without the drivers, the hardware remains useless. I remember the time when we had a spectrum analyzer in one of my previous roles, and the spectrum analyzer on worked on Windows 3.11 for Workgroups due to limited driver support. We couldn’t move it to the latest OS as the manufacturer had stopped support, and their recommendation was to spend equal amount of money to get a new set of equipments, that had the latest software and drivers. The organization, of course, chose the path of least expenditure, had to isolate the PC so that it was only a single use, making it a standalone independent analyzer. I still remember pulling the data off the analyzer using floppy disk. The same can be said to medical equipments. Imagine an MRI that was build using Windows XP as it’s operating system, cannot be migrated as the hardware support is no more there. No “sane” hospital would buy a new MRI machine just because the OS is outdated. As some IT experts will tell you – “If It ain’t broken, don’t fix it…”. The whole security industry was built around managing security risks, mind you.

Thirdly, organizations do not actively manage obsolescence. This is because obsolescence is an expensive affair. A cost conscious organization would do its best to “sweat its assets” and make their investments live past zero net book value. It creates additional cost to business when it comes to obsolescence management, and we potentially see it happen previously at the MAHB issue.  There is cost in replacing equipments, there is cost in migrating the data as well as time and resources required to carry out the project, not to mention training to all those involved so that they know how to use the new system (which may come with new UI/UX and workflow). It’s a very daunting affair, and one that has far reaching effect throughout the organization.

Software Dependencies

We depend on a number of software for our systems to work. Having a computer and the OS alone doesn’t do much, business runs based on its Line-Of-Business applications. Example, ERP (Enterprise Resource Planning), CRM (Customer Relationship Management) and many more. When the OS becomes deprecated, software developers also stop developing functionality for a now defunct OS, moving their codebase to a new platform. This means the code, depending on the extensivity of the change, may render the old code useless. Some platforms provide some degree of cross version support, but that’s usually limited, more of in favor of newer platforms. Remember that the same support required by the user is also required by the software developers to build their code on. There are some cross platform frameworks available, but even these frameworks may stop supporting older, deprecated OS.

The OS also comes with SDK (Software Development Kit), crucial for software development teams to harness the power of the OS. Just like how the OS gets deprecated, so will the platform SDK’s be.

Moving Forward

I discussed technology debt in greater details before, and it seems to be a recurring theme in large organizations in Malaysia. Obsolescence is a huge debt, which most organizations overlook, eventually coming back to haunt them. It’s not a discussion that any CEO/CFO would ever like to have, especially when the cost balloons and creates a huge dent in the balance sheets. The same can be seen in Governmental departments, where focus on maximizing the taxpayers money can be seen as a prudent outcome of well oiled administration.

Non-technology organization fail to grapple the complexities of technology. It took us long enough to start trusting automation and computing, and now this becomes another headache to manage. Some organizations even opt not to capitalize on the computing/internet era, effectively creating a barrier to efficiency and economies of scale (when it comes to data/records management). Having focus and attention to IT helps to alleviate such issues.

Most organizations establish an IT Steering Committee, comprising of senior leadership team to address such risks. Strategic discussion on project prioritization, maximizing annual budgets and looking at technology risks becomes a staple periodic discussion to look at IT and it’s associated risks/debts.

If at all, the business is faced up to the wall with no option but to manage its business with the existing infrastructure, there are some mitigation’s that can be applied. Ideally these systems should be run in isolation (similar to the spectrum analyzer case I spoke earlier). If it has to be networked due to the nature of the application, thorough backup and restoration procedures need to be established, run and periodically tested. I’ve seen organizations that pride on doing backups, but had never tested their backups, nor even tried to restore the backup (there’s a reason why those systems are called backup software and not restore software). Having a recovery plan helps, but the plan is as good as being tested periodically. Worst case scenario needs to be tested, break glass type procedures, and how business can run, in the event that complete failure is imminent. All else fails, start saving for a new system, maybe look at cloud as an alternative?

References

  1. Malay Mail (22 Jan 2020)- https://www.malaymail.com/news/malaysia/2020/01/22/report-sungai-buloh-hospital-hit-by-it-breakdown/1830364
  2. Microsoft XP EOL statement – https://www.microsoft.com/en-us/microsoft-365/windows/end-of-windows-xp-support
  3. MAHB Case Study – https://www.drsuresh.net/2019/09/mahb-case-study-aug2019/
  4. Technology Debt – https://www.drsuresh.net/2019/08/cyber-tech-debt/

Malaysian Airport Incident – A case study

Last updated: 4 September 2019

Acknowledgement

The information provided in this post was through crowdsourcing, thanks to the IT Security SIG set up by Nigel Rodrigues, contributed by many, with candid discussion which inspired me to write this article.

As this incident is still developing, this article will be updated with the latest information, and what you see here is a snapshot in time at the point.

The incident

On 21 August 2019, KLIA/KLIA2 airports begin to experience system and technical difficulties. The failure affected check-in counters, flight information display systems (FIDS), baggage handling, its airport mobile app, as well as payment systems which rely on its networks. Ostensibly, tempers and dissatisfaction among airport users were high.

Timeline

20 August 2019 – MAHB signed a MOU with Huawei on technology modernization.

21 August 2019 – KLIA/KLIA 2 reported system/technical issues affecting multiple systems in the airport. Initial news indicates a failure at the network equipment.

22 August 2019 – The Star reported that MAHB had informed that the situation will be resolved by 23 August 2019, as it has received new equipments to replace the existing ones and testing to be conducted on the same night.

23 August 2019 – MAHB updated their website (as at 6am) explaining that they are in the midst of stabilizing their system and had deployed additional buses to ferry the passengers to their respective terminals.

24 August 2019 – The Malay Mail reported that the situation had improved, passenger flow has been reported smooth with intermittent disruptions.

24 August 2019 – NACSA issued a statement to affirm that there were no cyber attacks which resulted in the network issue at KLIA/KLIA2.

25 August 2019 – The Malay Mail reported that KLIA/KLIA2 Operations has been restored to normal  based on a check by BERNAMA at 0930.

26 August 2019 – Ministry of Transport announces a panel to investigate the system failure of TAMS (Total Airport Management System). NACSA is one of the members who forms part of the committee.

26 August 2019 – MAHB in a statement was quoted saying that they are not dismissing the possibility of malicious intent that may have caused the incident.

26 August 2019 – Airport passengers were stating that its not a full service recovery, the information system was still down and the airports were operating at partial system availability.

27 August 2019 – Airlines seeking compensation from MAHB due to airport system down.

27 August 2019 – MAHB lodges police report over possibile malicious intent being cause of downtime.

28 August 2019 – PM orders probe to the airport downtime incident.

29 August 2019 – PDRM said to be probing 4 in relation to airport system failure, based on the report made by the IT division senior general manager.

30 August 2019 – AirAsia, a malaysian carrier is said to confirm not to sue MAHB due to the recent airport system failure.

2 September 2019 – Police said to have recorded 12 MAHB staff statements over the system failure incident.

3 September 2019 – 4 MAHB pioneer IT officers lodge counter police reports against MAHB. They were suspended, and claimed false accusation.

The cause

The details are vague, however the incident was pointed out to a faulty IP network switch which caused the IP network traffic to get to a grinding halt. The switch in question seems to be the core switch which processes all the network traffic for the airport.

Core switch is usually responsible for traffic between each segments, also acting as an aggregation point. Each area is connected via a smaller switch, point to an intermediate or aggregation switch which leads to the core switch. In this case, since the core switch is down, the segments are disconnected, through each machine shows as network connection as connected. Access to upstream, such as Internet, which is used for credit card payment gateway, is also interrupted as the traffic stops at the core since its not working.

Social Media Buzz

It was noted that a user, claiming to be a subcontractor to MAHB said that the network switch had been 17 years old and had not been changed since. This is unconfirmed, pending official statement from MAHB.

A report from Utusan Malaysia also have mentioned something similar, an excerpt mentioned here.

Related news

Just one day before the incident, on the 20 August 2019,  MAHB signed an MOU with Huawei “to drive MAHB’s digital transformation framework by enhancing connectivity and real-time information by connecting all stakeholders in one fully integrated digital ecosystem. The collaboration would also seek to set up a fully integrated network communication managed platform to manage above technology and integrated data to enable future big data analysis throughout the entire airport, further improving airport operation efficiency and reduce overall ICT cost.”

It’s probably sheer luck, the network equipment failed the very next day, seemingly catapulting the priority of this initiative.

Assessment

At this point, lack of official news seems to lead to multiple speculation. The first would be that the airport was under a cyber attack. This news was quickly quashed by NACSA, confirming that there were no attacks.

Another discussion lead to the belief that there should have been sufficient DR (Disaster Recovery) infrastructure to ensure business runs as usual. Assuming the social media news was right, most networks designed at that time would have had a typical star topology, whereby layer one connectivity would cascade back to a single core switch. Using Cisco as example, the spine and leaf architecture would have allowed the network to be redirected to a different core, should that had been the architecture. Spine and leaf is still a new concept, there may be others which any organization can adopt.

The Good

MAHB had been mobilizing their own staff, by recruiting and promoting initiatives to get them to assist the passengers during these trying times.  A poster was seen circulating on social media dated 22 August 2019 asking to assist the situation at KUL during peak hours (12 – 2pm & 4 – 10pm).

MAHB had exhibited strong understanding of the airport processes, being able to manage with manual processes and having pure manpower to handle the airports operations while the system was down.

Flipside

Assuming the theory about 17 year old network equipment is true, there can be 2 possible outcomes. The first, an overzealous CIO might end up saying “We should sweat our assets more, make sure you don’t buy anything new for the next 15 years! (BTW are we using the same brand as the airport?)”. Scary, to say the least! Worthwhile to remember that computer/network hardware are susceptible to degradation over time, even to the network copper wire, hence some data centers make it a point to “re-cable” their infrastructure periodically! Other views include “we’re not an airport, we wont need to worry about it”.

The second outcome is that investment on IT now becomes justifiable, as part of technology refresh. More prudent approach to technology life cycle emerges and that the MAHB story becomes a talking point at the Board level, raising the question of whether the assets in use are still (1) maintained, with necessary support and (2) prior to End-of-Life/End-of-Support. This is in line with managing tech debt, ensuring that such compounding interest doesn’t suddenly pop up!

Lessons learnt – so far

1. Have manual processes that will stand in if something fails. Can you operate without technology?

2. Understand the implications of tech debt. It’s a matter of time before it catches up and as an organization then pays the compounding interest. Reputational damage becomes severe and takes time to recover.

Reference

  1. Malay Mail – https://www.malaymail.com/news/malaysia/2019/08/23/mahb-network-failure-caused-systems-disruption-at-klia/1783638
  2. MAHB Official PR – https://www.malaysiaairports.com.my/media-centre/news/klia-network-disruption
  3. NACSA PR – https://www.nacsa.gov.my/doc/Press_Release_MAHB_KLIA_English.pdf
  4. TheStar MAHB Huawei MOU – https://www.thestar.com.my/business/business-news/2019/08/20/huawei-malaysia-to-support-mahb039s-digital-transformation
  5. Cisco Spine & Leaf Architecture – https://www.cisco.com/c/en/us/products/collateral/switches/nexus-7000-series-switches/white-paper-c11-737022.html
  6. Copper degradation – https://www.quora.com/Does-a-signal-sent-over-a-cable-network-degrade-over-time
  7. Potential malicious intent – https://www.thestar.com.my/news/nation/2019/08/26/mahb-not-ruling-out-malicious-intent-behind-klia-glitch
  8. The Star (22 Aug 2019)  – https://www.thestar.com.my/news/nation/2019/08/22/mahb-expects-klia-glitch-to-be-resolved-by-friday-morning-aug-23
  9. MAHB update (23 Aug 2019) – http://www.malaysiaairports.com.my/media-centre/news/latest-update-systems-disruption-klia-0
  10. The Malay Mail – Day 3 – https://www.malaymail.com/news/malaysia/2019/08/24/klia-systems-still-crippled-but-operations-improving-on-third-day-video/1783804
  11. The Malay Mail – Day 4 – https://www.malaymail.com/news/malaysia/2019/08/25/klia-operations-back-to-normal-after-system-outage/1784006
  12. The Star – https://www.thestar.com.my/business/business-news/2019/08/27/airlines-to-seek-mahb-compensation-for-delays-losses
  13. https://www.nst.com.my/news/nation/2019/08/516444/mahb-lodges-police-report-klia-systems-disruption
  14. https://www.thestar.com.my/news/nation/2019/08/28/pm-wants-probe-into-klia-systems-malfunction
  15. https://www.malaymail.com/news/malaysia/2019/08/29/report-police-to-probe-four-over-klia-systems-disruption/1785325
  16. https://www.nst.com.my/business/2019/08/517398/airasia-wont-sue-mahb-system-glitches-klia-and-klia2
  17. https://www.malaymail.com/news/malaysia/2019/09/02/klia-systems-disruption-police-record-statements-from-12-mahb-staff/1786552
  18. https://www.nst.com.my/news/crime-courts/2019/09/518461/4-mahb-it-officers-lodge-police-reports-against-their-employer-over