Sg Buloh hospital – Jan 2020 case study


The Malay Mail reported that Sungai Buloh hospital (SBH) was recently hit with IT failures. Sg Buloh hospital is quite well known to the denizens of Klang Valley, being a governmental hospital of choice to many. I personally find the service is very good, doctors are friendly, professional and I don’t spend much time waiting as the processes are quite efficient.

The news report highlights difficulties in retrieving medical and investigation reports of patients. This problem was pinned at the hospital using Windows XP as the Operating System. Reports also mention that the hospital main servers were down for sometime. It was also quoted that the issue also stemmed due to limited storage space due to Windows XP OS limitation.

Some facts for consideration in this article, which to me sounded odd, but plausible. Lack of clarity on the matter, unfortunately fuels to the speculation (of which some are discussed in this article)

What about Windows XP?

Firstly the use of Windows XP. Microsoft positioned Windows XP as an end-user operating system, targeted to be installed on desktops and laptops. It was never meant to be used in server environments, though I have noticed small organization turns a desktop into a file share for the organization. The OS was never meant to be a server platform (though it can be at a minuscule deployment), and surely not for a hospital the size of SBH.

Microsoft declared Windows XP obsolete starting April 8, 2014. That means from the EOL (end-of-life) date, there will be no support. Meaning, if there is a bug or vulnerability in the OS, it will not be fixed. This also implies that the software ecosystem, such as anti-virus, endpoint protection and other critical software will also not made available as the efforts to maintain the software will be focused towards supported operating system. So, not only you have an OS that doesn’t have any updates, you also lose the updates for other software that runs on that ecosystem. Essentially a ticking time bombs.

Without any further information, one of the points of issues faced by SBH was the server failure. This can potentially be attributed to the use of Windows XP as the server OS (god forbid, but based on experience can/may happen), or a general failure at the server, be it at the OS, Application or Hardware level (may even be network, the server might be running on a 10Mbps network card, supporting the whole hospital). Again, without any further details, one can only speculate on this matter.

Another interesting point to note that the report also pointed out that Windows XP has a storage limitation. A quick check shows that the first limitation is at the memory level, and its due to the 32bit architecture. There are also issues at the file system level, as XP primarily supports FAT (File Allocation Table) format, which has a hard limit of 32GB. Worth to note that Microsoft also released a version of Windows XP for 64 bit. Windows XP supports FAT32, however the format tool natively supplied with XP doesn’t allow creation of FAT32 partitions (Primer: FAT/FAT32 creates an index of where files are located, based on free space available, which gives the OS a location of the files in the disk).

Software Obsolescence

Software obsolescence isn’t new, but its worth revisiting to understand how it can contribute to this situation.

When a software companies declare that a product is at End-Of-Life, its a statement to inform customers that they will no longer support that product, and that the customer should opt for newer software. While at face value it looks pretty simple, the impact is far reaching.

Hardware Compatibility

Firstly, upgrading an OS requires the hardware the be compatible. This means, if my father had a PC that he uses at home, he needs to first check if the PC can be upgraded. At times, the change in the OS can be drastic. Meaning, that the PC may not be compatible due to several reasons. For example, moving to a new PC hardware because the processor and it’s architecture is also obsolete. In this case, a new OS may only support 64bit architecture and may not have backwards compatibility to a 32 bit architecture. Windows XP supported both 32 bit and 64 bit architecture, paving way for the move from 32bit hardware to 64 bit hardware, which offers better scalability and flexibility. While this implies the link between OS and hardware, some subtle changes at software may have similar effect. Apple introduced MacOS Catalina which effectively prevented running on 32bit applications on it, making it a pure 64bit based Operating System. Many users reported the issue due to application availability, and some users ended up reverting back to MacOS Mojave.

Secondly, while the hardware becomes a factor to look in, another point to consider is the hardware compatibility towards the OS, from the point of drivers. Drivers are software that allows the OS to “talk” to the hardware. Without the drivers, the hardware remains useless. I remember the time when we had a spectrum analyzer in one of my previous roles, and the spectrum analyzer on worked on Windows 3.11 for Workgroups due to limited driver support. We couldn’t move it to the latest OS as the manufacturer had stopped support, and their recommendation was to spend equal amount of money to get a new set of equipments, that had the latest software and drivers. The organization, of course, chose the path of least expenditure, had to isolate the PC so that it was only a single use, making it a standalone independent analyzer. I still remember pulling the data off the analyzer using floppy disk. The same can be said to medical equipments. Imagine an MRI that was build using Windows XP as it’s operating system, cannot be migrated as the hardware support is no more there. No “sane” hospital would buy a new MRI machine just because the OS is outdated. As some IT experts will tell you – “If It ain’t broken, don’t fix it…”. The whole security industry was built around managing security risks, mind you.

Thirdly, organizations do not actively manage obsolescence. This is because obsolescence is an expensive affair. A cost conscious organization would do its best to “sweat its assets” and make their investments live past zero net book value. It creates additional cost to business when it comes to obsolescence management, and we potentially see it happen previously at the MAHB issue.  There is cost in replacing equipments, there is cost in migrating the data as well as time and resources required to carry out the project, not to mention training to all those involved so that they know how to use the new system (which may come with new UI/UX and workflow). It’s a very daunting affair, and one that has far reaching effect throughout the organization.

Software Dependencies

We depend on a number of software for our systems to work. Having a computer and the OS alone doesn’t do much, business runs based on its Line-Of-Business applications. Example, ERP (Enterprise Resource Planning), CRM (Customer Relationship Management) and many more. When the OS becomes deprecated, software developers also stop developing functionality for a now defunct OS, moving their codebase to a new platform. This means the code, depending on the extensivity of the change, may render the old code useless. Some platforms provide some degree of cross version support, but that’s usually limited, more of in favor of newer platforms. Remember that the same support required by the user is also required by the software developers to build their code on. There are some cross platform frameworks available, but even these frameworks may stop supporting older, deprecated OS.

The OS also comes with SDK (Software Development Kit), crucial for software development teams to harness the power of the OS. Just like how the OS gets deprecated, so will the platform SDK’s be.

Moving Forward

I discussed technology debt in greater details before, and it seems to be a recurring theme in large organizations in Malaysia. Obsolescence is a huge debt, which most organizations overlook, eventually coming back to haunt them. It’s not a discussion that any CEO/CFO would ever like to have, especially when the cost balloons and creates a huge dent in the balance sheets. The same can be seen in Governmental departments, where focus on maximizing the taxpayers money can be seen as a prudent outcome of well oiled administration.

Non-technology organization fail to grapple the complexities of technology. It took us long enough to start trusting automation and computing, and now this becomes another headache to manage. Some organizations even opt not to capitalize on the computing/internet era, effectively creating a barrier to efficiency and economies of scale (when it comes to data/records management). Having focus and attention to IT helps to alleviate such issues.

Most organizations establish an IT Steering Committee, comprising of senior leadership team to address such risks. Strategic discussion on project prioritization, maximizing annual budgets and looking at technology risks becomes a staple periodic discussion to look at IT and it’s associated risks/debts.

If at all, the business is faced up to the wall with no option but to manage its business with the existing infrastructure, there are some mitigation’s that can be applied. Ideally these systems should be run in isolation (similar to the spectrum analyzer case I spoke earlier). If it has to be networked due to the nature of the application, thorough backup and restoration procedures need to be established, run and periodically tested. I’ve seen organizations that pride on doing backups, but had never tested their backups, nor even tried to restore the backup (there’s a reason why those systems are called backup software and not restore software). Having a recovery plan helps, but the plan is as good as being tested periodically. Worst case scenario needs to be tested, break glass type procedures, and how business can run, in the event that complete failure is imminent. All else fails, start saving for a new system, maybe look at cloud as an alternative?


  1. Malay Mail (22 Jan 2020)-
  2. Microsoft XP EOL statement –
  3. MAHB Case Study –
  4. Technology Debt –

Malaysian Airport Incident – A case study

Last updated: 4 September 2019


The information provided in this post was through crowdsourcing, thanks to the IT Security SIG set up by Nigel Rodrigues, contributed by many, with candid discussion which inspired me to write this article.

As this incident is still developing, this article will be updated with the latest information, and what you see here is a snapshot in time at the point.

The incident

On 21 August 2019, KLIA/KLIA2 airports begin to experience system and technical difficulties. The failure affected check-in counters, flight information display systems (FIDS), baggage handling, its airport mobile app, as well as payment systems which rely on its networks. Ostensibly, tempers and dissatisfaction among airport users were high.


20 August 2019 – MAHB signed a MOU with Huawei on technology modernization.

21 August 2019 – KLIA/KLIA 2 reported system/technical issues affecting multiple systems in the airport. Initial news indicates a failure at the network equipment.

22 August 2019 – The Star reported that MAHB had informed that the situation will be resolved by 23 August 2019, as it has received new equipments to replace the existing ones and testing to be conducted on the same night.

23 August 2019 – MAHB updated their website (as at 6am) explaining that they are in the midst of stabilizing their system and had deployed additional buses to ferry the passengers to their respective terminals.

24 August 2019 – The Malay Mail reported that the situation had improved, passenger flow has been reported smooth with intermittent disruptions.

24 August 2019 – NACSA issued a statement to affirm that there were no cyber attacks which resulted in the network issue at KLIA/KLIA2.

25 August 2019 – The Malay Mail reported that KLIA/KLIA2 Operations has been restored to normal  based on a check by BERNAMA at 0930.

26 August 2019 – Ministry of Transport announces a panel to investigate the system failure of TAMS (Total Airport Management System). NACSA is one of the members who forms part of the committee.

26 August 2019 – MAHB in a statement was quoted saying that they are not dismissing the possibility of malicious intent that may have caused the incident.

26 August 2019 – Airport passengers were stating that its not a full service recovery, the information system was still down and the airports were operating at partial system availability.

27 August 2019 – Airlines seeking compensation from MAHB due to airport system down.

27 August 2019 – MAHB lodges police report over possibile malicious intent being cause of downtime.

28 August 2019 – PM orders probe to the airport downtime incident.

29 August 2019 – PDRM said to be probing 4 in relation to airport system failure, based on the report made by the IT division senior general manager.

30 August 2019 – AirAsia, a malaysian carrier is said to confirm not to sue MAHB due to the recent airport system failure.

2 September 2019 – Police said to have recorded 12 MAHB staff statements over the system failure incident.

3 September 2019 – 4 MAHB pioneer IT officers lodge counter police reports against MAHB. They were suspended, and claimed false accusation.

The cause

The details are vague, however the incident was pointed out to a faulty IP network switch which caused the IP network traffic to get to a grinding halt. The switch in question seems to be the core switch which processes all the network traffic for the airport.

Core switch is usually responsible for traffic between each segments, also acting as an aggregation point. Each area is connected via a smaller switch, point to an intermediate or aggregation switch which leads to the core switch. In this case, since the core switch is down, the segments are disconnected, through each machine shows as network connection as connected. Access to upstream, such as Internet, which is used for credit card payment gateway, is also interrupted as the traffic stops at the core since its not working.

Social Media Buzz

It was noted that a user, claiming to be a subcontractor to MAHB said that the network switch had been 17 years old and had not been changed since. This is unconfirmed, pending official statement from MAHB.

A report from Utusan Malaysia also have mentioned something similar, an excerpt mentioned here.

Related news

Just one day before the incident, on the 20 August 2019,  MAHB signed an MOU with Huawei “to drive MAHB’s digital transformation framework by enhancing connectivity and real-time information by connecting all stakeholders in one fully integrated digital ecosystem. The collaboration would also seek to set up a fully integrated network communication managed platform to manage above technology and integrated data to enable future big data analysis throughout the entire airport, further improving airport operation efficiency and reduce overall ICT cost.”

It’s probably sheer luck, the network equipment failed the very next day, seemingly catapulting the priority of this initiative.


At this point, lack of official news seems to lead to multiple speculation. The first would be that the airport was under a cyber attack. This news was quickly quashed by NACSA, confirming that there were no attacks.

Another discussion lead to the belief that there should have been sufficient DR (Disaster Recovery) infrastructure to ensure business runs as usual. Assuming the social media news was right, most networks designed at that time would have had a typical star topology, whereby layer one connectivity would cascade back to a single core switch. Using Cisco as example, the spine and leaf architecture would have allowed the network to be redirected to a different core, should that had been the architecture. Spine and leaf is still a new concept, there may be others which any organization can adopt.

The Good

MAHB had been mobilizing their own staff, by recruiting and promoting initiatives to get them to assist the passengers during these trying times.  A poster was seen circulating on social media dated 22 August 2019 asking to assist the situation at KUL during peak hours (12 – 2pm & 4 – 10pm).

MAHB had exhibited strong understanding of the airport processes, being able to manage with manual processes and having pure manpower to handle the airports operations while the system was down.


Assuming the theory about 17 year old network equipment is true, there can be 2 possible outcomes. The first, an overzealous CIO might end up saying “We should sweat our assets more, make sure you don’t buy anything new for the next 15 years! (BTW are we using the same brand as the airport?)”. Scary, to say the least! Worthwhile to remember that computer/network hardware are susceptible to degradation over time, even to the network copper wire, hence some data centers make it a point to “re-cable” their infrastructure periodically! Other views include “we’re not an airport, we wont need to worry about it”.

The second outcome is that investment on IT now becomes justifiable, as part of technology refresh. More prudent approach to technology life cycle emerges and that the MAHB story becomes a talking point at the Board level, raising the question of whether the assets in use are still (1) maintained, with necessary support and (2) prior to End-of-Life/End-of-Support. This is in line with managing tech debt, ensuring that such compounding interest doesn’t suddenly pop up!

Lessons learnt – so far

1. Have manual processes that will stand in if something fails. Can you operate without technology?

2. Understand the implications of tech debt. It’s a matter of time before it catches up and as an organization then pays the compounding interest. Reputational damage becomes severe and takes time to recover.


  1. Malay Mail –
  2. MAHB Official PR –
  3. NACSA PR –
  4. TheStar MAHB Huawei MOU –
  5. Cisco Spine & Leaf Architecture –
  6. Copper degradation –
  7. Potential malicious intent –
  8. The Star (22 Aug 2019)  –
  9. MAHB update (23 Aug 2019) –
  10. The Malay Mail – Day 3 –
  11. The Malay Mail – Day 4 –
  12. The Star –

IT vs Cyber Security – Technology Debt

Where are we today?

Almost on a daily basis, we are bombarded with news of cyber attacks, breaches, data leaks and more. It’s as if cyber related issues are becoming a norm, so much so someone was quoted saying “ There are 2 types of organization; the ones that has been breached, and the ones that have yet to be”. As such, all organizations are putting emphasis on spending for continuity, and one question gets asked quite frequently. How much is enough? Is there a magical percentage that a CEO needs to consider as part of healthy spending to ensure that the safeguards are sufficient to manage today and tomorrow’s risk?

While there are research done on average spend buy organization, that is not an accurate reflection of what a particular organization’s spend pattern for protection of its assets. This article aims to demystify the topic on technology debt, using Security as a factor,  in order to identify right-spending for an organization. Technology debt is just one of the consideration to put into place when evaluating tech spend vs. security spend as a consideration.

What is a technology debt? 

A debt is defined as owing. When someone borrows money, they are obliged to return in (in most instances with an interest). A technology debt concept is no different than a conventional debt, however it is in the form of considerations, protection and governance aspect in rolling out current and new technology.

The concept of interest in technology debt is the occurrence of an event which creates an additional burden to the organization. Example, a cyber breach causes additional overheads from manpower utilization, engagement of relevant third party for services such as recovery, forensics as well as additional expenditure incurred.

Does Technology incur a debt? How does it work?

To illustrate technology debt, there will be 3 examples on how technology debt is incurred.

Scenario 1

An end user procures a computer for home use. He gets the operating system installed and starts using. He/She finds the computer very useful and engaging, starts using it for not just work/assignments, but also for personal content consumption such as videos, websites and even social media. One day, the user encounters a phishing email, which leads to downloading an attachment which infects the user with a ransomware. As his work is important and needs to be sent to the customer, the user ends up paying the ransom.

For this case, the tech debt was incurred at the point of starting the use the computer. The debt was to ensure that the computer was secured, had the necessary protection in place, such as endpoint protection and phishing alert. Because the debt as been incurred, the user ends up paying with interest, i.e. the ransom in order to retrieve the data.

Question: does the debt end here? Yes and no. While the ransom is paid (interest), the debt (principal) is still there. The debt goes away when the user secures his endpoint/laptop/machine and removes the “debt” altogether.

Scenario 2

A hardware store has purchased a Point-of-Sale (POS) terminal for use, primarily to ensure sales tax calculations are done and the reports made available for submission to the authorities. There is a thermal printer to print out the receipts, with the computed tax value as per regulations. A barcode scanner is attached to make it easy to input item code for the data capture during checkout. It became very convenient, so much so that even the inventory was managed effectively. Life seems to have been easier, thanks to the new technology. The POS came with 1TB hard drive, which makes it almost impossible to fill it up.

One day, for some unfortunate reason, the hard drive in the Point-of-Sale machine crashed. This resulted in some inconvenience as the the items have to be manually computed. Because of the convenience of the POS system, the prices are no longer printed as the reliance is towards product barcode. A manual list with the prices had to be derived after calling the vendors for price confirmation. What made it worse? The taxation department decides to show up for an audit, demanding to see the taxation report that was suppose to be produced adhoc as part of system requirement for taxation.

The technology debt in this case is the inability to backup and restore the system. While the reliance of the system is good, the debt (backup/recovery) had been incurred, and the user ended up paying interest (fines due to non compliance, additional recovery services, manual process institution, time wastage).

Scenario 3

A mobile app development firm has purchased a server  to store their source codes. The server is backed up daily using DVD and a copy is kept in a separate site. The server is configured with detailed access control list to ensure only the right people have access to the right set of codes.

A disgruntled employee decided to take matters into his/her own hands and deleted portion of the codes on the day he/she was leaving. The manager discovered the issue when reviewing the CI/CD logs during build failure and found files missing. Upon inspecting the version control software, identified the malicious action that has taken place. The manager proceeded to recover the part of the tree that was lost, and compared it against the backup that was kept to ensure that the changes made were consistent.

This case shows zero debt scenario. While deploying the solution, the IT team took into consideration requirements for backup, audit logs and continuity plan. When a potential “interest” scenario came up, because the debt was zero, there was none/minimal impact to the organization.

How does tech debt influence budgeting?

As technology gets deployed, as illustrated above, debt starts coming in. In some organizations, the debt is addressed up front as technology is deployed to avoid interest. Some organizations pan out the debt over time, in hopes that the interest will never come up.

How does this influence budgeting? The budget to manage security will include ensuring that the debt is being addressed timely. For organizations that has incurred debt, in order to zeroize the debt, expenditure needs to be done. As budgets are usually one line item for an organization, this then is seen in the percentage split between IT spend vs security spend.

Hence, for some organization which heavy tech debt, the budget will be more towards resolving the debt rather than expansion of IT. The percentage split will be skewed as the debt now influences the spend percentage.

Another reason why the spend will be skewed is when the interest come into play. Due to an incident, the interest becomes mature and payable. This creates additional expenditure which eats up into the budget. Post incident usually sees organization putting more emphasis into governance and control, almost having a blank cheque to show commitment, including in most instances, hiring a CISO that reports directly to CEO and Board.

The result, difference in spend percentage compared to overall budget based on level of debt resolution, depending on the state of the organization. Mature organization depends on resolving debt as the technology is incorporated, while other play catch up, due to business and budget limitation. What’s important is to be mindful that the debt may spring an interest at any time, causing organization to end up spending more. Delayed investment may result in heightened expenditure.

While the scenarios presented above may be simplistic, it is worth remembering that technology debt is often multi-dimensional and require an in-depth study to ascertain the respective areas of protection required. In the future article, we can discuss about this aspect of multi-dimensional tech debt and how to look at resolving the debt and preventing interest.

Moving forward

The crux of this article was to make a clear distinction between why different organization had different budget spend split. Though a baseline of spend helps CEOs identify whether the spend is healthy, understanding the technology debt help to justify why the spend needs to be more for some organization. While most organizations look at analyst report on average security spend, it is wise to ensure that technology debt is kept at check to ensure lack of interest popping up.

Perhaps if there is enough interest, then I can write up on identifying and resolving technology debt.

Geopolitical considerations as part of Technology risk

This thread started off as a discussion at the local Mamak (the Malaysian colloquial terminology for your cafe). A bunch of security and tech folks meet up to ponder upon the world and business woe.

The discussion started off with the question “How do you decide on your tech purchase? What are your consideration factors?”

Our conservative buddy came up and said “You can never go wrong with Brand X! Tried and tested”. That seems to indicate that the selection criteria is based on market presence, branding and prominence. As well as adoption.

The bleeding edge/challenge the status quo person came up and said “Why not Open Source?” Its mature enough for adoption and more organisations are cozying up to the idea that Open Source will work, provided that support is available.

Then comes in the CIO, whom made it clear that his/her choice will be cost based. Why bother paying premium and consider alternatives when you can get a good bargain at a reasonable choice? Pricing would be the ultimate deciding factor, provided that it meets bare minimum.

I had to open my mouth and ask, ”what about geopolitical consideration?” Everyone had a flustered look, some in amazement and some pretended that was not even the case. Geopolitical? Is that even necessary?

What is geopolitical consideration/risk?

This is a consideration when you view the origin/source country of technology and consciously make a decision to use technology from another country. Example, if the first tier of firewall originates from US, the second tier of firewall may be purchased from Russia (ignoring the underlying hardware may all originate from China, the consideration here is based on vendor origin, not part origin, although that may be a severe version of geopolitical based risk separation).

History Lesson – PGP

A little bit of history lesson on technology, starting from cryptography. PGP was created by Phil Zimmerman in 1981. PGP was created with the intention of securing communications between activists and to prevent snooping. The software was free to use, as long as its not for commercial use. Eventually PGP ended up on the Internet, being adopted for widespread use as an added encryption layer on top of emails.

In 1983, Zimmerman became target of prosecution. Cryptographic capabilities above 128 bits became subject to export restriction and Zimmerman’s PGP was using keys with defaults of 1024. Zimmerman became a target, due to violations on “munitions export without license”. Definition of munition includes “guns, bombs and even software”. For unknown reasons, the case never proceeded and was eventually dropped without any criminal charges filed.

Zimmerman was determined to make his software public. He identified a loophole, in which the First Ammendment, protects the export of books. Through MIT Press, Zimmerman published the source codes of PGP. One had to simply procure the book, scan the contents and make it digital using OCR (Optical Character Recognition); or simply type the code into a program.

More challenges on export

A similar situation happened to D.J. Bernstein. He wanted to publish the source code of his Snuggle encryption system. Together with EFF, DJ Bernstein challenged the export ruling. After 4 years and one regulatory change, Ninth Circuit Courts of Appeal ruled that software source code is protected by the First Amendment, and government preventing the publication is unconstitutional.

Why geopolitical risk?

The world is already borderless. Technology crosses boundaries easily without much hassle. However, G2G relationships are never that smooth (G2G – Government to Government). Technology sold by a company is governed by the laws in which that company is HQ’ed. Hence indirectly, law of the land plays an important role in ensuring that governments play an indirect crucial role in determining the availability of technology.

The most common technology denominator is the USA. USA produces majority of technology innovations which the world uses. An example used in the earlier part of this article is encryption/cryptography technology. As algorithms become prevalent, the use of these algorithms often become subject of export restrictions.

The rise of nation states

Borderless world creates borderless problems. The hacking scene (not the “Texas Chainsaw Massacre type”) used to be fueled by hormone-raged idealistic filled teens, or just curious cats trying to learn tech. But today, dominance in cyber space is seen as a sign of “Cyber-sovereignty” and arms race towards cyber dominance becomes imminent. (Man I really abuse the cyber word this time…)

As explained earlier, the battle ground has shifted into the cyber world. Corporates are becoming the unwilling victims in the fight towards dominance. Nation-states may infiltrate large corporate organizations in order to further their agenda, by implanting their tech folks which directly influence the product build. This means that product that gets shipped out may potentially be inhibited with malicious code, backdoors or even intentional vulnerabilities in order for nation state actors to freely abuse.

Export laws, sanctions and politics

Open any news site right now and you’d hear about trade wars between government. In the recent news, one government has stood firm and taken actions against another country for alleged espionage. This resulted in key companies in the country being denied business and imposed high levies and taxes. The situation created a “tit-for-tat” reaction, causing a downward spiral of impact towards other organizations which forms part of the ecosystem.

Standards and tech volition

If export restriction becomes apparent, in a new twist to the developing stories, standards organisation are now becoming subject of such ruling. One standards body which is referred to worldwide has stepped up and imposed bans towards researchers from a said country from being moderators or participating in standards building. This has far reaching impact to the global community.

Firstly, other countries who are not part of the trade war are now unwilling victims as the standards body align themselves towards the country stance. Secondly, the countries now have to re-evaluate and establish their own standards, or subscribe to a common standard which all vendors should be given a chance to participate. ISO (International Standards Organization) is a global standards body which prides on being independent from country level politics (while the standards being voted are based on country lines and affiliations).

In one hand, you need a standards body as reference point, and in another you’ll need to start excluding standards body which shows affiliation towards country level policies. Aligning standards into a country specific set will be another arduous task.

Long story short

Countries today can no longer exclude geopolitical factors of risk. This is prevalent today, looking at the recent developments in the international arena and current trade wars and Brexit. While moving towards Industry Revolution 4.0, it is important to no longer be in a nutshell, but understand that borderless is a reality and new sets of regulations are emerging to govern tech and it’s use.

Insider Threat – A look at AT&T incident

In a recent expose published by SecureWorld through court documents seen, this issue has suddenly hit the spotlight.

The damning question, can your employees be bought?

Lets look at the reported news on the incident experienced by AT&T Wireless. The A&T& Wireless call center in Bothell, Washington is where this had begun. Call center employees knowingly shared their credentials with the cybercriminal, in exchange for money. According to DOJ based on the indictment documents, one call center employee who made the most had paid “$428,000 over 5 years scheme”.

There were 3 things that the employees did

  1. The employees were instructed to install malware in their machine.
  2. The employees installed unauthorized access points, hardware devices to create a backdoor into the network.
  3. The employees installed a specialized malware that performs phone unlocking through AT&T’s internal network using valid AT&T credentials that were obtained from the call center agents.

The objective of the “intrusion” was to create unlocked phones. Phones that are sold in the US are carrier locked, meaning once the phone is provisioned, only AT&T services can be used on those phones. Having the phones carrier unlocked creates a huge market, selling them on eBay and other online stores.

This begs the question, why would the phones require to be unlocked in the first place? The phones are locked to a carrier because it is subsidized and requires a contract. When a user travels overseas, the phones may require unlocking for roaming purposes, hence unlocking becomes a legal function of the call center.

This racket had netted more than 2 million phones to be unlocked and sold. At the rate of an iPhone price, one can only imagine how much money is there to be made.

According to the official documents, the scheme  began somewhere around 2012, and around October 2013, AT&T discovered the unlocking malware. When questioned, the AT&T staff in question left the organization. The criminals were determined, recruiting new insiders in the same call center on the subsequent year. Recruitment happened through Facebook (surprise, surprise, and not LinkedIn) and the bribes were made in-person. The cybercriminal, known as Muhammad Fahd, is now in jail.

A breakdown of this issue

  • Call center agents sold their access and performed illegal acts in payment for money. Insider threat will remain a key issue, and it becomes a challenging issue to tackle. While a potential solution can be to look at a “lifestyle audit”, getting trustworthy staff will always be a challenge, in a market where skills are limited.
  • Valid access used for illegal activities – this may be potentially addressed by monitoring activities performed with a certain ID. This means that there is sufficient logs in place and systems to correlate and analyze system usage behavior and look at baselining activities to identify anomalies. If someone does extremely too many unblocking, a check on what is actually done is performed. This review process (often slow, painful and most of the time even manual) is usually avoided due to unnecessary workload, though i am sure AT&T would enforce this as a requirement now.
  • Installation of malware – call center agents should never ever have administrative access. Ability to install or run application should be limited, through the use of application whitelisting. However, there will still be an issue of a malicious IT Technician, which may have been possible in this scenario.
  • Installation of access points – rogue access points can be detected with Wireless Intrusion Prevention systems. However, WIPS presents different set of problems, as it may potentially deny the use of wireless due to neighboring building APs having signal spillover, effectively causing a denial of services attack.

Taking the original question into perspective, can employees be bought? The answer to this question is multi-faceted while the technical challenges can be addressed

  1. Getting more money is always appealing to everyone around. Looking at the money being made, a call center agent would have jumped to the occasion because of the sheer amount to be made.
  2. Moral obligation of doing the right thing. In any such cases, you’d hear many reasons why the staff did what he/she did. From the point of making ends meet, to doing something that didn’t hurt anyone, moral standing has always been on shaky grounds.
  3. Economics of organizations also play a part. Income disparity, job satisfaction vs load becomes a talking point. Most call center agents bear the brunt of the customers, and often, even yelled at. Hence call centers become a churning pot for most organization, and those who stay are often resilient, understanding that it is a thankless job.
  4. Making examples – some organizations motivate employees to do the right thing by (1) having a whistleblowing policy to aid reporting and (2) showing examples of action taken against wrong-doing. Denying someone their livelihood has always been a key motivator to do the right thing.


SecureWorld –


Capital One – The Breach

Capital One (PRNewsFoto/Capital One Financial Corp)

The incident

Capital One issued a press release on 29 July 2019 that there was an unauthorized access by an outside individual who obtained access to it’s customer information. The information that was obtained were credit card application information, for applications between 2005 to early 2019. Information breached includes

– Name

– Addresses, ZIP/Postal Codes

– Phone number

– Email addresses

– Date of Birth

– Income information (self reported)

– Status information – credit scores, credit limits, balances, payment history, contact information

– Transaction data from a total of 23 days during 2016,2017 & 2018

– Social Security Number of 140,000 and 80,000 linked bank account of secured credit card

The existing customers doesn’t seem to be affected as the system in question was only specific for credit card application facility.

What happened?

According to CapitalOne, the “highly sophisticated individual” was able to exploit a certain configuration vulnerability. NYT added that it was a misconfiguration of a firewall on a web application, and this echoes the court documents pointing to a misconfigured firewall on Capital One’s Amazon Web Services cloud server. The information was accessed between March 12 to July 17.

More than 700 folders of data was stored on the server.

The hacker

FBI has arrested Paige A. Thompson, going by the nick “erratic”, according to Justice Department. Ms Thompson made appearance in Seattle District Court on July 29, 2019 and was ordered to be detained pending hearing on August 1,2019.

Ms Thompson posted in GitHub regarding the information theft, which was reported by a GitHub user to Capital One on July 17, 2019. Capital One contacted FBI on July 19, 2019 after confirming the breach to be legitimate. FBI confirmed the identity of the attacker.

Ms. Thompson has worked with Amazon Web Services before. It was also evident that Ms Thompson left online trails of her hacker activities. She is listed as an organizer for “Seattle Wares Kiddies”, a group on Meetup, which lead to her online identities at other social media such as Twitter and Slack. The nick “erratic” was identified back to Ms Thompson as she had previously posted a photograph of an invoice for a veterinarian care services.

Ms. Thompson was quoted saying that “I’ve basically strapped myself with a bomb vest” in a related Slack posting, according to the prosecutors. If convicted, Ms Thompson will face the possibility of a USD250K fine and up to 5 years jail term.

The victim (?)

Capital One had anticipated that they would be incurring loss of up to USD150 million, which includes paying for the customer’s credit monitoring services. The credit monitoring services and identity protection services is offered as part of compensation for those affected.

Capital One may also be facing potential regulatory fines/sanctions, which at this point of time is still undetermined, as well as lawsuits.

New York Times was also seen to report that Amazon has refused any blame as part of the incident. Amazon told Newsweek that “this type of vulnerability is not specific to the cloud“. Misconfiguration, be it at the application or data bucket layer seems to be leading cause of data theft from cloud infrastructures, as seen the past such as Attinuty. Amazon maintains that “you choose how your content is secured“.

Situational Analysis

SocMed seems to be abuzz about whether the focus should be on the attacker, since its a criminal offense, while Capital One walks free. While the attacker may have done crime, the question is, could it have been prevented?

From a criminal aspect, what Ms Thompson did is illegal. The proof, which seemly handed by Ms Thompson herself, due to number of posts/articles, as well as poor opsec due to posting of the invoice. The public persona of Ms Thompson indicates her leaning towards hacking, and postings on the social media channels indicate admission. The prosecutors would have all the necessary evidences to convict Ms. Thompson, following the digital trail. In my opinion, seems like the prosecutors have an open/shut case in their hands.

Capital One was also dissected on social media for it’s role in the incident. The question remains if Capital One had done everything it possibly could to ensure such issues do not occur. Reading from the press release, it seems that Capital One looks to “augment routine automated scanning to look for this issue on a continuous basis”. Not sure how to interpret that, whether a routine automated scan has been recently introduced, or whether the scan itself was enhanced to include misconfiguration related issues.

What’s next?

Companies with cloud presence has a different set of security concerns to address. While traditional on-prem presence seems to indicate better control. Some quick action items to be done for organizations concerned with such issues

I. Train your staff on cloud security. It can be provider specific as well as provider agnostic.

II. Providers such as Amazon/Azure has configuration templates which can be used to securely roll out services. These configurations are secure by default and will not allow any insecure setup. Insecure setup should be reviewed and follow internal process for deployment and approval.

III. Deploy tools to check for misconfiguration on a periodic basis.

IV. Separate instances based on type environment – Development/Testing/Production

V. Enforce strict IAM/PAM (Identity Access Management/Privilege Access Management) to ensure access is managed effectively


  1. New York Times –
  2. TechRadar –
  3. CNET –
  4. US Dept of Justice –
  5. Capital One –
  6. NewsWeek –
  7. US Department of Justice – Case details –

Do you need BCP for Cloud?

I woke up feeling very warm. I thought I missed the alarm, but its just 3:23 am. Very sure I don’t need a potty break, extremely sleepy and obviously upset. Leaned over to see the AC (air-condition), and I found that it was off. I’m very sure its too warm and by now the AC should have kicked in. Mumbling, I woke my already tired and weary body and walked towards the thermostat to see what’s happening.

After blinking a few times to get my sight back to normal, I found that the Nest thermostat isn’t working. Walking back to my bedside table to grab my phone (I know, it’s a bad habit), I checked to see if the internet was down. WiFi seems up, checked my public IP (instead of good ol’ ping), everything seems okay. Google search shows up okay. Still with sleep in my head, I rummaged through my bedside drawer for the remote and turned it on. “This is too much work” – grumbled my half sleepy head. That’s enough for the night.

Woke up in the morning with a sleep hangover (yes, its possible, when you don’t have enough sleep), I was trying to figure out what happened. Turned on twtr and true enough, reports on Google Cloud services failure starts trickling in.


The horror! Google Cloud services went down?

*My panicked head screaming – The sky has fallen! The Sky has fallen!*

This pretty much explains why the thermostat went down. I wondered how may threat actors lost their C2 hosted on Google Services, how many IOT devices like the Nest Thermostat stopped working and other dependent service. If as an end user I am grumbling on the service availability, how about corporate organisations relying on Cloud services ?

Today’s organization rely heavily on cloud. Business today runs on cloud. Social media runs on cloud. Almost everything runs on cloud. Whether it’s server/virtual servers, serverless, functions (you name it), runs on cloud. (Disclaimer, most of my stuff also runs on cloud…)

But, is cloud outage a rarity? Well it depends on what you deem as rare. The Internet forgives, but never forgets. In August 25, 2013, AWS suffered an outage, bringing down Vine and Instagram with it. March 14, 2019, Facebook went down, bringing WhatsApp together in an apparent server configuration change issue.

The impact is obvious, business will lose revenue when the services goes down. Local franchise such as AirAsia, runs their kit mostly on Cloud. The impact is devastating, imagine booking of flights goes dark. So does a lot of other business. Hence this brings an interesting point: What is your business continuity plan if cloud goes down?

When I had this conversation a few years ago, most CIOs I spoke to boldly claim that their BCP is the cloud (we never reached the part about cloud and security because its most often dominated by the cost debate). There is no need, due to the apparent global redundancies of cloud infrastructure. The once-sleeping-soundly-at-night CIOs are now rudely awaken (just like me, due to the broken thermostat) that cloud no longer offers the comfort they can afford, after investing years of CAPEX (capital expenditure) and happily paying cloud services their monthly dues to show that their services are up.

Few points to note for those interested in even thinking about Cloud BCP. Yes, its time we take the skeletons out of the closet and start talking about this.

Firstly, can your application and services run a completely different cloud provider? Let’s look at the layers of services before we answer this question.

XKCD - The Cloud

If you are running server images (compute cloud), it’s completely possible to run in a different cloud provider. You’ll need to be able to replicate the server image across cloud provider. You can archive the setup of your cloud server via scripts, create a repository to host your configuration files and execute the setup script to bring up the services in a separate cloud provider. The setup and configuration can be hosted in a private git/svn repository and called up when needed.

What about data? Most database services provide for replication and data backup services. For “modern” database services, data can be spread across multiple database for better data availability and redundancies.

The actual stickler for hybrid cloud is serverless/function based hosting. If the organization invests heavily in one particular cloud provider’s technology (without naming any particular provider), then it depends on the portability of that technology. If something common such as Python is used, the portability is pretty much assured. Technologies that are exclusive for a cloud provider will have issues of portability across different cloud providers.

Another question that needs to be answered is, how would you “swing” your services across different cloud providers? A common approach for internet availability is to use DNS services. Using DNS, the organization can change the location of services by changing the DNS records. This would allow seamless failover without having to change the URL. However, speed of failover will be determined based on the DNS TTL (time-to-live) configuration of that record. Too low, your DNS will be constantly hit with queries, but changes are almost instantaneous (usually a low TTL is around 15 to 30 minutes). Too high, your DNS infrastructure will have low traffic, but takes a long time before the failover actually happens. DNS based failover also creates administrative headache for firewall administrators as they have to change their approach from IP based to a DNS based access control list.

All of cloud isn’t just hot air. Moving towards Industry 4.0 (now I’m just throwing buzzwords around), Cloud adoption is definitely a core component of the technology strategy that each organisation needs to have. As times goes by, we find that even cloud is fallible, hence a proper approach towards Cloud is key in business continuity.

So, what’s your approach towards Cloud Services BCP?

Information Mismanagement – the need for proper Information Security

At this day and age, it is difficult NOT to automate/computerise your business/data.
Your receipts are part of an elaborate data capture/retention/warehouse infrastructure which constantly crunches numbers, creating meaningful information in a vast cloud of networks, systems and storage. As such, one cannot run away from the responsibilities of protecting that data, which is key to any business in this modern age.

It is nearly impossible to operate a business in total isolation. One might say that he is a petty trader and does not need much information management. Well, you might get into trouble if your books are not in order, your stocks mismanaged, your payments unmet, and your cash mismanaged. You can run foul of your business, or even being chased by the tax collector.

I’ve seen most SME organizations tend to have very small IT outfit, and treat everything as part of the IT responsibility. The reality is, the web designer you hired, may be able to fix some common IT issues, but will not be able to tell you the real risks of information mismanagement. Your organization gets hit by a worm/virus infection, and you invest on some anti virus solution. Your website gets hacked, you just reinstall the OS. After a while, you realize that your competitors seem to know your every move, and you feel helpless trying to move your business further. It can be convenient to blame the IT Guy a.k.a Programmer a.k.a Security Guy.

Then comes the crude Information Security program. You hire someone whose heard of information security, put him way down the food chain (or the reporting hierarchy) and expect everything to be secure. The person comes with standard kit approach. Have firewalls, install anti virus. Spend a little, and get more maybe? Sure, that sounds reasonable. But guess what? You still get attacked, you blame your security vendor and eventually fire your security guy. Again, doesn’t sound that workable, right?

You grow further, having a team, but still buried under the food chain. You have people advising you at the project level on your implementations and do periodic reviews/audits. Sounds good right? But here’s the problem. Projects have the word COST tied to it. And security is a line item thats “nice to have“. So when push comes to shove, the line item called security gets pushed aside because the project must go on, at break-neck price. Even before the team can say anything, their own boss muffles their voice. Risk doesn’t get documented, easily swept under the carpet. (Sounds familiar?)

You reach a stumbling block where things keep failing. You start wondering whether is it the people? the process? What gives?

Herein the problem lies in implementing Information Security in an organization. Depending on the goal of the organization and the governance level of the organization, that’s how successful the Information Security program will be.

As a CEO/Board of Director, the governance determinant of Information Security needs to come as a mandate for corporate governance. The CEO/Board of Director needs to agree that Information Security is an agenda for review (either as a line item by itself, or as part of Audit Committee Review, or as Enterprise Risk Management review). Establishing a clear escalation process to the Board provides visibility and accountability of the company’s status, allows the Directors to have clearer view of the organization. Besides that, the Board is assured that the organization is in compliance with information security/privacy laws that may govern the business. The CEO will be accountable at the company level to ensure that the Information Security program is running and conducts reviews and ensures that escalation reports are discussed and closed timely. Key message here, visibility and reporting.

CEO also has many other functions, so this particular function then goes down to CSO/CISO. CSO (Chief Security Officer) will encompass the 2 large security domains, namely physical & information security. Whereas CISO (Chief Information Security Officer) is responsible for Information Security controls & governance. When establishing the hierarchy, position and reporting visibility also needs to be thought through. The reporting role (both official and unofficial) will ensure that the subject matter gets right attention. In highly governed environment, CISO/CSO reports directly to the COO/CEO level, and has a reporting requirements to the Board of Directors. Otherwise CISO function is absorbed within the Audit/Assurance structure.  In a slightly less governed environment, the CISO/CSO reports to a Head under the COO/CEO level (usually under the CIO/CTO reporting line). In other organizations, the CISO role is just a manager role within the large IT/Technology enclave. Key message here: reporting structure and empowerment.

The success of information management in any organization depends on how well information is governed. Process and policy comes into play. Having a well defined policy (using standards based policy like ISO 27002:2005 as a baseline helps to ensure that you’ve got all your bases covered. But having policies alone does not help. Policies needs to be translated into standards, and guidelines and then woven into the fabric of everyday process. The enhancement of these processes should help in improving the process, while carefully ensuring that it does not disrupt business due to unnecessary red tapes or throwing the process into a state of limbo. Take time to get the policies reviewed at all levels of organization, that helps you to get buy in from everyone. Policies are living documents, so be prepared to time review processes and get the documents to be approved by the right levels (usually CEO). Review quantum should be kept at one year. Have the ability to enforce immediate new policy requirements (due to urgent business needs) without having to do a full review, as this would enable immediate steps taken to prevent further issues/damage, but be prudent with this ability. Key message here: properly defined policy which can be adopted into everyday processes.

The structure of an infosec team would make a difference in how the organization needs are managed. Understand roles that other department plays, such as Audit as they would be performing some of the functions. Having 2 divisions performing the same function is ridiculous, you might as well empower the right divisions to manage the right responsibilities. Clearly state boundaries (use RACI charts) of each team, identify their abilities and functions. Even within the infosec team, you can further structure it. The operational aspect of information security can remain with the operations team, doing the day-to-day operational tasks, whereas the more strategic/tactical roles can reside in a different hierarchy. Key word here: check and balance, even within information security.

Lastly, the organization itself needs to move as a unit. In some organizations, information security is often perceived as a stumbling block. You’d probably hear more NO’s than YES, or more grouses than actual solutions. In those cases, clearly the organization objectives are overshadowed by individual preference. Becoming the solution provider goes a long way in building rapport and getting things done. If you get cold-storage, then you will not move anywhere, nor will you get the right level of participation to see your goals through. Information Security goals must tie back to the overall organization roles. In cases where the book doesn’t work, rationale mind comes into importance. Establish an exemption process which is a catch all/release all mechanism, but at the same time ensure that it’s not easily abused. Hence reporting structure and responsibility needs to be clearly established. Key message: TEAMWORK.

Links: Twitter runs foul of FTC

Information Security & Cryptography

Cryptography or the cryptic art started off as the art & science of encryption. It is a wide area of research and implementation. You will find it touching almost a variety area of quantum physics, law, hardware design, advanced mathematics, user interface and even politics! This makes cryptography an interesting area of study and in fact one of the key reasons why I’m personally passionate about it.

Cryptography is one of the key component in the ecosystem. Cryptography by itself, is not that fancy or useful. It adds layer of protection into an existing deployment/infrastructure/functionality. A physical equation of cryptography would be akin to a metal lock. A lock recognizes no legitimate owner (some electronic locks claims so), but only recognises the right metal key to open it. The wielder of the key could be anybody, both legit or not. As i said earlier, cryptography alone is not so useful, but when deployed properly, it will serve a critical role.

In the current technology world, you’d encounter that most attacks aren’t really against cryptography (yes, the more learned would disagree, citing rainbow tables and collisions, but that’s another story – keyspace). So, the current attacks (such as race conditions, buffer overflows) would be centered around other parts of the ecosystem.

Security is only as strong as the weakest link. One can only improve the state of security by improving the vulnerability of the weakest link. Alternatively you could use the layered onion approach, whereby your weakest link is concealed within layers of added security or risk mitigation.

Attacks to the cryptography layer can be deadly. This is because the system can only recognize whether an access is “legitimate” or non-legitimate. It will not be able to detect whether cryptography is broken or not. Similar to burglary, if one pries the lock open, the physical damage of the lock is seen. However if the assailant picks the lock, prove of crime is not present anymore.

What’s ironic about this situation is that, even security systems are vulnerable. The “over-confidence” and the fact that vendors dealing security are “suppose” to be secure is yet to be seen. We see reports of security vendors scurrying to patch their systems when vulnerabilities affecting core cryptography component such as OpenSSL (which is used widely, even in router OS such as Cisco’s IOS and Juniper’s JunOS).

Unlike nature, which is governed by some laws of physics like gravity, there is none when it comes to threat to cryptography. One cannot assume that functions will be called properly, right types are passed as parameters, bounds/limits respected. As such, writing cryptography becomes a daunting task in ensuring that all factors are carefully considered, all risks identified and accounted for.

Again, it is stressed that cryptography alone does not make a system secure. Just like the widely accepted misconception that having a firewall protects your system. When deployed correctly, cryptography provides key protection to data. However, vendors tend to attempt implementing “proprietary” encryption, which has not gone through peer reviews, extensive tests and verification to prove the strength and ability of those algorithms.

Reality is, cryptography stands somewhere near nuclear physics. It is extremely difficult, has complex mathematical equations in its core functions and usually subjects of doctorate studies. It does require a fair amount of effort and understanding on this subject matter.

Operating Systems – Introduction

Operating System Brains

A computer’s heart is the operating system. The core processing is done at the CPU, and it’s only possible if there is an operating system. So what is an Operating System? Operating system is a set of software, written using a low-level programming language (either C/C++ or Assembly).

Operating system is responsible to manage the requests made by any software applications, and direct them to be executed via the hardware that it’s installed upon. In essence, it acts as an interface between the software and the hardware. You might be wondering “Why do i even need an Operating System? I might as well code to use the hardware directly!!”. Valid concerns, but your application will not be the only application running. If you need your application to run at the Operating System level, that can be achieved via kernel mode access (which will be covered at a later stage).

So, you need an operating system. But what exactly does an Operating System do?

  • Process Management – makes sure that your applications runs smoothly without any interruption, and to ensure that it executes successfully
  • Memory Management – the CPU can only execute a limited number of processes/applications at one time. And as these applications are run, they need storage space to manipulate data. This storage (RAM) needs to be managed so that both applications and operating systems have their own space.
  • Input/Output – Your applications will leverage on the existing hardware. As such, the Operating Systems provide a structured means of accessing these devices (by providing a generic access layer called the device drivers) to access myriads of hardware without having to worry about the specifics.

Though this is a limited list, most other functionality are some form of variation of these basic functions. The exact functions will be covered in the later blog entries.