The number of malware samples found in Internet of Things (IoT) devices has increased over the past few years due to two factors: IoT default credentials are not being changed, and IoT devices aren’t properly configured and/or are not updated as often as they should be. As a result, these devices become easy targets for hackers. Attackers either take control of them so they can spy on users or launch an attack against another network or system. Additionally, malware authors are using open-source tools to modify their malware so that it evades detection by antivirus software.
In our ongoing effort to understand the evolving threat landscape for IoT devices, we recently collected 728 malware samples from our IoT honeypots over the course of 15 days. We then analyzed the malware samples and discovered new modification techniques malware authors are using to evade detection. They are also adopting new methods for crafting malicious files, exploiting a variety of vulnerabilities in IoT devices, and using command-and-control (C&C) servers to maintain control of compromised devices.
In this blog, we analyze collected samples, provide new detection and analysis techniques, and share Indicators of Compromise (IoCs) you can use to search your network for threats.
In our previous blog How IoT Botnets Evade Detection and Analysis, released back in March 2022, we discussed some widely used modification techniques done to malware samples after being packed with Ultimate Packer for Executables (UPX) to make decompression more difficult. Then in August 2022, we released a blog detailing a UPX recovery tool we created to automate the decompression process to make it easier for security researchers; a snapshot of that tool in use can be seen in Figure 1.
While using this tool consistently over the past few months, we discovered additional evasion techniques malware authors use in their samples. The idea behind these small modifications is to keep the file executable while making it impossible to completely or even partially parse. This can be seen in Figure 2 where there is a failed recovery execution due to a modified Executable and Linkable Format (ELF) header.
We analyzed some protection techniques implemented in standard botnet samples that share code with several malware families, namely with Kaiten/Tsunami. This malware family behavior and functionality was captured in technical analysis blogs conducted by MalwareMustDie and Stratosphere Lab.
The first few protection methods discussed in the blog by MalwareMustDie was about a sample where the UPX!signatures had been overwritten by 0A 00 00 00 and the entry point was modified to start with a call instruction that pointed to the original entry point code.
Later, according to the Stratosphere Lab, the malware authors included additional countermeasures, like adding junk bytes at the end of the file (in the overlay). Even if an analyst can find and fix the original offsets of all the UPX!signatures, the standard UPX tool still won’t be able to find the PackHeader structure because it expects it at the end of the file.
After our analysis, we noticed that the samples from this family have some new modifications that make them more difficult to automatically parse and analyze. The only new UPX-related protection we have seen (compared to the protections reported in other blogs) has been found in a couple of samples where, in addition to the overlay, the authors overwrote the PackHeader structure.
ELF Header Modifications
Focusing on the non-UPX protections, we are going to compare an executable – packed by us – with UPX, with one of the samples we found in our honeypots.
The screenshot in Figure 3 shows the ELF header of a normal UPX-packed file (left) versus the modified header from one of these standard bot samples that we harvested using our honeypots (right).
As seen in Figure 4, the most notable differences can be found in the e_shoff (offset 0x20), e_shnum (0x30) and e_shstrndx (0x32) fields which have been overwritten using 0xFFFF values. These artificial modifications, while not affecting the malware execution, will raise different kinds of errors when the sample is processed with different tools.
The fact that the modification of these fields doesn’t affect the execution of the file makes sense as sections contain important data for linking, relocating, and building the executable, but not to run it directly.
During our investigation, we found research that, after fuzzing the ELF header, these 3 fields (e_shoff, e_shnum, e_shstrndx) could be freely modified, inducing errors in several analysis tools and keeping the sample executable. This researcher also shared a C code that modifies the ELF executables in the same way we see in the samples in our sandboxes. We think these malware creators are using this same tool. Both the malware authors and the researcher’s tool overwrite only 2 bytes of the e_shoff field, whose size is 4.
For example, the Interactive Disassembler (IDA) tool showed a couple of warnings but was still able to open and analyze them. Hiew, another popular tool, was not able to open this file at all and immediately terminated. Even our UPX recovery tool had issues handling these samples as it crashed when it was parsing the ELF headers. In this case, pyelftools was the link inside the chain of analysis made to the sample. These unusual values only appear in a very small percentage of the ELF files that exist and that’s why this library was not ready to open these malware files. As this project is open source, this allowed us to submit a couple of improvements to make the library more resilient to these kinds of unexpected values.
Debugging Using QEMU User-mode Emulation
To properly analyze all aspects of malicious behavior, it makes sense to debug the samples of interest rather than perform static analysis only. In this particular case, all of them were created to run on Linux environments common for IoT devices (mainly ARM and MIPS architectures) rather than for x86 (32- or 64-bit) which is common for analysts’ host machines. To be able to debug them without a need to set up ARM or MIPS hardware, it is common to use emulators. Rather than the full-system emulation mode, we decided to use QEMU user-mode emulation. This tool has a very simple setup where you can directly install it from some Linux distributions’ repositories and execute IoT samples on x86 hardware straight away; a snapshot can be seen in Figure 5.
However, this simplicity has a major downside. If you decide to debug the sample remotely via the GDB protocol, the first thing that may seem unusual is that the sample will need to have execution permissions for the local file system. When restarting the sample a few times, you may also realize that the behavior of the sample can change. In Figure 6, for example, at first the ARM sample was executed without prompting anything, but consecutive executions showed us the following message:
This is the main difference between virtualization software emulating the whole OS and this QEMU user-mode setup. The emulated sample had full access to the local x86-64-based host machine where it was emulated, was able to change local files, and add a new crontab task. Therefore, be extremely careful when emulating samples compiled for different architectures when using QEMU in user mode. It may not be obvious at first, but it can do real harm even though the architecture is mismatching. So always perform analysis on a dedicated (physical or virtual) isolated machine.
ELF Header Modification YARA Detections
The analysis of the samples we receive in our honeypots let us detect some anti-analysis techniques that were able to evade some tools by not properly loading. We want to know if we are receiving more samples that implement some similar techniques, to make sure we were extracting all the information from them and to further learn the latest techniques in the IoT malware environment.
Our research led us to several YARA rules that describe and detect sneaky modifications that keep the ELF files executable but can cause some analysis software to return errors:
Let’s take a look at the 728 malicious samples we harvested from our IoT honeypots within 15 days, Figure 7. 696 (95%) of the total are ELF files, ARM being the most targeted architecture (539 samples), followed by MIPS (40 samples) and 386 (35 samples) with less than a tenth of samples. The other files that are not ELF files are mainly bash and python scripts used in different stages of the infection of the machine.
The constant analysis of samples helps us track their evolution and understand how malware developers protect their creations from analysis tools, improving these protections over time. In order to protect yourself from these attacks, we recommend implementing strong passwords on devices that are accessible from outside networks, setting up firewalls between your network and external connections, installing antivirus software on all devices, and monitoring any changes in device behavior or performance (for example: an increase in CPU usage).