By Geoff Twardokus
Given the notoriety that ransomware has achieved over the last few years, I thought it would be interesting to compare some of the major variants that are floating around to look for similarities in design and execution that might yield some preventative actions users could take to protect themselves from multiple threats at once. In my testing I looked at various samples of GandCrab and Sodinokibi (aka REvil) which have been previously linked to a common author based on their relatively unique method of constructing URLs[i]. Unfortunately, the results of my experiments were not encouraging. Not only are many variants of the same strains dissimilar from each other, in general the different variants have very little in common beyond their ultimate result – an encrypted filesystem.
For all of the analysis performed, I used two virtual machines. The first, the analysis machine for disassembling and decompiling ransomware samples, was running Windows 10 Build 1903. The victim machine on which all dynamic analysis was performed was running Windows XP SP3. Both machines were virtualized with VMware Workstation 15.3.
The samples themselves were obtained from http://virusshare.com. I should note that I cherry-picked samples to a certain extent as I have limited experience with decompiling malware, and I was not able to successfully unpack some of the more sophistically obfuscated samples (particularly of Sodinokibi/REvil) that are designed to resist analysis and reverse engineering.
The first step for any malware research is some basic static analysis. For this, I used two tools – Detect it Easy (DiE) and PEStudio. The glaringly obvious difference between GandCrab and Sodinokibi is the use of cryptographic API functions by the former and not the latter. In fact, both variants use crypt32.dll to generate an RSA key pair for encrypting the victim’s files, but REvil obfuscates these library calls to avoid detection by antivirus. So, static analysis shows nothing related to cryptography in the REvil executables; however, GandCrab’s use of the native cryptography API is evident as shown below.
Figure 1 – GandCrab imports native cryptography API functions
Figure 2 – GandCrab uses the crypt32.dll library
The next area of comparison was network traffic generated by the various strains when they are executed. First, I looked at what GandCrab does over the network. The figure below shows a Wireshark capture of the network traffic generated by a GandCrab sample immediately after it is executed.
Figure 3 – Wireshark capture of GandCrab network traffic
As shown in frames 6/7 and 18/19, GandCrab immediately attempts to determine the local machine’s IP address and hostname. At the same time, this version of GandCrab establishes a TCP session to retrieve some data over HTTP. The HTTP reply (frame 13 above) contained only the IP address of the victim machine, perhaps verifying the victim’s identity to the locally executing binary. This is shown here:
Figure 4 – GandCrab HTTP response contents
I suspect this exchange is intended either to verify connection to the Internet or to attempt to avoid executing in a virtual environment (as execution in a virtual machine that is not bridged to the host network might show a different IP address from the Internet than would be provided by the local network utilities).
Unfortunately, all of the other GandCrab versions I tested attempted to reach out to domains that were not up at time of execution, so I was not able to determine what data would have been transferred upon successful connection. This is likely due to the documented tendency of GandCrab’s authors to frequently change domains for download links[ii], presumably to avoid AV detection and preemptively dodge takedown orders from law enforcement. However, it is interesting that GandCrab, possibly uniquely[iii], uses the “.bit” TLD for its URLs. This can be seen in the Host header of the figure below, which shows the first HTTP request sent by GandCrab.
Figure 5 – GandCrab HTTP request showing use of “.bit” TLD
Based on this behavior, I believe this sample was an early version of GandCrab because later versions are known to use an entirely different method of connecting to URLs. This later, more sophisticated method can be seen in the figure below, which shows the randomly selected Request URI and Host header of a different GandCrab strain’s first HTTP request.
Figure 6 – Alternate GandCrab HTTP request
This difference in behavior demonstrates the difficulty of comparing samples as they can behave completely differently even within a single family of ransomware.
After looking at GandCrab’s network traffic, I looked at what REvil sends over the wire. A caveat is necessary here – some of what REvil does is over SSL/TLS, so my analysis cannot completely describe what REvil sends and receives. However, it does send some unencrypted HTTP traffic as well, as shown below.
Figure 7 – REvil’s first HTTP request
Unlike either variant of GandCrab, REvil’s first HTTP request uses the POST method (as opposed to the GET method used by GandCrab) and the request is sent using HTTP/1.0 rather than version 1.1. The use of an outdated HTTP version is interesting and might indicate an effort to fly under the radar, although it might also have to do with this test being run on Windows XP.
The most notable difference I observed between the variants tested was the time to encrypt the entire hard disk and deploy a ransom note. I have constructed a table with the variants tested. All times are based on encrypting a 20GB virtual hard drive with 2.43GB in use at time of encryption.
|Variant||Executable SHA-256 hash||Time (m:s)|
|REvil||968197f146fdc42ee0f7985b992349f8ae6ddc7725b50860067fbd52b8a283f7||Failed to execute|
|REvil||6606987e6513c7738bcdfaa3d8422ef8a0385aa229ebea26de11e27074f6882e||Failed to execute|
|GandCrab||446ff9c070965e04507a2a2761ebfccea84ba603cdd1742274568107d0ee55f6||Got stuck trying to resolve “dns.soprodns.ru”|
This data shows that REvil encrypts the victim’s files significantly faster than GandCrab does. This is because REvil makes use of I/O completion ports (IOCPs)iii to efficiently multithread the encryption process, whereas GandCrab sequentially encrypts files without any apparent optimization.
Using IDA Freeware v7, I was able to locate the GandCrab key generation subroutines and look up the constant values pushed onto the stack to confirm RSA was the algorithm used for encrypting the user’s files.
Figure 8 – GandCrab RSA key generation subroutine – note the “8000001h” hex value passed as dwFlags indicates a 2048-bit key pair is to be generated with the call to CryptGenKey
Figure 9 – GandCrab RSA KeyBlob export subroutines
REvil performs encryption in a similar manner, but the API calls to Crypto32.dll are obfuscated within the original sample so as to avoid detection[iv]. Since the Crypto32.dll inclusions cannot be shown in IDA, I used procmon from the SysInternals Suite to observe REvil’s use of the cryptography libraries during execution.
Figure 10 – REvil’s initial loading of crypt32.dll to generate a key pair
Figure 11 – REvil’s storage of the key pair in the registry
Figure 12 – The REvil key pair stored in the registry at HKEY_LOCAL_MACHINE\SOFTWARE\recfg
Immediately after generating the key pair, REvil attempts to take control of a native process in order to run as a whitelisted program. On the victim machine I used for testing, REvil overwrote mspaint.exe with a malicious executable.
Figure 13 – REvil overwriting mspaint.exe with a malicious executable
Before hijacking MS paint, REvil attempted to find several other binaries native to later versions of Windows. This indicates REvil contains some enumeration of native Windows binaries that it can attempt to hijack at this stage of infection, providing it with some versatility and making it more difficult to identify infections at the initial stages.
This analysis provides an overview of several differences between the GandCrab and REvil ransomware families. As I continue to learn the intricacies of disassembling and decompiling malware, I hope to be able to take a deeper look at this to search for code-level similarities that might be present yet invisible to this type of higher-level inspection.