A close look at malicious documents (Part II)

This is the second article about the analysis of malicious documents observed in March 2018.

You can read the first part here: A close look at malicious documents (Part I )


  • rtfobj tool – part of python-oletools package – “rtfobj is a Python module to detect and extract embedded objects stored in RTF files, such as OLE objects. It can also detect OLE Package objects, and extract the embedded files.” https://github.com/decalage2/oletools/wiki/rtfobj
  • 7zip – archiver
  • Hexinator – a powerful hex editor
  • Notepad++ – a powerful text editor

Sample 3 – march 2018 order.docx

Link: https://www.hybrid-analysis.com/sample/5b9465b5297fbb2031f3fc55f010f6080a3bb50e4a05be77b2cedfa26dc3155b/5aa089f27ca3e12b460a0316

Format: docx

SHA1: 5b9465b5297fbb2031f3fc55f010f6080a3bb50e4a05be77b2cedfa26dc3155b

Analysis date: March 7

This is a docx file – in OOXML format – so we can unzip it with 7zip. We can see the following directory structure:

Fig 1: directory structure of unzipped “march 2018 order.docx”

By examining the footer2.xml, we can observe that an external Relationship of type oleObject is embedded in this document.

Fig 2: footer2.xml.rels contains external object.

The link to external resource is hxxp://bit.ly/2FAcCe3, which is redirected to hxxp://babymama.co.ke/0802/3/word.doc

We can append plus sign (+) to the end of shortened bit.ly URLs to see the visiting statistics.

Fig 3: visit statistics for the embedded bit.ly link

word.doc is an rtf document. In A close look at malicious documents (Part I ) post, I manually extracted the ole objects embedded in the rtf file (sample 2). However, this time, I use rtfobj tool to extract the ole objects and dump them on the file system.

rtfobj.exe [/path/to/rtffile] -s all -d [/path/to/dump/dir]

Fig 4: using rtfobj to extract embedded OLE objects/packages

By opening these files with the Hexinator, we can observe that “2FAcCe3_object_000000EF.package” contains a PE file. The 4 byte before “MZ” determine the length of tthe embedded object (which a PE file)

Fig 5: examining the content of the OLE package

I uploaded the extracted PE file on VirusTotal on March 31st.

Fig 6: VirusTotal result

[Update] After publishing this post, James () mentioned that this is #formbook malware


docx (external Relationship) -> rtf (OLE package) -> PE file

Sample 4 – document2018-03-20-104831.doc

Link: https://www.hybrid-analysis.com/sample/1f039c63654cbaab2d666427c74a0d5f4c4f3cd3eb581ae7f6351e65f57173e3

Format: rtf

SHA1: 1f039c63654cbaab2d666427c74a0d5f4c4f3cd3eb581ae7f6351e65f57173e3

Analysis date: 21 March 2018

First, we use rtfobj to examine this rtf document.

Fig 7: using rtfobj to extract OLE objects/packages

The rtf file contains one OLE object with an unknown class name (package), b’CNBo8Z3Oqbh02JzEwftSelDsq’, so the class name is ignored.

When we open the OLE object in the Hexinator, we see:

Fig 8: examining the content of the OLE object with Hexinator

Fig 8 depicts the content of the extracted OLE object. We can see a link to a remote wsdl file (hxxp://my.mixtape.moe/tzsfgh). When we open the file, it contains many whitespaces (e.g. the first few hundred lines are empty). We can use replace capability in Notepad++ to remove extra whitespaces:

Fig 9: using notepad++ to remove extra whitespaces from the WSDL file

After removing the extra characters, we can see the following:

Fig 10: after removing the whitespaces

The embedded code is in C#.  To see what is going on in the code, we can debug the code with Visual C# debugger. I created a Console app and copy/pasted the whole C# code into its main function and then formatted the code. You can see this code in https://pastebin.com/dSdwau7p.

Fig 11: compiling the c# code

UxSul is an array with the length of 6522 chars (each char is 2 bytes). UxSul is converted to a String obj on line 6568. Let’s run the code and see the generated string.

Fig 12: content of generated string (BCvBK)

Fig 12. shows the content of BCvBK variable, which is another c# application. The generated c# code is then compiled and then invoked by code on lines 6569-6581. You can get the embedded C# code from https://pastebin.com/3i8TZryf.

The embedded C# code contains machine code, which will be injected into the process itself and will be executed, shown in Fig 13.

Fig 13: embedded c# code contains machine code



Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s