WeChat(微信), as the most popular mobile IM app in China, doesn't give users any method to export well-formatted history message. This tool can parse and export WeChat messages on a rooted android phone.
Right now it can dump messages in text-only mode, or generate a single-file html containing voice messages, images, emoji, etc.
NEWS: WeChat 6.0+ uses silk to encode audio. The code is updated.
NEWS: WeChat 6.3 uses a new avatar storage. The code is updated.
HELP NEEDED: Starting from May 2016, the first 1KB of all emojis in resource/emoji
are encrypted. Right now I'm using emoji URL which covers most of them.
If you are good at cryptography / reverse engineereing, or you work at Tencent, feel free to contact me or help take a look. It is also possible to recover the image without knowing the first 1KB (just have to detect chunks without knowing metadata), but I don't have time to do that either.
If this tools works for you, please take a moment to add your phone/OS to the wiki. If it doesn't work, you probably have to investigate it as the behavior may be different on each phone.
- adb and rooted android phone connected to a Linux/Mac OSX/Win10+Bash. If the phone does not come with adb support, you can download an app such as https://play.google.com/store/apps/details?id=eu.chainfire.adbd
- Python >= 3.6
- PyQuery, javaobj-py3, Pillow, requests
- sqlcipher >= 4.1, pysqlcipher3
- sox (command line tool)
- csscompressor (suggested, optional)
- Silk audio decoder (included; build with
./third-party/compile_silk.sh
)
- Pull database file and avatar index:
- Automatic:
./android-interact.sh db
. It may use an incorrect userid. - Manual:
- Figure out your
${userid}
by inspecting the contents of/data/data/com.tencent.mm/MicroMsg
on the root filesystem of the device. It should be a 32-character-long name consisting of hexadecimal digits. - Get
/data/data/com.tencent.mm/MicroMsg/${userid}/{EnMicroMsg.db,sfs/avatar.index}
from the device.
- Figure out your
- Decrypt database file:
- Automatic:
./decrypt-db.py decrypt --input EnMicroMsg.db
- Manual:
-
Get WeChat uin (an integer), possible ways are:
./decrypt-db.py uin
, which looks for uin in/data/data/com.tencent.mm/shared_prefs/
- Login to web wechat, get wxuin=1234567 from
document.cookie
-
Get your device id (a positive integer), possible ways are:
./decrypt-db.py imei
implements some ways to find device id.- Call
*#06#
on your phone - Find IMEI in system settings
-
Decrypt database with combination of uin and device id:
./decrypt-db.py decrypt --input EnMicroMsg.db --imei <device id> --uin <uin>
NOTE: you may need to try different ways to get device id and fine one that can decrypt the database. Some phones may have multiple IMEIs, you may need to try them all. See #33. The command will dump decrypted database at
EnMicroMsg.db.decrypted
.
-
If decryption doesn't work, you can also try the password cracker to brute-force the password.
-
Copy the WeChat user resource directory
/mnt/sdcard/tencent/MicroMsg/${userid}/{avatar,emoji,image2,sfs,video,voice2}
from the phone to theresource
directory:./android-interact.sh res
- You might need to change
RES_DIR
in the script if the default is incorrect on your phone. - This script needs tar and base64 command on your phone. If they are not available, there is a slow fallback method in the script you can use.
- This can take a few minutes. One way to do it faster:
- If there's enough free space on your phone, you can log in and archive all required files via
tar
with or without compression, and useadb pull
to copy the archive. Note thatbusybox
is needed as the Android system'star
may choke on long paths.
- If there's enough free space on your phone, you can log in and archive all required files via
- What you'll need in the end is a
resource
directory with the following subdir:avatar,emoji,image2,sfs,video,voice2
.
-
(Optional) Download the emoji cache from here and decompress it under
wechat-dump
. This will avoid downloading too many emojis during rendering.wget -c https://github.com/ppwwyyxx/wechat-dump/releases/download/0.1/emoji.cache.tar.bz2 tar xf emoji.cache.tar.bz2
-
Parse and dump text messages of every chat (requires decrypted database):
./dump-msg.py decrypted.db output_dir
-
List all chats (required decrypted database):
./list-chats.py decrypted.db
-
Generate statistics report on text messages (requires
output_dir
from./dump-msg.py
):./count-message.sh output_dir
-
Dump messages of one contact to html, containing voice messages, emojis, and images (requires decrypted database,
avatar.index
, andresource
):./dump-html.py "<contact_display_name>"
The output file is
output.html
.Check
./dump-html.py -h
to use different paths.
Screenshots of generated html:
See here for an example html.
- Attack the emoji encryption problem
- Fix rare unhandled message types: > 10000 and < 0
- Better user experiences... see
grep 'TODO' wechat -R