Automatically extracts API Keys from APK files
Automatically extracts API Keys from APK (Android Package) files
For technical details, check out my thesis (Automatic extraction of API Keys from Android applications) and, in particular, Chapter 3 and 4 of the work.
The library responsible for identifying the API Keys is a standalone project.
Searches for API Keys embedded in Android String Resources, Manifest metadata, Java code (included Gradle's BuildConfig), Native code.
pip3 install -r requirements.txt
$ git clone --recursive https://github.com/alessandrodd/apk_api_key_extractor.git
$ cd apk_api_key_extractor
$ cp config.example.yml config.yml
$ pip3 install -r requirements.txt
$ python3 main.py
$ git clone https://github.com/alessandrodd/ApiKeyTestApp.git
$ cd apk_api_key_extractor
$ python3 main.py --analyze-apk ../ApiKeyTestApp/apk/apikeytestapp_obfuscated.apk
usage: main.py [-h] [--debug] [--analyze-apk APK_PATH] [--monitor-apks-folder]
A python program that finds API-KEYS and secrets hidden inside strings
optional arguments:
-h, --help show this help message and exit
--debug Print debug information
--analyze-apk APK_PATH
Analyze a single apk to find hidden API Keys
--monitor-apks-folder
Monitors the configured apks folder for new apks. When
a new apk is detected, the file is locked and analysis
starts.
Let's say you want to find any API Key hidden inside a set of Android apps.
Using the default settings:
python3 main.py --monitor-apks-folder
Note that this software is process-safe, meaning that you can start multiple instances of the same script without conflicts. You can also configure the apks folder as a remote folder in a Network File System (NFS) to parallelize the analysis on multiple hosts.
NOTE: make sure your .APK file is in the 'apks' folder.
$ git clone --recursive https://github.com/alessandrodd/apk_api_key_extractor.git
$ cd apk_api_key_extractor
$ docker build -t apk_key_extractor:latest .
$ docker run -it apk_key_extractor:latest
Rebuild your image when you've added other .apk's in your 'apks' folder.
apks_dir => The folder containing the .apk files to be analyzed
apks_decoded_dir => The folder that will temporarily contain the decompiled apk files
apks_analyzed_dir => The folder that will contain the already analyzed .apk files. Each time an APK is analyzed, it gets moved from the apks_folder to this folder
save_analyzed_apks => If false, the .apk files gets removed instead of being moved to the apks_analyzed folder
apktool => Path of the main apktool jar file, used to decode the apks
lib_blacklists => Txt files containing names of the native libraries (one for each line) that should be ignored during the analysis
shared_object_sections => When analyzing native libraries, all the ELF sections listed here will be searched for API Keys
dump_all_strings => If true, the script dumps not only API Keys but also every other string found in the APK. Useful to make a dataset of common not-api-key strings that can be used to train the model itself.
dump_location => Where to dump the API Keys found (as well as every other strin, if dump_all_strings is set to true). Possible values are console (stdout), jsonlines (text files in the jsonlines format), mongodb.
jsonlines.dump_file => Path of the jsonlines file where the API Keys will be dumped
jsonlines.strings_file => Path of the jsonlines file where the every string will be dumped, if dump_all_strings is true
mongodb.name => Used if key_dump is set to mongodb; name of the MongoDB database
mongodb.address => Address of the MongoDB database
mongodb.port => Port to which to contact the MongoDB database (default 27017)
mongodb.user => MongoDB database credentials
mongodb.password => MongoDB database credentials
I'm a curious guy; if you make something cool with this and you want to tell me, drop me an email at didiego.alessandro+keyextract (domain is gmail.com)