2016-09-08 18:43:35 +00:00
|
|
|
|
# ExtensionCrawler
|
2017-08-23 23:29:59 +00:00
|
|
|
|
|
2016-09-22 23:00:45 +00:00
|
|
|
|
A collection of utilities for downloading and analyzing browser
|
|
|
|
|
extension from the Chrome Web store.
|
2016-09-08 18:43:35 +00:00
|
|
|
|
|
2017-08-23 23:29:59 +00:00
|
|
|
|
* `crawler`: A crawler for extensions from the Chrome Web Store.
|
2017-06-23 17:43:12 +00:00
|
|
|
|
* `crx-tool`: A tool for analyzing and extracting `*.crx` files
|
2016-09-29 15:32:54 +00:00
|
|
|
|
(i.e., Chrome extensions). Calling `crx-tool.py <extension>.crx`
|
|
|
|
|
will check the integrity of the extension.
|
2017-08-29 23:04:32 +00:00
|
|
|
|
* `crx-extract`: A simple tool for extracting `*.crx` files from the
|
|
|
|
|
tar-based archive hierarchy.
|
2017-08-30 16:18:31 +00:00
|
|
|
|
* `crx-jsinventory`: Build a JavaScript inventory of a `*.crx` file using a
|
2017-08-30 15:14:29 +00:00
|
|
|
|
JavaScript decomposition analysis.
|
2017-08-30 16:18:31 +00:00
|
|
|
|
* `crx-jsstrings`: A tool for extracting code blocks, comment blocks, and
|
2017-08-30 22:25:49 +00:00
|
|
|
|
string literals from JavaScript.
|
2017-08-30 14:38:04 +00:00
|
|
|
|
* `create-db`: A tool for updating a remote MariaDB from already
|
|
|
|
|
existing extension archives.
|
2016-09-08 18:55:40 +00:00
|
|
|
|
|
2017-08-23 23:29:59 +00:00
|
|
|
|
The utilities store the extensions in the following directory
|
2017-06-24 07:16:34 +00:00
|
|
|
|
hierarchy:
|
2017-08-23 23:29:59 +00:00
|
|
|
|
|
|
|
|
|
```shell
|
2017-06-24 07:16:34 +00:00
|
|
|
|
archive
|
|
|
|
|
├── conf
|
|
|
|
|
│ └── forums.conf
|
|
|
|
|
├── data
|
|
|
|
|
│ └── ...
|
|
|
|
|
└── log
|
|
|
|
|
└── ...
|
|
|
|
|
```
|
2017-08-23 23:29:59 +00:00
|
|
|
|
|
2017-06-24 07:16:34 +00:00
|
|
|
|
The crawler downloads the most recent extension (i.e., the `*.crx`
|
2017-08-23 23:29:59 +00:00
|
|
|
|
file as well as the overview page. In addition, the `conf` directory
|
|
|
|
|
may contain one file, called `forums.conf` that lists the ids of
|
2017-06-24 07:16:34 +00:00
|
|
|
|
extensions for which the forums and support pages should be downloaded
|
2017-08-30 14:38:04 +00:00
|
|
|
|
as well. The `data` directory will contain the downloaded extensions.
|
|
|
|
|
|
|
|
|
|
The `crawler` and `create-db` scripts will access and update a MariaDB.
|
|
|
|
|
They will use the host, datebase, and credentials found in `~/.my.cnf`.
|
|
|
|
|
Since they make use of various JSON features, it is recommended to use at
|
|
|
|
|
least version 10.2.8 of MariaDB.
|
2017-06-24 07:16:34 +00:00
|
|
|
|
|
2017-11-02 18:46:20 +00:00
|
|
|
|
All utilities are written in Python 3.6. The required modules are listed
|
2017-07-26 06:41:37 +00:00
|
|
|
|
in the file `requirements.txt`.
|
2016-09-08 18:57:54 +00:00
|
|
|
|
|
2017-08-18 16:00:23 +00:00
|
|
|
|
## Installation
|
2017-08-23 23:29:59 +00:00
|
|
|
|
|
2017-09-01 23:07:50 +00:00
|
|
|
|
Clone and use pip3 to install as a package.
|
2017-08-18 16:00:23 +00:00
|
|
|
|
|
2017-08-23 23:29:59 +00:00
|
|
|
|
```shell
|
2017-08-18 16:00:23 +00:00
|
|
|
|
git clone git@logicalhacking.com:BrowserSecurity/ExtensionCrawler.git
|
2017-09-01 23:07:50 +00:00
|
|
|
|
pip3 install --user -e ExtensionCrawler
|
2017-08-18 16:00:23 +00:00
|
|
|
|
```
|
|
|
|
|
|
2016-09-24 09:39:01 +00:00
|
|
|
|
## Team
|
2017-08-23 23:29:59 +00:00
|
|
|
|
|
2016-09-24 09:36:25 +00:00
|
|
|
|
* [Achim D. Brucker](http://www.brucker.ch/)
|
|
|
|
|
* [Michael Herzberg](http://www.dcs.shef.ac.uk/cgi-bin/makeperson?M.Herzberg)
|
2016-09-22 20:40:34 +00:00
|
|
|
|
|
2017-08-23 23:29:59 +00:00
|
|
|
|
### Contributors
|
|
|
|
|
|
|
|
|
|
* Mehmet Balande
|
|
|
|
|
|
2016-09-22 20:40:34 +00:00
|
|
|
|
## License
|
2017-08-23 23:29:59 +00:00
|
|
|
|
|
|
|
|
|
This project is licensed under the GPL 3.0 (or any later version).
|