A Python crawler for extensions from the Chrome Web Store.
Go to file
Achim D. Brucker 9c4ba39558 Refactoring. 2017-01-28 12:52:18 +00:00
ExtensionCrawler Refactoring. 2017-01-28 12:52:18 +00:00
.gitignore initial commit 2016-09-08 20:43:35 +02:00
LICENSE initial commit 2016-09-08 20:43:35 +02:00
README.md Documented dateutil dependency. 2016-12-18 18:32:36 +00:00
crawler Refactoring. 2017-01-28 12:52:18 +00:00
crawler.py Added optional summary to stderr. 2017-01-17 19:29:03 +00:00
crx-tool.py Restored executability. 2016-12-03 23:29:31 +00:00
discover_extensions.py Improved discovery script. 2017-01-17 16:08:20 +00:00
parse_sitemap Renaming main script for parsing/downloading the sitemap. 2017-01-28 12:26:17 +00:00
permstats.py Restored executability. 2016-12-03 23:29:31 +00:00

README.md

ExtensionCrawler

A collection of utilities for downloading and analyzing browser extension from the Chrome Web store.

  • crawler.py: A crawler for extensions from the Chrome Web Store. Calling crawler.py will downloads 200 extensions from all categories into a folder downloaded in the current directory. In case an extension has already been downloaded, the script skips it.
  • permstats.py: A tool for generating statistical data from the crawled extensions.
  • crx-tool.py: A tool for analyzing and extracting *.crx files (i.e., Chrome extensions). Calling crx-tool.py <extension>.crx will check the integrity of the extension.

All utilities are written in Python 3.x. The following non-standard modules might be required:

  • requests (apt-get install python3-requests)
  • dateutil (apt-get install python3-dateutil)
  • jsmin (apt-get install python3-jsmin)

Team

License

This project is licensed under the GPL 3.0 (or any later version).