Commit Graph

301 Commits

Author SHA1 Message Date
Michael Herzberg 11604c0fa5 Collect jsfilesize instead of jsloc. 2017-07-26 12:05:54 +01:00
Achim D. Brucker 349adb4b39 Updated README.md to reflect new requirements.txt. 2017-07-26 07:41:37 +01:00
Achim D. Brucker 9c8cedad74 Add hoc generated requirements.txt. 2017-07-26 07:38:17 +01:00
Achim D. Brucker 73eedab07d Log time delta for each extension upate. 2017-07-26 07:29:48 +01:00
Michael Herzberg b3d1ab912e Wait a maximum of 10min before stopping jsbeautifier. 2017-07-25 22:57:55 +01:00
Achim D. Brucker 04d242ce7e Added sqlite3 version to config log. 2017-07-24 11:10:16 +01:00
Achim D. Brucker 9b291d02ae Added local bin/lib-directory to PATH. 2017-07-24 10:47:42 +01:00
Achim D. Brucker d6d1d21d05 Bug fix: Append log to log instead of overwritting it. 2017-07-24 10:26:03 +01:00
Achim D. Brucker 2bf32b7a8f Merge branch 'production' of logicalhacking.com:BrowserSecurity/ExtensionCrawler into production 2017-07-22 21:14:07 +01:00
Achim D. Brucker 052748af07 Create log file for profiling. 2017-07-20 07:51:30 +01:00
Michael Herzberg 072e008fe2 Run the garbage collector manually after using jsbeautify. 2017-07-19 17:25:21 +01:00
Achim D. Brucker c73f4cda3e Fixed log output. 2017-07-19 07:01:38 +01:00
Michael Herzberg cf0478d740 Added sge file for merging the individual dbs into the individual dbs of our archive. 2017-07-18 23:51:11 +01:00
Michael Herzberg 186f6162af Fixed NoneType str conversion exception. 2017-07-17 15:29:00 +01:00
Michael Herzberg 9b1e5db96f Check for attributes key first and use traceback module instead of printing str(e). 2017-07-17 14:00:39 +01:00
Michael Herzberg 50123f14b8 Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler 2017-07-16 20:14:59 +01:00
Michael Herzberg eded1ca893 Only attempt to search for replies when we actually have search parameters. 2017-07-16 20:14:50 +01:00
Achim D. Brucker 3346b1b61a Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler 2017-07-16 14:02:40 +01:00
Achim D. Brucker 049db03021 Add user local to binary and library search paths. 2017-07-16 14:02:22 +01:00
Michael Herzberg 5e60715d98 Removed schema creation from scripts as it is now being done by the merge script. 2017-07-13 14:47:42 +01:00
Michael Herzberg 34e62d4e13 Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler 2017-07-13 14:43:50 +01:00
Michael Herzberg a901fa30bb Adjusted merge scripts for fts tables. 2017-07-13 14:43:37 +01:00
Achim D. Brucker 02ce0cfd2c Print info: sqlite3 binary used. 2017-07-12 20:36:29 +01:00
Michael Herzberg 26bddde328 Removed primary keys from fts tables as that had no effect. 2017-07-12 18:30:37 +01:00
Michael Herzberg 2cb868b3d0 Removed removal of review row ids as we now use author ids as primary keys. 2017-07-12 18:13:04 +01:00
Michael Herzberg 6a6a12c88a Added parsing of support to sqlite. 2017-07-12 18:11:31 +01:00
Michael Herzberg 16a44cf499 Added parsing of review replies to sqlite. 2017-07-12 17:56:40 +01:00
Michael Herzberg 0ed8c15a2d Made review a fts table. 2017-07-12 17:04:56 +01:00
Michael Herzberg 51bdcb4f16 Also download replies for support forum. 2017-07-12 16:57:16 +01:00
Michael Herzberg 11b0ccee4a Added download of review replies. 2017-07-12 16:10:47 +01:00
Michael Herzberg d6ae9d28b8 Fixed bug that lead to downloading the first review page twice instead of the first and second review page. 2017-07-12 14:09:01 +01:00
Michael Herzberg 60dd98e60e Fixed parsing of developer from overview page. 2017-07-12 13:54:44 +01:00
Michael Herzberg 885f580f2e Fixed set -u. 2017-07-11 11:15:43 +01:00
Michael Herzberg 6d03e8da71 Refactored sge scripts. 2017-07-11 11:11:32 +01:00
Michael Herzberg 8ca76f373e Made extension crawler repo path mandatory for sge scripts. 2017-07-10 15:25:56 +01:00
Michael Herzberg 00ad40e3a0 Added more sge scripts. 2017-07-10 14:42:03 +01:00
Michael Herzberg 3b3bbfbb6d Improved grepper script. 2017-07-10 14:09:45 +01:00
Michael Herzberg dac1f3e998 Improved grepper script. 2017-07-10 13:45:25 +01:00
Michael Herzberg 5f360e40e4 Improved grepper script. 2017-07-10 13:17:48 +01:00
Michael Herzberg e60265975f Renamed etag to crx_etag. 2017-07-10 12:46:41 +01:00
Michael Herzberg fe6b563f31 Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler 2017-07-10 12:29:45 +01:00
Michael Herzberg af68a95139 Fixed sed string. 2017-07-10 12:28:55 +01:00
Achim D. Brucker c4e13daae5 Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler 2017-07-09 18:55:24 +01:00
Michael Herzberg 77023e001e Do not treat js file decoding strictly. 2017-07-07 22:17:56 +01:00
Michael Herzberg 38c88d7461 Added parsing of content_script_urls. 2017-07-07 20:09:22 +01:00
Michael Herzberg fbc0a7c87c Added crx size and jsloc. 2017-07-07 19:47:14 +01:00
Michael Herzberg 62dc61826a Added parsing of itemcategory. 2017-07-07 19:29:51 +01:00
Achim D. Brucker 9c4b1225f0 Added parameter for specifying output directory. 2017-07-07 09:18:53 +01:00
Achim D. Brucker 449d3e1d6c Initial commit: utility to extract crx file from an extension archive. 2017-07-06 09:03:59 +01:00
Michael Herzberg c8a51c7a3c Added js grepper. 2017-07-05 20:55:41 +01:00