Commit Graph

915 Commits

Author SHA1 Message Date
Achim D. Brucker 2fc154d643 Use UTC-based time/dates for logging. 2018-04-07 15:54:39 +01:00
Achim D. Brucker 91a76091e3 Use UTC-based time/dates for logging. 2018-04-07 15:42:29 +01:00
Achim D. Brucker c7d28d2c9e Merge branch 'production' of logicalhacking.com:BrowserSecurity/ExtensionCrawler into production 2018-04-07 13:29:48 +01:00
Achim D. Brucker f6a9d49da1 Reverted processing in chunks back into processing only one large list. 2018-04-07 13:17:33 +01:00
Michael Herzberg 558bff402a Removed --writable flag from read-only ExtensionCrawler image. 2018-04-07 00:42:39 +01:00
Michael Herzberg 0c3423dcd8 Fitted db connection log messages into our logging framework. 2018-04-07 00:42:39 +01:00
Michael Herzberg 9c1d48fcbe Added 'wheel' to dependencies to fix build error with simhash. 2018-04-07 00:42:39 +01:00
Achim D. Brucker 7756ad2963 Bug fix: actually use max_workers. 2018-04-06 23:04:01 +01:00
Achim D. Brucker 6a86b37e7c Increase number of parallel downloads. 2018-04-06 21:36:11 +01:00
Achim D. Brucker 14a30a570d Process extensions in chunks. 2018-04-06 21:34:09 +01:00
Achim D. Brucker d5df43c5c3 Moved heuristic for parallel download into separate method. 2018-04-06 20:32:24 +01:00
Achim D. Brucker 9434df1b28 Set max task to 100. 2018-04-06 16:37:49 +01:00
Achim D. Brucker 69f1618db2 Reduced number of parallel downloads, as pebble seems to be much more memory hungry ... 2018-04-06 13:34:36 +01:00
Achim D. Brucker d3fe5e758a New default download timeout to 2 hours. 2018-04-06 12:08:02 +01:00
Achim D. Brucker d9fc65a089 Reformatting. 2018-04-06 07:27:57 +01:00
Achim D. Brucker 8c9aab8216 Converted timeout into a proper configuration parameter. 2018-04-06 07:25:21 +01:00
Achim D. Brucker 9586eed280 Added documentation. 2018-04-06 07:18:15 +01:00
Achim D. Brucker fd9cc1855a Improved command line interface for selecting which type of extensiosn should be crawled. 2018-04-06 07:17:20 +01:00
Achim D. Brucker 47f4af5d1b Fixed spelling of constant 'False'. 2018-04-06 06:46:25 +01:00
Achim D. Brucker 054fdb62cb Prefix top-leve exception logs for workers with WorkerException. 2018-04-05 23:16:27 +01:00
Achim D. Brucker 5d70bf1831 Switched to pebble.ProcessPool() for concurrency. 2018-04-05 22:51:27 +01:00
Achim D. Brucker fee88ed0fe Implemented sequential download mode. 2018-04-05 17:32:11 +01:00
Achim D. Brucker bf6269c600 Bug fix: warning mail for stalled download. 2018-04-05 17:05:51 +01:00
Achim D. Brucker d0c185fa69 Log If-Modified-Since request for timing analysis. 2018-04-04 23:14:24 +01:00
Achim D. Brucker 2d33f3bebe Added pdf output. 2018-04-04 09:35:35 +01:00
Achim D. Brucker de2519130a Improved derivative (downloads per eight hours) and png output. 2018-04-04 09:20:29 +01:00
Achim D. Brucker 423d3c35fa Improved line types for png output. 2018-04-04 08:27:02 +01:00
Achim D. Brucker 0f1c53a011 More aggressive download heuristics. 2018-04-03 22:20:09 +01:00
Achim D. Brucker 7c688afee8 Configured 32 parallel downloads. 2018-04-03 22:14:02 +01:00
Achim D. Brucker ca42e4026f Added plot of an approximation of the first derivative. 2018-04-03 16:17:14 +01:00
Achim D. Brucker 1102fc102d Kill running downloads more aggressively. 2018-04-02 16:41:07 +01:00
Achim D. Brucker 234cbef539 Improved mode for terminating running downloads. 2018-04-02 12:01:54 +01:00
Achim D. Brucker ec30fdda5a Added missing ". 2018-04-02 11:37:50 +01:00
Achim D. Brucker c05ad75bca Minor increasement of parallel downloads (testing). 2018-04-02 11:36:22 +01:00
Achim D. Brucker 2412a66731 Fixed kill mode. 2018-04-02 11:35:49 +01:00
Achim D. Brucker 2a7e94f325 Introduced mode that kills (still) running downloads. 2018-04-02 10:58:09 +01:00
Achim D. Brucker 2d046a5e4d Removed unnecessary pipes. 2018-04-02 10:19:06 +01:00
Achim D. Brucker 46639927d7 Removed unused configuration variables. 2018-04-02 10:08:51 +01:00
Achim D. Brucker 8041e7366d Improved x and y ticks. 2018-04-01 20:14:09 +01:00
Achim D. Brucker b5af89e8c3 Increased chunk size to 500. 2018-04-01 19:54:28 +01:00
Achim D. Brucker 57b0b82326 Fixed last mail check and separator for max extensions. 2018-04-01 19:00:21 +01:00
Achim D. Brucker fcaf4144dc Fixed last mail check. 2018-04-01 18:50:17 +01:00
Achim D. Brucker c032de575a Fixed location of updates.csv. 2018-04-01 18:25:45 +01:00
Achim D. Brucker 51fe4bc1e2 Fixed type error (chunksize is a parameter of map, not the list constructor). 2018-04-01 18:19:27 +01:00
Achim D. Brucker c7a4df17db Fixed typo. 2018-04-01 17:02:25 +01:00
Achim D. Brucker b9a9225a07 Added plot generation and moved location of CSV file. 2018-04-01 15:15:09 +01:00
Achim D. Brucker fa74c46f5a Initial commit: gnuplot script for generating graph of the downloads of the last week. 2018-04-01 15:13:48 +01:00
Achim D. Brucker b9a9ed7f2b Merged date/time fields. 2018-04-01 12:44:01 +01:00
Achim D. Brucker 337178f3cf Fixed column numbers in csv file. 2018-03-31 21:12:40 +01:00
Achim D. Brucker afd7a18f3d Added PIDs to logged data. 2018-03-31 21:10:55 +01:00