Commit Graph

859 Commits

Author SHA1 Message Date
Achim D. Brucker cbc1fd9a58 Added install option. 2018-03-01 07:21:03 +00:00
Achim D. Brucker 79f8d15aeb Report running PIDs. 2018-02-28 22:12:27 +00:00
Achim D. Brucker 1aed6cb1f2 Merge branch 'master' into production 2018-02-26 21:23:13 +00:00
Achim D. Brucker d6c7fbd306 Ignore invalid bytes during character decoding. 2018-02-26 21:23:00 +00:00
Achim D. Brucker 527fca78bc Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler 2018-02-18 19:32:47 +00:00
Achim D. Brucker 535e5c934b Simple monitoring script for global_update.sh. 2018-02-18 19:06:10 +00:00
Achim D. Brucker 3547cdc26a Merge branch 'master' into production 2018-02-16 21:32:55 +00:00
Achim D. Brucker 02320e51f7 Simple monitoring script for global_update.sh. 2018-02-16 21:32:11 +00:00
Achim D. Brucker a32abc1d48 Merge branch 'master' into production 2018-02-12 08:48:46 +00:00
Achim D. Brucker fd194d0103 Increased number of parallel downloads. 2018-02-12 08:48:28 +00:00
Achim D. Brucker 51f7337639 Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler 2018-02-12 08:46:58 +00:00
Achim D. Brucker a17e71529b Updated to build interface of latest singularity and read-only squashfs images for default crawler image. 2018-02-12 08:46:35 +00:00
Michael Herzberg c633b4a085 Made arguments for CrawlError optional in the hopes of fixing some pickle error... 2017-12-14 00:05:28 +00:00
Michael Herzberg 2a751d69e4 Made arguments for CrawlError optional in the hopes of fixing some pickle error... 2017-12-14 00:02:26 +00:00
Achim D. Brucker 11f8d40a17 Reformatting. 2017-11-29 08:40:38 +09:00
Achim D. Brucker b5f6b273a3 Reformatting. 2017-11-29 08:40:32 +09:00
Achim D. Brucker 781148be4f Reformatting. 2017-11-29 08:40:28 +09:00
Achim D. Brucker ae165e5527 Reformatting. 2017-11-28 14:10:31 +00:00
Achim D. Brucker f2632c02df Reformatting. 2017-11-27 06:50:25 +00:00
Achim D. Brucker b3e5b9bb37 Reformatting. 2017-11-26 23:35:35 +00:00
Achim D. Brucker 7b501a1d71 Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler 2017-11-25 23:11:20 +00:00
Achim D. Brucker 360abdf072 Reformatting. 2017-11-25 23:10:15 +00:00
Achim D. Brucker 9b397a8c42 Reformatting. 2017-11-24 23:11:10 +00:00
Achim D. Brucker e0d587cd94 Reformatting. 2017-11-23 21:57:58 +00:00
Achim D. Brucker 41bfe0cbf0 Reformatting. 2017-11-22 23:08:12 +00:00
Achim D. Brucker 06caca81e9 Store simhash in data base. 2017-11-22 06:52:31 +00:00
Achim D. Brucker f7cdc03133 Compute simhash for string representation of binary data. 2017-11-21 07:43:49 +00:00
Achim D. Brucker 52045ed53d Reformatting. 2017-11-20 22:42:21 +00:00
Achim D. Brucker 76c4c57ea3 Basic integration of simhash computation. 2017-11-20 22:41:31 +00:00
Achim D. Brucker 6ba5906ffb Added docstring. 2017-11-20 20:25:40 +00:00
Achim D. Brucker 74da2e9c08 Initial simhash integration. 2017-11-19 00:36:15 +00:00
Achim D. Brucker dbd0ff6bf3 Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler 2017-11-18 23:21:33 +00:00
Achim D. Brucker acfdb9ee50 Removed unused function analyse_comment_blocks. 2017-11-18 23:21:19 +00:00
Achim D. Brucker d63207410b Removed unused function. 2017-11-18 23:19:23 +00:00
Achim D. Brucker e3519f012d Reformatting. 2017-11-17 16:58:48 +00:00
Achim D. Brucker 32c08672d9 Added log output for failed data decoding. 2017-11-16 07:13:55 +00:00
Achim D. Brucker 64e0054975 Added logging infrastructure. 2017-11-16 07:08:42 +00:00
Achim D. Brucker 3db3435c07 Refactoring of heursitic detection stubs. 2017-11-15 08:05:40 +00:00
Achim D. Brucker c5dce7bcd0 Fixed decoding of content (str_data). 2017-11-15 07:12:41 +00:00
Achim D. Brucker 0667c5e0f2 Removed outdated cdnjs crawler (replaced by git miner instead). 2017-11-13 17:05:37 +00:00
Achim D. Brucker 24bdd9a1c5 Added log messages for cron report. 2017-11-12 14:49:51 +00:00
Achim D. Brucker 91e6014c6c Moved to single-threaded mode. 2017-11-12 14:07:25 +00:00
Achim D. Brucker 4cb49f2281 Merge branch 'production' 2017-11-11 21:56:33 +00:00
Achim D. Brucker 9bd283f35a Fixed use of append. 2017-11-10 00:13:06 +00:00
Achim D. Brucker 7dfbdac670 Disabled parallel updates (for debugging a deadlock situation). 2017-11-09 23:38:05 +00:00
Achim D. Brucker 5cc7a92f90 Fixed typo. 2017-11-09 00:17:09 +00:00
Achim D. Brucker d8956bae04 Updated latest python dependency. 2017-11-08 22:11:15 +00:00
Achim D. Brucker ac910bf819 Updated python version to 3.6. 2017-11-07 20:58:24 +00:00
Achim D. Brucker 631f461d1f Removed not supported connection_timeout parameter. 2017-11-06 06:11:14 +00:00
Achim D. Brucker d0b173c55f Disable automatic python module update. 2017-11-05 20:15:05 +00:00