Commit Graph

718 Commits

Author SHA1 Message Date
Achim D. Brucker 06ff5f3057 Method for computing basic file identifiers. 2017-09-10 15:57:07 +01:00
Achim D. Brucker a6e90794bc Extended const_basedir to check environment variable EXTENSION_ARCHIVE and modified main scripts to actually use const_basedir. 2017-09-10 15:55:22 +01:00
Achim D. Brucker 4b31097975 Added function for computing a list of normalized code blocks for a JavaScript file. 2017-09-10 15:02:57 +01:00
Michael Herzberg fbef566466 Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler 2017-09-10 12:20:33 +01:00
Michael Herzberg e09cb16083 Updated path to archive. 2017-09-10 12:20:23 +01:00
Achim D. Brucker 52b42dfaef Changed pull method to return list of changed files. 2017-09-10 11:01:29 +01:00
Achim D. Brucker c3053427c0 Added method for obtaining initial commit date and pulling git repos. 2017-09-09 23:13:26 +01:00
Achim D. Brucker 08b70ed63a Updated archive dir to reflect new file hierarchy by default. 2017-09-08 21:10:40 +01:00
Achim D. Brucker a519495096 Removed outdated sync script (only useful for old sqlite-based setup). 2017-09-08 20:58:36 +01:00
Achim D. Brucker b93c84f948 Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler 2017-09-08 11:42:42 +01:00
Achim D. Brucker de314c1112 Added GitPython dependency. 2017-09-07 20:25:05 +01:00
Achim D. Brucker 8c33558934 Reformatting. 2017-09-07 20:09:29 +01:00
Michael Herzberg 69a04c0a7b Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler 2017-09-07 12:44:19 +01:00
Michael Herzberg 66adacccad Adjusted parameters in grepper sge script. 2017-09-07 12:44:08 +01:00
Achim D. Brucker 2b63192bc2 Initial commit. 2017-09-06 23:32:03 +01:00
Achim D. Brucker 3b2913616b Skip first_seen if not defined. 2017-09-05 10:15:48 +01:00
Michael Herzberg a9173345e8 Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler 2017-09-04 15:54:38 +01:00
Michael Herzberg 36d36facfe Relaxed mysql retries. 2017-09-04 15:54:28 +01:00
Achim D. Brucker 6395d98443 Releaxed handling of network errors. 2017-09-04 09:11:27 +01:00
Achim D. Brucker cfeb29d95f Clean-up of logging infrastructure. 2017-09-03 15:56:27 +01:00
Achim D. Brucker f42f8e3d03 Improved error handling for request failures. 2017-09-03 15:43:33 +01:00
Achim D. Brucker 872346fa61 Add timout parameter to http get requests. 2017-09-03 12:03:51 +01:00
Achim D. Brucker 0b0268e320 Copy outphased date to hash map of files archive. 2017-09-03 11:13:27 +01:00
Achim D. Brucker 0f716e98da Bug fix: only try to preserve outphased library information is there is any stored locally. 2017-09-03 11:09:39 +01:00
Achim D. Brucker 80c8e7caa0 Preserve outphased library versions. 2017-09-03 11:00:05 +01:00
Achim D. Brucker 03504ff81a Improved error handling. 2017-09-03 10:45:56 +01:00
Achim D. Brucker 13191f1ce0 Renaming: date -> first_seen. 2017-09-03 10:32:45 +01:00
Achim D. Brucker 59f9b47a81 Switched to Logging framework. 2017-09-03 10:29:57 +01:00
Achim D. Brucker 074447064c Enabled parallel download. 2017-09-03 10:06:55 +01:00
Achim D. Brucker e3aa92f1b8 Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler 2017-09-02 22:15:36 +01:00
Achim D. Brucker 515a462938 Added methods for generating/updating index files based on the file hash. 2017-09-02 22:10:43 +01:00
Achim D. Brucker 9ae5905973 Generalized hash map builders. 2017-09-02 21:53:58 +01:00
Achim D. Brucker 22c3a7581d Reformatting. 2017-09-02 21:44:20 +01:00
Achim D. Brucker 3097db3790 Added methods for generating sha1 indexed dictionary. 2017-09-02 21:40:44 +01:00
Achim D. Brucker e5c2372222 Improved log output (verbose mode). 2017-09-02 20:57:01 +01:00
Achim D. Brucker c32ab6bc94 print URL of downloaded library files in verbose mode. 2017-09-02 20:44:47 +01:00
Achim D. Brucker ea8460f1b8 Updated local update. 2017-09-02 20:41:16 +01:00
Achim D. Brucker 030a4b36ca Added functionality for deleting information of orphaned libraries. 2017-09-02 19:43:10 +01:00
Achim D. Brucker 247b96db6d Refactoring: moved core functionality in own module. 2017-09-02 18:47:41 +01:00
Achim D. Brucker 7bcf9aca8e Removed executable flag. 2017-09-02 18:08:20 +01:00
Achim D. Brucker 99028c3763 Removed executable flag. 2017-09-02 18:08:06 +01:00
Achim D. Brucker 6b3ef921ff Reformatting and minor refactoring. 2017-09-02 18:00:24 +01:00
Michael Herzberg 45496a0d5d Log parameters. 2017-09-02 17:52:51 +01:00
Michael Herzberg d7dcfdbcbd Use $* instead of $@. 2017-09-02 17:50:16 +01:00
Michael Herzberg 54475b97a8 Added arg option to sge script. 2017-09-02 17:45:12 +01:00
Achim D. Brucker 7e07c6d734 Initital commit: tool for crawling cdnjs.com. 2017-09-02 17:42:46 +01:00
Michael Herzberg c94f23dcee Added --from-date option for create-db. 2017-09-02 17:42:18 +01:00
Michael Herzberg c33e8204ea Cleaned up create-db sge script a bit. 2017-09-02 17:05:42 +01:00
Michael Herzberg 1647aac086 Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler 2017-09-02 16:46:04 +01:00
Michael Herzberg 08fced735c Fixed queries. 2017-09-02 16:45:38 +01:00