Commit Graph

503 Commits

Author SHA1 Message Date
Achim D. Brucker 936f2d3189 Log git info before starting pull (update). 2017-09-14 22:54:37 +01:00
Achim D. Brucker 2ff30f7382 Parallel execution of git date queries. 2017-09-14 15:11:53 +01:00
Achim D. Brucker 12a1e282aa The method pull_get_updated_lib_files(...) now also returns unique library/version information. 2017-09-14 10:44:30 +01:00
Achim D. Brucker e3f1202e44 Use version dictionary. 2017-09-14 10:33:00 +01:00
Achim D. Brucker f54f29c9ba Added build_release_date_dic(...). 2017-09-14 09:50:09 +01:00
Achim D. Brucker 3b217922c5 Added line count. 2017-09-13 16:41:01 +01:00
Achim D. Brucker 420eec7462 Minor memory optimizations. 2017-09-13 11:12:33 +01:00
Achim D. Brucker ec1c47625a Added support for parallel update of database. 2017-09-13 09:13:35 +01:00
Achim D. Brucker c386bd01dd Added missing string conversion. 2017-09-13 08:29:23 +01:00
Achim D. Brucker 42e685ee32 Added missing string conversion. 2017-09-13 08:01:02 +01:00
Achim D. Brucker 18fb23d3dc Use glob instead of os.walk() to avoid memory leak in the latter. 2017-09-13 04:04:38 +01:00
Achim D. Brucker 76d5993794 Added logging output. 2017-09-13 03:02:39 +01:00
Achim D. Brucker c30f7fdd7c Implemented skeleton of main routine. 2017-09-13 02:56:13 +01:00
Achim D. Brucker a8a5534be1 Renamed module. 2017-09-13 01:13:17 +01:00
Achim D. Brucker bdb84c2120 Renamed module. 2017-09-13 01:09:30 +01:00
Achim D. Brucker 4e5b52617f Catch exception during decompression and increase max. allowed size of decompressed data to 100 times of compressed size. 2017-09-13 00:23:17 +01:00
Achim D. Brucker 88efe2b8a4 Reformatting. 2017-09-13 00:02:20 +01:00
Achim D. Brucker ea9339bc53 Compute data identifiers for uncompressed content of gzip compressed files. 2017-09-13 00:01:15 +01:00
Achim D. Brucker f9cf7bd35f Refactoring: moved computation of data related identifiers into own method. 2017-09-12 23:52:52 +01:00
Achim D. Brucker 8243664974 Use StringIO representation for normalizing js/css files (avoid re-reading the file content from disk). 2017-09-12 23:43:09 +01:00
Achim D. Brucker 933c4d4d11 Determine file description from buffer instead from file (avoid reading file twice). 2017-09-12 23:23:22 +01:00
Achim D. Brucker 6353202ee8 Renaming: fileinfo -> filedb. 2017-09-10 22:59:07 +01:00
Achim D. Brucker 0426d7d3d1 Reformatting. 2017-09-10 22:39:47 +01:00
Achim D. Brucker e5da9abaea Added get_file_libinfo(...). 2017-09-10 22:38:49 +01:00
Achim D. Brucker ad2af517a3 Agressively try to normalize as many filetypes as possible. 2017-09-10 17:40:30 +01:00
Achim D. Brucker 06ff5f3057 Method for computing basic file identifiers. 2017-09-10 15:57:07 +01:00
Achim D. Brucker a6e90794bc Extended const_basedir to check environment variable EXTENSION_ARCHIVE and modified main scripts to actually use const_basedir. 2017-09-10 15:55:22 +01:00
Achim D. Brucker 4b31097975 Added function for computing a list of normalized code blocks for a JavaScript file. 2017-09-10 15:02:57 +01:00
Achim D. Brucker 52b42dfaef Changed pull method to return list of changed files. 2017-09-10 11:01:29 +01:00
Achim D. Brucker c3053427c0 Added method for obtaining initial commit date and pulling git repos. 2017-09-09 23:13:26 +01:00
Achim D. Brucker 8c33558934 Reformatting. 2017-09-07 20:09:29 +01:00
Achim D. Brucker 3b2913616b Skip first_seen if not defined. 2017-09-05 10:15:48 +01:00
Michael Herzberg a9173345e8 Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler 2017-09-04 15:54:38 +01:00
Michael Herzberg 36d36facfe Relaxed mysql retries. 2017-09-04 15:54:28 +01:00
Achim D. Brucker 6395d98443 Releaxed handling of network errors. 2017-09-04 09:11:27 +01:00
Achim D. Brucker cfeb29d95f Clean-up of logging infrastructure. 2017-09-03 15:56:27 +01:00
Achim D. Brucker f42f8e3d03 Improved error handling for request failures. 2017-09-03 15:43:33 +01:00
Achim D. Brucker 872346fa61 Add timout parameter to http get requests. 2017-09-03 12:03:51 +01:00
Achim D. Brucker 0b0268e320 Copy outphased date to hash map of files archive. 2017-09-03 11:13:27 +01:00
Achim D. Brucker 0f716e98da Bug fix: only try to preserve outphased library information is there is any stored locally. 2017-09-03 11:09:39 +01:00
Achim D. Brucker 80c8e7caa0 Preserve outphased library versions. 2017-09-03 11:00:05 +01:00
Achim D. Brucker 03504ff81a Improved error handling. 2017-09-03 10:45:56 +01:00
Achim D. Brucker 13191f1ce0 Renaming: date -> first_seen. 2017-09-03 10:32:45 +01:00
Achim D. Brucker 59f9b47a81 Switched to Logging framework. 2017-09-03 10:29:57 +01:00
Achim D. Brucker 074447064c Enabled parallel download. 2017-09-03 10:06:55 +01:00
Achim D. Brucker 515a462938 Added methods for generating/updating index files based on the file hash. 2017-09-02 22:10:43 +01:00
Achim D. Brucker 9ae5905973 Generalized hash map builders. 2017-09-02 21:53:58 +01:00
Achim D. Brucker 22c3a7581d Reformatting. 2017-09-02 21:44:20 +01:00
Achim D. Brucker 3097db3790 Added methods for generating sha1 indexed dictionary. 2017-09-02 21:40:44 +01:00
Achim D. Brucker e5c2372222 Improved log output (verbose mode). 2017-09-02 20:57:01 +01:00
Achim D. Brucker c32ab6bc94 print URL of downloaded library files in verbose mode. 2017-09-02 20:44:47 +01:00
Achim D. Brucker ea8460f1b8 Updated local update. 2017-09-02 20:41:16 +01:00
Achim D. Brucker 030a4b36ca Added functionality for deleting information of orphaned libraries. 2017-09-02 19:43:10 +01:00
Achim D. Brucker 247b96db6d Refactoring: moved core functionality in own module. 2017-09-02 18:47:41 +01:00
Achim D. Brucker 7bcf9aca8e Removed executable flag. 2017-09-02 18:08:20 +01:00
Achim D. Brucker 99028c3763 Removed executable flag. 2017-09-02 18:08:06 +01:00
Achim D. Brucker 9ed8f5f926 Improved reporting. 2017-09-02 00:05:07 +01:00
Achim D. Brucker a69c173064 Activated preliminary check of regexps for specific libs. 2017-09-01 23:41:45 +01:00
Achim D. Brucker 28f6aa5f45 Bug fix: indentation 2017-09-01 23:24:55 +01:00
Achim D. Brucker 5c987833a4 Bug fix: NoneType object is not iterable. 2017-09-01 23:23:11 +01:00
Michael Herzberg bb03a67a29 Deleted ropeproject stuff. 2017-09-01 17:04:25 +01:00
Achim D. Brucker 2693fb0fcd Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler 2017-09-01 16:28:18 +01:00
Achim D. Brucker 3fb0d740c0 Bug fix: exception due to reading from the wrong dictionary. 2017-09-01 16:27:44 +01:00
Michael Herzberg ab943c87f0 Expand user directory for mysql config file. 2017-09-01 16:17:51 +01:00
Michael Herzberg abd9605ebc Use python3.5 for all files. 2017-09-01 14:12:05 +01:00
Michael Herzberg cbcb3bc3b0 Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler 2017-09-01 13:30:57 +01:00
Michael Herzberg 5c24608c4d Added --max-discover <N> option to limit the number of new extensions. 2017-09-01 13:30:42 +01:00
Achim D. Brucker 258269abb6 Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler 2017-09-01 12:14:53 +01:00
Achim D. Brucker 8289d50d38 Download all extensions in parallel and later do a second download for a subset including forums/reviews. 2017-09-01 12:14:39 +01:00
Michael Herzberg b5fd382ab8 Use utf8mb4 for mysql connections. 2017-09-01 12:11:37 +01:00
Achim D. Brucker 53f080ba36 Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler 2017-09-01 12:02:17 +01:00
Achim D. Brucker 22264fb9e0 Changed download order: first download parallel all extensions without forum/review download, then download extensions with forums. 2017-09-01 12:02:12 +01:00
Michael Herzberg 62c353f647 Removed crawling restriction to 10 extids. 2017-09-01 10:42:21 +01:00
Michael Herzberg 21a7741f0c Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler 2017-09-01 08:15:36 +01:00
Michael Herzberg 05ffdc6e24 Added explicit utf-8 request to mysql connector. 2017-09-01 08:15:22 +01:00
Achim D. Brucker 9446c20d01 Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler 2017-08-31 23:45:00 +01:00
Achim D. Brucker 883e7ddcd8 Report details of matches. 2017-08-31 23:44:51 +01:00
Michael Herzberg e06d3f4ac4 Reduced timeout and fixed logging. 2017-08-31 23:01:05 +01:00
Achim D. Brucker e0db2a5f47 Added detection details. 2017-08-31 08:43:19 +01:00
Michael Herzberg ccf43de3d0 Pad process id to 6 chars. 2017-08-30 20:05:17 +01:00
Michael Herzberg 906d81ab86 Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler 2017-08-30 19:59:33 +01:00
Michael Herzberg 4145f88a94 MySQL wait +/- 20%. 2017-08-30 19:59:15 +01:00
Achim D. Brucker e70cf5d28f Bug fix: missing hex decoding of md5/sha1 hashes. 2017-08-30 19:25:58 +01:00
Michael Herzberg b76eef12d5 Added randomized delay for MySQL. 2017-08-30 18:55:13 +01:00
Michael Herzberg bde59c5040 Fixed crx_etag select bug and some logging. 2017-08-30 16:32:36 +01:00
Michael Herzberg cbd2dea820 Removed everything related to sqlite and updated README. 2017-08-30 15:38:04 +01:00
Michael Herzberg c7a808db3f Don't process replies if there are none. 2017-08-30 15:15:12 +01:00
Michael Herzberg 5f234d8539 Improved logging. 2017-08-30 15:12:54 +01:00
Michael Herzberg 12c111ca11 Once more, make mysql the default. Also, increased timeout. 2017-08-30 12:13:25 +01:00
Michael Herzberg 9b8a693a5f Changed logging a little bit. 2017-08-30 12:12:57 +01:00
Michael Herzberg f8c8382919 Merge. 2017-08-30 11:42:12 +01:00
Michael Herzberg 6a9a1cda63 Moved crx logging to where crx will actually be parsed. 2017-08-30 11:29:30 +01:00
Michael Herzberg d99142f8d0 Added and changed a few columns. 2017-08-30 10:07:06 +01:00
Achim D. Brucker 3269a4900c Bug fix: printing of file name in Javascript mode. 2017-08-30 09:56:19 +01:00
Achim D. Brucker b5b6a17ee5 Support analysis of crx files and plain JavaScript files. 2017-08-30 09:11:55 +01:00
Achim D. Brucker cacdf1f727 Refactoring. 2017-08-30 08:28:39 +01:00
Achim D. Brucker 85d6ec084d Bug fix: missing detection method for empty files. 2017-08-30 08:24:16 +01:00
Achim D. Brucker d7120fad45 Bug fix: update char if loops reads another char (next_char). This avoids missing an escape character or a newline. 2017-08-30 01:43:18 +01:00
Achim D. Brucker 66818b2fa6 Renamed hash to md5 in JSON file and added support for sha1 hashes. 2017-08-30 00:38:30 +01:00
Achim D. Brucker e947e69f37 Define type and detection method for all generated entries. 2017-08-30 00:24:19 +01:00
Achim D. Brucker ae3bbd7339 Using values of enumeration to obtain nice and short human readable representations. 2017-08-30 00:12:57 +01:00
Michael Herzberg 47f424cf2f Added more logging. 2017-08-29 23:10:46 +01:00
Michael Herzberg 080f00f17c Added new columns for jsfile table. 2017-08-29 22:40:01 +01:00
Michael Herzberg 95d71a9edc Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler 2017-08-29 22:29:49 +01:00
Michael Herzberg 3e24d1f08c Changed logging to use logging library. 2017-08-29 22:29:38 +01:00
Achim D. Brucker 39cd03dccc Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler 2017-08-29 18:01:42 +01:00
Achim D. Brucker 97f5b14158 Compute sha1 for JavaScript files. 2017-08-29 18:01:28 +01:00
Michael Herzberg bddd80c138 Made removal of manifest.json comments stricter. 2017-08-29 15:43:04 +01:00
Michael Herzberg 7ffdf30545 Push manifest into table crx column manifest. 2017-08-29 15:41:13 +01:00
Michael Herzberg 2b11117b6f Always process crx, regardless whether or not crx_etag is already in db. 2017-08-29 15:24:59 +01:00
Michael Herzberg 8b91957372 Reduced default MySQL timeout. 2017-08-29 15:20:58 +01:00
Michael Herzberg 6a99d41471 Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler 2017-08-29 15:11:37 +01:00
Achim D. Brucker d4ad5f96f8 Report empty files as own category/type. 2017-08-28 22:38:06 +01:00
Michael Herzberg f81aac7c61 Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler 2017-08-28 22:38:05 +01:00
Achim D. Brucker 2ace19f453 Compute js_info (including md5 hash and character set detection) only once per file. 2017-08-28 21:05:15 +01:00
Achim D. Brucker 91dfe67513 Auto-detect character encoding of JavaScript files using cchardet. 2017-08-28 20:53:55 +01:00
Michael Herzberg c30f0c4147 Removed database and host setting. To be set in ~/.my.cnf file now. 2017-08-28 20:17:11 +01:00
Achim D. Brucker 5cff2bc1b7 New check based on file hash (md5). 2017-08-28 20:09:34 +01:00
Achim D. Brucker 030adb6adc Minor refactoring and cleanup. 2017-08-28 19:20:50 +01:00
Michael Herzberg 5175d28edc Convert some stuff to string for db insert. 2017-08-28 17:12:32 +01:00
Michael Herzberg 0a4e8839a1 Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler 2017-08-28 11:50:49 +01:00
Michael Herzberg 81077b807c Added mysql retry logic and use time.time() now. 2017-08-28 11:50:41 +01:00
Achim D. Brucker 9bf0b47f98 Minor improvement of string conversion for JsBlock. 2017-08-28 10:50:52 +01:00
Achim D. Brucker c721e6fdbf Merge with upstream. 2017-08-28 10:49:01 +01:00
Achim D. Brucker f10923af03 Integreated js_mincer into decomposition analysis to allow, in the future, to check comments, code, and string literals explicitely. 2017-08-28 10:40:37 +01:00
Achim D. Brucker 9ef27f9ac9 Added missing return statements. 2017-08-28 10:28:21 +01:00
Achim D. Brucker 90b1db4a25 Added additional comment checks. 2017-08-28 01:26:13 +01:00
Achim D. Brucker 9b272c9302 Added option to merge subsequent single line comments into a single line comment block. 2017-08-28 01:17:00 +01:00
Achim D. Brucker 111777c821 Improved position counting. 2017-08-28 00:57:58 +01:00
Achim D. Brucker d4de20efc1 Bug fix: start position of blocks and omit empty code blocks. 2017-08-28 00:19:28 +01:00
Achim D. Brucker e2e92594ce Bug fix: catch also last block of file. 2017-08-27 23:34:33 +01:00
Michael Herzberg 257afe92f0 Use selective insert. 2017-08-27 23:00:28 +01:00
Achim D. Brucker 629f492fa7 Added tests for code blocks and comments. 2017-08-27 22:58:09 +01:00
Achim D. Brucker 7ff1623bc6 Introduced JavaScript mincer working on file objects. 2017-08-27 22:51:55 +01:00
Michael Herzberg b98b7bc0f7 Fixed column typo. 2017-08-27 22:49:07 +01:00
Achim D. Brucker e324ab9483 Re-formatted and added documentation. 2017-08-27 22:41:04 +01:00
Achim D. Brucker 9376b4056f Collect string literals in code blocks. 2017-08-27 22:27:35 +01:00
Achim D. Brucker 41ca506b9f Return iterator that iterates over JavaScript blocks. 2017-08-27 22:17:04 +01:00
Achim D. Brucker 5add586da3 Initial commit. 2017-08-27 20:47:24 +01:00
Achim D. Brucker f6f0bc0394 Renamed jsdecompose.py to js_decomposer.py. 2017-08-27 20:45:56 +01:00
Michael Herzberg 9521240d90 Make stuff configurable. 2017-08-27 18:28:19 +01:00
Michael Herzberg 0cff600861 Fixed etag keys. 2017-08-27 17:35:58 +01:00
Michael Herzberg d4b0a6535b Fixed some things. 2017-08-27 16:57:23 +01:00
Michael Herzberg f075192b44 made sqlite default again. 2017-08-27 03:26:29 +01:00
Michael Herzberg 22c90dcb4f Truncate timezone from timestamps for mysql, make mysql default. 2017-08-27 03:14:43 +01:00
Michael Herzberg 585c8faf0e Added mysql, but still outcommented. 2017-08-27 02:53:15 +01:00
Michael Herzberg c5c04cd1ed Refactored sqlite-specifics into own class. 2017-08-27 00:22:19 +01:00
Achim D. Brucker 0bd6a55adb Added documentation for analyse_filename. 2017-08-26 22:45:14 +01:00
Achim D. Brucker df472fbbe8 Refactored filename check. 2017-08-26 22:43:57 +01:00
Achim D. Brucker b2c862ede1 Added fields for storing evidence information for detected library/version information. 2017-08-25 07:07:34 +01:00
Achim D. Brucker 807af6f32d Refactoring: proper use of enumerations. 2017-08-24 21:37:35 +01:00
Achim D. Brucker 45d2c7ad44 Fundamental refactoring. 2017-08-24 19:43:48 +01:00
Achim D. Brucker 676cc5ac9d Renamed detectLibraries to decompose_js. 2017-08-24 00:47:35 +01:00
Achim D. Brucker 486b967d2d Refactoring. 2017-08-24 00:44:34 +01:00
Achim D. Brucker 9ced7ea3b5 Refactoring and bug fix in library classification. 2017-08-24 00:29:44 +01:00
Achim D. Brucker 94bd0f9a95 Refactoring. 2017-08-23 23:37:15 +01:00
Achim D. Brucker 2bbd6281f7 Reformatting. 2017-08-23 20:09:02 +01:00
Achim D. Brucker 4c5f8889d2 Refactoring. 2017-08-23 20:04:52 +01:00
Achim D. Brucker cd217f57a6 Integrated JavaScript decomposition analysis. 2017-08-23 19:42:00 +01:00
Achim D. Brucker 5d89e28486 Cleanup. 2017-08-23 19:17:35 +01:00
Achim D. Brucker 123623b111 Minor code cleanup. 2017-08-23 17:36:41 +01:00
Achim D. Brucker 3208a6e58a Initial import of JavaScript decomposition framework. 2017-08-23 17:22:58 +01:00
Michael Herzberg 68e7e72e93 Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler 2017-08-09 13:06:42 +01:00
Michael Herzberg 40f800b4de Check if 'annotations' exists in search results. 2017-08-09 13:06:22 +01:00
Achim D. Brucker d3da686e16 Changed formular for computing download delay. 2017-08-05 10:57:45 +01:00
Michael Herzberg c61f19e860 Use INSERT OR IGNORE. 2017-07-31 23:55:21 +01:00
Michael Herzberg b8f57196c7 Changed fts table structure. 2017-07-31 23:23:57 +01:00
Michael Herzberg b34d45c4dc Added md5sum to sqlite. 2017-07-31 20:38:21 +01:00
Achim D. Brucker 35c133e395 Slightly more aggressive implementation of google_dos_protection. 2017-07-30 14:41:20 +01:00
Achim D. Brucker 5268f2a732 Refactoring: clean-up of imports and a few other minor improvements. 2017-07-29 16:13:39 +01:00
Achim D. Brucker eb0054b47d Refactoring: Moved default configuration to config module. 2017-07-29 12:36:20 +01:00
Achim D. Brucker 0b24fb15fe Refactoring. 2017-07-29 11:32:06 +01:00
Achim D. Brucker 0ca3476b09 Slightly more aggressive implementation of google_dos_protection. 2017-07-29 11:21:15 +01:00
Achim D. Brucker d05ca9678e Refactoring. 2017-07-29 10:57:35 +01:00
Achim D. Brucker ac663299b3 Refactoring. 2017-07-29 10:17:16 +01:00
Achim D. Brucker 10cce2859d Renamed variable/attribute pk to public_key. 2017-07-29 09:15:22 +01:00
Achim D. Brucker 333bcaa62d Strip path from crx file. 2017-07-29 09:12:01 +01:00
Achim D. Brucker e5d671c7c4 Refactoring. 2017-07-29 09:05:16 +01:00
Michael Herzberg 11604c0fa5 Collect jsfilesize instead of jsloc. 2017-07-26 12:05:54 +01:00
Achim D. Brucker 73eedab07d Log time delta for each extension upate. 2017-07-26 07:29:48 +01:00
Michael Herzberg b3d1ab912e Wait a maximum of 10min before stopping jsbeautifier. 2017-07-25 22:57:55 +01:00
Michael Herzberg 072e008fe2 Run the garbage collector manually after using jsbeautify. 2017-07-19 17:25:21 +01:00
Michael Herzberg 186f6162af Fixed NoneType str conversion exception. 2017-07-17 15:29:00 +01:00
Michael Herzberg 9b1e5db96f Check for attributes key first and use traceback module instead of printing str(e). 2017-07-17 14:00:39 +01:00
Michael Herzberg eded1ca893 Only attempt to search for replies when we actually have search parameters. 2017-07-16 20:14:50 +01:00
Michael Herzberg 26bddde328 Removed primary keys from fts tables as that had no effect. 2017-07-12 18:30:37 +01:00
Michael Herzberg 6a6a12c88a Added parsing of support to sqlite. 2017-07-12 18:11:31 +01:00
Michael Herzberg 16a44cf499 Added parsing of review replies to sqlite. 2017-07-12 17:56:40 +01:00
Michael Herzberg 0ed8c15a2d Made review a fts table. 2017-07-12 17:04:56 +01:00
Michael Herzberg 51bdcb4f16 Also download replies for support forum. 2017-07-12 16:57:16 +01:00
Michael Herzberg 11b0ccee4a Added download of review replies. 2017-07-12 16:10:47 +01:00
Michael Herzberg d6ae9d28b8 Fixed bug that lead to downloading the first review page twice instead of the first and second review page. 2017-07-12 14:09:01 +01:00
Michael Herzberg 60dd98e60e Fixed parsing of developer from overview page. 2017-07-12 13:54:44 +01:00
Michael Herzberg e60265975f Renamed etag to crx_etag. 2017-07-10 12:46:41 +01:00
Achim D. Brucker c4e13daae5 Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler 2017-07-09 18:55:24 +01:00
Michael Herzberg 77023e001e Do not treat js file decoding strictly. 2017-07-07 22:17:56 +01:00
Michael Herzberg 38c88d7461 Added parsing of content_script_urls. 2017-07-07 20:09:22 +01:00
Michael Herzberg fbc0a7c87c Added crx size and jsloc. 2017-07-07 19:47:14 +01:00
Michael Herzberg 62dc61826a Added parsing of itemcategory. 2017-07-07 19:29:51 +01:00
Michael Herzberg dbe8a26a6b Fixed download parsing. 2017-07-05 16:20:52 +01:00
Michael Herzberg cface0128c Changed download number extraction to also work with Google Docs extensions (and potentially others). 2017-07-05 16:08:15 +01:00
Michael Herzberg 4c01b95f69 Added ratingValue and ratingCount to db. 2017-07-05 14:23:45 +01:00
Achim D. Brucker d5d0a44b69 Reformatting. 2017-07-05 08:21:40 +01:00
Achim D. Brucker 600ec933f4 Introduced optional argument to last_crx - return latest crx that is not newer than the passed date/time. 2017-07-05 08:21:00 +01:00
Achim D. Brucker 30c0b92979 Ignore empty crx files in calculating last crx file date. 2017-07-04 09:30:33 +01:00
Achim D. Brucker 939b29f55a Use getmembers instead of getnames in last_crx(). 2017-07-03 07:04:03 +01:00
Michael Herzberg 6d5221c5d7 Make db path configurable. 2017-06-22 17:46:18 +01:00
Michael Herzberg 6833ba6683 Fixed sqlite creation, added missing commit 2017-06-20 23:47:31 +01:00
Michael Herzberg 4220d48d34 Close db when an exception is thrown. 2017-06-20 23:15:15 +01:00
Achim D. Brucker 8dbd535e3e Merge branch 'master' into production 2017-06-20 20:03:10 +01:00
Achim D. Brucker d9ebe265ae Re-formatting 2017-06-20 18:17:44 +01:00
Achim D. Brucker 05227494d6 Re-formatting 2017-06-20 18:17:36 +01:00
Michael Herzberg dae5d4caa9 Fixed creation of empty .crx files. 2017-06-20 18:05:22 +01:00
Michael Herzberg 7ef8ecf3b1 Relax json parsing of manifest. 2017-06-20 17:45:13 +01:00
Michael Herzberg 1cfe1bdab9 Also ignore /* style comments in manifests. 2017-06-20 15:22:09 +01:00
Michael Herzberg 437a00d256 Don't print warning when crx status is 404. 2017-06-20 15:07:44 +01:00
Michael Herzberg 69cdcd7174 Remove JavaScript-style comments from manifest before parsing. 2017-06-20 11:22:54 +01:00
Michael Herzberg b6bf280d1e Fixed error. 2017-06-20 08:49:01 +01:00
Michael Herzberg 3496e89460 Fixed error. 2017-06-20 08:43:43 +01:00
Michael Herzberg aa259807e2 Catch exceptions due to empty crx header file. 2017-06-20 08:42:30 +01:00
Michael Herzberg c47ba57c97 Changed handing of manifest parsing exceptions. 2017-06-20 08:28:50 +01:00
Michael Herzberg d2dd2aaf81 Moved db path into config file. 2017-06-20 08:10:28 +01:00
Michael Herzberg 39d7bf0330 Deal with missing annotation block in reviews. 2017-06-20 08:03:15 +01:00
Michael Herzberg fa82129c2b Deal with a possibly missing overview.html.status file. 2017-06-19 21:34:42 +01:00
Michael Herzberg c1cd41c2e1 Improved logging. 2017-06-19 18:41:29 +01:00
Michael Herzberg 282e2c4e8c Worked on sqlite stuff. 2017-06-19 16:42:35 +01:00
Achim D. Brucker 456bf292c8 Merge branch 'master' into production 2017-06-19 05:57:29 +01:00
Achim D. Brucker be293b1ba8 Fixed import. 2017-06-19 05:57:17 +01:00
Achim D. Brucker f95619670c Merge branch 'master' into production 2017-06-18 15:38:13 +01:00
Achim D. Brucker d9195c8174 Max. number of concurrent download can now be configured via command line. 2017-06-18 15:36:21 +01:00
Achim D. Brucker 66eff6780d Fixed passign is_new. 2017-06-17 18:26:04 +01:00
Achim D. Brucker 85c8f6a546 Pass is_new flag to sqlite update. 2017-06-17 18:19:44 +01:00
Achim D. Brucker 2e6323c8c5 Report number of extensions for which the SQL database was updated. 2017-06-17 18:15:08 +01:00
Michael Herzberg 7f24a9da7a Split db creation into incremental part and separate full regeneration script. 2017-06-17 17:10:18 +01:00
Achim D. Brucker ea71c5b6e3 Removed debugging code raising execptions. 2017-06-17 15:38:17 +01:00
Achim D. Brucker 8fcc7ab99f Fixed logging. 2017-06-17 00:48:34 +01:00
Achim D. Brucker 97460c498f Basic support for logging of errors related to SQL import/update. 2017-06-17 00:43:40 +01:00
Achim D. Brucker c4a5c5a231 Ignore non extensions ids in forums.conf. 2017-06-16 23:32:52 +01:00
Achim D. Brucker 86a608c6a1 Re-formatting. 2017-06-16 23:19:13 +01:00
Achim D. Brucker 1c8d68d495 Moved path utility functions into config module. 2017-06-16 23:09:23 +01:00
Achim D. Brucker 9f174f6785 Downport to python 3.5. 2017-06-16 22:38:48 +01:00
Michael Herzberg 6e2772711f Next version of sqlite generator. 2017-06-16 20:40:48 +01:00
Michael Herzberg c08124fa17 First version of sqlite generator. 2017-06-16 14:56:23 +01:00
Achim D. Brucker ab4c0ad002 Fixed logging. 2017-06-16 12:07:51 +01:00
Achim D. Brucker b9e5ca6f82 Stub for updating sqlite. 2017-06-16 11:06:04 +01:00
Achim D. Brucker 73baef61b2 Re-formatting. 2017-06-16 10:29:07 +01:00
Achim D. Brucker 64778c783e Check etags in addition to modified-since (basic implementation). 2017-06-16 10:28:47 +01:00
Achim D. Brucker 763ac137b2 Re-introduced parallel download (they are not causing the If-Modified-Since problem). 2017-06-15 23:14:07 +01:00
Achim D. Brucker 3ff67bddc8 Disabled parallel download (IF-Modified Bug Hunting). 2017-06-14 16:56:05 +01:00
Achim D. Brucker fd717d1516 Reformatting. 2017-05-27 20:39:00 +01:00