Achim D. Brucker
|
24c65daecf
|
Bug fix: check for dirty missed actual function application.
|
2017-09-16 13:41:47 +01:00 |
Achim D. Brucker
|
c274b96f66
|
Added csv output for debugging.
|
2017-09-16 13:21:49 +01:00 |
Michael Herzberg
|
69e95fdf13
|
Catch json parse extensions for reviews etc. more nicely.
|
2017-09-16 12:53:35 +01:00 |
Achim D. Brucker
|
de6dde5269
|
Updated help text to include taskid/maxtaskid.
|
2017-09-16 12:41:18 +01:00 |
Michael Herzberg
|
58aacef3ff
|
Reopen connection after every exception.
|
2017-09-16 12:31:00 +01:00 |
Michael Herzberg
|
a514c0001e
|
Added check for empty crx files.
|
2017-09-16 12:14:41 +01:00 |
Michael Herzberg
|
b51de8577f
|
Added compression for mysql.
|
2017-09-16 12:04:35 +01:00 |
Achim D. Brucker
|
92e1c4c2e5
|
Skip deleted files.
|
2017-09-16 11:41:21 +01:00 |
Achim D. Brucker
|
082cd2fc65
|
Added hacking pull method that uses the regular git binary. While method will not work well with filenames containg spaces and there mit be other glitches, it allows to pull an update of the cdnjs git reposistory (> 100GB) within a couple of minutes compared to a couple of days that the non hackish solution needs.
|
2017-09-16 11:36:40 +01:00 |
Achim D. Brucker
|
5d3343acf1
|
Refactoring: moved git_repo creation into pull_get_list_changed_files(...).
|
2017-09-16 10:33:11 +01:00 |
Achim D. Brucker
|
7b0e63da10
|
Implemented n/N options for external parallelisation (only for fresh initialization).
|
2017-09-15 22:40:46 +01:00 |
Michael Herzberg
|
a1781b9ff9
|
Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler
|
2017-09-15 21:32:25 +01:00 |
Michael Herzberg
|
1814b1738a
|
Added email notifications on abort.
|
2017-09-15 21:32:12 +01:00 |
Achim D. Brucker
|
400e74ae3f
|
Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler
|
2017-09-15 20:21:45 +01:00 |
Achim D. Brucker
|
26678636eb
|
Ignore commits where blobs are None.
|
2017-09-15 20:21:05 +01:00 |
Michael Herzberg
|
85680d360b
|
Automatically reopen database connection on failure.
|
2017-09-15 18:23:25 +01:00 |
Michael Herzberg
|
ddbbc2672d
|
Try to insert also other data if some inserts fail. Use autocommit to prevent data loss on retries.
|
2017-09-15 18:15:03 +01:00 |
Michael Herzberg
|
c57bce2491
|
Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler
|
2017-09-15 17:42:05 +01:00 |
Achim D. Brucker
|
936f2d3189
|
Log git info before starting pull (update).
|
2017-09-14 22:54:37 +01:00 |
Achim D. Brucker
|
2ff30f7382
|
Parallel execution of git date queries.
|
2017-09-14 15:11:53 +01:00 |
Achim D. Brucker
|
12a1e282aa
|
The method pull_get_updated_lib_files(...) now also returns unique library/version information.
|
2017-09-14 10:44:30 +01:00 |
Achim D. Brucker
|
e3f1202e44
|
Use version dictionary.
|
2017-09-14 10:33:00 +01:00 |
Achim D. Brucker
|
f54f29c9ba
|
Added build_release_date_dic(...).
|
2017-09-14 09:50:09 +01:00 |
Achim D. Brucker
|
3b217922c5
|
Added line count.
|
2017-09-13 16:41:01 +01:00 |
Achim D. Brucker
|
420eec7462
|
Minor memory optimizations.
|
2017-09-13 11:12:33 +01:00 |
Achim D. Brucker
|
ec1c47625a
|
Added support for parallel update of database.
|
2017-09-13 09:13:35 +01:00 |
Achim D. Brucker
|
c386bd01dd
|
Added missing string conversion.
|
2017-09-13 08:29:23 +01:00 |
Achim D. Brucker
|
42e685ee32
|
Added missing string conversion.
|
2017-09-13 08:01:02 +01:00 |
Achim D. Brucker
|
18fb23d3dc
|
Use glob instead of os.walk() to avoid memory leak in the latter.
|
2017-09-13 04:04:38 +01:00 |
Achim D. Brucker
|
76d5993794
|
Added logging output.
|
2017-09-13 03:02:39 +01:00 |
Achim D. Brucker
|
c30f7fdd7c
|
Implemented skeleton of main routine.
|
2017-09-13 02:56:13 +01:00 |
Achim D. Brucker
|
a8a5534be1
|
Renamed module.
|
2017-09-13 01:13:17 +01:00 |
Achim D. Brucker
|
bdb84c2120
|
Renamed module.
|
2017-09-13 01:09:30 +01:00 |
Achim D. Brucker
|
4e5b52617f
|
Catch exception during decompression and increase max. allowed size of decompressed data to 100 times of compressed size.
|
2017-09-13 00:23:17 +01:00 |
Achim D. Brucker
|
88efe2b8a4
|
Reformatting.
|
2017-09-13 00:02:20 +01:00 |
Achim D. Brucker
|
ea9339bc53
|
Compute data identifiers for uncompressed content of gzip compressed files.
|
2017-09-13 00:01:15 +01:00 |
Achim D. Brucker
|
f9cf7bd35f
|
Refactoring: moved computation of data related identifiers into own method.
|
2017-09-12 23:52:52 +01:00 |
Achim D. Brucker
|
8243664974
|
Use StringIO representation for normalizing js/css files (avoid re-reading the file content from disk).
|
2017-09-12 23:43:09 +01:00 |
Achim D. Brucker
|
933c4d4d11
|
Determine file description from buffer instead from file (avoid reading file twice).
|
2017-09-12 23:23:22 +01:00 |
Michael Herzberg
|
5ce3f2a148
|
Added until-date option.
|
2017-09-12 11:01:44 +01:00 |
Achim D. Brucker
|
6353202ee8
|
Renaming: fileinfo -> filedb.
|
2017-09-10 22:59:07 +01:00 |
Achim D. Brucker
|
0426d7d3d1
|
Reformatting.
|
2017-09-10 22:39:47 +01:00 |
Achim D. Brucker
|
e5da9abaea
|
Added get_file_libinfo(...).
|
2017-09-10 22:38:49 +01:00 |
Achim D. Brucker
|
8d9f6e4fa1
|
Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler
|
2017-09-10 17:40:45 +01:00 |
Achim D. Brucker
|
ad2af517a3
|
Agressively try to normalize as many filetypes as possible.
|
2017-09-10 17:40:30 +01:00 |
Achim D. Brucker
|
06ff5f3057
|
Method for computing basic file identifiers.
|
2017-09-10 15:57:07 +01:00 |
Achim D. Brucker
|
a6e90794bc
|
Extended const_basedir to check environment variable EXTENSION_ARCHIVE and modified main scripts to actually use const_basedir.
|
2017-09-10 15:55:22 +01:00 |
Achim D. Brucker
|
4b31097975
|
Added function for computing a list of normalized code blocks for a JavaScript file.
|
2017-09-10 15:02:57 +01:00 |
Michael Herzberg
|
fbef566466
|
Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler
|
2017-09-10 12:20:33 +01:00 |
Michael Herzberg
|
e09cb16083
|
Updated path to archive.
|
2017-09-10 12:20:23 +01:00 |