Achim D. Brucker
|
936f2d3189
|
Log git info before starting pull (update).
|
2017-09-14 22:54:37 +01:00 |
Achim D. Brucker
|
2ff30f7382
|
Parallel execution of git date queries.
|
2017-09-14 15:11:53 +01:00 |
Achim D. Brucker
|
12a1e282aa
|
The method pull_get_updated_lib_files(...) now also returns unique library/version information.
|
2017-09-14 10:44:30 +01:00 |
Achim D. Brucker
|
e3f1202e44
|
Use version dictionary.
|
2017-09-14 10:33:00 +01:00 |
Achim D. Brucker
|
f54f29c9ba
|
Added build_release_date_dic(...).
|
2017-09-14 09:50:09 +01:00 |
Achim D. Brucker
|
3b217922c5
|
Added line count.
|
2017-09-13 16:41:01 +01:00 |
Achim D. Brucker
|
420eec7462
|
Minor memory optimizations.
|
2017-09-13 11:12:33 +01:00 |
Achim D. Brucker
|
ec1c47625a
|
Added support for parallel update of database.
|
2017-09-13 09:13:35 +01:00 |
Achim D. Brucker
|
c386bd01dd
|
Added missing string conversion.
|
2017-09-13 08:29:23 +01:00 |
Achim D. Brucker
|
42e685ee32
|
Added missing string conversion.
|
2017-09-13 08:01:02 +01:00 |
Achim D. Brucker
|
18fb23d3dc
|
Use glob instead of os.walk() to avoid memory leak in the latter.
|
2017-09-13 04:04:38 +01:00 |
Achim D. Brucker
|
76d5993794
|
Added logging output.
|
2017-09-13 03:02:39 +01:00 |
Achim D. Brucker
|
c30f7fdd7c
|
Implemented skeleton of main routine.
|
2017-09-13 02:56:13 +01:00 |
Achim D. Brucker
|
a8a5534be1
|
Renamed module.
|
2017-09-13 01:13:17 +01:00 |
Achim D. Brucker
|
bdb84c2120
|
Renamed module.
|
2017-09-13 01:09:30 +01:00 |
Achim D. Brucker
|
4e5b52617f
|
Catch exception during decompression and increase max. allowed size of decompressed data to 100 times of compressed size.
|
2017-09-13 00:23:17 +01:00 |
Achim D. Brucker
|
88efe2b8a4
|
Reformatting.
|
2017-09-13 00:02:20 +01:00 |
Achim D. Brucker
|
ea9339bc53
|
Compute data identifiers for uncompressed content of gzip compressed files.
|
2017-09-13 00:01:15 +01:00 |
Achim D. Brucker
|
f9cf7bd35f
|
Refactoring: moved computation of data related identifiers into own method.
|
2017-09-12 23:52:52 +01:00 |
Achim D. Brucker
|
8243664974
|
Use StringIO representation for normalizing js/css files (avoid re-reading the file content from disk).
|
2017-09-12 23:43:09 +01:00 |
Achim D. Brucker
|
933c4d4d11
|
Determine file description from buffer instead from file (avoid reading file twice).
|
2017-09-12 23:23:22 +01:00 |
Achim D. Brucker
|
6353202ee8
|
Renaming: fileinfo -> filedb.
|
2017-09-10 22:59:07 +01:00 |
Achim D. Brucker
|
0426d7d3d1
|
Reformatting.
|
2017-09-10 22:39:47 +01:00 |
Achim D. Brucker
|
e5da9abaea
|
Added get_file_libinfo(...).
|
2017-09-10 22:38:49 +01:00 |
Achim D. Brucker
|
ad2af517a3
|
Agressively try to normalize as many filetypes as possible.
|
2017-09-10 17:40:30 +01:00 |
Achim D. Brucker
|
06ff5f3057
|
Method for computing basic file identifiers.
|
2017-09-10 15:57:07 +01:00 |
Achim D. Brucker
|
a6e90794bc
|
Extended const_basedir to check environment variable EXTENSION_ARCHIVE and modified main scripts to actually use const_basedir.
|
2017-09-10 15:55:22 +01:00 |
Achim D. Brucker
|
4b31097975
|
Added function for computing a list of normalized code blocks for a JavaScript file.
|
2017-09-10 15:02:57 +01:00 |
Achim D. Brucker
|
52b42dfaef
|
Changed pull method to return list of changed files.
|
2017-09-10 11:01:29 +01:00 |
Achim D. Brucker
|
c3053427c0
|
Added method for obtaining initial commit date and pulling git repos.
|
2017-09-09 23:13:26 +01:00 |
Achim D. Brucker
|
8c33558934
|
Reformatting.
|
2017-09-07 20:09:29 +01:00 |
Achim D. Brucker
|
3b2913616b
|
Skip first_seen if not defined.
|
2017-09-05 10:15:48 +01:00 |
Michael Herzberg
|
a9173345e8
|
Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler
|
2017-09-04 15:54:38 +01:00 |
Michael Herzberg
|
36d36facfe
|
Relaxed mysql retries.
|
2017-09-04 15:54:28 +01:00 |
Achim D. Brucker
|
6395d98443
|
Releaxed handling of network errors.
|
2017-09-04 09:11:27 +01:00 |
Achim D. Brucker
|
cfeb29d95f
|
Clean-up of logging infrastructure.
|
2017-09-03 15:56:27 +01:00 |
Achim D. Brucker
|
f42f8e3d03
|
Improved error handling for request failures.
|
2017-09-03 15:43:33 +01:00 |
Achim D. Brucker
|
872346fa61
|
Add timout parameter to http get requests.
|
2017-09-03 12:03:51 +01:00 |
Achim D. Brucker
|
0b0268e320
|
Copy outphased date to hash map of files archive.
|
2017-09-03 11:13:27 +01:00 |
Achim D. Brucker
|
0f716e98da
|
Bug fix: only try to preserve outphased library information is there is any stored locally.
|
2017-09-03 11:09:39 +01:00 |
Achim D. Brucker
|
80c8e7caa0
|
Preserve outphased library versions.
|
2017-09-03 11:00:05 +01:00 |
Achim D. Brucker
|
03504ff81a
|
Improved error handling.
|
2017-09-03 10:45:56 +01:00 |
Achim D. Brucker
|
13191f1ce0
|
Renaming: date -> first_seen.
|
2017-09-03 10:32:45 +01:00 |
Achim D. Brucker
|
59f9b47a81
|
Switched to Logging framework.
|
2017-09-03 10:29:57 +01:00 |
Achim D. Brucker
|
074447064c
|
Enabled parallel download.
|
2017-09-03 10:06:55 +01:00 |
Achim D. Brucker
|
515a462938
|
Added methods for generating/updating index files based on the file hash.
|
2017-09-02 22:10:43 +01:00 |
Achim D. Brucker
|
9ae5905973
|
Generalized hash map builders.
|
2017-09-02 21:53:58 +01:00 |
Achim D. Brucker
|
22c3a7581d
|
Reformatting.
|
2017-09-02 21:44:20 +01:00 |
Achim D. Brucker
|
3097db3790
|
Added methods for generating sha1 indexed dictionary.
|
2017-09-02 21:40:44 +01:00 |
Achim D. Brucker
|
e5c2372222
|
Improved log output (verbose mode).
|
2017-09-02 20:57:01 +01:00 |
Achim D. Brucker
|
c32ab6bc94
|
print URL of downloaded library files in verbose mode.
|
2017-09-02 20:44:47 +01:00 |
Achim D. Brucker
|
ea8460f1b8
|
Updated local update.
|
2017-09-02 20:41:16 +01:00 |
Achim D. Brucker
|
030a4b36ca
|
Added functionality for deleting information of orphaned libraries.
|
2017-09-02 19:43:10 +01:00 |
Achim D. Brucker
|
247b96db6d
|
Refactoring: moved core functionality in own module.
|
2017-09-02 18:47:41 +01:00 |
Achim D. Brucker
|
7bcf9aca8e
|
Removed executable flag.
|
2017-09-02 18:08:20 +01:00 |
Achim D. Brucker
|
99028c3763
|
Removed executable flag.
|
2017-09-02 18:08:06 +01:00 |
Achim D. Brucker
|
9ed8f5f926
|
Improved reporting.
|
2017-09-02 00:05:07 +01:00 |
Achim D. Brucker
|
a69c173064
|
Activated preliminary check of regexps for specific libs.
|
2017-09-01 23:41:45 +01:00 |
Achim D. Brucker
|
28f6aa5f45
|
Bug fix: indentation
|
2017-09-01 23:24:55 +01:00 |
Achim D. Brucker
|
5c987833a4
|
Bug fix: NoneType object is not iterable.
|
2017-09-01 23:23:11 +01:00 |
Michael Herzberg
|
bb03a67a29
|
Deleted ropeproject stuff.
|
2017-09-01 17:04:25 +01:00 |
Achim D. Brucker
|
2693fb0fcd
|
Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler
|
2017-09-01 16:28:18 +01:00 |
Achim D. Brucker
|
3fb0d740c0
|
Bug fix: exception due to reading from the wrong dictionary.
|
2017-09-01 16:27:44 +01:00 |
Michael Herzberg
|
ab943c87f0
|
Expand user directory for mysql config file.
|
2017-09-01 16:17:51 +01:00 |
Michael Herzberg
|
abd9605ebc
|
Use python3.5 for all files.
|
2017-09-01 14:12:05 +01:00 |
Michael Herzberg
|
cbcb3bc3b0
|
Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler
|
2017-09-01 13:30:57 +01:00 |
Michael Herzberg
|
5c24608c4d
|
Added --max-discover <N> option to limit the number of new extensions.
|
2017-09-01 13:30:42 +01:00 |
Achim D. Brucker
|
258269abb6
|
Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler
|
2017-09-01 12:14:53 +01:00 |
Achim D. Brucker
|
8289d50d38
|
Download all extensions in parallel and later do a second download for a subset including forums/reviews.
|
2017-09-01 12:14:39 +01:00 |
Michael Herzberg
|
b5fd382ab8
|
Use utf8mb4 for mysql connections.
|
2017-09-01 12:11:37 +01:00 |
Achim D. Brucker
|
53f080ba36
|
Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler
|
2017-09-01 12:02:17 +01:00 |
Achim D. Brucker
|
22264fb9e0
|
Changed download order: first download parallel all extensions without forum/review download, then download extensions with forums.
|
2017-09-01 12:02:12 +01:00 |
Michael Herzberg
|
62c353f647
|
Removed crawling restriction to 10 extids.
|
2017-09-01 10:42:21 +01:00 |
Michael Herzberg
|
21a7741f0c
|
Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler
|
2017-09-01 08:15:36 +01:00 |
Michael Herzberg
|
05ffdc6e24
|
Added explicit utf-8 request to mysql connector.
|
2017-09-01 08:15:22 +01:00 |
Achim D. Brucker
|
9446c20d01
|
Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler
|
2017-08-31 23:45:00 +01:00 |
Achim D. Brucker
|
883e7ddcd8
|
Report details of matches.
|
2017-08-31 23:44:51 +01:00 |
Michael Herzberg
|
e06d3f4ac4
|
Reduced timeout and fixed logging.
|
2017-08-31 23:01:05 +01:00 |
Achim D. Brucker
|
e0db2a5f47
|
Added detection details.
|
2017-08-31 08:43:19 +01:00 |
Michael Herzberg
|
ccf43de3d0
|
Pad process id to 6 chars.
|
2017-08-30 20:05:17 +01:00 |
Michael Herzberg
|
906d81ab86
|
Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler
|
2017-08-30 19:59:33 +01:00 |
Michael Herzberg
|
4145f88a94
|
MySQL wait +/- 20%.
|
2017-08-30 19:59:15 +01:00 |
Achim D. Brucker
|
e70cf5d28f
|
Bug fix: missing hex decoding of md5/sha1 hashes.
|
2017-08-30 19:25:58 +01:00 |
Michael Herzberg
|
b76eef12d5
|
Added randomized delay for MySQL.
|
2017-08-30 18:55:13 +01:00 |
Michael Herzberg
|
bde59c5040
|
Fixed crx_etag select bug and some logging.
|
2017-08-30 16:32:36 +01:00 |
Michael Herzberg
|
cbd2dea820
|
Removed everything related to sqlite and updated README.
|
2017-08-30 15:38:04 +01:00 |
Michael Herzberg
|
c7a808db3f
|
Don't process replies if there are none.
|
2017-08-30 15:15:12 +01:00 |
Michael Herzberg
|
5f234d8539
|
Improved logging.
|
2017-08-30 15:12:54 +01:00 |
Michael Herzberg
|
12c111ca11
|
Once more, make mysql the default. Also, increased timeout.
|
2017-08-30 12:13:25 +01:00 |
Michael Herzberg
|
9b8a693a5f
|
Changed logging a little bit.
|
2017-08-30 12:12:57 +01:00 |
Michael Herzberg
|
f8c8382919
|
Merge.
|
2017-08-30 11:42:12 +01:00 |
Michael Herzberg
|
6a9a1cda63
|
Moved crx logging to where crx will actually be parsed.
|
2017-08-30 11:29:30 +01:00 |
Michael Herzberg
|
d99142f8d0
|
Added and changed a few columns.
|
2017-08-30 10:07:06 +01:00 |
Achim D. Brucker
|
3269a4900c
|
Bug fix: printing of file name in Javascript mode.
|
2017-08-30 09:56:19 +01:00 |
Achim D. Brucker
|
b5b6a17ee5
|
Support analysis of crx files and plain JavaScript files.
|
2017-08-30 09:11:55 +01:00 |
Achim D. Brucker
|
cacdf1f727
|
Refactoring.
|
2017-08-30 08:28:39 +01:00 |
Achim D. Brucker
|
85d6ec084d
|
Bug fix: missing detection method for empty files.
|
2017-08-30 08:24:16 +01:00 |
Achim D. Brucker
|
d7120fad45
|
Bug fix: update char if loops reads another char (next_char). This avoids missing an escape character or a newline.
|
2017-08-30 01:43:18 +01:00 |
Achim D. Brucker
|
66818b2fa6
|
Renamed hash to md5 in JSON file and added support for sha1 hashes.
|
2017-08-30 00:38:30 +01:00 |
Achim D. Brucker
|
e947e69f37
|
Define type and detection method for all generated entries.
|
2017-08-30 00:24:19 +01:00 |
Achim D. Brucker
|
ae3bbd7339
|
Using values of enumeration to obtain nice and short human readable representations.
|
2017-08-30 00:12:57 +01:00 |
Michael Herzberg
|
47f424cf2f
|
Added more logging.
|
2017-08-29 23:10:46 +01:00 |
Michael Herzberg
|
080f00f17c
|
Added new columns for jsfile table.
|
2017-08-29 22:40:01 +01:00 |
Michael Herzberg
|
95d71a9edc
|
Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler
|
2017-08-29 22:29:49 +01:00 |
Michael Herzberg
|
3e24d1f08c
|
Changed logging to use logging library.
|
2017-08-29 22:29:38 +01:00 |
Achim D. Brucker
|
39cd03dccc
|
Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler
|
2017-08-29 18:01:42 +01:00 |
Achim D. Brucker
|
97f5b14158
|
Compute sha1 for JavaScript files.
|
2017-08-29 18:01:28 +01:00 |
Michael Herzberg
|
bddd80c138
|
Made removal of manifest.json comments stricter.
|
2017-08-29 15:43:04 +01:00 |
Michael Herzberg
|
7ffdf30545
|
Push manifest into table crx column manifest.
|
2017-08-29 15:41:13 +01:00 |
Michael Herzberg
|
2b11117b6f
|
Always process crx, regardless whether or not crx_etag is already in db.
|
2017-08-29 15:24:59 +01:00 |
Michael Herzberg
|
8b91957372
|
Reduced default MySQL timeout.
|
2017-08-29 15:20:58 +01:00 |
Michael Herzberg
|
6a99d41471
|
Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler
|
2017-08-29 15:11:37 +01:00 |
Achim D. Brucker
|
d4ad5f96f8
|
Report empty files as own category/type.
|
2017-08-28 22:38:06 +01:00 |
Michael Herzberg
|
f81aac7c61
|
Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler
|
2017-08-28 22:38:05 +01:00 |
Achim D. Brucker
|
2ace19f453
|
Compute js_info (including md5 hash and character set detection) only once per file.
|
2017-08-28 21:05:15 +01:00 |
Achim D. Brucker
|
91dfe67513
|
Auto-detect character encoding of JavaScript files using cchardet.
|
2017-08-28 20:53:55 +01:00 |
Michael Herzberg
|
c30f0c4147
|
Removed database and host setting. To be set in ~/.my.cnf file now.
|
2017-08-28 20:17:11 +01:00 |
Achim D. Brucker
|
5cff2bc1b7
|
New check based on file hash (md5).
|
2017-08-28 20:09:34 +01:00 |
Achim D. Brucker
|
030adb6adc
|
Minor refactoring and cleanup.
|
2017-08-28 19:20:50 +01:00 |
Michael Herzberg
|
5175d28edc
|
Convert some stuff to string for db insert.
|
2017-08-28 17:12:32 +01:00 |
Michael Herzberg
|
0a4e8839a1
|
Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler
|
2017-08-28 11:50:49 +01:00 |
Michael Herzberg
|
81077b807c
|
Added mysql retry logic and use time.time() now.
|
2017-08-28 11:50:41 +01:00 |
Achim D. Brucker
|
9bf0b47f98
|
Minor improvement of string conversion for JsBlock.
|
2017-08-28 10:50:52 +01:00 |
Achim D. Brucker
|
c721e6fdbf
|
Merge with upstream.
|
2017-08-28 10:49:01 +01:00 |
Achim D. Brucker
|
f10923af03
|
Integreated js_mincer into decomposition analysis to allow, in the future, to check comments, code, and string literals explicitely.
|
2017-08-28 10:40:37 +01:00 |
Achim D. Brucker
|
9ef27f9ac9
|
Added missing return statements.
|
2017-08-28 10:28:21 +01:00 |
Achim D. Brucker
|
90b1db4a25
|
Added additional comment checks.
|
2017-08-28 01:26:13 +01:00 |
Achim D. Brucker
|
9b272c9302
|
Added option to merge subsequent single line comments into a single line comment block.
|
2017-08-28 01:17:00 +01:00 |
Achim D. Brucker
|
111777c821
|
Improved position counting.
|
2017-08-28 00:57:58 +01:00 |
Achim D. Brucker
|
d4de20efc1
|
Bug fix: start position of blocks and omit empty code blocks.
|
2017-08-28 00:19:28 +01:00 |
Achim D. Brucker
|
e2e92594ce
|
Bug fix: catch also last block of file.
|
2017-08-27 23:34:33 +01:00 |
Michael Herzberg
|
257afe92f0
|
Use selective insert.
|
2017-08-27 23:00:28 +01:00 |
Achim D. Brucker
|
629f492fa7
|
Added tests for code blocks and comments.
|
2017-08-27 22:58:09 +01:00 |
Achim D. Brucker
|
7ff1623bc6
|
Introduced JavaScript mincer working on file objects.
|
2017-08-27 22:51:55 +01:00 |
Michael Herzberg
|
b98b7bc0f7
|
Fixed column typo.
|
2017-08-27 22:49:07 +01:00 |
Achim D. Brucker
|
e324ab9483
|
Re-formatted and added documentation.
|
2017-08-27 22:41:04 +01:00 |
Achim D. Brucker
|
9376b4056f
|
Collect string literals in code blocks.
|
2017-08-27 22:27:35 +01:00 |
Achim D. Brucker
|
41ca506b9f
|
Return iterator that iterates over JavaScript blocks.
|
2017-08-27 22:17:04 +01:00 |
Achim D. Brucker
|
5add586da3
|
Initial commit.
|
2017-08-27 20:47:24 +01:00 |
Achim D. Brucker
|
f6f0bc0394
|
Renamed jsdecompose.py to js_decomposer.py.
|
2017-08-27 20:45:56 +01:00 |
Michael Herzberg
|
9521240d90
|
Make stuff configurable.
|
2017-08-27 18:28:19 +01:00 |
Michael Herzberg
|
0cff600861
|
Fixed etag keys.
|
2017-08-27 17:35:58 +01:00 |
Michael Herzberg
|
d4b0a6535b
|
Fixed some things.
|
2017-08-27 16:57:23 +01:00 |
Michael Herzberg
|
f075192b44
|
made sqlite default again.
|
2017-08-27 03:26:29 +01:00 |
Michael Herzberg
|
22c90dcb4f
|
Truncate timezone from timestamps for mysql, make mysql default.
|
2017-08-27 03:14:43 +01:00 |
Michael Herzberg
|
585c8faf0e
|
Added mysql, but still outcommented.
|
2017-08-27 02:53:15 +01:00 |
Michael Herzberg
|
c5c04cd1ed
|
Refactored sqlite-specifics into own class.
|
2017-08-27 00:22:19 +01:00 |
Achim D. Brucker
|
0bd6a55adb
|
Added documentation for analyse_filename.
|
2017-08-26 22:45:14 +01:00 |
Achim D. Brucker
|
df472fbbe8
|
Refactored filename check.
|
2017-08-26 22:43:57 +01:00 |
Achim D. Brucker
|
b2c862ede1
|
Added fields for storing evidence information for detected library/version information.
|
2017-08-25 07:07:34 +01:00 |
Achim D. Brucker
|
807af6f32d
|
Refactoring: proper use of enumerations.
|
2017-08-24 21:37:35 +01:00 |
Achim D. Brucker
|
45d2c7ad44
|
Fundamental refactoring.
|
2017-08-24 19:43:48 +01:00 |
Achim D. Brucker
|
676cc5ac9d
|
Renamed detectLibraries to decompose_js.
|
2017-08-24 00:47:35 +01:00 |
Achim D. Brucker
|
486b967d2d
|
Refactoring.
|
2017-08-24 00:44:34 +01:00 |
Achim D. Brucker
|
9ced7ea3b5
|
Refactoring and bug fix in library classification.
|
2017-08-24 00:29:44 +01:00 |
Achim D. Brucker
|
94bd0f9a95
|
Refactoring.
|
2017-08-23 23:37:15 +01:00 |
Achim D. Brucker
|
2bbd6281f7
|
Reformatting.
|
2017-08-23 20:09:02 +01:00 |
Achim D. Brucker
|
4c5f8889d2
|
Refactoring.
|
2017-08-23 20:04:52 +01:00 |
Achim D. Brucker
|
cd217f57a6
|
Integrated JavaScript decomposition analysis.
|
2017-08-23 19:42:00 +01:00 |
Achim D. Brucker
|
5d89e28486
|
Cleanup.
|
2017-08-23 19:17:35 +01:00 |
Achim D. Brucker
|
123623b111
|
Minor code cleanup.
|
2017-08-23 17:36:41 +01:00 |
Achim D. Brucker
|
3208a6e58a
|
Initial import of JavaScript decomposition framework.
|
2017-08-23 17:22:58 +01:00 |
Michael Herzberg
|
68e7e72e93
|
Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler
|
2017-08-09 13:06:42 +01:00 |
Michael Herzberg
|
40f800b4de
|
Check if 'annotations' exists in search results.
|
2017-08-09 13:06:22 +01:00 |
Achim D. Brucker
|
d3da686e16
|
Changed formular for computing download delay.
|
2017-08-05 10:57:45 +01:00 |
Michael Herzberg
|
c61f19e860
|
Use INSERT OR IGNORE.
|
2017-07-31 23:55:21 +01:00 |
Michael Herzberg
|
b8f57196c7
|
Changed fts table structure.
|
2017-07-31 23:23:57 +01:00 |
Michael Herzberg
|
b34d45c4dc
|
Added md5sum to sqlite.
|
2017-07-31 20:38:21 +01:00 |
Achim D. Brucker
|
35c133e395
|
Slightly more aggressive implementation of google_dos_protection.
|
2017-07-30 14:41:20 +01:00 |
Achim D. Brucker
|
5268f2a732
|
Refactoring: clean-up of imports and a few other minor improvements.
|
2017-07-29 16:13:39 +01:00 |
Achim D. Brucker
|
eb0054b47d
|
Refactoring: Moved default configuration to config module.
|
2017-07-29 12:36:20 +01:00 |
Achim D. Brucker
|
0b24fb15fe
|
Refactoring.
|
2017-07-29 11:32:06 +01:00 |
Achim D. Brucker
|
0ca3476b09
|
Slightly more aggressive implementation of google_dos_protection.
|
2017-07-29 11:21:15 +01:00 |
Achim D. Brucker
|
d05ca9678e
|
Refactoring.
|
2017-07-29 10:57:35 +01:00 |
Achim D. Brucker
|
ac663299b3
|
Refactoring.
|
2017-07-29 10:17:16 +01:00 |
Achim D. Brucker
|
10cce2859d
|
Renamed variable/attribute pk to public_key.
|
2017-07-29 09:15:22 +01:00 |
Achim D. Brucker
|
333bcaa62d
|
Strip path from crx file.
|
2017-07-29 09:12:01 +01:00 |
Achim D. Brucker
|
e5d671c7c4
|
Refactoring.
|
2017-07-29 09:05:16 +01:00 |
Michael Herzberg
|
11604c0fa5
|
Collect jsfilesize instead of jsloc.
|
2017-07-26 12:05:54 +01:00 |
Achim D. Brucker
|
73eedab07d
|
Log time delta for each extension upate.
|
2017-07-26 07:29:48 +01:00 |
Michael Herzberg
|
b3d1ab912e
|
Wait a maximum of 10min before stopping jsbeautifier.
|
2017-07-25 22:57:55 +01:00 |
Michael Herzberg
|
072e008fe2
|
Run the garbage collector manually after using jsbeautify.
|
2017-07-19 17:25:21 +01:00 |
Michael Herzberg
|
186f6162af
|
Fixed NoneType str conversion exception.
|
2017-07-17 15:29:00 +01:00 |
Michael Herzberg
|
9b1e5db96f
|
Check for attributes key first and use traceback module instead of printing str(e).
|
2017-07-17 14:00:39 +01:00 |
Michael Herzberg
|
eded1ca893
|
Only attempt to search for replies when we actually have search parameters.
|
2017-07-16 20:14:50 +01:00 |
Michael Herzberg
|
26bddde328
|
Removed primary keys from fts tables as that had no effect.
|
2017-07-12 18:30:37 +01:00 |
Michael Herzberg
|
6a6a12c88a
|
Added parsing of support to sqlite.
|
2017-07-12 18:11:31 +01:00 |
Michael Herzberg
|
16a44cf499
|
Added parsing of review replies to sqlite.
|
2017-07-12 17:56:40 +01:00 |
Michael Herzberg
|
0ed8c15a2d
|
Made review a fts table.
|
2017-07-12 17:04:56 +01:00 |
Michael Herzberg
|
51bdcb4f16
|
Also download replies for support forum.
|
2017-07-12 16:57:16 +01:00 |
Michael Herzberg
|
11b0ccee4a
|
Added download of review replies.
|
2017-07-12 16:10:47 +01:00 |
Michael Herzberg
|
d6ae9d28b8
|
Fixed bug that lead to downloading the first review page twice instead of the first and second review page.
|
2017-07-12 14:09:01 +01:00 |
Michael Herzberg
|
60dd98e60e
|
Fixed parsing of developer from overview page.
|
2017-07-12 13:54:44 +01:00 |
Michael Herzberg
|
e60265975f
|
Renamed etag to crx_etag.
|
2017-07-10 12:46:41 +01:00 |
Achim D. Brucker
|
c4e13daae5
|
Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler
|
2017-07-09 18:55:24 +01:00 |
Michael Herzberg
|
77023e001e
|
Do not treat js file decoding strictly.
|
2017-07-07 22:17:56 +01:00 |
Michael Herzberg
|
38c88d7461
|
Added parsing of content_script_urls.
|
2017-07-07 20:09:22 +01:00 |
Michael Herzberg
|
fbc0a7c87c
|
Added crx size and jsloc.
|
2017-07-07 19:47:14 +01:00 |
Michael Herzberg
|
62dc61826a
|
Added parsing of itemcategory.
|
2017-07-07 19:29:51 +01:00 |
Michael Herzberg
|
dbe8a26a6b
|
Fixed download parsing.
|
2017-07-05 16:20:52 +01:00 |
Michael Herzberg
|
cface0128c
|
Changed download number extraction to also work with Google Docs extensions (and potentially others).
|
2017-07-05 16:08:15 +01:00 |
Michael Herzberg
|
4c01b95f69
|
Added ratingValue and ratingCount to db.
|
2017-07-05 14:23:45 +01:00 |
Achim D. Brucker
|
d5d0a44b69
|
Reformatting.
|
2017-07-05 08:21:40 +01:00 |
Achim D. Brucker
|
600ec933f4
|
Introduced optional argument to last_crx - return latest crx that is not newer than the passed date/time.
|
2017-07-05 08:21:00 +01:00 |
Achim D. Brucker
|
30c0b92979
|
Ignore empty crx files in calculating last crx file date.
|
2017-07-04 09:30:33 +01:00 |
Achim D. Brucker
|
939b29f55a
|
Use getmembers instead of getnames in last_crx().
|
2017-07-03 07:04:03 +01:00 |
Michael Herzberg
|
6d5221c5d7
|
Make db path configurable.
|
2017-06-22 17:46:18 +01:00 |
Michael Herzberg
|
6833ba6683
|
Fixed sqlite creation, added missing commit
|
2017-06-20 23:47:31 +01:00 |
Michael Herzberg
|
4220d48d34
|
Close db when an exception is thrown.
|
2017-06-20 23:15:15 +01:00 |
Achim D. Brucker
|
8dbd535e3e
|
Merge branch 'master' into production
|
2017-06-20 20:03:10 +01:00 |
Achim D. Brucker
|
d9ebe265ae
|
Re-formatting
|
2017-06-20 18:17:44 +01:00 |
Achim D. Brucker
|
05227494d6
|
Re-formatting
|
2017-06-20 18:17:36 +01:00 |
Michael Herzberg
|
dae5d4caa9
|
Fixed creation of empty .crx files.
|
2017-06-20 18:05:22 +01:00 |
Michael Herzberg
|
7ef8ecf3b1
|
Relax json parsing of manifest.
|
2017-06-20 17:45:13 +01:00 |
Michael Herzberg
|
1cfe1bdab9
|
Also ignore /* style comments in manifests.
|
2017-06-20 15:22:09 +01:00 |
Michael Herzberg
|
437a00d256
|
Don't print warning when crx status is 404.
|
2017-06-20 15:07:44 +01:00 |
Michael Herzberg
|
69cdcd7174
|
Remove JavaScript-style comments from manifest before parsing.
|
2017-06-20 11:22:54 +01:00 |
Michael Herzberg
|
b6bf280d1e
|
Fixed error.
|
2017-06-20 08:49:01 +01:00 |
Michael Herzberg
|
3496e89460
|
Fixed error.
|
2017-06-20 08:43:43 +01:00 |
Michael Herzberg
|
aa259807e2
|
Catch exceptions due to empty crx header file.
|
2017-06-20 08:42:30 +01:00 |
Michael Herzberg
|
c47ba57c97
|
Changed handing of manifest parsing exceptions.
|
2017-06-20 08:28:50 +01:00 |
Michael Herzberg
|
d2dd2aaf81
|
Moved db path into config file.
|
2017-06-20 08:10:28 +01:00 |
Michael Herzberg
|
39d7bf0330
|
Deal with missing annotation block in reviews.
|
2017-06-20 08:03:15 +01:00 |
Michael Herzberg
|
fa82129c2b
|
Deal with a possibly missing overview.html.status file.
|
2017-06-19 21:34:42 +01:00 |
Michael Herzberg
|
c1cd41c2e1
|
Improved logging.
|
2017-06-19 18:41:29 +01:00 |
Michael Herzberg
|
282e2c4e8c
|
Worked on sqlite stuff.
|
2017-06-19 16:42:35 +01:00 |
Achim D. Brucker
|
456bf292c8
|
Merge branch 'master' into production
|
2017-06-19 05:57:29 +01:00 |
Achim D. Brucker
|
be293b1ba8
|
Fixed import.
|
2017-06-19 05:57:17 +01:00 |
Achim D. Brucker
|
f95619670c
|
Merge branch 'master' into production
|
2017-06-18 15:38:13 +01:00 |
Achim D. Brucker
|
d9195c8174
|
Max. number of concurrent download can now be configured via command line.
|
2017-06-18 15:36:21 +01:00 |
Achim D. Brucker
|
66eff6780d
|
Fixed passign is_new.
|
2017-06-17 18:26:04 +01:00 |
Achim D. Brucker
|
85c8f6a546
|
Pass is_new flag to sqlite update.
|
2017-06-17 18:19:44 +01:00 |
Achim D. Brucker
|
2e6323c8c5
|
Report number of extensions for which the SQL database was updated.
|
2017-06-17 18:15:08 +01:00 |
Michael Herzberg
|
7f24a9da7a
|
Split db creation into incremental part and separate full regeneration script.
|
2017-06-17 17:10:18 +01:00 |
Achim D. Brucker
|
ea71c5b6e3
|
Removed debugging code raising execptions.
|
2017-06-17 15:38:17 +01:00 |
Achim D. Brucker
|
8fcc7ab99f
|
Fixed logging.
|
2017-06-17 00:48:34 +01:00 |
Achim D. Brucker
|
97460c498f
|
Basic support for logging of errors related to SQL import/update.
|
2017-06-17 00:43:40 +01:00 |
Achim D. Brucker
|
c4a5c5a231
|
Ignore non extensions ids in forums.conf.
|
2017-06-16 23:32:52 +01:00 |
Achim D. Brucker
|
86a608c6a1
|
Re-formatting.
|
2017-06-16 23:19:13 +01:00 |
Achim D. Brucker
|
1c8d68d495
|
Moved path utility functions into config module.
|
2017-06-16 23:09:23 +01:00 |
Achim D. Brucker
|
9f174f6785
|
Downport to python 3.5.
|
2017-06-16 22:38:48 +01:00 |
Michael Herzberg
|
6e2772711f
|
Next version of sqlite generator.
|
2017-06-16 20:40:48 +01:00 |
Michael Herzberg
|
c08124fa17
|
First version of sqlite generator.
|
2017-06-16 14:56:23 +01:00 |
Achim D. Brucker
|
ab4c0ad002
|
Fixed logging.
|
2017-06-16 12:07:51 +01:00 |
Achim D. Brucker
|
b9e5ca6f82
|
Stub for updating sqlite.
|
2017-06-16 11:06:04 +01:00 |
Achim D. Brucker
|
73baef61b2
|
Re-formatting.
|
2017-06-16 10:29:07 +01:00 |
Achim D. Brucker
|
64778c783e
|
Check etags in addition to modified-since (basic implementation).
|
2017-06-16 10:28:47 +01:00 |
Achim D. Brucker
|
763ac137b2
|
Re-introduced parallel download (they are not causing the If-Modified-Since problem).
|
2017-06-15 23:14:07 +01:00 |
Achim D. Brucker
|
3ff67bddc8
|
Disabled parallel download (IF-Modified Bug Hunting).
|
2017-06-14 16:56:05 +01:00 |
Achim D. Brucker
|
fd717d1516
|
Reformatting.
|
2017-05-27 20:39:00 +01:00 |