Michael Herzberg
|
34e62d4e13
|
Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler
|
2017-07-13 14:43:50 +01:00 |
Michael Herzberg
|
a901fa30bb
|
Adjusted merge scripts for fts tables.
|
2017-07-13 14:43:37 +01:00 |
Achim D. Brucker
|
02ce0cfd2c
|
Print info: sqlite3 binary used.
|
2017-07-12 20:36:29 +01:00 |
Michael Herzberg
|
26bddde328
|
Removed primary keys from fts tables as that had no effect.
|
2017-07-12 18:30:37 +01:00 |
Michael Herzberg
|
2cb868b3d0
|
Removed removal of review row ids as we now use author ids as primary keys.
|
2017-07-12 18:13:04 +01:00 |
Michael Herzberg
|
6a6a12c88a
|
Added parsing of support to sqlite.
|
2017-07-12 18:11:31 +01:00 |
Michael Herzberg
|
16a44cf499
|
Added parsing of review replies to sqlite.
|
2017-07-12 17:56:40 +01:00 |
Michael Herzberg
|
0ed8c15a2d
|
Made review a fts table.
|
2017-07-12 17:04:56 +01:00 |
Michael Herzberg
|
51bdcb4f16
|
Also download replies for support forum.
|
2017-07-12 16:57:16 +01:00 |
Michael Herzberg
|
11b0ccee4a
|
Added download of review replies.
|
2017-07-12 16:10:47 +01:00 |
Michael Herzberg
|
d6ae9d28b8
|
Fixed bug that lead to downloading the first review page twice instead of the first and second review page.
|
2017-07-12 14:09:01 +01:00 |
Michael Herzberg
|
60dd98e60e
|
Fixed parsing of developer from overview page.
|
2017-07-12 13:54:44 +01:00 |
Michael Herzberg
|
885f580f2e
|
Fixed set -u.
|
2017-07-11 11:15:43 +01:00 |
Michael Herzberg
|
6d03e8da71
|
Refactored sge scripts.
|
2017-07-11 11:11:32 +01:00 |
Michael Herzberg
|
8ca76f373e
|
Made extension crawler repo path mandatory for sge scripts.
|
2017-07-10 15:25:56 +01:00 |
Michael Herzberg
|
00ad40e3a0
|
Added more sge scripts.
|
2017-07-10 14:42:03 +01:00 |
Michael Herzberg
|
3b3bbfbb6d
|
Improved grepper script.
|
2017-07-10 14:09:45 +01:00 |
Michael Herzberg
|
dac1f3e998
|
Improved grepper script.
|
2017-07-10 13:45:25 +01:00 |
Michael Herzberg
|
5f360e40e4
|
Improved grepper script.
|
2017-07-10 13:17:48 +01:00 |
Michael Herzberg
|
e60265975f
|
Renamed etag to crx_etag.
|
2017-07-10 12:46:41 +01:00 |
Michael Herzberg
|
fe6b563f31
|
Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler
|
2017-07-10 12:29:45 +01:00 |
Michael Herzberg
|
af68a95139
|
Fixed sed string.
|
2017-07-10 12:28:55 +01:00 |
Achim D. Brucker
|
c4e13daae5
|
Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler
|
2017-07-09 18:55:24 +01:00 |
Michael Herzberg
|
77023e001e
|
Do not treat js file decoding strictly.
|
2017-07-07 22:17:56 +01:00 |
Michael Herzberg
|
38c88d7461
|
Added parsing of content_script_urls.
|
2017-07-07 20:09:22 +01:00 |
Michael Herzberg
|
fbc0a7c87c
|
Added crx size and jsloc.
|
2017-07-07 19:47:14 +01:00 |
Michael Herzberg
|
62dc61826a
|
Added parsing of itemcategory.
|
2017-07-07 19:29:51 +01:00 |
Achim D. Brucker
|
9c4b1225f0
|
Added parameter for specifying output directory.
|
2017-07-07 09:18:53 +01:00 |
Achim D. Brucker
|
449d3e1d6c
|
Initial commit: utility to extract crx file from an extension archive.
|
2017-07-06 09:03:59 +01:00 |
Michael Herzberg
|
c8a51c7a3c
|
Added js grepper.
|
2017-07-05 20:55:41 +01:00 |
Michael Herzberg
|
dbe8a26a6b
|
Fixed download parsing.
|
2017-07-05 16:20:52 +01:00 |
Michael Herzberg
|
cface0128c
|
Changed download number extraction to also work with Google Docs extensions (and potentially others).
|
2017-07-05 16:08:15 +01:00 |
Michael Herzberg
|
7d0a68a808
|
Merge branch 'master' of logicalhacking.com:BrowserSecurity/ExtensionCrawler
|
2017-07-05 14:23:55 +01:00 |
Michael Herzberg
|
4c01b95f69
|
Added ratingValue and ratingCount to db.
|
2017-07-05 14:23:45 +01:00 |
Achim D. Brucker
|
d5d0a44b69
|
Reformatting.
|
2017-07-05 08:21:40 +01:00 |
Achim D. Brucker
|
600ec933f4
|
Introduced optional argument to last_crx - return latest crx that is not newer than the passed date/time.
|
2017-07-05 08:21:00 +01:00 |
Achim D. Brucker
|
30c0b92979
|
Ignore empty crx files in calculating last crx file date.
|
2017-07-04 09:30:33 +01:00 |
Achim D. Brucker
|
939b29f55a
|
Use getmembers instead of getnames in last_crx().
|
2017-07-03 07:04:03 +01:00 |
Achim D. Brucker
|
817cf11ca7
|
Update script via cron.
|
2017-07-02 17:35:51 +01:00 |
Achim D. Brucker
|
c7a06dcf8e
|
Remove old compressed sqlite file prior to compressing updated sqlite file.
|
2017-07-02 17:31:48 +01:00 |
Achim D. Brucker
|
cc80684630
|
Disabled compressing of log files due to raise condition problems on network mounted drive.
|
2017-07-02 17:30:09 +01:00 |
Achim D. Brucker
|
4aad117c12
|
Simple script for syncing a local archive (excerpt of the total archive) for development or debugging.
|
2017-07-02 16:11:20 +01:00 |
Achim D. Brucker
|
b73ed18b67
|
Ignore errors while compressing log files.
|
2017-07-01 20:09:00 +01:00 |
Achim D. Brucker
|
797328b855
|
Overwrite compressed files.
|
2017-07-01 19:57:29 +01:00 |
Achim D. Brucker
|
5702a43368
|
Use parallel bzip2 (pbzip2) instead of regular bzip2.
|
2017-06-30 20:54:08 +01:00 |
Achim D. Brucker
|
062f2dea26
|
Report size before/after compressing of data base files.
|
2017-06-30 20:02:49 +01:00 |
Achim D. Brucker
|
931ae4f7ed
|
Normalized script name.
|
2017-06-30 18:39:26 +01:00 |
Achim D. Brucker
|
70f69f652f
|
Compress aa-ac.sqlite.
|
2017-06-29 22:46:59 +01:00 |
Achim D. Brucker
|
ef04c71314
|
Delete compressed database before re-creating it.
|
2017-06-29 20:38:31 +01:00 |
Achim D. Brucker
|
77c1fae66d
|
Compress log files.
|
2017-06-29 20:22:13 +01:00 |