Achim D. Brucker
|
bad8334df1
|
Report parallel downloads instead of total downloads as fourth column.
|
2018-04-27 16:29:32 +01:00 |
Achim D. Brucker
|
fd4ed697a7
|
Added default value for ext_id in const_log_format() to ensure backwards compatibility.
|
2018-04-22 22:50:27 +01:00 |
Michael Herzberg
|
9eb164bb81
|
Fixed refactor bug.
|
2018-04-22 21:47:30 +01:00 |
Michael Herzberg
|
49ea3bb496
|
Make sure semaphore is released if an exception occurs during http request.
|
2018-04-22 13:59:15 +01:00 |
Michael Herzberg
|
756dcb3ed1
|
Increased wait time again...
|
2018-04-21 21:33:36 +01:00 |
Michael Herzberg
|
1dab51d3f5
|
Reduced bot detection timeout.
|
2018-04-21 20:50:08 +01:00 |
Michael Herzberg
|
5b0f49b35a
|
Deleted annoying Creating DB Connection message.
|
2018-04-21 20:35:23 +01:00 |
Michael Herzberg
|
13e4ee050c
|
Reset mysqlclient version.
|
2018-04-21 20:17:18 +01:00 |
Michael Herzberg
|
738d7e9b4f
|
Adjusted monitor script for new log line.
|
2018-04-21 20:13:08 +01:00 |
Michael Herzberg
|
d8d49b1b80
|
Moved ext_id into logger formatter to make logger output more uniform.
|
2018-04-21 19:59:02 +01:00 |
Michael Herzberg
|
dd011aaad1
|
Removed -P option.
|
2018-04-21 19:28:47 +01:00 |
Michael Herzberg
|
ecb00f6009
|
Merge branch 'master' into mixed_forums
|
2018-04-21 19:19:07 +01:00 |
Michael Herzberg
|
a789fe505f
|
Fixed style errors and warnings.
|
2018-04-21 19:00:07 +01:00 |
Michael Herzberg
|
ac3c1c7f20
|
Removed plain multiprocessing option.
|
2018-04-21 17:25:22 +01:00 |
Michael Herzberg
|
0613ac1ac1
|
Removed explicitly calling the garbage collector.
|
2018-04-21 16:52:58 +01:00 |
Michael Herzberg
|
2715e95665
|
Only try to add review and support pages if HTTP return code is 200.
|
2018-04-21 16:50:33 +01:00 |
Michael Herzberg
|
dbeba9e9bf
|
Use a lock to mix forum downloads into the parallel mode.
|
2018-04-21 13:59:33 +01:00 |
Michael Herzberg
|
aee916a629
|
Moved setting of forkserver further outwards...
|
2018-04-15 16:26:26 +01:00 |
Michael Herzberg
|
ff78f8e7d8
|
Fixed missing parameter.
|
2018-04-12 23:25:31 +01:00 |
Michael Herzberg
|
a758134c97
|
Readded mimetype from mimetypes. TODO: add mysql columns
|
2018-04-11 16:52:22 +01:00 |
Michael Herzberg
|
87b2847c6e
|
Make ProcessPool and pystuck the default (for now).
|
2018-04-11 15:39:23 +01:00 |
Michael Herzberg
|
cd09e2509d
|
Removed retry of worker exceptions; instead, properly log them similary to tar and sql exceptions.
|
2018-04-11 15:38:32 +01:00 |
Michael Herzberg
|
22dc8f8263
|
Added --pystuck option to start pystuck servers for all processes.
|
2018-04-11 15:15:52 +01:00 |
Michael Herzberg
|
46494ec18b
|
Re-setup logging in new processes.
|
2018-04-10 18:19:12 +01:00 |
Michael Herzberg
|
410fa3cf1c
|
Moved setting of forkserver to prevent multiple invocations.
|
2018-04-10 17:24:10 +01:00 |
Michael Herzberg
|
12bdc1b00f
|
Don't crash if something is wrong with the etag file.
|
2018-04-10 16:32:12 +01:00 |
Michael Herzberg
|
385003771a
|
Set chunksize, maxtasksperchild, and max_tasks to 100.
|
2018-04-10 16:23:22 +01:00 |
Michael Herzberg
|
bbe575d07b
|
Pebble: start processing results right away.
|
2018-04-10 16:15:33 +01:00 |
Michael Herzberg
|
6bee81b711
|
Use forkserver.
|
2018-04-10 16:13:31 +01:00 |
Michael Herzberg
|
778736e2d3
|
Fixed logging of if-modified-since.
|
2018-04-10 10:55:03 +01:00 |
Michael Herzberg
|
f677258f83
|
Added use of garbage collector.
|
2018-04-10 10:51:33 +01:00 |
Michael Herzberg
|
d27106d7a9
|
Added creation of separate .etag files outside the .tar file.
|
2018-04-09 19:42:41 +01:00 |
Michael Herzberg
|
50b598993f
|
Bugfix: actually download forums on sequential run.
|
2018-04-09 18:38:51 +01:00 |
Michael Herzberg
|
f4c0ff56ff
|
Use magic for mimetypes and don't attempt text-based analyses on binary resources.
|
2018-04-09 14:25:47 +01:00 |
Michael Herzberg
|
fcfa58fb3d
|
Wheel needs to be installed before ExtensionCrawler.
|
2018-04-09 00:14:07 +01:00 |
Achim D. Brucker
|
0c70b2e20b
|
Increase number of parallel downloads.
|
2018-04-08 22:45:56 +01:00 |
Michael Herzberg
|
3d136daae3
|
Various small bug fixes.
|
2018-04-08 17:44:59 +01:00 |
Michael Herzberg
|
faa2214af4
|
Timeout must be an integer.
|
2018-04-08 13:10:26 +01:00 |
Achim D. Brucker
|
33898a4cf3
|
Updated help text.
|
2018-04-08 10:10:30 +01:00 |
Achim D. Brucker
|
e1ef0758f7
|
Made the choice of Pool vs. ProcessPool a configuration option.
|
2018-04-08 10:06:26 +01:00 |
Achim D. Brucker
|
70b64616e1
|
Ensure the use of /usr/bin/mail.
|
2018-04-08 09:59:03 +01:00 |
Achim D. Brucker
|
7f71a40ff4
|
Configured number of parallel processes.
|
2018-04-07 21:14:36 +01:00 |
Achim D. Brucker
|
66023b6b72
|
Reverted test of ThreadPools.
|
2018-04-07 21:13:32 +01:00 |
Achim D. Brucker
|
a75380b0c5
|
Merge branch 'production' of logicalhacking.com:BrowserSecurity/ExtensionCrawler into production
|
2018-04-07 19:49:03 +01:00 |
Achim D. Brucker
|
987236958e
|
Testing ThreadPools.
|
2018-04-07 19:48:45 +01:00 |
Achim D. Brucker
|
c3d8de9b81
|
Testing ThreadPools.
|
2018-04-07 19:37:55 +01:00 |
Achim D. Brucker
|
a3c60c0ae8
|
Ensure that mail recipient is defined.
|
2018-04-07 17:54:15 +01:00 |
Achim D. Brucker
|
a7f0b26ead
|
Log memory usage.
|
2018-04-07 16:26:00 +01:00 |
Achim D. Brucker
|
2fc154d643
|
Use UTC-based time/dates for logging.
|
2018-04-07 15:54:39 +01:00 |
Achim D. Brucker
|
91a76091e3
|
Use UTC-based time/dates for logging.
|
2018-04-07 15:42:29 +01:00 |