Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

github.com/ArchiveTeam/wget-lua

Wget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.
https://github.com/ArchiveTeam/wget-lua

Version 1.21.3-at.20231215.01. Do not write Archive Team as operator to warcinfo record.

c1fe6093eda544fc7a933f7646225bec1ff4bd8d authored about 1 year ago by arkiver <[email protected]>
Version 1.21.3-at.20231213.03. Write Wget build information to warcinfo record. Add Archive Team contact details to --version and --help messages.

28bec41b7f31352115077999696b3f8ea873a02e authored about 1 year ago by arkiver <[email protected]>
Version 1.21.3-at.20231213.02. Use HTTP/1.1 for the request, not HTTP/1.0.

0e7d903275ee835970c8343dcf5e66a65c9168cd authored about 1 year ago by arkiver <[email protected]>
Version 1.21.3-at.20231213.01. Write WARC-Protocol and WARC-Cipher-Suite WARC headers. Fix writing HTTP URL to WARC headers when transformed to HTTPS with HSTS. Fix setting specific single SSL/TLS protocol version instead of minimum version.

d7855b0a875a87fbe90f9757064f7c28dae38c0d authored about 1 year ago by arkiver <[email protected]>
Use Debian bookworm.

c9820674ffe7c68751f77fb690a6d3ec765d268d authored about 1 year ago by arkiver <[email protected]>
Merge pull request #21 from chosak/patch-1

Fix WARC CDX writing

01bae48b489b93efe26fee97f10f6f5b5ba4583e authored about 1 year ago by arkiver <[email protected]>
Fix WARC CDX writing

Commit fd873c1ecb96467e633f145aeaad256ca36fcd63
introduced a bug in the CDX file writing logic ...

33515e0478e4dbd62b6aec8f89c0d68b96718883 authored about 1 year ago by Andy Chosak <[email protected]>
Fix --with-ssl option in Dockerfile.

e237033799009a5b4ef594d8da02258ec770d279 authored over 1 year ago by arkiver <[email protected]>
Version 1.21.3-at.20230825.01. Build with libluajit-5.1 if available. Print Lua info. Install PCRE2 in Dockerfile.

da43582bfda92c9f5848f7b1fc15edf78d9e1b41 authored over 1 year ago by arkiver <[email protected]>
Merge pull request #19 from OrIdow6/traceback-with-lua-errors

Print traceback with Lua errors.

d7530c88e745beaff2622f2e676d20c5f82fd49e authored over 1 year ago by arkiver <[email protected]>
Print traceback with Lua errors.

1903fca838bfac84c14a23809ba6fd02fa142df7 authored over 1 year ago by OrIdow6 <[email protected]>
Version 1.21.3-at.20230623.01. Calculate WARC-Payload-Digest value over payload stripped of transfer encoding.

4fda5f2cc5188a1db15ff5b0d535d56aafc5a4de authored over 1 year ago by arkiver <[email protected]>
Version 1.21.3-at.20230621.01. Add options --reject-subnets SUBNETS, --accept-subnets SUBNETS, --reject-reserved-subnets.

aa346a3a958692a2405b0cb35ccb2019e7dd772f authored over 1 year ago by arkiver <[email protected]>
Fix compile warnings for luahooks_dedup_response.

bda3c51ca5f3aec3b893a726b874b4f809ebba8e authored over 1 year ago by arkiver <[email protected]>
Version 1.21.3-at.20230605.01. Add options --host-lookups --hosts-file --resolvconf-file when compiled with c-ares. Build Zstd 1.4.4 and c-ares 1.91.1 from source in Dockerfile and upgrade to bullseye.

765e7ad6c14a521067a779daaa938130859df1e3 authored over 1 year ago by arkiver <[email protected]>
Version 1.21.3-at.20230208.01. Throw configuration error if --with-cares is used but c-ares is not found. Build Wget-AT in Dockerfile with c-ares.

d07a7ce1b71c3c74342f6a6e44e0ed7d4a9dd310 authored almost 2 years ago by arkiver <[email protected]>
Version 1.21.3-at.20220608.02. Treat several attributes URLs as HTML.

6fdf2683499f567f09931bf4066ed8a8af99f4f9 authored over 2 years ago by arkiver <[email protected]>
Version 1.21.3-at.20220608.01. Fix following refresh meta tag URL.

bfb6153fffff2c2bca1756da287f90be2952e6cf authored over 2 years ago by arkiver <[email protected]>
Version 1.21.3-at.20220528.01.

ab8f6966a8b64ab27910009bcc57abc725f4a935 authored over 2 years ago by arkiver <[email protected]>
Handle srcset and data-srcset for img, source and span tags. Extract from various tags and attributes. Conditionally extract attributes from certain tags. Make ignore_when_downloading usable in the Lua hooks.

47f293924df65481985d117aaa8670b03dcb89f2 authored over 2 years ago by arkiver <[email protected]>
Version 1.21.3-at.20220503.02

4347ca39f29fa9d5b5dff896b3a8724e1601a1db authored over 2 years ago by arkiver <[email protected]>
Merge branch 'master' of git://git.savannah.gnu.org/wget into v1.21.3-at

* 'master' of git://git.savannah.gnu.org/wget:
* src/main.c (print_help): Add --retry-on-host-...

0aecc3808eebf7a1b45fb6d77e24a566eccee570 authored over 2 years ago by arkiver <[email protected]>
Version 1.21.3-at.20220503.01.

f36487ebbb8c038e58cc496b06029df9e14f5cdd authored over 2 years ago by arkiver <[email protected]>
Merge v1.21.3 from Wget

c9cca721013e4b664161c668ee37be010930b592 authored over 2 years ago by arkiver <[email protected]>
Next revision of the revisit dedup support. Remove the code from http.c

and move it warc.c. Remove now unused imports from http.c. Change
luahooks_dedup_to_warc to retur...

b45a055c3e886faf3a72fd3fa4c58ca0d7459abb authored over 2 years ago by Jake L <[email protected]>
Version 1.20.3-at.20210410.01.

7f4f36a7bdb1ea250c891bbffd575c4975ab9728 authored over 2 years ago by arkiver <[email protected]>
Flush stderr as well

80d459777d3ad577ec2422053faeeff5369725d8 authored over 2 years ago by OrIdow6 <[email protected]>
Add initial revisit dedup support.

bd475b132026d895c838f86abb7a6fb46fc83fce authored over 2 years ago by Jake L <[email protected]>
Add proper null check to warc.c, rename luahooks_dedup_to_warc to

luahooks_dedup_response.

370155251e1ffa5b8138508de8ea55cab06f00d9 authored over 2 years ago by Jake L <[email protected]>
Version 1.20.3-at.20210504.01. Fix clearing of headers given with --headers after getting first URL.

024b93d11081d5e334b71aab26d8ea85dc715871 authored over 2 years ago by arkiver <[email protected]>
Version 1.20.3-at.20211001.01. Fix implicit conversion off_t to int.

1991671b2aa7bf17a69d32074a6fdf49a204e4ad authored over 2 years ago by arkiver <[email protected]>
Change dedup_resonse to expect a table with date and target_uri and

optionally, response_uuid. Properly return all of this to warc.c and
properly free up data after ...

799b88069a204fb6d98374f9703886690d70ed58 authored over 2 years ago by Jake L <[email protected]>
Version 1.20.3-at.20210212.01. Add various checks to warc.c.

fd873c1ecb96467e633f145aeaad256ca36fcd63 authored over 2 years ago by arkiver <[email protected]>
Version 1.20.3-at.20210212.02. More fixes for warc.c.

0187a8bf000b1ce3a60471566845bc7af71402cc authored over 2 years ago by arkiver <[email protected]>
Abort on Lua errors by default

--no-abort-on-lua-error can go back to the old behavior.

0999c7ed9a20560809c706eeb3d7af0090832bbb authored over 2 years ago by OrIdow6 <[email protected]>
move build_args back into settings

e99dac621df2af09a7cd7ffe5611801fee3642d1 authored over 2 years ago by Katie Holly <[email protected]>
actually build gnutls-type wget for gnutls image

da59f4638b4bdfd4b68cb6835b238779f3d6e4ec authored over 2 years ago by Katie Holly <[email protected]>
use tlstype instead of ssltype, default to openssl

f0c4767959e02fed8ef3b2697843fe72d61747f1 authored over 2 years ago by Katie Holly <[email protected]>
Version 1.20.3-at.20201030.01. Add --warc-item-name option to specify item name for which WARC is created, and allow to be set using item-name:// URLs mid crawl.

199877e6509aaa0ea3628d3ec6dfaf0e7e4b0428 authored over 2 years ago by arkiver <[email protected]>
fixed "default: Linter: duplicate step name"

4078cfb4f9738faecb171fb51807c40feeb3b750 authored over 2 years ago by Katie Holly <[email protected]>
move build_args from settings to step object

321bbee1baf9491dbb9d90fbc14b123c244e84c8 authored over 2 years ago by Katie Holly <[email protected]>
move build arg in .drone.yml out of env

16a217cf61134b8650005098fe5e29e23ec56ad6 authored over 2 years ago by Katie Holly <[email protected]>
adding Dockerfile and .drone.yml for automated builds

84b0462ff195cad0c54a525d418fea50360aa090 authored over 2 years ago by Katie Holly <[email protected]>
Version 1.20.3-at.20200902.01. Add write_to_warc Lua hook.

0cbdd6bf08ad3ca06652a407d4024047647ceed0 authored over 2 years ago by arkiver <[email protected]>
Include the null terminator when processing the result of wget.callbacks.lookup_host

Before, wget-lua did not copy over the null terminator with the rest of the returned string. As a...

6dc93ed84f98cc10b2fdb4274fd29e11569ccd5c authored over 2 years ago by OrIdow6 <[email protected]>
Version 1.20.3-at.20200917.01. Add --content-on-redirect option.

6a65105b83f9117a6aa7b2869a9e2537ef87580a authored over 2 years ago by arkiver <[email protected]>
Version 1.20.3-at.20200919.01. Move write_to_warc Lua hook check to make body available in hook.

2c3c12128427aa351fdf9aa5ffeaf0bdd67079fe authored over 2 years ago by arkiver <[email protected]>
Make HTTP headers customizable through get_urls Lua hook.

42755d548de1e8a536de0fd8647c1e9bf74b9f52 authored over 2 years ago by arkiver <[email protected]>
Version 1.20.3-at.20200322.01. Add ZSTD with dictionary compression support for WARC compression.

79fbff2682b2b16d9fa04e685c3221f9e4242ab2 authored over 2 years ago by arkiver <[email protected]>
Version 1.20.3-at.20200401.01.

0bb2e748e1dd3a6b9a88ad57696fc7c7ee7e4a4a authored over 2 years ago by arkiver <[email protected]>
Always write LE representation of dict size. Fix options check.

7ea3d502618ccdb9fcc64eeea14f036c84181cb2 authored over 2 years ago by arkiver <[email protected]>
Version 1.20.3-at.20200804.01.

1f39c8dfc948d56156cc3ba6754f5a1035e4fe7f authored over 2 years ago by arkiver <[email protected]>
Version 1.20.3-at.20200322.02. Configure for ZSTD.

72500a9389afc7e57397396ebdd0d704167da6c3 authored over 2 years ago by arkiver <[email protected]>
Version 1.20.3-at.20200329.01. Move to WARC/1.1. Support URL agnostic deduplication.

cb412ef6a317637b73dcbec64f68787ce75d78eb authored over 2 years ago by arkiver <[email protected]>
v1.20.3 release

9643449afb9ab65b87d20dc5f0b57e52397a9906 authored over 2 years ago by Katie Holly <[email protected]>
remove angle brackets from warc header

0a430d7fab7d46f8f0a9520f2613334a2aaef95f authored over 2 years ago by Katie Holly <[email protected]>
v1.20.3 lua test

5264d804f4fcf5e51befd40b89c27439a458637e authored over 2 years ago by Katie Holly <[email protected]>
* src/main.c (print_help): Add --retry-on-host-error to help text

aab539bb44b5d7aeb093014c21fd4f4e4e728136 authored almost 3 years ago by Tim Rühsen <[email protected]>
Fix HSTS portability by using int64_t instead of time_t.

* src/hsts.c: Use int64_t instead of time_t.
* src/http.c: Use int64_t for parsing Strict-Transp...

cb114fbbf73eb687d28b01341c8d4266ffa96c9d authored almost 3 years ago by Tim Rühsen <[email protected]>
src/http.c (time_to_rfc1123): Fix -Wformat-nonliteral

1cda2bb5d591f564db2f5bbfcb60ba31b6351ca0 authored almost 3 years ago by Tim Rühsen <[email protected]>
* src/warc.c (warc_process_cdx_line): Fix variable type to idx_t

565f566fab57a84a21671da8f67119873e6d2ce1 authored almost 3 years ago by Tim Rühsen <[email protected]>
* src/main.c (secs_to_human_time): Use snprintf instead of sprintf

c7e6e378e5e90b853edcda0f11f99843cc4f615e authored almost 3 years ago by Tim Rühsen <[email protected]>
* src/main.c (main): Remove unused variable

59d08d32902e3f7531ea74292b00a8556a975775 authored almost 3 years ago by Tim Rühsen <[email protected]>
* src/netrc.c (test_parse_netrc): Check if HAVE_FMEMOPEN is defined

8d5cdef9a9c13fbd7d11ced7d13d41152ad420fe authored almost 3 years ago by Tim Rühsen <[email protected]>
maint: post-release administrivia

* NEWS: Add header line for next release.
* .prev-version: Record previous version.
* cfg.mk (ol...

9f93ffb44b638a7119ac76931f9ffe002b22f9d3 authored almost 3 years ago by Darshit Shah <[email protected]>
* NEWS: Update NEWS items for release

40747a11e44ced5a8ac628a41f879ced3e2ebce9 authored almost 3 years ago by Darshit Shah <[email protected]>
* .gitignore: Remove empty line at EOF

b6f3c6153edeedfeee23e378045c64d8f35d6aa2 authored almost 3 years ago by Darshit Shah <[email protected]>
* .gitignore: Update file

e1fa51206c1c4501626217e032e3e23cdd13c023 authored almost 3 years ago by Darshit Shah <[email protected]>
* src.hsts.c(hsts_read_database): Use SCNd64 for portable format flags

6d4a4e56c82b806758e1240d7746475da876df9c authored almost 3 years ago by Darshit Shah <[email protected]>
* configure.ac: Add some warning flags to ignore

14a7f68f46bd02f9ca582cc49ca4c72c8988e082 authored almost 3 years ago by Darshit Shah <[email protected]>
* cfg.mk: Remove passing syntax-checks from skip list

ccc7866fead1f71db48189e9004867e29aadb14e authored almost 3 years ago by Darshit Shah <[email protected]>
Fix issues from syntax-check

* doc/wget.texi: s/time stamp/timestamp/
* src/ftp-ls.c(clean_line): Same
(ftp_parse_vms_ls): ...

2730a00c0d690dccd46ce4951505218a88fd3e10 authored almost 3 years ago by Darshit Shah <[email protected]>
* Update Copyright years

be936bda564963d7903a6966d3f40f92c95ca7f0 authored almost 3 years ago by Darshit Shah <[email protected]>
* bootstrap: Update script

7ba0a44939c13f0e2ef3087e73b49836a7f9352c authored almost 3 years ago by Darshit Shah <[email protected]>
* gnulib: Pull forward

98c23153a2f7b8d9e5944c31bf4fb74d659dfbd8 authored almost 3 years ago by Darshit Shah <[email protected]>
Cleanup some incorrect uses of AM Conditionals

* configure.ac: Replace IRI_IS_ENABLED with WITH_IRI and
METALINK_IS_ENABLED with WITH...

3a470a90f2811832166c47edb7636b5d7652a2ab authored almost 3 years ago by Darshit Shah <[email protected]>
Replace incorrect usage of AC_LIBOBJ in configure.ac

AC_LIBOBJ is to be used for providing replacement functions for
compatibility reasons. Not for c...

f5263969fa4aa518384b44d7bb334fdb232f7a0a authored almost 3 years ago by Darshit Shah <[email protected]>
Fix case where installed gettext is newer than minimum version

* configure.ac: Use AM_GNU_GETTEXT_REQUIRE_VERSION to specify a minimum
version of gettext nee...

cc5ec2a158a7cd9e3e50ff37831f4598bff1cb6c authored almost 3 years ago by Darshit Shah <[email protected]>
* src/netrc.c (test_parse_netrc): Free netrc structure

f7ce79fd85c39128d255a15a01ee28191cd68e71 authored almost 3 years ago by Tim Rühsen <[email protected]>
* .gitlab-ci.yml: Fix path to llvm-symbolizer

d139fecbe8dd537393b20f535ddc5977bd99e9c3 authored almost 3 years ago by Tim Rühsen <[email protected]>
* src/netrc.c (test_parse_netrc): New unit test function

74a9d9e7c4864d22f93422bd5d71f7b68f1f829e authored almost 3 years ago by Tim Rühsen <[email protected]>
* src/http.c (parse_strict_transport_security): Fix typo in string

Copyright-paperwork-exempt: Yes

446afdca21060e1be804509e687a299bfec505d6 authored almost 3 years ago by Aarni Koskela <[email protected]>
* .gitlab-ci.yml (Scan-Build): Allow failure due to two false positives

e6fa409a4d959fc710ea8e044db403c2c6248ee9 authored almost 3 years ago by Tim Rühsen <[email protected]>
* configure.ac: Use pkg-config for gpgme, libidn2 and nettle

a24e67e239ef949cc77a4c4e5a0beb703026a296 authored almost 3 years ago by Tim Rühsen <[email protected]>
* src/ftp.c: Small cleanups

c984cb316a790bf672b71d14d3b903921aacc00d authored almost 3 years ago by Tim Rühsen <[email protected]>
Print newline after dot progress bar in non-verbose mode

* src/progress.c (dot_finish): Print new in all progress bar contexts
instead of just verbose
...

35a6317b9976f578677098901c48173e826e9f4a authored almost 3 years ago by Nik Soggia <[email protected]>
* .gitlab-ci.yml: Fix artifact path for the Scan-Build runner

9474a2c6f4ca4072e55198c5d74531fb4e45a975 authored almost 3 years ago by Tim Rühsen <[email protected]>
* src/main.c (main): Unlink output document when --unlink is given

e7a4d818fa4e721e9adc72008afc73bc1c6376d4 authored almost 3 years ago by Tim Rühsen <[email protected]>
fuzz/*.in: Update fuzzer corpora

f3545297088e29b83807334a6641fcea127b5300 authored about 3 years ago by Tim Rühsen <[email protected]>
* .gitlab-ci.yml (CoverageReports): Fix artifacts paths

67d4cb3ab68a77926683ff02ddd89c2682439b43 authored about 3 years ago by Tim Rühsen <[email protected]>
* tests/valgrind-suppressions: Fix libidn rule

d2af84fbb3f3b88ec614a504afff986b1ee90258 authored about 3 years ago by Tim Rühsen <[email protected]>
* .gitlab-ci.yml: Fix artifacts paths

bfb5bedf7da42beb0ec6c0401701aa1f67a9fef9 authored about 3 years ago by Tim Rühsen <[email protected]>
* tests/valgrind-suppressions: Extend libidn rule

8c5a620f0f5a073b3568281544b0831c0ac0ecf2 authored about 3 years ago by Tim Rühsen <[email protected]>
* src/log.c (logprintf): Check earlier for verbosity

c34c2529dc84450c9d0a6f38b8e9d3f6b0743021 authored about 3 years ago by Tim Rühsen <[email protected]>
* src/http.c (http_loop): Fix memleak

c7a37d82eefa5e3ca8043eb0102725d27d582c96 authored about 3 years ago by Tim Rühsen <[email protected]>
Switch fuzzing build from C++ to C

* Makefile.am (oss-fuzz): Build with $CC instead of $CXX.
* README.md: Remove CXX and CXXFLAGS e...

c81042295e6ef8c6cc82f9a5e590134fc268a8f9 authored about 3 years ago by Tim Rühsen <[email protected]>
* src/http.c (http_loop): Hide password when printing status with -nv

Reported-By: Per Lundberg <[email protected]>
Closes: #61492

f75fcf2985bf7ace36051f5d00a9f7c53e125a2b authored about 3 years ago by Darshit Shah <[email protected]>
* gnulib: Pull forward

22611a77baf9bfb460657c74087300a0f6afb003 authored about 3 years ago by Darshit Shah <[email protected]>
* src/hsts.c (hsts_read_database): Read time_t values as long long

e1bacd2fa5f026f9493bd74e9179b859fa9ce8f1 authored about 3 years ago by Darshit Shah <[email protected]>
* src/main.c (print_help): Add command line option for TLS 1.3

faeb4d90c2b7835f5c86279187e7f1d4a3a8a089 authored about 3 years ago by Thomas Niederberger <[email protected]>