Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

github.com/webrecorder/warcio

Streaming WARC/ARC library for fast web archive IO
https://github.com/webrecorder/warcio

bump version to 1.7.5

ikreymer opened this pull request 29 days ago
Handle deprecation of naive datetime functions like utcnow()

tw4l opened this pull request 2 months ago
feat: try py 3.13, plus typos

wumpus opened this pull request 3 months ago
Stream Recompressor

white-gecko opened this pull request 4 months ago
Add docs and https://warcio.readthedocs.io

Florents-Tselai opened this pull request 4 months ago
py3.12 and setuptools

wumpus opened this issue 4 months ago
feat: test old ubuntu version

wumpus opened this pull request 4 months ago
doc: document how to use brotli; test brotli

wumpus opened this pull request 4 months ago
feat: add darwin and windows CI

wumpus opened this pull request 4 months ago
feat: try darwin and windows [skip actions]

wumpus opened this pull request 4 months ago
chore: finish py3.12

wumpus opened this pull request 5 months ago
Test python 3.12

white-gecko opened this pull request 5 months ago
Remove superfluous ci step

white-gecko opened this pull request 5 months ago
Run pytest directly. setup.py test was removed in setuptools 72.

white-gecko opened this pull request 5 months ago
Update codecov/codecov-action from v1 to v4

white-gecko opened this pull request 5 months ago
Adjust classifiers to the actually tested build matrix

white-gecko opened this pull request 5 months ago
Migrate from setup.py to poetry/pyproject.toml

white-gecko opened this pull request 5 months ago
Add dependency for setuptools, which is required by cli get_version command

white-gecko opened this pull request 5 months ago
Bump urllib3 from 1.25.11 to 1.26.19

dependabot[bot] opened this pull request 7 months ago
Bump urllib3 from 1.25.11 to 1.26.18

dependabot[bot] opened this pull request 8 months ago
Add test to HTTPS proxies

tw4l opened this issue 8 months ago
Migrate to GitHub Actions CI and resolve dependency issues

tw4l opened this pull request 8 months ago
Fix typos discovered by codespell

cclauss opened this pull request about 1 year ago
Delete .travis.yml because Travis CI is no longer free

cclauss opened this pull request about 1 year ago
warcio accepts a bare LF everywhere a CRLF is required by the spec

acidus99 opened this issue over 1 year ago
doc bugs linking to source code files

wumpus opened this issue over 1 year ago
Deimos/add https type

Deimos4Flare opened this pull request over 1 year ago
Add support for the 1995 NCSA 1.5.1 webserver

omgoo opened this pull request almost 2 years ago
wget warc status code?

JohnMaguire opened this issue almost 2 years ago
webrecorder fails to open IA warc file on MacOS X Ventura 13.2.1

theopathic opened this issue almost 2 years ago
warcio cannot write wet files

mraslann opened this issue over 2 years ago
Patching WARCs using warcio

wsdookadr opened this issue over 2 years ago
GitHub Action to lint Python code

cclauss opened this pull request over 2 years ago
Trying to write to closed file when using `requests.Session`

maxyousif15 opened this issue over 2 years ago
Empty WARC files when deploying warcio on Airflow

maxyousif15 opened this issue over 2 years ago
fix utf-8 encoding

tomeksporczyk opened this pull request over 2 years ago
warcio.exceptions.ArchiveLoadFailed: Unknown archive format

KyloPrem opened this issue over 2 years ago
Issues with encoding of http-answers

Weyaaron opened this issue almost 3 years ago
Warcio does not support replay of sites hosted on NCSA 1.5

omgoo opened this issue almost 3 years ago
Record not followed by newline (conversion error)

mw0000 opened this issue almost 3 years ago
`capture_http` fails in tests, but works otherwise

maxyousif15 opened this issue about 3 years ago
warcio check does not raise error when GZip records are truncated

anjackson opened this issue about 3 years ago
extract entire warc file?

catharsis71 opened this issue about 3 years ago
CLI Indexer: silently ignore brokenpipe signal

sebastian-nagel opened this pull request about 3 years ago
Add offline mode to skip tests that require an internet connection

Luflosi opened this pull request over 3 years ago
Failsafe if it fails to % - encode headers

manueldeprada opened this pull request over 3 years ago
Offline tests

Luflosi opened this issue over 3 years ago
get_test_file missing from the PyPI release

Apteryks opened this issue over 3 years ago
Not compatible with WARC-files/records writtin by ArchiveSpark

parismic opened this issue over 3 years ago
quoted-string WARC header values are not parsed correctly

JustAnotherArchivist opened this issue over 3 years ago
warcio does not preserve HTTP header whitespace

JustAnotherArchivist opened this issue over 3 years ago
warcio mangles non-ASCII HTTP headers

JustAnotherArchivist opened this issue over 3 years ago
Invalid WARCs are silently accepted instead of raising an error

JustAnotherArchivist opened this issue about 4 years ago
Add version tags to the repository

JustAnotherArchivist opened this issue about 4 years ago
Header methods do not work well with repeated headers

JustAnotherArchivist opened this issue about 4 years ago
check_digests is under-documented, confusing everyone

wumpus opened this issue about 4 years ago
Block digest verification fails on some copied record

dlazesz opened this issue about 4 years ago
Suspicion of incorrect handling of content length in WARC records

ThomasA opened this issue about 4 years ago
Migrate CI

wumpus opened this issue about 4 years ago
add digest_algorithm option in writer

ThomasLiennard opened this pull request about 4 years ago
Support ZStd Compression for WARCs

ikreymer opened this issue over 4 years ago
Plans for adding type annotations?

dnaaun opened this issue over 4 years ago
capture_http/indexer tweaks

ikreymer opened this pull request over 4 years ago
Enable writing block digests for warcinfo records

JustAnotherArchivist opened this pull request over 4 years ago
Fix capture_http() with http and https proxies

ikreymer opened this pull request over 4 years ago
Fix ordering of arguments in README

baali opened this pull request almost 5 years ago
Confusing documentation around request filter

baali opened this issue almost 5 years ago
Develop->Master for 1.7.2

ikreymer opened this pull request almost 5 years ago
%-encoding fix: if header value does not contain a mutli-value separa…

ikreymer opened this pull request almost 5 years ago
Fix issues with read/write same record

ikreymer opened this pull request almost 5 years ago
ci: bound jinja2<3.0.0 for py27 fix, possible fix for #103

ikreymer opened this pull request almost 5 years ago
Error reading WAT files

MohammedElsayyed opened this issue almost 5 years ago
Using warcio with scrapy - what does the payload need to look like?

Chris8080 opened this issue about 5 years ago
Use scrapy together with warcio

CuloArdido opened this issue about 5 years ago
Add feature to skip past corrupted records in a warc.gz file

lukeplausin opened this pull request about 5 years ago
include record offsets in `warcio check` output

nlevitt opened this pull request about 5 years ago
fix payload digest for chunked response in test warc

nlevitt opened this pull request about 5 years ago
writer: use 1.1 revisit profile when writing WARC/1.1 revisits, fixes #94

ikreymer opened this pull request about 5 years ago
UnicodeEncodeError when using 'warcio recompress'

zuny26 opened this issue over 5 years ago
Incorrect WARC-Profile for revisit records when using WARC/1.1

JustAnotherArchivist opened this issue over 5 years ago
WARC-Payload-Digest should only be written for HTTP records

JustAnotherArchivist opened this issue over 5 years ago
Undocumented and non-standardised default Content-Type application/warc-record

JustAnotherArchivist opened this issue over 5 years ago
Threadpool executor creates zero byte warc files

naumansiddiqui4 opened this issue over 5 years ago
UTF-8 characters in Link header parameters raises exception

staylor-ds opened this issue over 5 years ago
No block digest written for warcinfo records

JustAnotherArchivist opened this issue over 5 years ago