Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

github.com/mozilla/translations

The code, training pipeline, and models that power Firefox Translations
https://github.com/mozilla/translations

deescape-special-chars.perl should be part of the clean step

XapaJIaMnu opened this issue about 3 years ago
bicleaner should be optional for some corpora

XapaJIaMnu opened this issue about 3 years ago
spm vocabulary training could fail on high res languages

XapaJIaMnu opened this issue about 3 years ago
find-corpus.py is outdated?

XapaJIaMnu opened this issue about 3 years ago
Change sharding mode in training to improve speed

XapaJIaMnu opened this issue about 3 years ago
Version of marian used for training

XapaJIaMnu opened this issue about 3 years ago
Quality improvements

eu9ene opened this pull request about 3 years ago
Add chrF evaluation metric

eu9ene opened this issue about 3 years ago
Evaluate ensemble of teachers

eu9ene opened this issue about 3 years ago
Handle soft hyphens with custom normalization tables

eu9ene opened this issue about 3 years ago
Remove punctuation normalizaiton

eu9ene opened this issue about 3 years ago
Snakemake integration

eu9ene opened this pull request about 3 years ago
Fine tune teacher on parallel data only

eu9ene opened this issue about 3 years ago
Add dataset specific bicleaner thresholds

eu9ene opened this issue about 3 years ago
Out of memory on shuffling huge datasets

eu9ene opened this issue over 3 years ago
Minor improvements

eu9ene opened this pull request over 3 years ago
Add more languages to corpus cleaning script

eu9ene opened this issue over 3 years ago
Try other teacher hyperparameters

eu9ene opened this issue over 3 years ago
Add flores dataset importer

eu9ene opened this issue over 3 years ago
Fix Chinese compatibility

eu9ene opened this issue over 3 years ago
Add dataset specific fixes

eu9ene opened this issue over 3 years ago
Track experiments metadata

eu9ene opened this issue over 3 years ago
Bicleaner support + fixes

eu9ene opened this pull request over 3 years ago
Add ability to use custom backward model

eu9ene opened this issue over 3 years ago
Custom training/test dataset

eu9ene opened this issue over 3 years ago
Workflow manager integration

eu9ene opened this issue over 3 years ago
Replace absolute paths with relative ones

eu9ene opened this issue over 3 years ago
Add support of ensemble of teachers for translation

eu9ene opened this issue over 3 years ago
Add a script to find datasets

eu9ene opened this issue over 3 years ago
Cache datasets across experiments and language pairs

eu9ene opened this issue over 3 years ago
Improve tensorboard

eu9ene opened this issue over 3 years ago
Support any importer for evaluation

eu9ene opened this issue over 3 years ago
Add bicleaner step

eu9ene opened this issue over 3 years ago
Requirements for the python venv

kirianguiller opened this issue over 3 years ago
Initial pipeline

eu9ene opened this pull request over 3 years ago