Ecosyste.ms: OpenCollective

An open API service for software projects hosted on Open Collective.

github.com/FedML-AI/FedML

FEDML - The unified and scalable ML library for large-scale distributed training, model serving, and federated learning. FEDML Launch, a cross-cloud scheduler, further enables running any AI jobs on any GPU cloud or on-premise cluster. Built on this library, TensorOpera AI (https://TensorOpera.ai) is your generative AI platform at scale.
https://github.com/FedML-AI/FedML

Alexleung/dev v070

fedml-alex opened this pull request 9 months ago
Dev/v0.7.0

fedml-alex opened this pull request 9 months ago
[Deploy] Set setting to "DEPLOYED" when no action needed.

Raphael-Jin opened this pull request 9 months ago
[Deploy] Refine Autoscaling Algorithm

Raphael-Jin opened this pull request 9 months ago
Refine metrics collection

Raphael-Jin opened this pull request 9 months ago
Refine metrics collection

Raphael-Jin opened this pull request 9 months ago
Update remote_storage.py

zhouLion opened this pull request 9 months ago
Update Launch Job Docker Image name

alaydshah opened this pull request 9 months ago
Hotfix to make detect status logging less noisy

alaydshah opened this pull request 9 months ago
[Deploy] Refine Database Readability; Format Code.

Raphael-Jin opened this pull request 9 months ago
Logging Timestamp with nanosecond granularity

alaydshah opened this pull request 9 months ago
Rearranging checking conditions of the autoscaler's prediction operations

fedml-dimitris opened this pull request 9 months ago
[Deploy] Support fail rollback for scale out.

Raphael-Jin opened this pull request 9 months ago
Fix logging + minor bugs

alaydshah opened this pull request 9 months ago
Changing the help option displayed for adding user metadata to storag…

bhargav191098 opened this pull request 9 months ago
Autoscaler hotfix

fedml-dimitris opened this pull request 9 months ago
[Deploy] Fix Proxy inference; Simplify Logs

Raphael-Jin opened this pull request 9 months ago
[Deploy] typo: modify end point to endpoint

ASCE1885 opened this pull request 9 months ago
[Deploy] Support autoscaling

Raphael-Jin opened this pull request 9 months ago
Refactored Logging Test

alaydshah opened this pull request 9 months ago
MQTT Refactoring

alaydshah opened this pull request 9 months ago
[Deploy] Fix endless rollback issue when multiple replica.

Raphael-Jin opened this pull request 9 months ago
Sync dev to alex-branch-latest-swap

fedml-alex opened this pull request 9 months ago
Alexleung/dev branch online

fedml-alex opened this pull request 9 months ago
Alexleung/dev branch online

fedml-alex opened this pull request 9 months ago
sync dev to my branch

fedml-alex opened this pull request 9 months ago
[DevOps] update devops files.

fedml-alex opened this pull request 9 months ago
Pr update fail rollback

Raphael-Jin opened this pull request 9 months ago
[Deploy] Hotfix crosstalk issue

Raphael-Jin opened this pull request 9 months ago
Test/v0.7.0

fedml-alex opened this pull request 9 months ago
Dev/v0.7.0

fedml-alex opened this pull request 9 months ago
[Deploy] Support update fail-rollback.

Raphael-Jin opened this pull request 9 months ago
Logging timestamp with millisecond granularity

alaydshah opened this pull request 9 months ago
Unified Log Prefix for Logs Inside Inference Container

Raphael-Jin opened this pull request 9 months ago
Raphael/refactor inf runtime logging

Raphael-Jin opened this pull request 9 months ago
Minor fixes + Test

alaydshah opened this pull request 10 months ago
Rotating Upload Fix Initialization

alaydshah opened this pull request 10 months ago
Alexleung/dev branch latest sync

fedml-alex opened this pull request 10 months ago
Alexleung/dev branch latest sync

fedml-alex opened this pull request 10 months ago
Dev/v0.7.0

fedml-alex opened this pull request 10 months ago
[WIP] [Deploy] Refactor Logging System for deploy

Raphael-Jin opened this pull request 10 months ago
typo "salve" instead of "slave" in identifiers

bene-ges opened this issue 10 months ago
Sync the workflow to dev

fedml-alex opened this pull request 10 months ago
Update Launch Driver Example

alaydshah opened this pull request 10 months ago
Refactor Logging + Fix Rotating Log Upload Bug

alaydshah opened this pull request 10 months ago
Dev/v0.7.0

fedml-alex opened this pull request 10 months ago
Scheduler Logging Nits

alaydshah opened this pull request 10 months ago
Enhance Workflow

alaydshah opened this pull request 10 months ago
Fix Race Conditions

Raphael-Jin opened this pull request 10 months ago
[CoreEngine] Use the original url to download packages.

fedml-alex opened this pull request 10 months ago
Dev/v0.7.0

fedml-alex opened this pull request 10 months ago
Fail loudly and terminate if version upgrade fails

alaydshah opened this pull request 10 months ago
[CoreEngine] replace the direct function call with posting launching …

fedml-alex opened this pull request 10 months ago
Test/v0.7.0

fedml-alex opened this pull request 10 months ago
[CoreEngine] close the ota.

fedml-alex opened this pull request 10 months ago
Test/v0.7.0

fedml-alex opened this pull request 10 months ago
Dev/v0.7.0

fedml-alex opened this pull request 10 months ago
[Deploy] Add replica_no column if missing.

Raphael-Jin opened this pull request 10 months ago
[Deploy] Add replica_no column if missing.

Raphael-Jin opened this pull request 10 months ago
[DevOps] update devops files.

fedml-alex opened this pull request 10 months ago
Dev/v0.7.0

fedml-alex opened this pull request 10 months ago
Refactor Replica Logic

Raphael-Jin opened this pull request 10 months ago
Support Replica Logic

Raphael-Jin opened this pull request 10 months ago
[CoreEngine] update the subscribed topics for reporting device info t…

fedml-alex opened this pull request 10 months ago
[CoreEngine] update the subscribed topics for reporting device info t…

fedml-alex opened this pull request 10 months ago
[CoreEngine] subscribe topics for reporting device info to mlops.

fedml-alex opened this pull request 10 months ago
Dev/v0.7.0

fedml-alex opened this pull request 10 months ago
Rookie question

Salterwater23 opened this issue 10 months ago
Enhance Build Packaging

alaydshah opened this pull request 10 months ago
[unitedllm, train.llm] migrate & integrate UnitedLLM

fedml-zijianhu opened this pull request 10 months ago
sync `fedml.train.llm`: bugfix falcon issue, update flash attention integrations

fedml-zijianhu opened this pull request 10 months ago
[CoreEngine] 1. fix the issue which the gpu id is not released into t…

fedml-alex opened this pull request 10 months ago
avoid remove the endpoint when cannot match GPU resource

fedml-alex opened this pull request 11 months ago
Dev/v0.7.0

fedml-alex opened this pull request 11 months ago
add the workflow with connected inputs and ouputs.

fedml-alex opened this pull request 11 months ago
[CoreEngine] add the workflow with connected inputs and ouputs.

fedml-alex opened this pull request 11 months ago
[Deploy] Download template from s3; Using home sign.

Raphael-Jin opened this pull request 11 months ago
Make run cleanup idempotent

alaydshah opened this pull request 11 months ago
Fix occupy_gpu_ids race condition

alaydshah opened this pull request 11 months ago
Fix

alaydshah opened this pull request 11 months ago
Alaydshah/fix/race condition

alaydshah opened this pull request 11 months ago
Dev/v0.7.0

chaoyanghe opened this pull request 11 months ago
Fix job processor logs

alaydshah opened this pull request 11 months ago
Share trace id between api calls

alaydshah opened this pull request 11 months ago
log_file_dir arg not work

flylzj opened this issue 11 months ago
Update trim unavailable gpu id logic

alaydshah opened this pull request 11 months ago
Remove sys and process utils logs

alaydshah opened this pull request 11 months ago
Bump test to 27a1

alaydshah opened this pull request 11 months ago
Correct dev version

alaydshah opened this pull request 11 months ago
Bump Dev

alaydshah opened this pull request 11 months ago
Merge Test to Prod

alaydshah opened this pull request 11 months ago
Merge Dev To Test

alaydshah opened this pull request 11 months ago
Fix: Related to Serializable Issue

alaydshah opened this pull request 11 months ago
Make setting log levels more straightforward

alaydshah opened this pull request 11 months ago