Ecosyste.ms: OpenCollective
An open API service for software projects hosted on Open Collective.
github.com/FedML-AI/FedML
FEDML - The unified and scalable ML library for large-scale distributed training, model serving, and federated learning. FEDML Launch, a cross-cloud scheduler, further enables running any AI jobs on any GPU cloud or on-premise cluster. Built on this library, TensorOpera AI (https://TensorOpera.ai) is your generative AI platform at scale.
https://github.com/FedML-AI/FedML
Fix: Related to Serializable Issue
alaydshah opened this pull request 11 months ago
alaydshah opened this pull request 11 months ago
Fix: Clean up containers and release gpus only if job_type is not serve or deploy
alaydshah opened this pull request 11 months ago
alaydshah opened this pull request 11 months ago
Dev/v0.7.0
fedml-alex opened this pull request 11 months ago
fedml-alex opened this pull request 11 months ago
Update monitoring metrics sending frequency to 5s
alaydshah opened this pull request 11 months ago
alaydshah opened this pull request 11 months ago
Fix GPU Release bugs and training exit routine
alaydshah opened this pull request 11 months ago
alaydshah opened this pull request 11 months ago
limit the paho-mqtt to less than 2.0.0 version.
fedml-alex opened this pull request 11 months ago
fedml-alex opened this pull request 11 months ago
Fix Login
alaydshah opened this pull request 11 months ago
alaydshah opened this pull request 11 months ago
Dev/v0.7.0
fedml-alex opened this pull request 11 months ago
fedml-alex opened this pull request 11 months ago
Merge Dev to Test
alaydshah opened this pull request 11 months ago
alaydshah opened this pull request 11 months ago
GPU Reporting Bug Fix
alaydshah opened this pull request 11 months ago
alaydshah opened this pull request 11 months ago
Raise exception if gpu_match fail
alaydshah opened this pull request 11 months ago
alaydshah opened this pull request 11 months ago
[`train.llm`] fix falcon loading issue, update README
fedml-zijianhu opened this pull request 11 months ago
fedml-zijianhu opened this pull request 11 months ago
[train.llm] update README, requirements
ghost opened this pull request 12 months ago
ghost opened this pull request 12 months ago
Add more logs, fail loudly
alaydshah opened this pull request 12 months ago
alaydshah opened this pull request 12 months ago
Merge Dev To Test
alaydshah opened this pull request 12 months ago
alaydshah opened this pull request 12 months ago
Dev/v0.7.0
alaydshah opened this pull request 12 months ago
alaydshah opened this pull request 12 months ago
Dev/v0.7.0
fedml-alex opened this pull request 12 months ago
fedml-alex opened this pull request 12 months ago
Support rolling update
Raphael-Jin opened this pull request 12 months ago
Raphael-Jin opened this pull request 12 months ago
Add and enable logs and decorate problematic functions with debug
alaydshah opened this pull request 12 months ago
alaydshah opened this pull request 12 months ago
Test/v0.7.0
fedml-alex opened this pull request 12 months ago
fedml-alex opened this pull request 12 months ago
Dev/v0.7.0
fedml-alex opened this pull request 12 months ago
fedml-alex opened this pull request 12 months ago
Dev/v0.7.0
fedml-alex opened this pull request 12 months ago
fedml-alex opened this pull request 12 months ago
[CoreEngine] fixed the edge id error.
fedml-alex opened this pull request 12 months ago
fedml-alex opened this pull request 12 months ago
Quickstart Guide
ArturNiederfahrenhorst opened this issue 12 months ago
ArturNiederfahrenhorst opened this issue 12 months ago
[Deploy] Fix port indication reading in yaml file
Raphael-Jin opened this pull request 12 months ago
Raphael-Jin opened this pull request 12 months ago
Test/v0.7.0
fedml-alex opened this pull request 12 months ago
fedml-alex opened this pull request 12 months ago
Dev/v0.7.0
fedml-alex opened this pull request 12 months ago
fedml-alex opened this pull request 12 months ago
Alexleung/dev branch
fedml-alex opened this pull request 12 months ago
fedml-alex opened this pull request 12 months ago
[Deploy] Fix port indication in yaml file
Raphael-Jin opened this pull request 12 months ago
Raphael-Jin opened this pull request 12 months ago
Workflow Fix + Nits
alaydshah opened this pull request 12 months ago
alaydshah opened this pull request 12 months ago
[CoreEngine] add the model deploy job, model inference job to the wor…
fedml-alex opened this pull request 12 months ago
fedml-alex opened this pull request 12 months ago
Test/v0.7.0
fedml-alex opened this pull request 12 months ago
fedml-alex opened this pull request 12 months ago
Dev/v0.7.0
fedml-alex opened this pull request 12 months ago
fedml-alex opened this pull request 12 months ago
[CoreEngine] change the log event to make it independent.
fedml-alex opened this pull request 12 months ago
fedml-alex opened this pull request 12 months ago
Test/v0.7.0
fedml-alex opened this pull request 12 months ago
fedml-alex opened this pull request 12 months ago
Dev/v0.7.0
fedml-alex opened this pull request 12 months ago
fedml-alex opened this pull request 12 months ago
[CoreEngine] upload the inference logs with a virtual edge id.
fedml-alex opened this pull request 12 months ago
fedml-alex opened this pull request 12 months ago
Test/v0.7.0
fedml-alex opened this pull request 12 months ago
fedml-alex opened this pull request 12 months ago
Dev/v0.7.0
fedml-alex opened this pull request 12 months ago
fedml-alex opened this pull request 12 months ago
Alexleung/dev branch
fedml-alex opened this pull request 12 months ago
fedml-alex opened this pull request 12 months ago
Test/v0.7.0
fedml-alex opened this pull request 12 months ago
fedml-alex opened this pull request 12 months ago
Dev/v0.7.0
fedml-alex opened this pull request 12 months ago
fedml-alex opened this pull request 12 months ago
[CoreEngine] update the endpoint test.
fedml-alex opened this pull request 12 months ago
fedml-alex opened this pull request 12 months ago
Alexleung/dev branch
fedml-alex opened this pull request 12 months ago
fedml-alex opened this pull request 12 months ago
Update setup.py
fedml-alex opened this pull request 12 months ago
fedml-alex opened this pull request 12 months ago
Test/v0.7.0
fedml-alex opened this pull request 12 months ago
fedml-alex opened this pull request 12 months ago
Dev/v0.7.0
fedml-alex opened this pull request 12 months ago
fedml-alex opened this pull request 12 months ago
[CoreEngine] add the customized workflow and example.
fedml-alex opened this pull request 12 months ago
fedml-alex opened this pull request 12 months ago
Test/v0.7.0
fedml-alex opened this pull request 12 months ago
fedml-alex opened this pull request 12 months ago
Dev/v0.7.0
fedml-alex opened this pull request 12 months ago
fedml-alex opened this pull request 12 months ago
[CoreEngine] upload the inferences logs.
fedml-alex opened this pull request 12 months ago
fedml-alex opened this pull request 12 months ago
Alexleung/dev branch
fedml-alex opened this pull request 12 months ago
fedml-alex opened this pull request 12 months ago
Test/v0.7.0
fedml-alex opened this pull request 12 months ago
fedml-alex opened this pull request 12 months ago
Dev/v0.7.0
fedml-alex opened this pull request 12 months ago
fedml-alex opened this pull request 12 months ago
Dev/v0.7.0
fedml-alex opened this pull request 12 months ago
fedml-alex opened this pull request 12 months ago
[CoreEngine] add the environment variables in the deploy containers.
fedml-alex opened this pull request 12 months ago
fedml-alex opened this pull request 12 months ago
Test/v0.7.0
fedml-alex opened this pull request 12 months ago
fedml-alex opened this pull request 12 months ago
Dev/v0.7.0
fedml-alex opened this pull request 12 months ago
fedml-alex opened this pull request 12 months ago
[CoreEngine] set on-premise host and port in the train container.
fedml-alex opened this pull request 12 months ago
fedml-alex opened this pull request 12 months ago
[Deploy] Deploy with image pull policy
Raphael-Jin opened this pull request 12 months ago
Raphael-Jin opened this pull request 12 months ago
Test/v0.7.0
fedml-alex opened this pull request 12 months ago
fedml-alex opened this pull request 12 months ago
Dev/v0.7.0
fedml-alex opened this pull request 12 months ago
fedml-alex opened this pull request 12 months ago
[CoreEngine] add on-premise port.
fedml-alex opened this pull request 12 months ago
fedml-alex opened this pull request 12 months ago
Test/v0.7.0
fedml-alex opened this pull request 12 months ago
fedml-alex opened this pull request 12 months ago
Dev/v0.7.0
fedml-alex opened this pull request 12 months ago
fedml-alex opened this pull request 12 months ago
[CoreEngine] add on-premise deployment files.
fedml-alex opened this pull request 12 months ago
fedml-alex opened this pull request 12 months ago
Workflow v1
alaydshah opened this pull request 12 months ago
alaydshah opened this pull request 12 months ago
Test/v0.7.0
fedml-alex opened this pull request 12 months ago
fedml-alex opened this pull request 12 months ago
Dev/v0.7.0
fedml-alex opened this pull request 12 months ago
fedml-alex opened this pull request 12 months ago
Alexleung/dev branch
fedml-alex opened this pull request 12 months ago
fedml-alex opened this pull request 12 months ago
[Deploy] Fix updating model card.
Raphael-Jin opened this pull request 12 months ago
Raphael-Jin opened this pull request 12 months ago
[Deploy] Support rolling update
Raphael-Jin opened this pull request 12 months ago
Raphael-Jin opened this pull request 12 months ago
[Deploy] Refactor dummy example; Add test file.
Raphael-Jin opened this pull request 12 months ago
Raphael-Jin opened this pull request 12 months ago
Add local constants
alaydshah opened this pull request 12 months ago
alaydshah opened this pull request 12 months ago
Test/v0.7.0
fedml-alex opened this pull request 12 months ago
fedml-alex opened this pull request 12 months ago
Dev/v0.7.0
fedml-alex opened this pull request 12 months ago
fedml-alex opened this pull request 12 months ago
Merge pull request #1862 from FedML-AI/alexleung/dev_branch
fedml-alex opened this pull request 12 months ago
fedml-alex opened this pull request 12 months ago
Alexleung/dev branch
fedml-alex opened this pull request 12 months ago
fedml-alex opened this pull request 12 months ago
Notify platform if job failed at server runner due to unavailability of resources
alaydshah opened this pull request 12 months ago
alaydshah opened this pull request 12 months ago
Fix Docker Cleanup Bug upon stopping Job through UI or CLI
alaydshah opened this pull request almost 1 year ago
alaydshah opened this pull request almost 1 year ago
Fix Docker Mapping Issue
alaydshah opened this pull request almost 1 year ago
alaydshah opened this pull request almost 1 year ago
Dev/v0.7.0
fedml-alex opened this pull request about 1 year ago
fedml-alex opened this pull request about 1 year ago
Confirm cluster only if user check is True
alaydshah opened this pull request about 1 year ago
alaydshah opened this pull request about 1 year ago
Dev/v0.7.0
fedml-alex opened this pull request about 1 year ago
fedml-alex opened this pull request about 1 year ago
[Deploy] Support Internal / External Port Configuration
Raphael-Jin opened this pull request about 1 year ago
Raphael-Jin opened this pull request about 1 year ago
The compatibility issues of Nvidia Jetson
rekkles2 opened this issue about 1 year ago
rekkles2 opened this issue about 1 year ago
Dimitris adding back CNN MNIST in Model Hub
fedml-dimitris opened this pull request about 1 year ago
fedml-dimitris opened this pull request about 1 year ago
Fix Storage Metadata Bug
alaydshah opened this pull request about 1 year ago
alaydshah opened this pull request about 1 year ago
[train.llm] example: update job config, requirements
fedml-zijianhu opened this pull request about 1 year ago
fedml-zijianhu opened this pull request about 1 year ago
[Deploy] Fix deploy record update logic
Raphael-Jin opened this pull request about 1 year ago
Raphael-Jin opened this pull request about 1 year ago
Dev/v0.7.0
fedml-alex opened this pull request about 1 year ago
fedml-alex opened this pull request about 1 year ago
Pass in docker args through `job.yaml`
alaydshah opened this pull request about 1 year ago
alaydshah opened this pull request about 1 year ago
Docker Bootstrap Fix 2
alaydshah opened this pull request about 1 year ago
alaydshah opened this pull request about 1 year ago
sync master to test
fedml-alex opened this pull request about 1 year ago
fedml-alex opened this pull request about 1 year ago
Simplify customized image
Raphael-Jin opened this pull request about 1 year ago
Raphael-Jin opened this pull request about 1 year ago
Fix bootstrapping in docker
alaydshah opened this pull request about 1 year ago
alaydshah opened this pull request about 1 year ago
Containerize Launch v1
alaydshah opened this pull request about 1 year ago
alaydshah opened this pull request about 1 year ago
trained model path in single process simulation examples
Sina-Najafi opened this issue about 1 year ago
Sina-Najafi opened this issue about 1 year ago
[Deploy] update HF template
fedml-zijianhu opened this pull request about 1 year ago
fedml-zijianhu opened this pull request about 1 year ago
[Deploy] update HF template
fedml-zijianhu opened this pull request about 1 year ago
fedml-zijianhu opened this pull request about 1 year ago