Federated learning β training machine learning models across distributed datasets without centralising the raw data β has matured from research concept to production deployment, with PySyft and TensorFlow Federated (TFF) as the two leading open-source frameworks. PySyft (now OpenMined's Python library) enables privacy-preserving ML with secure aggregation and differential privacy on arbitrary Python ML code; TFF provides a functional federated programming model for TensorFlow and JAX models with Google's production-tested aggregation protocols. This guide covers when to use each framework and the implementation patterns for enterprise federated learning.
When Federated Learning Over Centralised Training
PySyft vs TensorFlow Federated
| Framework | Language | Privacy Mechanisms | Best For |
|---|---|---|---|
| PySyft (OpenMined) | Python β any ML framework | Secure aggregation, DP, SMPC, HE integration | Cross-silo FL; healthcare/finance; arbitrary Python code |
| TensorFlow Federated | Python β TF/JAX | DP-SGD (TF Privacy), secure aggregation | Cross-device FL; mobile; Google-stack teams |
| Flower (flwr) | Python β any framework | Framework-agnostic aggregation strategies | Research; mixed framework environments; PyTorch FL |
Install: pip install syft. Each data owner (hospital, bank, company) runs a PySyft Datasite server. The model owner sends a study request β a Python script defining the training code β to each Datasite. The Datasite operator reviews, approves, and executes the code against their local data. Only model updates (gradients or weights) are returned β never raw data. PySyft's approval workflow is designed for regulated industry cross-organisational FL. The model owner aggregates updates using FedAvg. Deploy each Datasite on the data owner's own infrastructure for complete data sovereignty.
Install: pip install tensorflow-federated. Define federated dataset: train_data = [tf.data.Dataset.from_tensor_slices(client_data) for client_data in clients]. Use the built-in FedAvg process: iterative_process = tff.learning.algorithms.build_weighted_fed_avg(model_fn, client_optimizer_fn, server_optimizer_fn). Simulate training: state = iterative_process.initialize(); state, metrics = iterative_process.next(state, train_data[:10]). For production deployment, TFF provides a production runtime that communicates with actual client devices via gRPC. Add DP: wrap with tff.learning.dp_aggregator.
Our ML development and software development teams design and implement federated learning systems for cross-organisational ML collaboration. Book a free advisory session.