Skip to content
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
71 changes: 63 additions & 8 deletions examples/vertical-fl/README.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
---
tags: [vertical, tabular, advanced]
tags: [vertical, tabular, advanced, fds]
dataset: [Titanic]
framework: [torch, pandas, scikit-learn]
---

# Vertical Federated Learning with Flower

This example will showcase how you can perform Vertical Federated Learning using
Flower. We'll be using the [Titanic dataset](https://www.kaggle.com/competitions/titanic/data)
Flower. We'll be using the [Titanic dataset](https://huggingface.co/datasets/julien-c/titanic-survival)
to train simple regression models for binary classification. We will go into
more details below, but the main idea of Vertical Federated Learning is that
each client is holding different feature sets of the same dataset and that the
Expand Down Expand Up @@ -40,29 +40,58 @@ git clone --depth=1 https://github.com/adap/flower.git _tmp \
```

This will create a new directory called `vertical-fl` with the following structure:
following files:

```shell
vertical-fl
├── vertical_fl
│ ├── __init__.py
│ ├── client_app.py # Defines your ClientApp
│ ├── server_app.py # Defines your ServerApp
│ ├── strategy.py # Defines your Strategy
│ └── task.py # Defines your model, training and data loading
├── pyproject.toml # Project metadata like dependencies and configs
├── data/train.csv
└── README.md
```

### Install dependencies and project

Install the dependencies defined in `pyproject.toml` as well as the `mlxexample` package.
Install the dependencies defined in `pyproject.toml` as well as the `vertical_fl` package.

```bash
pip install -e .
```

## Vertical data partitioning

In this example we use the [VerticalSizePartitioner](https://flower.ai/docs/datasets/ref-api/flwr_datasets.partitioner.VerticalSizePartitioner.html#flwr_datasets.partitioner.VerticalSizePartitioner) from [Flower Datasets](https://flower.ai/docs/datasets/) to vertically split the dataset into 3 partitions (one for each client) with the target column (i.e. whether the passenger survived the Titanic sinking) being available at the `ServerApp` only.

```python
from flwr_datasets import FederatedDataset
from flwr_datasets.partitioner import VerticalSizePartitioner

partitioner = VerticalSizePartitioner(
partition_sizes=[2, 3, 2], # three partitions with 2,3 and 2 features
active_party_columns="Survived", # the target
active_party_columns_mode="create_as_last" # An additional partition will be created
) # that only contains the target column

fds = FederatedDataset(
dataset="julien-c/titanic-survival",
partitioners={"train": partitioner}
)
# Load all partitions
partitions = [fds.load_partition(i) for i in range(fds.partitioners["train"].num_partitions)]

for partition in partitions:
print(partition.column_names)

# ['Age', 'Sex'] <----------------------------------- ClientApp #0
# ['Fare', 'Siblings/Spouses Aboard', 'Name'] <------ ClientApp #1
# ['Parents/Children Aboard', 'Pclass'] <------------ ClientApp #2
# ['Survived'] <--------------------------------------ServerApp
```

You can control the number of partitions as well as how many features each have by modifying `feature-splits` (defatuls to \[`2,3,2`\]) in the `[tool.flwr.app.config]` section of the `pyproject.toml`.

## Run the project

You can run your Flower project in both _simulation_ and _deployment_ mode without making changes to the code. If you are starting with Flower, we recommend you using the _simulation_ mode as it requires fewer components to be launched manually. By default, `flwr run` will make use of the Simulation Engine.
Expand All @@ -72,18 +101,44 @@ You can run your Flower project in both _simulation_ and _deployment_ mode witho
> [!NOTE]
> Check the [Simulation Engine documentation](https://flower.ai/docs/framework/how-to-run-simulations.html) to learn more about Flower simulations and how to optimize them.

By default, the example runs for 250 rounds using three clients. Launch it with defaul settings with:

```bash
flwr run .
```

The expected last lines of the log should look like:

```shell
...
INFO : --- ServerApp Round 250 / 250 ---
INFO : Requesting embeddings from 3 nodes...
INFO : Received 3/3 results
INFO : Round 249, Loss: 0.3892, Accuracy: 80.83%
INFO : Sending gradients to 3 nodes...
INFO :
INFO : === Final Results ===
INFO : Round 0 -> Loss: 0.7235 | Accuracy: 56.03%
INFO : Round 25 -> Loss: 0.6482 | Accuracy: 63.25%
INFO : Round 50 -> Loss: 0.6141 | Accuracy: 65.61%
INFO : Round 75 -> Loss: 0.5654 | Accuracy: 69.22%
INFO : Round 100 -> Loss: 0.5161 | Accuracy: 72.60%
INFO : Round 125 -> Loss: 0.4967 | Accuracy: 73.51%
INFO : Round 150 -> Loss: 0.4562 | Accuracy: 75.31%
INFO : Round 175 -> Loss: 0.4392 | Accuracy: 77.56%
INFO : Round 200 -> Loss: 0.4222 | Accuracy: 79.14%
INFO : Round 225 -> Loss: 0.4043 | Accuracy: 78.92%
INFO : Round 249 -> Loss: 0.3892 | Accuracy: 81.83%
```

You can also override some of the settings for your `ClientApp` and `ServerApp` defined in `pyproject.toml`. For example:

```bash
flwr run . --run-config "num-server-rounds=5 learning-rate=0.05"
flwr run . --run-config "num-server-rounds=500 learning-rate=0.05"
```

### Run with the Deployment Engine

Follow this [how-to guide](https://flower.ai/docs/framework/how-to-run-flower-with-deployment-engine.html) to run the same app in this example but with Flower's Deployment Engine. After that, you might be intersted in setting up [secure TLS-enabled communications](https://flower.ai/docs/framework/how-to-enable-tls-connections.html) and [SuperNode authentication](https://flower.ai/docs/framework/how-to-authenticate-supernodes.html) in your federation.

Follow this [how-to guide](https://flower.ai/docs/framework/how-to-run-flower-with-deployment-engine.html) to run the same app in this example but with Flower's Deployment Engine. After that, you might be interested in setting up [secure TLS-enabled communications](https://flower.ai/docs/framework/how-to-enable-tls-connections.html) and [SuperNode authentication](https://flower.ai/docs/framework/how-to-authenticate-supernodes.html) in your federation.
If you are already familiar with how the Deployment Engine works, you may want to learn how to run it using Docker. Check out the [Flower with Docker](https://flower.ai/docs/framework/docker/index.html) documentation.
Loading