README.md 3.13 KB
Newer Older
1
2
3
4
5
6
7
8
9
10
<font size=4><b>Deep Learning with Differential Privacy</b></font>

Authors:
Martín Abadi, Andy Chu, Ian Goodfellow, H. Brendan McMahan, Ilya Mironov, Kunal Talwar, Li Zhang

Open Sourced By: Xin Pan (xpan@google.com, github: panyx0718)


<Introduction>

Neal Wu's avatar
Neal Wu committed
11
12
13
14
15
16
17
18
Machine learning techniques based on neural networks are achieving remarkable
results in a wide variety of domains. Often, the training of models requires
large, representative datasets, which may be crowdsourced and contain sensitive
information. The models should not expose private information in these datasets.
Addressing this goal, we develop new algorithmic techniques for learning and a
refined analysis of privacy costs within the framework of differential privacy.
Our implementation and experiments demonstrate that we can train deep neural
networks with non-convex objectives, under a modest privacy budget, and at a
19
20
21
22
23
24
25
26
27
28
29
manageable cost in software complexity, training efficiency, and model quality.

paper: https://arxiv.org/abs/1607.00133


<b>Requirements:</b>

1. Tensorflow 0.10.0 (master branch)

Note: r0.11 might experience some problems

30
2. Bazel 0.3.1 (<em>Optional</em>)
31

32
33
34
35
36
37
38
3. Download MNIST data (tfrecord format) <br>
   ```shell
   cd models/research/slim
   DATA_DIR=/tmp/mnist/
   mkdir /tmp/mnist
   python download_and_convert_data.py --dataset_name=mnist --dataset_dir="${DATA_DIR}"
   ```
39
40
41
42
43
44
45

<b>How to run:</b>

```shell
# Clone the codes under differential_privacy.
# Create an empty WORKSPACE file.

46
# List the codes (Optional).
Neal Wu's avatar
Neal Wu committed
47
$ ls -R differential_privacy/
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
differential_privacy/:
dp_sgd  __init__.py  privacy_accountant  README.md

differential_privacy/dp_sgd:
dp_mnist  dp_optimizer  per_example_gradients  README.md

differential_privacy/dp_sgd/dp_mnist:
BUILD  dp_mnist.py

differential_privacy/dp_sgd/dp_optimizer:
BUILD  dp_optimizer.py  dp_pca.py  sanitizer.py  utils.py

differential_privacy/dp_sgd/per_example_gradients:
BUILD  per_example_gradients.py

differential_privacy/privacy_accountant:
python  tf

differential_privacy/privacy_accountant/python:
BUILD  gaussian_moments.py

differential_privacy/privacy_accountant/tf:
accountant.py  accountant_test.py  BUILD

72
73
74
# List the data (optional).
$ mv /tmp/mnist/mnist_train.tfrecord data
$ mv /tmp/mnist/mnist_test.tfrecord data
Neal Wu's avatar
Neal Wu committed
75
$ ls -R data/
76
77
78
79

./data:
mnist_test.tfrecord  mnist_train.tfrecord

80
# Build the codes (optional).
Neal Wu's avatar
Neal Wu committed
81
$ bazel build -c opt differential_privacy/...
82

83
84
# Run the mnist differential privacy training codes.
# 1. With bazel
Neal Wu's avatar
Neal Wu committed
85
$ bazel-bin/differential_privacy/dp_sgd/dp_mnist/dp_mnist \
86
87
88
89
    --training_data_path=data/mnist_train.tfrecord \
    --eval_data_path=data/mnist_test.tfrecord \
    --save_path=/tmp/mnist_dir

90
91
92
# 2. Or without (by default data is in /tmp/mnist)
python dp_sgd/dp_mnist/dp_mnist.py  

93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
...
step: 1
step: 2
...
step: 9
spent privacy: eps 0.1250 delta 0.72709
spent privacy: eps 0.2500 delta 0.24708
spent privacy: eps 0.5000 delta 0.0029139
spent privacy: eps 1.0000 delta 6.494e-10
spent privacy: eps 2.0000 delta 8.2242e-24
spent privacy: eps 4.0000 delta 1.319e-51
spent privacy: eps 8.0000 delta 3.3927e-107
train_accuracy: 0.53
eval_accuracy: 0.53
...

Neal Wu's avatar
Neal Wu committed
109
$ ls /tmp/mnist_dir/
110
111
checkpoint  ckpt  ckpt.meta  results-0.json
```