".github/vscode:/vscode.git/clone" did not exist on "3d0282cf7bc460b941fc1cb85d6abf9bb3c2c885"
index.md 1.48 KB
Newer Older
chenzk's avatar
v1.0  
chenzk committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
---
layout: default
title: K Means using PyTorch
---

PyTorch implementation of kmeans for utilizing GPU

![Alt Text](https://media.giphy.com/media/WsYIwIHHXUcuiR8BeS/giphy.gif)

# Getting Started
```

import torch
import numpy as np
from kmeans_pytorch import kmeans

# data
data_size, dims, num_clusters = 1000, 2, 3
x = np.random.randn(data_size, dims) / 6
x = torch.from_numpy(x)

# kmeans
cluster_ids_x, cluster_centers = kmeans(
    X=x, num_clusters=num_clusters, distance='euclidean', device=torch.device('cuda:0')
)
```

see [`example.ipynb`](https://github.com/subhadarship/kmeans_pytorch/blob/master/example.ipynb) for a more elaborate example

# Requirements
* [PyTorch](http://pytorch.org/) version >= 1.0.0
* Python version >= 3.6

# Installation

install with `pip`:
```
pip install kmeans-pytorch
```

**Installing from source**

To install from source and develop locally:
```
git clone https://github.com/subhadarship/kmeans_pytorch
cd kmeans_pytorch
pip install --editable .
```

# CPU vs GPU
see [`cpu_vs_gpu.ipynb`](https://github.com/subhadarship/kmeans_pytorch/blob/master/cpu_vs_gpu.ipynb) for a comparison between CPU and GPU

# Notes
- useful when clustering large number of samples
- utilizes GPU for faster matrix computations
- support euclidean and cosine distances (for now)

# Credits
- This implementation closely follows the style of [this](https://github.com/overshiki/kmeans_pytorch)
- Documentation is done using the awesome theme [jekyllbook](https://github.com/ebetica/jekyllbook)