android.md 3.79 KB
Newer Older
xuxzh1's avatar
init  
xuxzh1 committed
1
2
3
4

# Android

## Build on Android using Termux
xuxzh1's avatar
update  
xuxzh1 committed
5
6
7
8
9
10
11
12
13
14
15
16
17
18

[Termux](https://termux.dev/en/) is an Android terminal emulator and Linux environment app (no root required). As of writing, Termux is available experimentally in the Google Play Store; otherwise, it may be obtained directly from the project repo or on F-Droid.

With Termux, you can install and run `llama.cpp` as if the environment were Linux. Once in the Termux shell:

```
$ apt update && apt upgrade -y
$ apt install git cmake
```

Then, follow the [build instructions](https://github.com/ggerganov/llama.cpp/blob/master/docs/build.md), specifically for CMake.

Once the binaries are built, download your model of choice (e.g., from Hugging Face). It's recommended to place it in the `~/` directory for best performance:

xuxzh1's avatar
init  
xuxzh1 committed
19
```
xuxzh1's avatar
update  
xuxzh1 committed
20
$ curl -L {model-url} -o ~/{model}.gguf
xuxzh1's avatar
init  
xuxzh1 committed
21
22
```

xuxzh1's avatar
update  
xuxzh1 committed
23
24
Then, if you are not already in the repo directory, `cd` into `llama.cpp` and:

xuxzh1's avatar
init  
xuxzh1 committed
25
```
xuxzh1's avatar
update  
xuxzh1 committed
26
$ ./build/bin/llama-simple -m ~/{model}.gguf -c {context-size} -p "{your-prompt}"
xuxzh1's avatar
init  
xuxzh1 committed
27
28
```

xuxzh1's avatar
update  
xuxzh1 committed
29
30
31
32
33
34
35
36
Here, we show `llama-simple`, but any of the executables under `examples` should work, in theory. Be sure to set `context-size` to a reasonable number (say, 4096) to start with; otherwise, memory could spike and kill your terminal.

To see what it might look like visually, here's an old demo of an interactive session running on a Pixel 5 phone:

https://user-images.githubusercontent.com/271616/225014776-1d567049-ad71-4ef2-b050-55b0b3b9274c.mp4

## Cross-compile using Android NDK
It's possible to build `llama.cpp` for Android on your host system via CMake and the Android NDK. If you are interested in this path, ensure you already have an environment prepared to cross-compile programs for Android (i.e., install the Android SDK). Note that, unlike desktop environments, the Android environment ships with a limited set of native libraries, and so only those libraries are available to CMake when building with the Android NDK (see: https://developer.android.com/ndk/guides/stable_apis.)
xuxzh1's avatar
init  
xuxzh1 committed
37

xuxzh1's avatar
update  
xuxzh1 committed
38
Once you're ready and have cloned `llama.cpp`, invoke the following in the project directory:
xuxzh1's avatar
init  
xuxzh1 committed
39
40

```
xuxzh1's avatar
update  
xuxzh1 committed
41
42
43
44
45
46
47
48
49
$ cmake \
  -DCMAKE_TOOLCHAIN_FILE=$ANDROID_NDK/build/cmake/android.toolchain.cmake \
  -DANDROID_ABI=arm64-v8a \
  -DANDROID_PLATFORM=android-28 \
  -DCMAKE_C_FLAGS="-march=armv8.7a" \
  -DCMAKE_CXX_FLAGS="-march=armv8.7a" \
  -DGGML_OPENMP=OFF \
  -DGGML_LLAMAFILE=OFF \
  -B build-android
xuxzh1's avatar
init  
xuxzh1 committed
50
51
```

xuxzh1's avatar
update  
xuxzh1 committed
52
53
54
55
56
Notes:
  - While later versions of Android NDK ship with OpenMP, it must still be installed by CMake as a dependency, which is not supported at this time
  - `llamafile` does not appear to support Android devices (see: https://github.com/Mozilla-Ocho/llamafile/issues/325)

The above command should configure `llama.cpp` with the most performant options for modern devices. Even if your device is not running `armv8.7a`, `llama.cpp` includes runtime checks for available CPU features it can use.
xuxzh1's avatar
init  
xuxzh1 committed
57

xuxzh1's avatar
update  
xuxzh1 committed
58
Feel free to adjust the Android ABI for your target. Once the project is configured:
xuxzh1's avatar
init  
xuxzh1 committed
59
60

```
xuxzh1's avatar
update  
xuxzh1 committed
61
62
$ cmake --build build-android --config Release -j{n}
$ cmake --install build-android --prefix {install-dir} --config Release
xuxzh1's avatar
init  
xuxzh1 committed
63
64
```

xuxzh1's avatar
update  
xuxzh1 committed
65
After installing, go ahead and download the model of your choice to your host system. Then:
xuxzh1's avatar
init  
xuxzh1 committed
66
67

```
xuxzh1's avatar
update  
xuxzh1 committed
68
69
70
71
$ adb shell "mkdir /data/local/tmp/llama.cpp"
$ adb push {install-dir} /data/local/tmp/llama.cpp/
$ adb push {model}.gguf /data/local/tmp/llama.cpp/
$ adb shell
xuxzh1's avatar
init  
xuxzh1 committed
72
73
```

xuxzh1's avatar
update  
xuxzh1 committed
74
75
In the `adb shell`:

xuxzh1's avatar
init  
xuxzh1 committed
76
```
xuxzh1's avatar
update  
xuxzh1 committed
77
78
$ cd /data/local/tmp/llama.cpp
$ LD_LIBRARY_PATH=lib ./bin/llama-simple -m {model}.gguf -c {context-size} -p "{your-prompt}"
xuxzh1's avatar
init  
xuxzh1 committed
79
80
```

xuxzh1's avatar
update  
xuxzh1 committed
81
That's it!
xuxzh1's avatar
init  
xuxzh1 committed
82

xuxzh1's avatar
update  
xuxzh1 committed
83
Be aware that Android will not find the library path `lib` on its own, so we must specify `LD_LIBRARY_PATH` in order to run the installed executables. Android does support `RPATH` in later API levels, so this could change in the future. Refer to the previous section for information about `context-size` (very important!) and running other `examples`.