Skip to main content
Latest version: v0.9.2

1. Prerequisites​

You should already have:
  • An Android project created in Android Studio. You may create an empty project with the wizard. LEAP Android SDK is Kotlin-first. We recommend to work with the SDK only in Kotlin.
  • Leap Android SDK needs Kotlin Android plugin v2.2.0 or above and Android Gradle Plugin v8.12.0 or above to build. Declare it in build.gradle.kts as
    plugins {
     id("com.android.application") version "8.13.2" apply false
     id("com.android.library") version "8.13.2" apply false
     id("org.jetbrains.kotlin.android") version "2.3.0" apply false
    }
    
  • A working Android device that supports arm64-v8a ABI with developer mode enabled. We recommend having 3GB+ of RAM to run the models.
  • The minimal SDK requirement is API 31. Declare it in build.gradle.kts as
    android { defaultConfig {  minSdk = 31  targetSdk = 36 }}
    
The SDK may crash on loading model bundles on simulators. A physical Android device is recommended.

2. Import the LeapSDK​

Add the following dependencies into $PROJECT_ROOT/app/build.gradle.kts:
dependencies {
  implementation("ai.liquid.leap:leap-sdk:0.9.2")
}
Then perform a project sync in Android Studio to fetch the LeapSDK artifacts.

3. Getting and Loading Models​

The SDK uses GGUF manifests for loading models (recommended for all new projects due to superior inference performance and better default generation parameters).
Legacy Executorch bundle support is available in the accordion below for existing projects.

Loading from GGUF Manifest

The LEAP Edge SDK supports directly downloading LEAP models in GGUF format. Given the model name and quantization method (which you can find in the LEAP Model Library), the SDK will automatically download the necessary GGUF files along with generation parameters for optimal performance. The LeapDownloader.loadModel suspend function loads a model and returns a model runner instance for invoking the model. This function takes some time to finish as loading the model is a heavy I/O operation, but it is safe to call on the main thread. The function should be executed in a coroutine scope.
try {
  val baseDir = File(context.filesDir, "model_files").absolutePath
  val modelDownloader = LeapDownloader(config = LeapDownloaderConfig(saveDir = baseDir))
  val modelRunner = modelDownloader.loadModel(
      modelSlug = "LFM2-1.2B",
      quantizationSlug = "Q5_K_M"
  )
}
catch (e: LeapModelLoadingException) {
  Log.e(TAG, "Failed to load the model. Error message: ${e.message}")
}
The SDK will automatically download the required GGUF files to the device’s cache and load the model with the appropriate generation parameters specified in the manifest.
Browse the Leap Model Library to find and download a model bundle that matches your needs.

Download and transfer bundle

Push the bundle file to the device using adb push. Assuming the downloaded model file is located at ~/Downloads/model.bundle, run the following commands:
adb shell mkdir -p /data/local/tmp/leap
adb push ~/Downloads/model.bundle /data/local/tmp/leap/model.bundle

Loading from local bundle file

The LeapClient.loadModel suspend function loads a model bundle file and returns a model runner instance for invoking the model. This function takes some time to finish as loading the model is a heavy I/O operation, but it is safe to call on the main thread. The function should be executed in a coroutine scope.
lifecycleScope.launch {
  try {
    modelRunner = LeapClient.loadModel("/data/local/tmp/leap/model.bundle")
  }
  catch (e: LeapModelLoadingException) {
    Log.e(TAG, "Failed to load the model. Error message: ${e.message}")
  }
}

4. Generate content with the model​

To generate content, a conversation object should be created from the model runner:
val conversation = modelRunner.createConversation()
With user input text, we can use Conversation.generateResponse function to invoke the generation. Its return value is a Kotlin asynchronous flow of MessageResponse, which can be processed with Kotlin flow operators:
val input = "Here is a user message!"val generationJob = lifecycleScope.launch {  conversation.generateResponse(input).onEach {    when (it) {      is MessageResponse.Chunk -> {        Log.d(TAG, "text chunk: ${it.text}")      }      is MessageResponse.ReasoningChunk -> {        Log.d(TAG, "reasoning chunk: ${it.text}")      }      else -> {        // ignore other response      }    }  }  .onCompletion {     Log.d(TAG, "Generation done!")  }  .catch { exception ->    Log.e(TAG, "Error in generation: $exception")  }  .collect()}
In this code snippet:
  • onEach callback will be called when the model generates a chunk of content.
  • onCompletion callback will be called when the generation is done. At this time point, conversation.history will have the latest message generated by the model.
  • catch callback will be called if an exception is thrown from the generation.
To interrupt the generation, simply cancel the generation job returned from the coroutine scope launch method:
generationJob.cancel()

5. Examples​

See LeapSDK-Examples for complete example apps using LeapSDK. Edit this page