Loading the model
While models can be directly used on the cloud-based APIs once the API client is created, LeapSDK requires the developers to explicitly load the model before requesting the generation. It is necessary because the model will run locally. This step generally takes a few seconds depending on the model size and the device performance. On cloud API, you need to create a API client:Request for generation
In the cloud API calls,client.chat.completions.create will return a stream object for
caller to fetch the generated contents.
generateResponse on
the conversation object to obtain a Swift AsyncStream (equivalent to a Python stream) for generation. Since
the model runner object contains all information about the model, we don’t need to indicate the model name
in the call again.
Process generated contents
In clould API Python code, a for-loop on the stream object retrieves the contents.for await loop on the Swift AsyncStream to process the content. When the completion is done,
a MessageResponse.complete case will be received.