Use Baseten for production ML model deployments with optimized inference, autoscaling, and multi-cloud support.
Clone the repository​
git clone https://github.com/Liquid4All/lfm-inference
The deployment script is based on Baseten’s documentation of Run any LLM with vLLM.
Launch command:
cd bastenpip install trusstruss push lfm2-8b --publish
curl -X POST https://<model-id>.api.baseten.co/environments/production/predict \ -H "Authorization: Api-Key $BASETEN_API_KEY" \ -d '{ "model": "LiquidAI/LFM2-8B-A1B", "messages": [ { "role": "user", "content": "What is the melting temperature of silver?" } ], "max_tokens": 32, "temperature": 0}'
Baseten endpoints expect the Api-Key prefix in the Authorization header.
Edit this page