WebLLM on WebGPU
Blazing fast inference with WebGPU and WebLLM running locally in your browser.
Load Model
Download
Prompt:
-
Completion:
-
Prefill:
-
Decoding:
-