Show HN: LLM, a Rust Crate/CLI for CPU Inference of LLMs (LLaMA, GPT-NeoX, etc.) https://ift.tt/YKTVRpD
Show HN: LLM, a Rust Crate/CLI for CPU Inference of LLMs (LLaMA, GPT-NeoX, etc.) G'day, HN! I'm one of the maintainers of `llm`. I've been working alongside a trusty group of contributors to bring this project to life, and we're now at a point where we're ready to share it with the world. Large language models (LLMs) are taking the computing world by storm due to their emergent abilities that allow them to perform a wide variety of tasks, including translation, summarization, code generation, and even some degree of reasoning. However, the ecosystem around LLMs is still in its infancy, and it can be difficult to get started with these models. `llm` is a one-stop shop for running inference on large language models (of the kind that power ChatGPT and more); we provide a CLI and a Rust crate for running inference on these models, all entirely open-source. The crate can be embedded in your own projects, allowing you to easily integrate LLMs into your own applications. We hope that `llm` can help to alleviate some of the pain points that users face when working with LLMs. Our goal is to build a robust solution for inferencing on LLMs that users can rely on for their projects, so that we can provide a moment of peace in the chaos of the LLM ecosystem. At present, we are powered by `ggml` (similar to `llama.cpp`), but we intend to add additional backends in the near-future. This means that we currently only support CPU inference, but we have several ideas in mind for how to add GPU support, as well as other accelerators. We're looking for feedback on the project, and we'd love to hear from you! If you're interested in contributing, please reach out to us on our Discord ( https://ift.tt/MdZY0s9 ), or post an issue on our GitHub ( https://ift.tt/vX4NMUp ). https://ift.tt/yKMsW2q May 9, 2023 at 05:37PM
No comments