cross-posted from: https://lemmy.world/post/1894070
> ## **Welcome to the Llama-2 FOSAI & LLM Roundup Series!**
>
> **(Summer 2023 Edition)**
>
> Hello everyone!
>
> The wave of innovation I mentioned in our [Llama-2 announcement](https://lemmy.world/post/1750098) is already on its way. The first tsunami of base models and configurations are being released as you read this post.
>
> That being said, I'd like to take a moment to shoutout [TheBloke](https://huggingface.co/TheBloke), who is rapidly converting many of these models for the greater good of FOSS & FOSAI.
>
> You can support [TheBloke](https://huggingface.co/TheBloke) here.
> - https://ko-fi.com/TheBlokeAI
>
> Below you will find all of the latest Llama-2 models that are FOSAI friendly. This means they are commercially available, ready to use, and open for development. I will be continuing this series exclusively for Llama models. I have a feeling it will continue being a popular choice for quite some time. I will consider giving other foundational models a similar series if they garner enough support and consideration. For now, enjoy this new herd of Llamas!
>
> All that you need to get started is capable hardware and a few moments setting up your inference platform (selected from any of your preferred software choices in the [Lemmy Crash Course for Free Open-Source AI](https://lemmy.world/post/76020)
> or [FOSAI Nexus](https://lemmy.world/post/814816) resource, which is also shared at the bottom of this post).
>
> Keep reading to learn more about the exciting new models coming out of Llama-2!
>
> ### **8-bit System Requirements**
>
> | Model | VRAM Used | Minimum Total VRAM | Card Examples | RAM/Swap to Load* |
> |-----------|-----------|--------------------|-------------------|-------------------|
> | LLaMA-7B | 9.2GB | 10GB | 3060 12GB, 3080 10GB | 24 GB |
> | LLaMA-13B | 16.3GB | 20GB | 3090, 3090 Ti, 4090 | 32 GB |
> | LLaMA-30B | 36GB | 40GB | A6000 48GB, A100 40GB | 64 GB |
> | LLaMA-65B | 74GB | 80GB | A100 80GB | 128 GB |
>
> ### **4-bit System Requirements**
>
> | Model | Minimum Total VRAM | Card Examples | RAM/Swap to Load* |
> |-----------|--------------------|--------------------------------|-------------------|
> | LLaMA-7B | 6GB | GTX 1660, 2060, AMD 5700 XT, RTX 3050, 3060 | 6 GB |
> | LLaMA-13B | 10GB | AMD 6900 XT, RTX 2060 12GB, 3060 12GB, 3080, A2000 | 12 GB |
> | LLaMA-30B | 20GB | RTX 3080 20GB, A4500, A5000, 3090, 4090, 6000, Tesla V100 | 32 GB |
> | LLaMA-65B | 40GB | A100 40GB, 2x3090, 2x4090, A40, RTX A6000, 8000 | 64 GB |
>
> *System RAM (not VRAM), is utilized to initially load a model. You can use swap space if you do not have enough RAM to support your LLM.
>
> ---
>
> ### **The Bloke**
> One of the most popular and consistent developers releasing consumer-friendly versions of LLMs. These active conversions of trending models allow for many of us to run these GPTQ or GGML variants at home on our own PCs and hardware.
>
> **70B**
>
> - [TheBloke/Llama-2-70B-chat-GPTQ](https://huggingface.co/TheBloke/Llama-2-70B-chat-GPTQ)
>
> - [TheBloke/Llama-2-70B-Chat-fp16](https://huggingface.co/TheBloke/Llama-2-70B-Chat-fp16)
>
> - [TheBloke/Llama-2-70B-GPTQ](https://huggingface.co/TheBloke/Llama-2-70B-GPTQ)
>
> - [TheBloke/Llama-2-70B-fp16](https://huggingface.co/TheBloke/Llama-2-70B-fp16)
>
> **13B**
>
> - [TheBloke/Llama-2-13B-chat-GPTQ](https://huggingface.co/TheBloke/Llama-2-13B-chat-GPTQ)
>
> - [TheBloke/Llama-2-13B-chat-GGML](https://huggingface.co/TheBloke/Llama-2-13B-chat-GGML)
>
> - [TheBloke/Llama-2-13B-GPTQ](https://huggingface.co/TheBloke/Llama-2-13B-GPTQ)
>
> - [TheBloke/Llama-2-13B-GGML](https://huggingface.co/TheBloke/Llama-2-13B-GGML)
>
> - [TheBloke/Llama-2-13B-fp16](https://huggingface.co/TheBloke/Llama-2-13B-fp16)
>
> **7B**
>
> - [TheBloke/Llama-2-7B-GPTQ](https://huggingface.co/TheBloke/Llama-2-7B-GPTQ)
>
> - [TheBloke/Llama-2-7B-GGML)](https://huggingface.co/TheBloke/Llama-2-7B-GGML)
>
> - [TheBloke/Llama-2-7B-fp16](https://huggingface.co/TheBloke/Llama-2-7B-fp16)
>
> - [TheBloke/Llama-2-7B-fp16](https://huggingface.co/TheBloke/Llama-2-7B-fp16)
>
> - [TheBloke/Llama-2-7b-Chat-GPTQ](https://huggingface.co/TheBloke/Llama-2-7b-Chat-GPTQ)
>
> ### **LLongMA**
> LLongMA-2, a suite of Llama-2 models, trained at 8k context length using linear positional interpolation scaling.
>
> **13B**
>
> - [conceptofmind/LLongMA-2-13b](https://huggingface.co/conceptofmind/LLongMA-2-13b)
>
> **7B**
>
> - [conceptofmind/LLongMA-2-7b](https://huggingface.co/conceptofmind/LLongMA-2-7b)
>
> Also available from The Bloke in GPTQ and GGML formats:
>
> **7B**
>
> - [TheBloke/LLongMA-2-7B-GPTQ](https://huggingface.co/TheBloke/LLongMA-2-7B-GPTQ)
>
> - [TheBloke/LLongMA-2-7B-GGML](https://huggingface.co/TheBloke/LLongMA-2-7B-GGML)
>
> ### **Puffin**
> The first commercially available language model released by Nous Research! Available in 13B parameters.
>
> **13B**
>
> - [NousResearch/Redmond-Puffin-13B-GGML](https://huggingface.co/NousResearch/Redmond-Puffin-13B-GGML)
>
> - [NousResearch/Redmond-Puffin-13B](https://huggingface.co/NousResearch/Redmond-Puffin-13B)
>
> Also available from The Bloke in GPTQ and GGML formats:
>
> **13B**
>
> - [TheBloke/Redmond-Puffin-13B-GPTQ](https://huggingface.co/TheBloke/Redmond-Puffin-13B-GPTQ)
>
> - [TheBloke/Redmond-Puffin-13B-GGML](https://huggingface.co/TheBloke/Redmond-Puffin-13B-GGML)
>
> ### **Other Models**
> Leaving a section here for 'other' LLMs or fine tunings derivative of Llama-2 models.
>
> **7B**
>
> - [georgesung/llama2_7b_chat_uncensored](https://huggingface.co/georgesung/llama2_7b_chat_uncensored)
>
> ---
>
> ### **Getting Started w/ FOSAI!**
>
> Have no idea where to begin with AI/LLMs? [Try starting here with ](https://understandgpt.ai/docs/getting-started/what-is-a-llm) [UnderstandGPT](https://understandgpt.ai/) to learn the basics of LLMs before visiting our [Lemmy Crash Course for Free Open-Source AI](https://lemmy.world/post/76020)
>
> If you're looking to explore more resources, see our [FOSAI Nexus](https://lemmy.world/post/814816) for a list of all the major FOSS/FOSAI in the space.
>
> If you're looking to jump right in, visit some of the links below and stick to models that are <13B in parameter (unless you have the power and hardware to spare).
>
> **FOSAI Resources**
>
> **Fediverse / FOSAI**
> - [The Internet is Healing](https://www.youtube.com/watch?v=TrNE2fSCeFo)
> - [FOSAI Welcome Message](https://lemmy.world/post/67758)
> - [FOSAI Crash Course](https://lemmy.world/post/76020)
> - [FOSAI Nexus Resource Hub](https://lemmy.world/post/814816)
>
> **LLM Leaderboards**
> - [HF Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
> - [LMSYS Chatbot Arena](https://chat.lmsys.org/?leaderboard)
>
> **LLM Search Tools**
> - [LLM Explorer](https://llm.extractum.io/)
> - [Open LLMs](https://github.com/eugeneyan/open-llms)
>
> ### **GL, HF!**
>
> If you found anything about this post interesting - consider subscribing to
[email protected] where I do my best to keep you in the know about the most important updates in free open-source artificial intelligence.
>
> I will try to continue doing this series season by season, making this a living post for the rest of this summer. If I have missed a noteworthy model, don't hesitate to let me know in the comments so I can keep this resource up-to-date.
>
> Thank you for reading! I hope you find what you're looking for. Be sure to subscribe and bookmark the [main post](https://lemmy.world/post/1894070) if you want a quick one-stop shop for all of the new Llama-2 models that will be emerging the rest of this summer!