karpathy / llama2.c
Inference Llama 2 in one file of pure C
{{ message }}
See what the GitHub community is most excited about today.
Inference Llama 2 in one file of pure C
Run Llama 2 locally with gradio UI on GPU or CPU from anywhere (Linux/Windows/Mac). Supporting Llama-2-7B/13B/70B with 8-bit, 4-bit. Supporting GPU inference (6 GB VRAM) and CPU inference.
Chat with your documents on your local device using GPT models. No data leaves your device and 100% private.
The simplest, fastest repository for training/finetuning medium-sized GPTs.
Meta-Transformer for Unified Multimodal Learning
弿ºç¤¾åºç¬¬ä¸ä¸ªè½ä¸è½½ãè½è¿è¡ç䏿 LLaMA2 模åï¼
Python package for easily interfacing with chat apps, with robust features and minimal code complexity.
An implementation of "Retentive Network: A Successor to Transformer for Large Language Models"
Running Llama 2 and other Open-Source LLMs on CPU Inference Locally for DocumentÂ?Q&A
Llamaä¸æç¤¾åºï¼æå¥½ç䏿Llama大模åï¼å®å ¨å¼æºå¯åç¨
Interact privately with your documents using the power of GPT, 100% privately, no data leaks
A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)
bot for 2022 r/place
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
Nginxpwner is a simple tool to look for common Nginx misconfigurations and vulnerabilities.
DialogStudio: Towards Richest and Most Diverse Unified Dataset Collection and Instruction-Aware Models for Conversational AI
Firefox Decrypt is a tool to extract passwords from Mozilla (Firefoxâ¢, Waterfoxâ¢, Thunderbird®, SeaMonkey®) profiles
Seamlessly integrate powerful language models like ChatGPT into scikit-learn for enhanced text analysis tasks.
Play LLaMA2 (official / 䏿ç / INT4 / llama2.cpp) Together! ONLY 3 STEPS! ( non GPU / 5GB vRAM / 8~14GB vRAM)
All Algorithms implemented in Python
ä¸ä¸ªç®çº¦çé³ä¹ä¸è½½å·¥å ·