llama-fpga is (most probably) the world’s first open-source project for building an FPGA-based Large Language Model (LLM) accelerator, capable of running LLaMA2-7B in AWQ 4-bit quantized format. This ...