I kept seeing people ask "Which model i can run on my gpu", "will model X fit on my GPU". Thats why I built a filter on whichllmmodel that lets you search models by what will actually fit on your hardware (8GB, 16GB, 24GB, etc.) at a given quantization level.
Comments URL: https://news.ycombinator.com/item?id=48681981
Points: 2
# Comments: 4