On Huggingface is a space where you can select the model and your graphics card and see if you can run it, or how many cards you need to run it. https://huggingface.co/spaces/Vokturz/can-it-run-llm
You should be able to do inference on all 7b or smaller models with quantization.
This is probably the only reason microsoft recall exists, as it is completely useless for anything else.