• 13 Posts
  • 486 Comments
Joined 7 months ago
cake
Cake day: March 22nd, 2024

help-circle
  • Pretty much everything has an API :P

    ollama is OK because its easy and automated, but you can get higher performance, better vram efficiency, and better samplers from either kobold.cpp or tabbyAPI, with the catch being that more manual configuration is required. But this is good, as it “forces” you to pick and test an optimal config for your system.

    I’d recommend kobold.cpp for very short context (like 6K or less) or if you need to partially offload the model to CPU because your GPU is relatively low VRAM. Use a good IQ quantization (like IQ4_M, for instance).

    Otherwise use TabbyAPI with an exl2 quantization, as it’s generally faster (but GPU only) and much better at long context through its great k/v cache quantization.

    They all have OpenAI APIs, though kobold.cpp also has its own web ui.


  • No, it’s a nail biting toss-up, well within the margin of statistical error:

    https://projects.fivethirtyeight.com/2024-election-forecast/

    Our latest forecast shows a toss-up race between Vice President Kamala Harris and former President Donald Trump. Harris has a 56-in-100 chance of winning the majority of Electoral College votes, according to our model on Tuesday, Oct. 1 at 11 a.m. Eastern. To put that into perspective, it’s somewhere between the probability of flipping a coin and getting heads, and flipping a coin twice and getting at least one heads. In other words: It’s close! (Go ahead and grab a coin and test this out for yourself; it happens more than you think!)

    But the race being close in terms of win probability does not mean a big win is not possible for either candidate. Our model simulates thousands of potential outcomes for the election by adding randomly generated polling errors (of various severity) to our current forecast vote margins in each state. In 2020, polls overestimated Joe Biden’s margin over Donald Trump by about 4.5 points on average. If polls underestimate Harris by the same amount, she will win all of the states 538 currently rates as toss-up states up to and including Florida. If polls go the other way, and underestimate Trump again, the Republican will win all swing states up to (but not including) Minnesota. That would be 349 Electoral College votes for Harris, or 312 for Trump, in the case where they beat the polls.

    TL;DR Tiny errors translate to massive wins/losses.

    IMO Harris is the one that has to sweat this out, as negative press basically doesn’t affect Trump’s polls, while even the littlest slip ups could affect her, given the wave of positivity she’s riding now.



  • I wouldn’t assume it’s a brigade, honestly most of Lemmy.world is so mega American left that anything even remotely critical is downvoted.

    Like, I’m American. I’m a mega Trump and Republican hater, I freaking love Kamala, I’m not voting 3rd party atm, and I have family would punch Trump on sight lol, and I kinda disagree with you. I’m sympathetic to some radical policy.

    But even I feel alienated.

    It’s like I can’t even point out a post/source is straight up spam, fake, or propaganda, without getting downvoted to oblivion. I can’t acknowledge any postive thing any Republican has ever done, or an isolated policy that kinda makes sense. Nope, every single one should be in jail forever.

    You must be this left to participate in /c/politics, and everywhere it leaks.

    So… yeah, there’s my rant.













  • I have an old Lenovo laptop with an NVIDIA graphics card.

    @Maroon@lemmy.world The biggest question I have for you is what graphics card, but generally speaking this is… less than ideal.

    To answer your question, Open Web UI is the new hotness: https://github.com/open-webui/open-webui

    I personally use exui for a lot of my LLM work, but that’s because I’m an uber minimalist.

    And on your setup, I would host the best model you can on kobold.cpp or the built-in llama.cpp server (just not Ollama) and use Open Web UI as your front end. You can also use llama.cpp to host an embeddings model for RAG, if you wish.

    This is a general ranking of the “best” models for document answering and summarization: https://huggingface.co/spaces/vectara/Hallucination-evaluation-leaderboard

    …But generally, I prefer to not mess with RAG retrieval and just slap the context I want into the LLM myself, and for this, the performance of your machine is kind of critical (depending on just how much “context” you want it to cover). I know this is !selfhosted, but once you get your setup dialed in, you may consider making calls to an API like Groq, Cerebras or whatever, or even renting a Runpod GPU instance if that’s in your time/money budget.




  • A letter seen by Reuters, sent by Vivaldi, Waterfox, and Wavebox, and supported by a group of web developers, also supports Opera’s move to take the EC to court over its decision to exclude Microsoft Edge from being subject to the Digital Markets Act (DMA).

    OK…

    Shouldn’t they be fighting Chrome, more than anything? Surely there’s a legal avenue for that, though I guess there’s a risk of getting deprioritized by Google and basically disappearing.