Every time I open the right sidebar in VSCode, the Copilot panel shows up. I hid it countless times. Just one of the many ways companies shit all over your preferences nowadays
@sergaderg Oh yeah, that completely slipped my mind. And yet, it doesn't seem like it helps a lot considering the massive hardware requirements.
edit: I looked into the performance characteristics and it seems there's a threshold of batch size 64 after which performance stops improving. On a scale of millions of requests, that's pretty much negligible.
Snapchat made AI pictures with my face and added them to my memories and I don’t remember ever giving them consent or clicking anything
Tech companies on their way to create the most dystopian future possible. Also love the contradiction of having pictures that never existed appear in a section named "memories".
We recently got a phishing mail at work which would've been incredibly convincing if they had spoofed a proper sender and made the link target domains more plausible. There were no weird grammar errors and they used your full name.
Context is that GMail crippled newsletter emails from that website. It took German emails, interpreted them as English and "translated" them into German.
I remember when a teacher played a game with us in 5th grade where we had to send a text through google translate over and over again to get hilarious results. GMail gives you that automatically now.
When you run an LLM, and then another one for a different user, they will use twice the amount of VRAM and twice the number of cores to get the same performance as the original single run.
Let's say you have a database server used by one application, and then you add another application. How much do the resource requirements increase? Not by another 100%, that's for sure.
The problem tools like Cursor have is that unlike classic software, AI is horrible to run at scale. With something like a social network, the cost per user goes down as the number of user increases. With AI, you can't have this kind of parallelism that brings the cost down and that means there's linear growth. Computations on the GPU are specific to one model invocation, and a model invocation can't handle multiple requests at once.