AI coding assistance discussion

igor_kavinski · Mar 2, 2025

Very relevant post: http://www.portvapes.co.uk/?id=Latest-exam-1Z0-876-Dumps&exid=threads/rdna4-cdna3-architectures-thread.2602668/post-41407076

igor_kavinski · Mar 3, 2025

moonshotai/Moonlight-16B-A3B · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

Pretty CRAZY model.

Don't believe me? Ask it something and watch it go. Like really, really go.

It doesn't stop until it runs against some sort of limit. Keeps going through different code possibilities.

C1 · Mar 3, 2025

In today's news (and hopefully not old news):

Google launches a free AI coding assistant – What to know

igor_kavinski · Mar 7, 2025

Speculative decoding now allows a larger 231B model to oversee the draft work of the smaller 13B model, resulting in improved response times.

igor_kavinski · Mar 7, 2025

igor_kavinski · Mar 7, 2025

WOW!

"Shut up and code!" and it actually complied!

igor_kavinski · Mar 7, 2025

YES!!!!!

igor_kavinski · Mar 7, 2025

Let me know if any CPU owners with AVX-512 support wish to run this.

igor_kavinski · Mar 7, 2025

It's LIVE!

Discussion - Rudi_Float_Bench v0.01a (AVX-512 supported!)

Download here (only ~12 KB in size): https://drive.google.com/file/d/1l7PU3W0u82iJovpbmJ9FTnhGMHJoVJGw/view?usp=sharing If it complains about missing vcruntime140 DLL something, install this: https://aka.ms/vs/17/release/vc_redist.x64.exe And now for some scores!

forums.anandtech.com

Kids can now create their own CPU benchmarks!

(yes, I'm a 44 year old kid...)

igor_kavinski · Mar 11, 2025

RAM latency checker: https://www.overclock.net/posts/29439133/

As described there, it's not the absolute latency but it seems fairly consistent.

Tested to work on Haswell and onwards. Don't think I can try it on my Epyc today so someone may wanna volunteer and test on their Ryzen? Thanks!

EDIT: Tested and working as intended on Tiger Lake. Average latency deviation isn't wild which means it can be useful.

Red Squirrel · Mar 19, 2025

Lol this is pretty funny and why it's rather important to understand how to code if you're going to use AI.

Red Squirrel · Mar 19, 2025

I'm starting to think this guy is trolling lol.

igor_kavinski · 2025-03-30T16:21:42-0400

A disappointment to report, hoping it would dissuade someone else from investing in expensive hardware (good thing LLM wasn't the only thing I bought the laptop for).

So my Thinkpad now has 128GB RAM and RTX 5000 16GB dGPU. I was hoping I would be able to run Llama 3.3 70B. It loads, at a context length of 16384 and consumes 71GB system RAM and all of VRAM. Unfortunately, the calculations are not offloaded to the GPU, despite lowering the core count to 1 and using all 80 cores of the GPU. It stays at 0% utilization. The processing happens on the CPU and even when setting it to max 6 cores (HT not supported by LM Studio I guess), the CPU utilization does not go beyond 17%. It gives a response, at the most horrible speed of something like 0.05 tokens per second or even lower. Gave up on it and now downloading another 8B LLM at F16 and Q8, to take advantage of speculative decoding. If I still don't get any GPU utilization, I will need to troubleshoot (maybe driver issue?).

AI coding assistance discussion

igor_kavinski

Lifer

igor_kavinski

Lifer

moonshotai/Moonlight-16B-A3B · Hugging Face

C1

Platinum Member

igor_kavinski

Lifer

igor_kavinski

Lifer

igor_kavinski

Lifer

igor_kavinski

Lifer

igor_kavinski

Lifer

igor_kavinski

Lifer

Discussion - Rudi_Float_Bench v0.01a (AVX-512 supported!)

igor_kavinski

Lifer

Red Squirrel

No Lifer

Red Squirrel

No Lifer

igor_kavinski

Lifer

TRENDING THREADS