Does anyone even know what MASM is anymore ?

Page 5 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

knutinh

Member
Jan 13, 2006
61
3
66
Never said it could. I was just disputing the claim that ASM will always be at least as fast as higher level languages.
Then you are argueing against straw men. I said that _optimal_ ASM will be at least as fast as compiled code. That fact is evident: ASM is a strict superset of compiled C code, it occupies a larger "space". Whatever a compiler does with C code, a team of monkeys and lots of time could (in principle) do with ASM. The opposite is not true.

I have made it clear that writing (and maintaining) that "optimal" ASM can either be "very hard" or "close to impossible" (just like writing a compiler seems like black magic to me).
Many of the cool tricks compilers play are simply too difficult to do by hand at any significant scale.
Sure. And many of the tricks that a programmer might pull off are too difficult to expect a generic compiler reading somewhat generic C code to be able to.

-k
 
Last edited:

knutinh

Member
Jan 13, 2006
61
3
66
ASM isn't important when it comes to performance. It isn't even the 10th step when you are looking at optimizing something.
I fully agree that for 99% of the applications, 99% of the time, ASM is not the solution. Lots of pain, lots of bugs, and the overall speedup may not be "enough".
Don't believe me? Then look at how little ASM is in performance critical things such as V8, Spidermonkey, the linux kernel, most VMs. Yet these things are constantly seeing performance improvements and gains.
The fact that these see performance improvements tells us nothing about how much ASM would have mattered? If some JavaScript app sees a 2x year-over-year performance improvement, does that mean that a low-level could not be 10x or 20x faster?

x264 has ASM for some platforms. Are you suggesting that the developers are wasting their time? If not, then we seem to be in agreement: ASM _can_ be the right thing in a specific scenario, but usually it is not.
http://git.videolan.org/?p=x264.git;a=history;f=common/x86;hb=HEAD

ffmpeg seems to have some ASM:
http://git.videolan.org/?p=ffmpeg.g...b157e6d4f69a70148a47071fc0b34d155f216;hb=HEAD

I don't know much about the applications that you are listing. It may well be that implementing them in e.g. C makes them "fast enough" and/or "about as fast as hardware allows". It is also possible that Linux kernel is willing to take some unknown performance hit in order to improve security, stability, recruit developers or some other nice-to-have that makes ASM a bad choice.
These are things that have performance at the top of their priority lists, yet they don't write things in ASM. Why? Because the gains are minimal/nonexistent for most application logic.
I think that is a hasty conclusion. I think that the (possible) lack of ASM in those applications is a reflection that:
1. Programmer time is expensive. Do any ASM, and the development cost increases.
2. Being able to run the same code across platforms is a neat way to have more customers.
3. Customers don't like their application crashing, lacking features or being 3 years late to market.
4. Many/most applications don't have a localized hotspot where the equivalent of 100 lines of C code takes >90% of the computation time
5. User satisfaction might scale somewhat lineary with execution time (within limits). I.e. if a operation takes 50% longer in Excel, the user would only be somewhat less happy. If you are a pace-maker customer, then being 50% late with a heart-beat should make you a grumpy customer.

The (possible) speedup of ASM (or more generally: optimization techniques such as intrinsics/pragmas/compiler switches/choice of compiler/...) would have to be pitted up against 1-4.
On top of that, it excludes the application from doing compiler optimizations on the ASM block.
Well, if it is faster then it is faster. If it aint, then it aint.

-k
 
Last edited:

knutinh

Member
Jan 13, 2006
61
3
66
One thing I find interesting is comparing common attitudes towards software efficiency improvements vs hardware efficiency improvements.

At this point, new generations of desktop/laptop processors are increasing performance on average roughly 10% for a uarch update. perf/W may be increased somewhat more, but not a lot more. While many people do find something like Ivy Bridge good enough there was still enough demand in something like Haswell, at least enough to justify what had to have been hundreds of millions of dollars revising the core uarch (as opposed to uncore, packaging, etc). You would think there'd have to be at least some set of programs where that relatively low performance and perf/W improvement was justified. Should it not also be the case that there's some set of software where such incremental improvements would also be justified? But for most people, the idea of improving any software's performance by 10% every other year or so is viewed as absurd. Even improving things by 40-50% isn't very interesting, while in the hardware world that's enough to make AMD look like a joke in single threaded performance to the eyes of many.

Granted, improving software performance doesn't always mean improving perf/W. There could even be cases where perf/W goes down. But generally perf/W will probably improve too, and if that's more the interesting optimization point you're probably not going to get something with great perf/W by accident either.
I think that your thought is interesting.

If my hardware has an overall efficiency boost of 10%, then I expect it to apply to all of my applications (on average). If any one of my applications have a 10% speedup, then that will be only for this single application. Thus, while it might be worth it for me for Intel to spend amount X on a 10% speedup, it may not be worth X for Adobe to do a 10% speedup on Photoshop _unless_ I am only/mainly running Photoshop on my computer (or it is the only application that feels sluggish).

I think that hw is very cost vs performance oriented, things are measured and many customers choose products based on that metric. Software customers are also making choices based on stability, features, user-interface etc.

Hw also have the perceived quality of "finished" about it. When you buy a computer, you expect to use it for e.g. 5 years. When you buy a software product, you expect it to continually improve for some time. Thus I might purchase the program that "get the job done" today, hoping that it will be less sluggish in a future release.

-k
 
Last edited:

VirtualLarry

No Lifer
Aug 25, 2001
56,452
10,120
126
The current x86 instruction set contains over 1,000 instructions. How are you remembering all those?

When I was doing x86 ASM programming with MASM, I always kept the spiral-bound opcode reference handy. Nowadays, there's probably twice as many opcodes, what with all of the ISA extensions since the 386/486 days.
 

Exophase

Diamond Member
Apr 19, 2012
4,439
9
81
The current x86 instruction set contains over 1,000 instructions. How are you remembering all those?

That doesn't translate to over 1000 assembly instructions with separate names. A lot of opcodes have implicit operands encoded.

The ones that are hardest to remember are esoteric SSE instructions that are there for pretty niche use cases that a compiler (especially GCC) is very unlikely to every use. You kind of get a feel for the classes these instructions fall under and scan through the documentation when working on a particular algorithm if you think something might fit.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |