Hyperthreading and Firefox Qunatum

gmaslin

Junior Member
Aug 12, 2004
6
0
61
If you haven't heard, Firefox Quantum supports multiple simultaneous processes. The question I have is whether an Intel CPU with Hyperthreading is able to take advantage of this new rendering engine. Specifically, how far back in Hyperthreading history will handle the additional process threads.
 

TheELF

Diamond Member
Dec 22, 2012
3,991
744
126
"The question I have is whether an Intel CPU with Hyperthreading is able to take advantage of this new rendering engine. "
Hyper threading is being taken advantage of every time something can take advantage of that amount of real cores,

Specifically, how far back in Hyperthreading history will handle the additional process threads.
Huh?What?
 

TheRyuu

Diamond Member
Dec 3, 2005
5,479
14
81
If you haven't heard, Firefox Quantum supports multiple simultaneous processes. The question I have is whether an Intel CPU with Hyperthreading is able to take advantage of this new rendering engine. Specifically, how far back in Hyperthreading history will handle the additional process threads.

Can you reword your question? It doesn't really make any sense as currently written.
 

gmaslin

Junior Member
Aug 12, 2004
6
0
61
Okay, Firefox Quantum and later versions use a multiple process engine. Their default setting allows it to do 4 simultaneous processes. A program that can perform multiple simultaneous processes will benefit from multiple processors. What I am to know is if Intel CPUs that have Hyperthreding take advantage of this new capability in Firefox the same way a real processor would.
 

TheRyuu

Diamond Member
Dec 3, 2005
5,479
14
81
Okay, Firefox Quantum and later versions use a multiple process engine. Their default setting allows it to do 4 simultaneous processes. A program that can perform multiple simultaneous processes will benefit from multiple processors. What I am to know is if Intel CPUs that have Hyperthreding take advantage of this new capability in Firefox the same way a real processor would.

Typically one browser process isn't going to really load up more than one core. I'm not sure about the specifics of how Firefox handles how it chooses what to do with each process but I can't imagine it benifiting from the multiple processes in the way you're describing. They probably approach it more from a stability perspective meaning if one processes crashes you don't lose the others.

The performance benifit of SMT (hyperthreading) on Intel CPU's is very workload dependent. The performance hit is negligable for single threaded stuff (like web browsing) and you can get up to a 25% performance increase in heavily threaded stuff like video encoding.
 

Cogman

Lifer
Sep 19, 2000
10,278
126
106
Typically one browser process isn't going to really load up more than one core. I'm not sure about the specifics of how Firefox handles how it chooses what to do with each process but I can't imagine it benifiting from the multiple processes in the way you're describing. They probably approach it more from a stability perspective meaning if one processes crashes you don't lose the others.

The performance benifit of SMT (hyperthreading) on Intel CPU's is very workload dependent. The performance hit is negligable for single threaded stuff (like web browsing) and you can get up to a 25% performance increase in heavily threaded stuff like video encoding.

The new web renderer makes page rendering highly parallel and stylo does the same thing for CSS evaluation. While JS execution is unaffected (javascript is effectively always single threaded), layout and rendering are.

With that said, how much benefit will HT have? It would be really hard to measure. HT works by assigning more work when a thread becomes stalled (usually loading stuff from memory). So for highly parallel tasks involving lots of memory (video encoding) there are clear benefits. On the other hand, a lot of localized memory parallel work won't see near the benefit and in some cases may even see performance losses (games).

My shot in the dark guess is that HT will have minimal impact on Firefox Quantum. I don't think the amount of memory used in browser rendering is large or spread out, but enough is going on and potentially all around the memory that I don't think it will be a strict negative.

But again, really hard to measure.

Obviously, this is going to only really apply to super complex webpages.
 

TheELF

Diamond Member
Dec 22, 2012
3,991
744
126
It would be really hard to measure. HT works by assigning more work when a thread becomes stalled (usually loading stuff from memory). So for highly parallel tasks involving lots of memory (video encoding) there are clear benefits. On the other hand, a lot of localized memory parallel work won't see near the benefit and in some cases may even see performance losses (games).
WOW,how do people even come up with this stuff?
HT doubles your throughput (speed) as long as there is no thread that uses more then half the instructions available on a core.
Here, compared to the g4400 it's 97% faster in multicore ( ~200mhz difference)

In games the difference can be even more spectacular,depending on how crappy the game is made.
Here the game (BF1) completely kills the g4500 because the game doesn't adapt to core count,so the HT TRIPLES the performance from 30 to 90 FPS...

The 25-30% you get in video transcoding /3d rendering is actually the lowest possible benefit.
 

Cogman

Lifer
Sep 19, 2000
10,278
126
106
WOW,how do people even come up with this stuff?
HT doubles your throughput (speed) as long as there is no thread that uses more then half the instructions available on a core.
Here, compared to the g4400 it's 97% faster in multicore ( ~200mhz difference)

In games the difference can be even more spectacular,depending on how crappy the game is made.
Here the game (BF1) completely kills the g4500 because the game doesn't adapt to core count,so the HT TRIPLES the performance from 30 to 90 FPS...

The 25-30% you get in video transcoding /3d rendering is actually the lowest possible benefit.

Just read up on how hyperthreading is implemented.

Here you go, I'll wait.
https://en.wikipedia.org/wiki/Hyper-threading
https://software.intel.com/en-us/ar...per-threading-technology-with-an-application/

As I said before, how effective hyperthreading is will generally depend on how localized things like memory access are. That there are some games that do a lot of memory thrashing and therefore can benefit heavily from hyper threading does not surprise me or even negate what I said about hyper threading. However, generally, games tend to have well optimized memory access patterns (for performance reasons). Further, games tend to be heavy on floating point math, so they are using the FPU heavily (which also does not allow for much parallelism).

I come up with this stuff from reading the technical manuals, documents, and reviews published by anandtech, wikipedia, intel, and others. Because it is part of my job to know this kind of stuff and this stuff is interesting to me.
 

TheELF

Diamond Member
Dec 22, 2012
3,991
744
126
When execution resources would not be used by the current task in a processor without hyper-threading, and especially when the processor is stalled, a hyper-threading equipped processor can use those execution resources to execute another scheduled task. (The processor may stall due to a cache miss, branch misprediction, or data dependency.)[citation needed]

This technology is transparent to operating systems and programs. The minimum that is required to take advantage of hyper-threading is symmetric multiprocessing (SMP) support in the operating system, as the logical processors appear as standard separate processors.
Especially does not mean exclusively...
It's transparent to operating systems and programs which means that it happens always and not only when one thread is stalled,that's just when the logical thread will have access to the highest amount of resources.

Measured performance on the Intel® Xeon® processor MP with Hyper-Threading Technology shows performance gains of up to 30% on common server application benchmarks for this technology¹.
Server applications have little if anything to do with common everyday usage.


The gain from HT depends only on the amount of instructions that the threads use.
 

mv2devnull

Golden Member
Apr 13, 2010
1,503
145
106
A question is how does one benchmark a browser? Once you decide that, the HT is (almost) trivial; simply run the benchmark with HT enabled and again HT disabled.
 

TheELF

Diamond Member
Dec 22, 2012
3,991
744
126
A question is how does one benchmark a browser? Once you decide that, the HT is (almost) trivial; simply run the benchmark with HT enabled and again HT disabled.
Set up the browser to reopen all tabs on startup.
Load up a set amount of decently heavy webpages.
Restart the browser.
Count how much time is needed for the CPU to return to normal utilization.
 

Merad

Platinum Member
May 31, 2010
2,586
19
81
WOW,how do people even come up with this stuff?
HT doubles your throughput (speed) as long as there is no thread that uses more then half the instructions available on a core.

Dude(tte), no offense but your second sentence in this quote is basically complete nonsense, along with several other statements you've made in this thread. You're obviously very excited about computers and performance benchmarking and so on, but on a technical level you really don't know what you're talking about here.
 
Reactions: VirtualLarry

TheELF

Diamond Member
Dec 22, 2012
3,991
744
126
Dude(tte), no offense but your second sentence in this quote is basically complete nonsense, along with several other statements you've made in this thread. You're obviously very excited about computers and performance benchmarking and so on, but on a technical level you really don't know what you're talking about here.
https://web.archive.org/web/2015112...insights-to-intel-hyper-threading-technology/
The execution pipeline of processors based on Intel® Core™ microarchitecture is four instructions wide, meaning that it can execute up to four micro-operations per clock cycle. As shown in Figure 3, however, the software thread being executed often does not have four instructions eligible for simultaneous execution. Common reasons for fewer than four instructions per clock being retired include dependencies between the output of one instruction and the input of another, as well as latency waiting for data to be fetched from cache or memory.

Intel HT Technology improves performance through increased instruction level parallelism by having two threads with independent instruction streams, eliminating data dependencies between threads and increasing utilization of the available execution units. This effect typically increases the number of instructions executed in a given amount of time within a core, as shown in Figure 3. The impact of this greater efficiency is experienced by users as higher throughput (since more work gets completed per clock cycle) and higher performance per watt (since fewer idle execution units consume power without contributing to performance). In addition, when one thread has a cache miss, branch mispredict, or any other pipeline stall, the other thread continues processing instructions at nearly the same rate as a single thread running on the core.
four instructions wide
the software thread being executed often does not have four instructions eligible for simultaneous execution
It might only have two or maybe even just one
This effect typically increases the number of instructions executed in a given amount of time within a core
It might run a second thread that also only has two or maybe even just one instructions eligible for simultaneous execution.
In this case you would have doubled your throughput.
In addition ,cache misses and so on are an in addition.


If I don't know what I'm talking about then neither do Sr. Software Engineers working at intel...
Garrett Drysdale is a Sr. Software Performance Engineer for Intel.
Antonio C. Valles is a Senior Software Engineer at Intel
 

VirtualLarry

No Lifer
Aug 25, 2001
56,447
10,117
126
No offense, @TheELF, but that glosses over a lot of specifics. The pipeline on Core, is not "four wide", for any kind of instruction, there are separate execution pipelines for different categories of instructions.

The only way to achieve maximal throughput, is if, for example, you have one thread doing integer ops, and one thread doing FP ops.
 
Reactions: Drazick

TheELF

Diamond Member
Dec 22, 2012
3,991
744
126
No offense, @TheELF, but that glosses over a lot of specifics. The pipeline on Core, is not "four wide", for any kind of instruction, there are separate execution pipelines for different categories of instructions.

The only way to achieve maximal throughput, is if, for example, you have one thread doing integer ops, and one thread doing FP ops.
So how does that change anything?Who talked about maximum throughput until now?
If you have software that only uses a few instructions per thread it's possible for it to get up to twice the throughput with HTT,nobody is saying that the hardware magically get's more throughput it's all about how the software runs,you could have one thread that only does one read per cycle and one thread that only does one write per cycle those two could run on the same core with HTT while running at the same speed they would on a normal core,they wouldn't need to wait for stalls and they wouldn't get just 30% improovement.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |