Recent content by bronxzv

  1. B

    Intel Skylake / Kaby Lake

    what do you mean with fully enabled, two 512-bit FMA per core ? i.e. twice throughput with AVX-512 code vs. AVX2 code ? and what is your source for this ?
  2. B

    3DPM: Can we fix it? Yes we can!

    I don't see how you can reach this conclusion (log N instead of N) based on the spherical vector distribution method which has an extra inner loop for each step with a hard to predict termination branch IIRC the example discussed here the other day...
  3. B

    3DPM: Can we fix it? Yes we can!

    I don't see what this has to do with the problem at hand which complexity is O(M*N) with M particles and N steps, for both the original and your proposal now, maybe some pseudo-code will convince me that I'm wrong ?
  4. B

    3DPM: Can we fix it? Yes we can!

    neat! thank you a good example of pseudoscientific arguments IMHO is your analogy with "DFT vs FFT" computational complexity: it makes it sound as if your proposal is somewhat less "brute force" than the original your solution looks actually more complex than the original and less amenable...
  5. B

    3DPM: Can we fix it? Yes we can!

    note that I wrote "pseudo scientific" not "scientific"
  6. B

    3DPM: Can we fix it? Yes we can!

    I don't get it, still can't tell if you have more/less/same number of particles than the original i.e. number of distinct positions at the end of the simulation well, this is the common word for random walk and more generally all numerical simulations, anyway the one used by borandi in the...
  7. B

    3DPM: Can we fix it? Yes we can!

    too bad, it explains why it wasn't so easy to try to fix the supposed cache issues I have raised (hint: the out of this world speedup with HT enabled) anyway thank you for your answer, too bad I don't have free time these days to post a fully vectorized solution
  8. B

    3DPM: Can we fix it? Yes we can!

    it's not clear (IMHO) from your description what you are aiming at 1) processing fewer particles 2) processing fewer steps per particle 3) faster steps 4) other some pseudo-code will probably help to understand what you are trying to explain with quite strange and irrelevant (to me) analogies
  9. B

    3DPM: Can we fix it? Yes we can!

    huh ? I have dozens of loops perfectly vectorized with transcentals in my code (polynomial approximations and if conversion when there is more than one branch) for an of the shelf solution, look at a concrete example here using SMVL ...
  10. B

    3DPM: Can we fix it? Yes we can!

    I don't see the point to unroll by hand, you seem to have the same strange idea than DrMrLordX on this, just write x[i] = r[i]*cos(valpha[i]); in a loop, it's easy to maintain and vectorize well
  11. B

    3DPM: Can we fix it? Yes we can!

    it is used automatically by the ICC vectorizer, once you have a data layout amenable to vectorization void SinTest (const float * __restrict x, float * __restrict y, size_t size) { __assume_aligned(x,32); __assume_aligned(y,32); for (int i=0; i<size; i++) y[i] = sin(x[i]); } ...
  12. B

    3DPM: Can we fix it? Yes we can!

    are you sure SVML is used for the trig funcs ? you should see stuff such as call __svml_cosf8 in the ASM dump
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |