thilanliyan
Lifer
- Jun 21, 2005
- 11,979
- 2,202
- 126
Hmm, not sure if Fury X's are optimized for or perhaps the pool is being overloaded. Getting average of 57.5 Mh/s at http://eth.nanopool.org. Stock clocks.
How many Furys is that for?
Hmm, not sure if Fury X's are optimized for or perhaps the pool is being overloaded. Getting average of 57.5 Mh/s at http://eth.nanopool.org. Stock clocks.
Your firewall may be blocking the traffic?
There are some switches you can try.
First make sure you add this. It will get on new work faster. --farm-recheck 100 default is 500 (in ms)
You can also try to play around with the global and local work values.
--cl-global-work 4096 default is 4096, try 8192 or 16384 also. I use 16384 personally.
--cl-local-work 64 default is 64. 128 and 256 can be tried. I read some dev chatter that 64 was the most efficient for AMD because their wavefront is 64 threads. Nvidia should use 32.
Please note, there are not big gains to be had like tweaking for BTC mining.
I mined some a few months ago, and a single 980ti was getting about 20mh/s after overclocking, and now i can't get it to break 8? what gives?
As these currencies age it becomes more difficult to calculate.
How many Furys is that for?
That has absolutely nothing to do with hashrate
Where are you reading your speeds from? Maybe try DDU and reinstall latest drivers? Perhaps maybe try an older CUDA driver?
I'll be setting up my 970 soon so maybe I'll be of more help then.
I did that. I actually have 2 rigs, one with 780's and one with 980ti's and both are mining at like 1/3 their potential speed. I think it must have something to do with windows 10? I could install linux and mine that way and i doubt i'd have issues, but that's a bit of a hassle. I may go ahead and do that tomorrow night anyway.
edit: reading speeds from both ethminer and ethpool.
2. If you're using my directions, then you're going to get paid via pool 2x/day. You can see how much you're making via: http://eth.nanopool.org/account/0xyour_address
Hmm, not sure if Fury X's are optimized for or perhaps the pool is being overloaded. Getting average of 57.5 Mh/s at http://eth.nanopool.org. Stock clocks.
Awesome thanks for the advice. I just added my MSI 390 and it's hitting around 20.5 Mh/s at 1075Mhz core with a small undervolt. I still have an Asus mini Geforce 970 to add so hopefully with these 4 cards and your tweaks I'll be at 100Mh/S. If not I'll throw a few Kaveri APU's at it
I also have an old 2GB 5870 sitting around, can you still mine on these older pre GCN cards?
Some super rough napkin math suggests the theoretical limit for fury is around 78MH/s based on memory accesses being the limiting factor.
The math is that one hash takes 2^16 bits (8Kb) of memory accesses in 128 byte chunks so with 512 Gbit/s memory bw we arrive at around 78MH/s theoretical max.
Looking at the OpenCL kernels suggests the code is pretty poor so I'd expect pretty large optimizations being possible for Fury (X).
2GB Pit Cairns and Tahiti LEs do work, so it may just be that a 5870 is too old.FYI my 2GB 5870 fails to start due to "insufficient memory". Anyone else here with 2GB card get theirs to work? This card uses to kill it at BTC mining, would be a shame not to use it.
FYI I played around with the switches you provided. I wasn't able to hit your speeds but got close. I'm hitting around 60Mh/s with two Fury X's now.
The following gave me the best results so far.
"ethminer.exe -F http://eth1.nanopool.org:8888/"addres"/miner1 -G -t 2 --cl-global-work 32768 --farm-recheck 100 --cl-local-work 32"
More interesting is my MSI 390 at -25mv and -10 board power running 1100Mhz core / default memory is giving me almost 30MH/s so approx the same speed as the Fury X's are. And this is before trying the above tweaks! I'm guessing this program was optimized for Hawaii and the Fury has a bunch of untapped potential left given how many more shaders and memory bandwidth it has at its disposal.
I'm still puzzled how you're getting 32MH/s though. Would you mind sharing your command line arguments exactly?
FYI my 2GB 5870 fails to start due to "insufficient memory". Anyone else here with 2GB card get theirs to work? This card uses to kill it at BTC mining, would be a shame not to use it.
The fundamental unit of work on AMD GPUs is called a wavefront. Each wavefront consists of 64 work-items; thus, the optimal local work size is an integer multiple of 64 (specifically 64, 128, 192, or 256) work-items per work-group.
You are going to want a local work size of 64, 128, 192, or 256 for AMD. 64 is one wavefront. It's the smallest amount of work that can fill a CU. 32 is going to only use about half a CU thus wasting the other half. Nvidia should use 32 as a min because that is the size of their warps. See below link.
http://developer.amd.com/tools-and-...ncl-optimization-guide/#50401334_pgfId-458820
After more experimentation I'm using
--cl-global-work 16384
--cl-local-work 256
The OpenCL Optimization Guide says to use the largest global possible. I don't think you want to get too crazy, but 4096 is small. The global size is the total size of the problem array. If you want to find blocks you want to have more data to work with. Local work you have to experiment with. I believe it's better to stack work up so that your CUs are always being utilized. 256 would be 4 wavefronts of work. I think this ties in with what Zagitta said about memory bandwidth.
http://haifux.org/lectures/267/OpenCL_Dos_and_Donts.pdf
Thanks for this. I am now getting ~24 Mh/sec on my 290x, running at ~900Mhz and undervolted by 50mV. System power draw is ~215W, so I'll assume wall-power for 290x is close to 170W.
You are going to want a local work size of 64, 128, 192, or 256 for AMD. 64 is one wavefront. It's the smallest amount of work that can fill a CU. 32 is going to only use about half a CU thus wasting the other half. Nvidia should use 32 as a min because that is the size of their warps. See below link.
http://developer.amd.com/tools-and-...ncl-optimization-guide/#50401334_pgfId-458820
After more experimentation I'm using
--cl-global-work 16384
--cl-local-work 256
The OpenCL Optimization Guide says to use the largest global possible. I don't think you want to get too crazy, but 4096 is small. The global size is the total size of the problem array. If you want to find blocks you want to have more data to work with. Local work you have to experiment with. I believe it's better to stack work up so that your CUs are always being utilized. 256 would be 4 wavefronts of work. I think this ties in with what Zagitta said about memory bandwidth.
http://haifux.org/lectures/267/OpenCL_Dos_and_Donts.pdf
With the defaults I was getting 26.4 Mh/s on my unlocked Fury. With my tweaks I am @ 35.8 and gaining for the last 6 hours.
Thanks for this. I am now getting ~24 Mh/sec on my 290x, running at ~900Mhz and undervolted by 50mV. System power draw is ~215W, so I'll assume wall-power for 290x is close to 170W.
2GB Pit Cairns and Tahiti LEs do work, so it may just be that a 5870 is too old.
I run Windows so I've been using the miner recommended by coinotron for Etherium:Thanks. What Etherium app is working with 2GB Pitcairns and Tahitis?