Yeah they're "units", but it's like saying each pipeline within a x86 core is a core in of itself.
Here is a shot of GF100:
32 shaders per cluster, 512 shaders total but there are more units inside the cluster that do other things, shaders are only one part of a gpus performance. You also have ROPs and TMU's and other things contributing to performance.
GTX 480 had 1 cluster diabled, 480 "shader cores" 15 tess units (poly engines) while the GTX 470 had two clusters disabled for 448 "shader cores" and 14 tess units (poly engines).
The 580 is a refined core, but the same setup, with all clusters enabled for 512 "shader cores" and 16 tess units (poly engines). But clock for clock core for core GF100 and GF110 are nearly identical performance wise which can be seen clearly when the 560 ti 448 benches against a 470.