Yes, I'm aware, but unfortunately I don't know what to do about it. I'm not sure exactly what's going on to cause it, but my hypothesis is that as you get closer to the mesh, there will be more pixels to fill with the shader, i.e. the fragment will need to run more to fill each triangle. Since a fur shader renders the model multiple times, it means it will have to fill a lot of pixels for certain angles, especially where there's a lot of overlapping faces (from the camera projection's perspective).
To test this, I tried simply multiplying the vertex positition, and indeed - when making the mesh smaller (making fewer pixels having to be filled), the performance doesn't tank. (Obviously, I can't just multiply the vertex output though, since that puts the vertices at the wrong location - it was simply a test.)
As you have discovered, it doesn't seem to matter how complex the fragment shader is. Even when discarding the pixel, performance still tanks. So it seems to have more to do with the actual invocation than the fragment itself.
Comments
To test this, I tried simply multiplying the vertex positition, and indeed - when making the mesh smaller (making fewer pixels having to be filled), the performance doesn't tank. (Obviously, I can't just multiply the vertex output though, since that puts the vertices at the wrong location - it was simply a test.)
As you have discovered, it doesn't seem to matter how complex the fragment shader is. Even when discarding the pixel, performance still tanks. So it seems to have more to do with the actual invocation than the fragment itself.