As far as obfuscating sensitive data, things would be more simple in smaller datasets, but when you're dealing with millions of records, it can quickly get complicated and take a long time. Some of the issues with large data would be
the amount of time required to obfuscate the data
exporting last(1000) records could miss referencing objects
lock tables from writing if you're coping data over to a separate table
conflicts of unique constraint columns
I know that we're sometimes in situations where we get to the point of saying, "but I need the production data". At that point, I would probably do a "hotfix" to add more logging in the area of the application and see what makes this particular org/company/user so unique. From there, you should, hopefully, be able to replicate it in your staging environment. If, you're seeing the same issues, then try to replicate it locally. If you're still unable to replicate the issue locally, then you could pull the staging database where the issue is being experienced. If pulling and loading the data to my local environment doesn't replicate the issue then there could be a difference in the environments (at an infrastructure level or some kind of environment flag within the code).
So, while it is kind of avoiding the question, I think that we shouldn't ever pull the production database (regardless of obfuscation or not).