Also, I've been reminiscing lately with Adam and the Ants, a beloved New Romantic post-punk band of my teenage years. They only released three full albums and an EP but they were a lot of fun. Adam Ant also went solo later but truth be told, I prefer his earlier stuff. Be that as it may...
Find out how test data is populated in your system. How could it be improved? You can watch Techniques for Generating and Managing Test Data by Omose Ogala for some ideas to get you started.
This is an area that I spend a lot of time dealing with. I think I'm on safe ground describing these details as its' not a huge secret what Socialtext uses. First, it helps to understand what Socialtext is in a general sense and then in a more abstract sense. In it's most basic form, Socialtext is a collaboration platform. It makes it possible to work on a lot of stuff. At its core is a wiki. A wiki is a way of editing text rapidly. To that end, Socialtext uses an editor called CKEditor, which is used in a variety of applications (it's open source). Atop of that, we leverage a lot of details about the documents created with the wiki (text and spreadsheet) so that we can share that information. That information is displayed in a variety of ways, most notably through assignable modules we call widgets. Those widgets can be created and combined in a variety of ways to make Dashboards at varying levels (personal, group, or account level). Those dashboard widgets can contain the content from a document or documents itself, or it can be composed of meta-data from the documents (such as who uses what, who has commented on what, who has revised something in the system, etc.).
At the basic level, everything is associated with an account, so often the most effective method to load test data (as well as to protect it and to use it from a known starting point) is to import an account that is already set up with the information we want to use. I actually have several of these and each is set up to help me deal with a variety of issues. I have accounts created for Localization, Responsive Design, Accessibility & Inclusive Design, and Large Customer simulation. Additionally, I have data that deals with the components of our system so that I don't have to constantly reconfigure those elements (that includes text examples, HTML and Markup formatting, videos, user details, language preferences, etc.). The key here is that I try to limit the use of test data that tries to be all things to all circumstances. While it can be helpful to include a lot of details in one place, it can also complicate the situation in that there is "too much of a good thing".
Another way that I try to keep test data useful and fresh is that I determine the methods that can best generate the data that I use and help me to keep track of everything in a noticeable way. One of those methods is that I have general and specific scenarios. When I do tests with large numbers of users I generate that data with a tool called "Fake Name Generator". This has been my go-to tool for more than a decade and it provides both individual details I can call up one at a time to use, or I can get bulk downloads with tens or even hundreds of thousands of users (the system limits you to 100,000 users for any given request, but over time, it is possible to generate several hundred thousand or even millions of users).
Still, there are times that I want to look at the way that data relates to others in a more personal way. There are several methods for this but the one I enjoy using is I take my favorite Manga or Anime series, collect the characters, populate dossiers for each of the people in the cast and then I create accounts with those people. The reason? I know those stories so if I see people that shouldn't be "mixing" I can immediately identify that. The downside to this is that not every member of my team is familiar with these stories so what's obvious to me may invite a variety of questions (that and the fact that my user database ends up being overwhelmingly Japanese names rendered in Romaji ;) ).
So what can we do better? I think there's a way that we can make data that is self-referential, less niche specific and more easily relatable to a broader audience. I think FakeName Generator can help with that but it also requires a bit of pruning from time to time to make sure that the richness of the data doesn't itslf cause problems or allowing bugs to "hide in plain sight". To that end, meaningful personas that the whole team can understand, vote on and share would be a major plus.