Our beloved Perfomatic is undergoing a schema change. The initial design was put together when the Talos project was just getting off the ground. There’s a very large difference between 3 Talos boxes reporting and 90, and it shows in our 60 GB database. It has become so bloated, and anything interesting involves painful joins between giant tables, that we we mostly just leave it alone to run. We’d like to be able to branch out the graph server work to include dashboards and better statistical analysis and administrative features (removing corrupted data, etc) but everything ends up being hampered by the database.
With this in mind the new schema was designed. It’s broken up into more tables and will greatly reduce the redundancy found in the old schema. It should also make it dead simple to do things like “what are the last 10 data points for test X on branch Y”.
We are starting to put all the pieces together to make use of the new schema but there are some drawbacks:
- Format of links to graphs are changing, graph links that work on the old graph server will not work on the new. What does this mean for existing links in bugs?
- How much, if any, data can we migrate from the old graph server to the new? The format within the database has changed significantly and will require a large amount of massaging to get it into the new, is this effort worth it?
- If we are to migrate data, how long can we be without it while it gets pulled out of the old db, altered and re-assembled and then pushed into the new?
Bug 472176 - Migration procedure has been filed to work through issues with switching from the old database to the new. What I really need is insight from people who work with a graph server on a daily basis. What is the most important data that is really necessary to migrate? If we were without data for a few days or a week while migration happened in the background (you would still have the currently reported numbers, just no historic data) would that be okay? Would it be acceptable to migrate no data and just have the two set ups running side by side, until we felt that there was enough data in the new that the old set up would only be kept alive for looking at old numbers but no longer accepting new?
I’d love to get feedback on these questions during the weekly graph server meeting (Mondays, 11am PST); we’ll be discussing migration for the next few meetings as we get closer to being able to make the switch from old schema to new. If you can’t make the meeting time just join #perfomatic and talk with the graph server team directly, or comment in the migration bug.