I have a table say:
CREATE TABLE "DataNode" (
CREATE TABLE "DataNode_Revisions" (
So I have these two tables. I do all my reads from "DataNode" and when a change occurs I write out the current entry to "DataNode_Revisions" and then modify my existing "DataNode" record. Makes sense?
Is this the best way to go about it? I can already tell I am going to run into problems when the Schema changes. I am not seeing a better alternative but if there is one please let me know! I assume keeping this all in one table would result in massive performance losses woudl it not? I mean I would be more then quadrupling the number of records and there is already quite a few. I think Drupal stores node revisions like this, and I am curious how they do not suffer performance problems from it.
"DataNode" is constantly being read by a lot of users. However, very few writes ever occur. "DataNode_Revisions" is only read from on occasion. I am just worried about maintaining so many tables. "DataNode" is one of ~25 tables very similar to this one.
Whether there will be any performance implications from storing the old rows in the DataNode table depends on how the DataNode rows are accessed. If the reads are all single-row lookups for the current row, the number of rows in the table is relatively immaterial-- it's not going to take any more work to find the current row for a particular ID than it would to get the row for that ID from the current DataNode table (I'm assuming here that ID is the key for the table). On the other hand, if you have a number of queries that are doing table scans of the DataNode table, then quadrupling the number of rows will increase the time required to run those queries.
If you want to go down the path of putting the historical rows in the DataNode table, you would probably want to add an EXPIRATION_DATE column that is NULL for the current row and populated for the expired rows. You could then create a function-based index based on the EXPIRATION_DATE that would have data for only the current rows, i.e.
CREATE INDEX idx_current_ids
ON DataNode( (CASE WHEN expiration_date IS NULL THEN id ELSE null END) );
which would be used in a query like
WHERE (CASE WHEN expiration_date IS NULL THEN id ELSE null END) = <<some id>>
Obviously, you'd probably want to create a view that has this condition rather than rewriting it every time you need the current row, i.e.
CREATE VIEW CurrentDataNode
SELECT (CASE WHEN expiration_date IS NULL THEN id ELSE null END) id,
WHERE id = <<some value>>