This issue is a placeholder for further discussion before implementation.
Thought about this more after the recent change that returned a metadata object after an update/put.
The service currently updates records in the metadata table. One approach is to copy metadata to a separate metadata_archive table, then update the record in metadata (all in a transaction).
I think that this is a good idea. To start itemizing changes that would need to take place:
create metadata_archive table
Same schema as metadata_table.py?
We would need a reference table to associate metadata.id with metadata_archive.id, or
Add an additional foreign key metadata_archive.metadata_id that is a foreign key reference to metadata.id
Nothing is ever deleted from the metadata_archive table
new API endpoint 'ws/secure/metadata/history/:id'
get history based on metadata.id
returns array of metadata objects
update PUT request
archive existing metadata record, before updating record. Call metadata_table.get_metadata(id) and update metadata_archive_table.create_metadata(metadata) to "archive" the existing metadata record that is currently being updated by the PUT request.
Schema updates
// already existsCREATE TABLE metadata ( id int NOT NULL, ...metadata PRIMARY KEY (id),);// new table to be createdCREATE TABLE metadata_archive ( id int NOT NULL, metadata_id int NOT NULL, ...metadata PRIMARY KEY (id), FOREIGN KEY (metadata_id) REFERENCES metadata(id));
metadata_archive.metadata_id is a foreign key that references the current "master" copy of the metadata object.
Questions:
would metadata_archive only show archived records, or would it make more sense to have a metadata_history table that keeps track of all "versions" of a metadata entry.
This would make it easier to view all versions of a metadata record at once (compare new vs old)
How do user's archive a record? metadata-valid = false;
The intent is to keep the latest version of any metadata in the metadata table, and old versions of metadata in the metadata_history table. These would need to be linked, and I think an additional metadata_id column makes sense.
The metadata table currently tracks created_by and created_time, for the user and time when metadata was entered, as well as reviewed_by and reviewed_time which is less clear if there are multiple reviewers. One option is to change the review information to updated_by and updated_time, and use these to determine the history order?
metadata_valid can currently be used to say metadata is invalid. This could be null for not reviewed, false for reviewed and rejected, and true for reviewed and approved; or add a separate field reviewed that specifically means the other flags have been set.
The point of the separate table is to reduce the number of active records that need to be searched for operations, unless looking at a specific history so I think the history table should only have old versions.
Records are copied from metadata to metadata_history when updating a record. Users can have other options to hide rows using metadata_valid=false, but the latest version of the metadata with metadata_valid=false would remain in the metadata table. I'm not sure we need an option to actually delete a record as much as hiding records that have already been reviewed and rejected.
These considerations are in line with what I was thinking. I will open up sub-issues to address this in pieces.
A couple of remaining questions I have:
If metadata_valid is null and then the record is updated, will that historical copy have a null value permanently? Or will it require metadata_valid to be false before entering it into the historical table(hidden from view).
If this is the case, are we not concerned with edits made to a record before setting a metadata_valid value?
If metadata_valid is null and then the record is updated, will that historical copy have a null value permanently? Or will it require metadata_valid to be false before entering it into the historical table(hidden from view).
A record is copied as is into the metadata_history table, before it is updated by a new version with different information.
I think we should make the following changes to metadata:
Add reviewed: boolean column.
False when created, can be checked once review is complete.
Rename reviewed_by and reviewed_time to updated_by and updated_time.
Null when created, set to current user and time when existing record is updated (saved).
metadata_valid can default to True, and be set to False by a reviewer as part of the review process to "reject" metadata.
It will be a foreign key from the metadata_history table to the metadata table. It can be part of the data model too, but should be optional and None when it's a current/new metadata record.
@pcain I think it can wait for now, the main use case for the cli client is to add new metadata (adjusted matrix) or retrieve existing metadata for processing.