I’m currently attending the online lecture ‚in memory computing‘ by http://openhpi.com. One of the founders of SAP, Hasso Plattner, stated that they use insert only approach to keep a history of the record in the table. Since only ~10% of the records get updated during lifetime, there should be no performance penalty being expected.

I was very curious if this is possible with ADS aswell and implemented it using triggers.

Prepare the table

First of all the table has to be created and prepared. In order to know the lifecycle of a record, two timetamp fields are being used: validFrom and validTo. A field of type ROWVERSION keeps track of modifications in the table. So the resulting table looks like:

CREATE TABLE tbTest( 
      id Char(36),
      fname Char(30),
      lname Char(30),
      dob Date,
      gender Char(1),
      country Char(30),
      city Char(30),
      validFrom TimeStamp,
      validTo TimeStamp,
      rv RowVersion);

Invalid (deleted) records shouldn’t be displayed, so we need to filter them out. Easiest is to work only with a view of the table:

CREATE VIEW test AS 
  SELECT id, fname, lname, dob, gender, country, city, rv 
  FROM tbtest
  WHERE validTo IS NULL;

Inserts

On an INSERT operation, a unique ID should be given to the record if it’s not already set. Then we need to make sure our validity columns get populated: validFrom has to be set to the actual timestamp and validTo has to be reset to NULL. Since this is an INSTEAD OF INSERT trigger that replaces the INSERT operation, we finally need to make sure the record gets inserted into the table.

CREATE TRIGGER trig_ins ON tbTest INSTEAD OF INSERT 
BEGIN 
  update __new set id=newidstring(d) where id is null;
  update __new set validFrom=now(), validTo=NULL;
  insert into tbtest select * from __new;
END;

Deletes

A DELETE doesn’t remove the record from the table, but invalidates it by setting it’s validTo column to the actual timestamp:

CREATE TRIGGER trig_del ON tbTest INSTEAD OF DELETE 
BEGIN 
  update tbtest set validTo=now() where rowid = ::stmt.TrigRowId and validTo is null;
END;

Updates

An UPDATE operation invaliates the actual record and inserts a new record with the new values. This trigger is a bit more complicated because a delete will also fire the update and so removing the record from the view will not work. To bypass this, we’ll check the new record if it’s validTo field is still unset (DELETE willset this field for the new record).

CREATE TRIGGER trig_upd ON tbTest INSTEAD OF UPDATE 
BEGIN 
  update tbtest set validto=now() where rowid = ::stmt.TrigRowId and validto is null;
  if __new.validto is null then
    insert into tbtest select * from __new;
  end;
END;

Testing

As a first test let’s insert a new record:

INSERT INTO test (fname, lname, dob, gender, country, city)
  VALUES ('Joe', 'Doe', '2001-09-11', 'm', 'U.S.', 'New York');

Looking at the view ‚test‘ the record is there. In the table ‚tbTest‘ the record can also be found an the validFrom field is set. When updating the record (e.g. Joe Doe moves to Chicago), an updated version will be displayed in the view and the table will now contain two records (one with validTo set and one with validTo unset).

UPDATE test SET city='Chicago' WHERE id = '79298981-cc05-2245-b6d4-e413996f14dd';

So the behaviour is exactly what we want: Only actual records are shown in the view, but there’s still a full history available in the table.

Let’s try a delete:

DELETE FROM test WHERE id = '79298981-cc05-2245-b6d4-e413996f14dd'

This command removes the record from the view, but the invalidated copy is still left in the table. So this works aswell.

Bulk inserts, updates and deletes

Bulk inserts and deletes will still work, but if you now try to update all records, it will result in an ifinite loop:

UPDATE test SET city='Chicago';

This is caused by using a live view: any update will append a new record and so it will never come to an end. To bypass this behaviour, we need to use a ROWVERSION column (and that’s the reason why I’ve created it in the first place):

UPDATE test SET city='Chicago' WHERE rv<=(SELECT max(rv) FROM test);

Summary

Implementing an INSERT ONLY approach into ADS using triggers is possible as long as you only access one record at a time. But you need to be very careful on bulk updates in order to prevent an infinite loop.

Insert Only
Markiert in:        

Schreibe einen Kommentar

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert.