Versions, Upgrading And Migration

Implementing version support involves an additional parameter during type registration (i.e. a version_history is passed to bind()) and there is a formal registration of those types to be passed to an encode operation (e.g. the Job class is passed to released_document()). Lastly, the version tags provided at recovery sites must be used for codepath selection, which is where the actual “support” begins.

There are two reasons for the released_document() registration. Firstly, not all application types will be the focus of an encode operation and the requirements imposed on version-managed types are rigorous. Imposing these requirements on all types would be heavy-handed and tricky where types are shared by multiple applications.

Secondly, there is a need to maintain integrity of version-managed types that contain nested application types. An application document type may contain further types for sections, indexes and spelling dictionaries. What happens when one of those nested types changes and the document does not? Without extra attention there would be situations where the document changes but this is not detected during recovery and there is no potential for the application to respond in a proper fashion.

The versioning solution offered by this library exists in a world of software development that has evolved many different approaches. For the context of this solution refer to this assessment of the situation. The remainder of this section is focused on full version control.

In The Beginning

This is a repeat of the journey described in More About Types except this time the changes made to Job will be tracked using version managment.

Consider the original declaration:

import uuid
import ansar.encode as ar

class Job(object):
    def __init__(self, unique_id=None, title='watchdog', priority=10, service='noop', body=b''):
        self.unique_id = unique_id or uuid.uuid4()
        self.title = title
        self.priority = priority
        self.service = service
        self.body = body

ar.bind(Job)

The stored representation looks like:

{
    "value": {
        "body": "",
        "priority": 10,
        "service": "noop",
        "title": "watchdog",
        "unique_id": "b2c08bf5-99e0-41ad-810b-095a9715f48c"
    }
}

There is no evidence of anything suggesting version managment - yet. This absence of any versioning detail is deliberate and the proper representation of an unversioned message. All encoding-decoding activity associated with the Job type at this point is referred to as “unversioned”.

Consider a store-and-recover cycle of a default Job:

>>> f = ar.File('job', Job)
>>> j = Job()
>>> f.store(j)
>>> r, v = f.recover()
>>> print(r)
<__main__.Job object at 0x7fc300dd7150>
>>> print(v)
None

The recover() method has returned None as the version tag. This is the result of a preprocessing of the version situation. It tells the application that the recovered object is at the same version as the reading application. In this case it actually means that both the encoding and decoding operations were unversioned. This preprocessing simplifies the code around the decoding site as there is no need to compare against an explicit value.

A None version tag always indicates that no further version-related processing is needed.

Everything Changes

The first change to Job was to record the creation time. This change, along with a version history is shown here:

import time
import uuid
import ansar.encode as ar

class Job(object):
    def __init__(self, created=None, unique_id=None,
        title='watchdog', priority=10, service='noop', body=b''):
        self.created = created or time.time()
        self.unique_id = unique_id or uuid.uuid4()
        self.title = title
        self.priority = priority
        self.service = service
        self.body = body

JOB_HISTORY = (
    ('0.0', ar.Added('created'), 'Added timestamping'),
)

ar.bind(Job, version_history=JOB_HISTORY)

The simple presence of a version history activates version management. All subsequent encodings of jobs have a version tag added to the representation. The tag used comes from the end of the JOB_HISTORY. This is what happens when that application recovers the stored job created previously:

>>> f = ar.File('job', Job)
>>> r, v = f.recover()
>>> print(r)
<__main__.Job object at 0x7f3dd9197b10>
>>> print('"%s"' % (v,))
""

The library is notifying the application that the stored job is at a different version to the version in use by the application. It is returning the empty string - a special tag indicating that the recovered job is unversioned.

Storing a representation of the latest Job declaration produces the following:

{
    "value": {
        "body": "",
        "created": 1662413699.6478937,
        "priority": 10,
        "service": "noop",
        "title": "watchdog",
        "unique_id": "8a657371-53e7-4f96-a277-5e6eca4c27ca"
    },
    "version": "0.0"
}

Recovering this latest, versioned job results in a None version tag, indicating that the encoded materials and the decoding application are at the same version.

Application Type Histories

A history is a sequence of changes, where each change looks like this:

[tag, operation(s), note],

A tag is a string like 0.14 or 2.4 that includes a major and minor number separated by a dot. The minor number is assumed to increment on each subsequent change except where the major number increments and the minor number returns to zero.

The second column describes the significant operations in a “machine-readable” form. Accepted operations are Added, Moved and Deleted. These are classes defined in the library that accept one or two names as arguments. Any name Added is considered unavailable to the application before the associated version tag and any name Deleted is unavailable from the associated tag onward. Moved operations are a shorthand for a delete-add pair. An operation of None can be used to represent change in the software that processes the application data, rather than the application data itself. An operation can also be a list of operation objects.

A note is a brief explanation of the change.

Maintaining A Good History

Type histories are intended to capture a sliding window of changes. As the history becomes long and the older entries become less relevant it will become appropriate to delete them. Whenever a line of history is deleted, any names appearing in Deleted operations can also be removed from the class declaration. This includes the previous name on a Moved operation.

A name cannot be Deleted and then Added within the known history, as this would imply that the same name can be used with (potentially) different types. Values that need to change type are Moved to a new name and type. Names can be Added and then Deleted within the known history.

All names appearing in the changes must also appear in the current class declaration, whether that name was Added, Moved or Deleted. Whether a name is actually available is dependent on a runtime version tag - a value provided at each decoding point (e.g. d, v = f.recover()), or taken from the current type history at each encoding point (e.g. the default behaviour of f.store()), or passed explicitly from the application to the encoding operation. The latter gives an application the ability to save old versions of documents, e.g. at the request of a user.

The class may contain members that do not appear in the changes, e.g. when the history no longer includes the entry that Added the relevant name.

Quality Encodings From Bags Of Data

Application types are necessarily an accumulation of members. There are members that are no longer used by the latest code. There are also members that have only just been added - older code does not reference these members as they do not exist. Or rather, didn’t exist at the time the older code was written.

Supporting multiple versions of application types requires this accumulation of members. Without the continued presence of those “deleted” members there is no potential to decode older encodings. The decoding process would have nowhere to store the inbound values and old code would be referencing members that no longer exist. Application types become bags of past and present members.

In versioned operation the library emits encodings containing only those members currently available, i.e. Deleted members are not included. The encodings are an expression of what the application type would look like if it wasnt carrying historical baggage. This is not only more “correct” it also results in encodings that are smaller and more readable. In unversioned operation everything that appears in an application type is reflected in encodings (i.e. except members with None values).

To assist with the detection of programming errors, the library checks the value of any trimmed member. Any value other than None is considered a programming error and an exception is raised. Conceptually the code is trying to make use of a member that would not have existed at the relevant time.

Full implementation of member trimming requires that all sub-objects are also trimmed, i.e. a document might contain sections, paragraphs, a table of contents and an index. There are a collection of application types that are aggregated to create the top-level document.

Note

Application types that include further nested types are difficult for version management to track. Refer to here for what is required to properly track complex types.

A Typical Change

Making one more change demonstrates a more typical change, i.e. a change not involving unversioned materials. A list of email addresses is added below:

import ansar.encode as ar

class Job(object):
    def __init__(self, created=None, unique_id=None,
        title=None, priority=None,
        service=None, body=None, who=None):
        self.created = created
        self.unique_id = unique_id
        self.title = title
        self.priority = priority
        self.service = service
        self.body = body
        self.who = who or ar.default_vector()

JOB_SCHEMA = {
    'created': ar.ClockTime,
    'unique_id': ar.UUID,
    'title': ar.Unicode,
    'priority': ar.Integer4,
    'service': ar.Unicode,
    'body': ar.String,
    'who': ar.VectorOf(ar.Unicode),
}

JOB_HISTORY = (
    ('0.0', ar.Added('created'), 'Added created timestamp'),
    ('0.1', ar.Added('who'), 'Added who emails'),
)

ar.bind(Job, object_schema=JOB_SCHEMA, version_history=JOB_HISTORY)

This produces the stored representation:

{
    "value": {
        "who": []
    },
    "version": "0.2"
}

The representation reflects the updated version history. Given the following set of test files:

  • job, an instance of the original class

  • job-created, an instance with the created member added

  • job-who, an instance with the who member added

    >>> f = ar.File('job', Job)
    >>> r, v = f.recover()
    >>> print(r)
    <__main__.Job object at 0x7fc3ef7884d0>
    >>> print('"%s"' % (v,))
    ""
    >>> f = ar.File('job-created', Job)
    >>> r, v = f.recover()
    >>> print(v)
    0.0
    >>> f = ar.File('job-who', Job)
    >>> r, v = f.recover()
    >>> print(v)
    None
    

The last recovered version tag is again None, reflecting that fact that the stored version and the application version are the same.

Dropping Old Versions

During the recovery of an encoding, the library extracts the version information and compares it to the current application version information. If the minor number in the version tag is older than the oldest value in the application version history, the recovery process rejects the input and raises an exception. The encoding is considered to be unsupported. This acknowledges that the representation appears valid in all other ways but the executing application is no longer maintaining that area of code.

 JOB_HISTORY = (
     ('0.0', ar.Added('created'), 'Added timestamp'),
     ('0.1', ar.Added('who'), 'Added the list of email addresses'),
     ('0.2', ar.Added('unique_id'), 'Added accounting'),
     ('0.3', ar.Deleted('priority'), 'Deleted the priority number'),
     ('0.4', ar.Added('permissions'), 'Added authorization values'),
 )

By deleting the first line highlighted above, the "0.0" version immediately becomes unsupported. Any encounter with encodings at this version will result in exceptions. At the same time all code specific to this version can be retired from the application.

When housekeeping work finally catches up with the "0.3" version and the relevant version-description pair is deleted from the history, the priority parameter and member can also be deleted from the Job class. Any attempt to recover encodings at that version (or before) will result in a version exception.

A Brave New World

The major version number is used to signal a complete reset. The major number is incremented by one and the minor number returns to zero. The version history contains the lone entry:

JOB_HISTORY = (
    ('1.0', None, 'Brave new world'),
)

The class is now effectively at an initial version. Recovery of any encoding tagged with the previous major number - "0.24" - results in a rejection by the library. It is considered inappropriate to distinguish it from unsupported. An exception is raised.

Moving to a new major number is likely to reflect significant technical changes in the application - a shift to new tools and/or architecture. Perhaps a re-targeting from customer premises deployment to the cloud. There may be commercial considerations involved.

For whatever reason the application wants to continue using the name in the class declaration (e.g. Job) but it is starting a new ecosystem of software and stored representations, and is not offering any integration with the previous ecosystem.

Resuming The Journey

Actual version support begins with the object and version tag returned by one of the two recover() methods. This section looks at how an application might respond to these values. The goal is seamless operation in a mixed-version world, but there are at least two different ways that this can be achieved.

A First Attempt

The Job class has been through 2 changes, giving a total of three versions (including unversioned encodings) that might be encountered by an application tasked with processing these objects:

JOB_HISTORY = (
    ('0.0', ar.Added('created'), 'Added timestamp'),
    ('0.1', ar.Added('who'), 'Added the list of email addresses'),
)

Full support implies three different codepaths:

j, v = f.recover()
if v is None:   # 0.1
    pass
elif v == '0.0':
    j.who.append(DEFAULT_EMAIL)
elif v == '':
    j.created = DEFAULT_CREATED
    j.who.append(DEFAULT_EMAIL)
else:
    not_supported()

A version value of None is ignored - no specific support processing is required. Otherwise a series of conditionals arranges for a patching of the job object, according to the detected version. Default values are assigned to those members that did not exist in the respective version of a Job. If the version remains unrecognised by the application there is a call to an error routine.

This implementation of version support meets the primary requirement, i.e. seamless processing of mixed versions. A tacit decision was made to promote or upgrade older versions to something matching the current version. The actual processing of the job can then begin without regard to the original version.

A different coding style looks long-winded but brings an advantage:

j, v = f.recover()
if v is None:
    pass
elif v == '0.0':
    j = Job(created=j.created,
        unique_id=j.unique_id, title=j.title,
        priority=j.priority, service=j.service,
        body=j.body, who=[DEFAULT_EMAIL])
elif v == '':
    j = Job(created=DEFAULT_CREATED,
        unique_id=j.unique_id, title=j.title,
        priority=j.priority, service=j.service,
        body=j.body, who=[DEFAULT_EMAIL])
else:
    not_supported()

Where an upgrade is required an entirely new Job is constructed from the information provided by each older version, plus defaults as necessary. The result, courtesy of the __init__ function, is a properly constructed and up-to-date instance of a Job.

A second issue with both of these coding styles is that they are inline and over time the application will accumulate more than one call to recover().

An Upgrade Plan

A more forward-thinking style of coding is to move all the version-related activities to a dedicated function and call it something sensible like upgrade. This change in approach cleans up the call site:

j, v = f.recover(upgrade=upgrade)

It plays nice with the dict comprehensions associated with Folder objects:

jobs = {k: j for k, j, _ in f.recover(upgrade=upgrade)}

The version returned by recover() is always None and can be safely ignored as any related issue has been addressed by the upgrade function. The function looks like this:

def upgrade(r, v):
    if v == '0.0':
        return Job(created=j.created,
            unique_id=j.unique_id, title=j.title,
            priority=j.priority, service=j.service,
            body=j.body, who=[DEFAULT_EMAIL])
    elif v == '':
        return Job(created=DEFAULT_CREATED,
            unique_id=j.unique_id, title=j.title,
            priority=j.priority, service=j.service,
            body=j.body, who=[DEFAULT_EMAIL])
    ar.cannot_upgrade(r, v)

The upgrade function is only called where appropriate, i.e. where the detected version is not equal to the current application version. This means there is no need to include None as one of the codepaths. This function either returns a new object using the contents of an older object, or it raises an exception. The library function cannot_upgrade() raises a ValueError with a helpful diagnostic.

Injecting Runtime Values Into Upgrades

DEFAULT_CREATED and DEFAULT_EMAIL are assigned to the relevant members when the recovered version lacks those particular values. In the previous code fragment these were hardcoded constants and there will likely be scenarios where runtime values are needed. The optimal approach is a matter of design and preference. A few suggestions follow.

Values may be computed just prior to the recovery site, or even perhaps just prior to the upgrade:

who = [get_who()]
jobs = {}
for k, j, v in f.recover():
    hi = get_hi(j)
    lo = get_lo(j)
    j = upgrade(j, v, who=who, hi=hi, lo=lo)
    jobs[k] = j

The implication is that get_hi and get_lo are values that will be based on other values present in each Job. The who value does not share that dependency and can be calculated ahead of time prior to the for loop. These different runtime values are gathered together by the call to upgrade and become available for population of Job members.

Another arrangement elicits a slightly different behaviour:

who = None

    ..
    global who
    who = who or [get_who()]
    jobs = {}
    for k, j, v in f.recover():
        hi = get_hi(j)
        lo = get_lo(j)
        j = upgrade(j, v, who=who, hi=hi, lo=lo)
        jobs[k] = j

The get_who function is called on every visit to this code site, until it returns a non-None value.

Any runtime values that have no special connection to particulars of a recover site can be located with the upgrade function. Similar options exist with respect to placement of the different code elements:

who = None

def upgrade(r, v, hi=100, lo=10):
    global who
    who = who or [get_who()]
    if v == '0.0':
        return Job(created=j.created,
            unique_id=j.unique_id, title=j.title,
            priority=j.priority,
            service=j.service,
            body=j.body, who=who)
    elif v == '':
        return Job(created=DEFAULT_CREATED,
            unique_id=j.unique_id, title=j.title,
            priority=j.priority,
            service=j.service,
            body=j.body, who=who)
    ar.cannot_upgrade(r, v)

Migration - Reducing The Upgrade Workload

The goal is seamless operation in a mixed-version world and the upgrade facility ticks that box. An application with a solid implementation of upgrade can focus its attentions entirely on the current version of Job.

Any application that is repeatedly upgrading the same stored representations is missing an opportunity to avoid the overhead of all but the initial upgrade. This can happen with an application configuration file, or a job scheduler polling a folder for work. Adding a single parameter arranges for the runtime migration of an older configuration file:

j, v = f.recover(upgrade=upgrade, migrate=True)

The same facility is available on dict comprehensions based on Folder objects:

jobs = {k: j for k, j, _ in f.recover(upgrade=upgrade, migrate=True)}

Design of the recover() methods and the update and migrate parameters was tilted towards convenient use of list and dict comprehensions. The price of that convenience is less freedom in how runtime data may be injected into the upgrade process. Where the method-based approach is too constrained, a small function provides another option:

def migrate(f, upgrade, *args, **kwargs):
    r, v = f.recover()
    a = upgrade(r, v, *args, **kwargs)
    if id(a) != id(r):
        f.store(a)
    return a

This exact function is available within the library - migrate(). Migration is performed on a File object rather than an instance of application data (e.g. a Job) due to the need to store() any changes. The simplest possible use is therefore:

f = ar.File('job', Job)
j = ar.migrate(f, upgrade)

Usage involving a Folder might look like:

def get_jobs(spool):
    jobs = {}
    for f in spool.each():
        j = ar.migrate(f, upgrade)
        k = spool.key(j)
        jobs[k] = j
    return jobs

Note

Runtime values for the upgrade function have been omitted for clarity. The args and kwargs parameters allows the migrate function to forward runtime information on to upgrade where needed.

Automated Migration

Software applying the migrate style of version support versus the upgrade style brings an auto data migration behaviour to every software release process. Wherever the new software goes, the files it works with are brought up-to-date with respect to the latest application types.