Really Complicated Documents¶
The library supports arbitrarily structured objects through the type expression feature. Expressions can include sequences, collections and nested messages, covering most application persistence requirements.
Type expressions can also include the PointerTo type. This is the mechanism by which applications can store and recover graph objects - linked lists and trees. Even something as convoluted as an AST (i.e. abstract syntax tree) for a regular expression, can be stored and recovered.
The Python memory model is a handle-based model - there is no explicit support for pointers as there is in languages such as C++ and Go. However, handles can be managed in a way that simulates pointer behaviour and this is what the library PointerTo type is for.
Value Versus Pointer¶
To understand how the library can store and recover complex graph objects, the next few paragraphs demonstrate the use of PointerTo and how that manifests in the stored representations.
Consider the following Python script:
>> import ansar.encode as ar
>> f = ar.File('flag', bool)
>> b = False
>> f.store(b)
This produces perhaps the simplest possible application object:
{
"value": false
}
By making a single change to the type expression:
>> import ansar.encode as ar
>> f = ar.File('pointer-to-flag', ar.PointerTo(bool))
>> b = False
>> f.store(b)
The change in representation illustrates the effect of the change in type expression:
{
"pointer": [
[
"1100:889fd3ed-d7c8-473d-b50a-dbd29ad62fd3",
false
]
],
"value": "1100:889fd3ed-d7c8-473d-b50a-dbd29ad62fd3"
}
The value
has changed from false
to a rather long string starting
with 1100
and that same string value appears in the pointer
list,
paired up with the false
value.
There is no obvious difference when recovering the representation:
>>> r, _ = f.recover()
>>> r
False
Note
The strings used as identities for pointers are rather long. This is a necessity brought about by the inclusion of polymorphic capability (Going Incognito). Suffice to say that implementation was quite detailed and can result in pointers from multiple inbound encodings being merged into a single outbound encoding. The long string guarantees uniqueness of entries.
Many Pointers To The Same Value¶
The next step is to demonstrate how the management of PointerTo objects can be used to share a common object. An example of sharing a value in Python, looks like this:
>> b = True
>> a = [b, b, b, b]
Creating a stored representation looks like this:
>> f = ar.File('array-of-pointer', ar.ArrayOf(ar.PointerTo(bool), 4))
>> f.store(a)
The representation records the single instance of a bool
and
the multiple references to that instance:
{
"pointer": [
[
"1100:ada12766-d4c8-4e62-8241-44a05f6b21a5",
true
]
],
"value": [
"1100:ada12766-d4c8-4e62-8241-44a05f6b21a5",
"1100:ada12766-d4c8-4e62-8241-44a05f6b21a5",
"1100:ada12766-d4c8-4e62-8241-44a05f6b21a5",
"1100:ada12766-d4c8-4e62-8241-44a05f6b21a5"
]
}
After recovering this representation, a few careful Python expressions expose the true reason for the existence of PointerTo:
>>> r, _ = f.recover()
>>> r
[True, True, True, True]
>>> id(r[0]) == id(r[2])
True
>>> id(r[0]) == id(r[3])
True
Comparing the id()s
of the array elements shows that the recovered array
r
is populated with the single handle, referring to a common bool
value.
Exploiting this ability to manage Python handles leads to much more significant constructions, such as linked lists. The following sections show how such lists can be built, stored and recovered.
A Linked List¶
Consider the following class declaration:
import ansar.encode as ar
class Node(object):
def __init__(self, next=None):
self.next = next
ar.bind(Node, object_schema={
'next': ar.PointerTo(Node),
})
A Node
contains a single member next
that refers to the following
instance of a Node
, presumably until there is a None
value. This
Python session demonstrates the construction and persistence of a linked-list
of Nodes
:
>>> next = None
>>> for i in range(4):
... c = Node(next)
... next = c
...
>>> c
<__main__.Node object at 0x7fa0395507f0>
>>> c.next
<__main__.Node object at 0x7fa039550760>
>>> c.next.next
<__main__.Node object at 0x7fa0395506a0>
>>> c.next.next.next
<__main__.Node object at 0x7fa039550670>
>>> c.next.next.next.next
>>> f = ar.File('linked-list', ar.PointerTo(Node))
>>> f.store(c)
>>> r, _ = f.recover()
>>> r
<__main__.Node object at 0x7fa039550df0>
>>> r.next
<__main__.Node object at 0x7fa0395508e0>
>>> r.next.next
<__main__.Node object at 0x7fa039550f70>
>>> r.next.next.next
<__main__.Node object at 0x7fa039550e80>
>>> r.next.next.next.next
>>>
The for
loop creates a linked-list in a reverse direction - the first
Node
object created is eventually the last of 4 Nodes
in the list.
The list passes through store()
and recover()
operations. The recovered
list is examined in the same manner as the original, verifying that all
links are in place and that the list is properly terminated (i.e. the final
next
member is set to the value None
).
The stored representation looks like this:
{
"pointer": [
[
"1100:2e394750-2a07-4a08-82c4-642211222b6a",
{
"next": "1101:2e394750-2a07-4a08-82c4-642211222b6a"
}
],
[
"1101:2e394750-2a07-4a08-82c4-642211222b6a",
{
"next": "1102:2e394750-2a07-4a08-82c4-642211222b6a"
}
],
[
"1102:2e394750-2a07-4a08-82c4-642211222b6a",
{
"next": "1103:2e394750-2a07-4a08-82c4-642211222b6a"
}
],
[
"1103:2e394750-2a07-4a08-82c4-642211222b6a",
{}
]
],
"value": "1100:2e394750-2a07-4a08-82c4-642211222b6a"
}
Note
The encoding machinery omits any member with a value of None
and due to
the fact that a Node
has only one member, this results in a representation
that includes an empty JSON object - {}
.
A Loop Of Links¶
A single additional line of Python links the final Node
back to the
start of the list, creating an endless loop. The library detects “cycles”
or “circular references” and uses back-patching to complete complex
constructions of this nature. This ability is necessary for the proper
handling of graph objects:
>>> next = None
>>> for i in range(4):
... c = Node(next)
... next = c
...
>>> c
<__main__.Node object at 0x7fa0395506a0>
>>> c.next.next.next.next
>>> c.next.next.next.next = c
>>> c.next.next.next.next
<__main__.Node object at 0x7fa0395506a0>
>>> c.next.next.next.next.next.next.next.next
<__main__.Node object at 0x7fa0395506a0>
>>> f = ar.File('linked-loop', ar.PointerTo(Node))
>>> f.store(c)
>>> r, _ = f.recover()
>>> r
<__main__.Node object at 0x7fa039555370>
>>> r.next.next.next.next
<__main__.Node object at 0x7fa039555370>
>>>
The assignment of the start Node
to the fourth link, closes the loop. The
loop is then passed through a store and recover process. The start object and
final link of the recovered loop, are shown to be the same address. The
stored representation looks like:
{
"pointer": [
[
"1100:e8e505d0-efac-4470-8eac-0046fc62d584",
{
"next": "1101:e8e505d0-efac-4470-8eac-0046fc62d584"
}
],
[
"1101:e8e505d0-efac-4470-8eac-0046fc62d584",
{
"next": "1102:e8e505d0-efac-4470-8eac-0046fc62d584"
}
],
[
"1102:e8e505d0-efac-4470-8eac-0046fc62d584",
{
"next": "1103:e8e505d0-efac-4470-8eac-0046fc62d584"
}
],
[
"1103:e8e505d0-efac-4470-8eac-0046fc62d584",
{
"next": "1100:e8e505d0-efac-4470-8eac-0046fc62d584"
}
]
],
"value": "1100:e8e505d0-efac-4470-8eac-0046fc62d584"
}
A Map Of Loops¶
Linking can be combined with the structured types. Here is the construction of a map of loops:
>>> f = ar.File('map-loop', ar.MapOf(str,ar.PointerTo(Node)))
>>> m = {}
>>> def loop():
... next = None
... for x in range(4):
... n = Node(next)
... next = n
... next.next.next.next.next = next
... return next
...
>>> m['bjorn']=loop()
>>> m['sven']=loop()
>>> m['hilda']=loop()
>>> m['freya']=loop()
>>> bjorn = m['bjorn']
>>> bjorn
<__main__.Node object at 0x7fa039555430>
>>> bjorn.next.next.next.next
<__main__.Node object at 0x7fa039555430>
>>> bjorn.next.next.next.next.next.next.next.next
<__main__.Node object at 0x7fa039555430>
>>> f.store(m)
>>> r, _ = f.recover()
>>> bjorn = r['bjorn']
>>> bjorn
<__main__.Node object at 0x7fa039555f40>
>>> bjorn.next.next.next.next
<__main__.Node object at 0x7fa039555f40>
>>>
A dict
is populated with named loops. This creates a total of 16 Node
objects
divided across 4 loops. This can all be traced through the following representation:
{
"pointer": [
[
"1100:82c85c2f-62b5-4b87-99c9-8a82a3262821",
{
"next": "1101:82c85c2f-62b5-4b87-99c9-8a82a3262821"
}
],
[
"1101:82c85c2f-62b5-4b87-99c9-8a82a3262821",
{
"next": "1102:82c85c2f-62b5-4b87-99c9-8a82a3262821"
}
],
[
"1102:82c85c2f-62b5-4b87-99c9-8a82a3262821",
{
"next": "1103:82c85c2f-62b5-4b87-99c9-8a82a3262821"
}
],
[
"1103:82c85c2f-62b5-4b87-99c9-8a82a3262821",
{
"next": "1100:82c85c2f-62b5-4b87-99c9-8a82a3262821"
}
],
[
"1104:82c85c2f-62b5-4b87-99c9-8a82a3262821",
{
"next": "1105:82c85c2f-62b5-4b87-99c9-8a82a3262821"
}
],
[
"1105:82c85c2f-62b5-4b87-99c9-8a82a3262821",
{
"next": "1106:82c85c2f-62b5-4b87-99c9-8a82a3262821"
}
],
[
"1106:82c85c2f-62b5-4b87-99c9-8a82a3262821",
{
"next": "1107:82c85c2f-62b5-4b87-99c9-8a82a3262821"
}
],
[
"1107:82c85c2f-62b5-4b87-99c9-8a82a3262821",
{
"next": "1104:82c85c2f-62b5-4b87-99c9-8a82a3262821"
}
],
[
"1108:82c85c2f-62b5-4b87-99c9-8a82a3262821",
{
"next": "1109:82c85c2f-62b5-4b87-99c9-8a82a3262821"
}
],
[
"1109:82c85c2f-62b5-4b87-99c9-8a82a3262821",
{
"next": "1110:82c85c2f-62b5-4b87-99c9-8a82a3262821"
}
],
[
"1110:82c85c2f-62b5-4b87-99c9-8a82a3262821",
{
"next": "1111:82c85c2f-62b5-4b87-99c9-8a82a3262821"
}
],
[
"1111:82c85c2f-62b5-4b87-99c9-8a82a3262821",
{
"next": "1108:82c85c2f-62b5-4b87-99c9-8a82a3262821"
}
],
[
"1112:82c85c2f-62b5-4b87-99c9-8a82a3262821",
{
"next": "1113:82c85c2f-62b5-4b87-99c9-8a82a3262821"
}
],
[
"1113:82c85c2f-62b5-4b87-99c9-8a82a3262821",
{
"next": "1114:82c85c2f-62b5-4b87-99c9-8a82a3262821"
}
],
[
"1114:82c85c2f-62b5-4b87-99c9-8a82a3262821",
{
"next": "1115:82c85c2f-62b5-4b87-99c9-8a82a3262821"
}
],
[
"1115:82c85c2f-62b5-4b87-99c9-8a82a3262821",
{
"next": "1112:82c85c2f-62b5-4b87-99c9-8a82a3262821"
}
]
],
"value": [
[
"bjorn",
"1100:82c85c2f-62b5-4b87-99c9-8a82a3262821"
],
[
"sven",
"1104:82c85c2f-62b5-4b87-99c9-8a82a3262821"
],
[
"hilda",
"1108:82c85c2f-62b5-4b87-99c9-8a82a3262821"
],
[
"freya",
"1112:82c85c2f-62b5-4b87-99c9-8a82a3262821"
]
]
}
Warning
The entries in the MapOf must be PointerTo(Node) not
plain Node
.
A Document With Everything¶
This section combines all the constructs presented in the preceding sections. The result is a single application object that includes significant structuring and graph object elements. The techniques demonstrated should be enough to serve as the basis for almost any imaginable data persistence requirement.
The class declaration looks like this:
import ansar.encode as ar
class Node(object):
def __init__(self, next=None):
self.next = next
ar.bind(Node, object_schema={
'next': ar.PointerTo(Node),
})
class Doc(object):
def __init__(self, flag=None, pointer_to_flag=None, array_to_flag=None, linked_list=None, loop=None, map_of_loops=None):
self.flag=flag
self.pointer_to_flag=pointer_to_flag
self.array_to_flag=array_to_flag
self.linked_list=linked_list
self.loop=loop
self.map_of_loops=map_of_loops
ar.bind(Doc, object_schema={
'flag': bool,
'pointer_to_flag': ar.PointerTo(bool),
'array_to_flag': ar.ArrayOf(ar.PointerTo(bool), 4),
'linked_list': ar.PointerTo(Node),
'loop': ar.PointerTo(Node),
'map_of_loops': ar.MapOf(str,ar.PointerTo(Node)),
})
Construction and verification of a recovered Doc
looks like:
>>> f = ar.File('doc', Doc)
>>> d = Doc()
>>> def links():
... next = None
... for x in range(4):
... n = Node(next)
... next = n
... return next
...
>>> def loop():
... next = links()
... next.next.next.next.next = next
... return next
...
>>> flag = True
>>> d.flag = flag
>>> d.pointer_to_flag = flag
>>> d.array_to_flag = [flag, flag, flag, flag]
>>> d.linked_list = links()
>>> d.loop = loop()
>>> m = {}
>>> m['bjorn']=loop()
>>> m['sven']= loop()
>>> m['hilda']=loop()
>>> m['freya']=loop()
>>> d.map_of_loops = m
>>>
>>> f.store(d)
>>> r, _ = f.recover()
>>>
... bjorn = r.map_of_loops['bjorn']
>>> bjorn
<__main__.Node object at 0x7f73f3521490>
>>> bjorn.next.next.next.next
<__main__.Node object at 0x7f73f3521490>
>>> bjorn.next.next.next.next.next.next.next.next
<__main__.Node object at 0x7f73f3521490>
The complete representation appears below:
{
"pointer": [
[
"1100:6eb1e583-4adc-4ad7-9402-7c4e836ce47e",
true
],
[
"1101:6eb1e583-4adc-4ad7-9402-7c4e836ce47e",
{
"next": "1102:6eb1e583-4adc-4ad7-9402-7c4e836ce47e"
}
],
[
"1102:6eb1e583-4adc-4ad7-9402-7c4e836ce47e",
{
"next": "1103:6eb1e583-4adc-4ad7-9402-7c4e836ce47e"
}
],
[
"1103:6eb1e583-4adc-4ad7-9402-7c4e836ce47e",
{
"next": "1104:6eb1e583-4adc-4ad7-9402-7c4e836ce47e"
}
],
[
"1104:6eb1e583-4adc-4ad7-9402-7c4e836ce47e",
{}
],
[
"1105:6eb1e583-4adc-4ad7-9402-7c4e836ce47e",
{
"next": "1106:6eb1e583-4adc-4ad7-9402-7c4e836ce47e"
}
],
[
"1106:6eb1e583-4adc-4ad7-9402-7c4e836ce47e",
{
"next": "1107:6eb1e583-4adc-4ad7-9402-7c4e836ce47e"
}
],
[
"1107:6eb1e583-4adc-4ad7-9402-7c4e836ce47e",
{
"next": "1108:6eb1e583-4adc-4ad7-9402-7c4e836ce47e"
}
],
[
"1108:6eb1e583-4adc-4ad7-9402-7c4e836ce47e",
{
"next": "1105:6eb1e583-4adc-4ad7-9402-7c4e836ce47e"
}
],
[
"1109:6eb1e583-4adc-4ad7-9402-7c4e836ce47e",
{
"next": "1110:6eb1e583-4adc-4ad7-9402-7c4e836ce47e"
}
],
[
"1110:6eb1e583-4adc-4ad7-9402-7c4e836ce47e",
{
"next": "1111:6eb1e583-4adc-4ad7-9402-7c4e836ce47e"
}
],
[
"1111:6eb1e583-4adc-4ad7-9402-7c4e836ce47e",
{
"next": "1112:6eb1e583-4adc-4ad7-9402-7c4e836ce47e"
}
],
[
"1112:6eb1e583-4adc-4ad7-9402-7c4e836ce47e",
{
"next": "1109:6eb1e583-4adc-4ad7-9402-7c4e836ce47e"
}
],
[
"1113:6eb1e583-4adc-4ad7-9402-7c4e836ce47e",
{
"next": "1114:6eb1e583-4adc-4ad7-9402-7c4e836ce47e"
}
],
[
"1114:6eb1e583-4adc-4ad7-9402-7c4e836ce47e",
{
"next": "1115:6eb1e583-4adc-4ad7-9402-7c4e836ce47e"
}
],
[
"1115:6eb1e583-4adc-4ad7-9402-7c4e836ce47e",
{
"next": "1116:6eb1e583-4adc-4ad7-9402-7c4e836ce47e"
}
],
[
"1116:6eb1e583-4adc-4ad7-9402-7c4e836ce47e",
{
"next": "1113:6eb1e583-4adc-4ad7-9402-7c4e836ce47e"
}
],
[
"1117:6eb1e583-4adc-4ad7-9402-7c4e836ce47e",
{
"next": "1118:6eb1e583-4adc-4ad7-9402-7c4e836ce47e"
}
],
[
"1118:6eb1e583-4adc-4ad7-9402-7c4e836ce47e",
{
"next": "1119:6eb1e583-4adc-4ad7-9402-7c4e836ce47e"
}
],
[
"1119:6eb1e583-4adc-4ad7-9402-7c4e836ce47e",
{
"next": "1120:6eb1e583-4adc-4ad7-9402-7c4e836ce47e"
}
],
[
"1120:6eb1e583-4adc-4ad7-9402-7c4e836ce47e",
{
"next": "1117:6eb1e583-4adc-4ad7-9402-7c4e836ce47e"
}
],
[
"1121:6eb1e583-4adc-4ad7-9402-7c4e836ce47e",
{
"next": "1122:6eb1e583-4adc-4ad7-9402-7c4e836ce47e"
}
],
[
"1122:6eb1e583-4adc-4ad7-9402-7c4e836ce47e",
{
"next": "1123:6eb1e583-4adc-4ad7-9402-7c4e836ce47e"
}
],
[
"1123:6eb1e583-4adc-4ad7-9402-7c4e836ce47e",
{
"next": "1124:6eb1e583-4adc-4ad7-9402-7c4e836ce47e"
}
],
[
"1124:6eb1e583-4adc-4ad7-9402-7c4e836ce47e",
{
"next": "1121:6eb1e583-4adc-4ad7-9402-7c4e836ce47e"
}
]
],
"value": {
"array_to_flag": [
"1100:6eb1e583-4adc-4ad7-9402-7c4e836ce47e",
"1100:6eb1e583-4adc-4ad7-9402-7c4e836ce47e",
"1100:6eb1e583-4adc-4ad7-9402-7c4e836ce47e",
"1100:6eb1e583-4adc-4ad7-9402-7c4e836ce47e"
],
"flag": true,
"linked_list": "1101:6eb1e583-4adc-4ad7-9402-7c4e836ce47e",
"loop": "1105:6eb1e583-4adc-4ad7-9402-7c4e836ce47e",
"map_of_loops": [
[
"bjorn",
"1109:6eb1e583-4adc-4ad7-9402-7c4e836ce47e"
],
[
"sven",
"1113:6eb1e583-4adc-4ad7-9402-7c4e836ce47e"
],
[
"hilda",
"1117:6eb1e583-4adc-4ad7-9402-7c4e836ce47e"
],
[
"freya",
"1121:6eb1e583-4adc-4ad7-9402-7c4e836ce47e"
]
],
"pointer_to_flag": "1100:6eb1e583-4adc-4ad7-9402-7c4e836ce47e"
}
}