UniVerse and UniData Hashed Files - Part 2
Hashed Files are one of the key features of a MultiValue Database that make it efficient for business applications. In our last article, we talked about how the databases generate the pointer or location to a record using a hash key.
In this article we will explore how the record data is stored in the hashed file's groups.
Data Storage Within the Group
At this point, we've explained how data records are assigned to groups; now let's have a look at the way in which the data are stored in the group. Because of the variable length nature of the data, the records are treated as a list. The most significant implication of this is that locating a specific record within a group requires a scan of the records within the group. This has major performance implications that will occupy much of our series of articles.
UniVerse and UniData organize the data records within a group in very different ways. We will describe each method, beginning with UniVerse. Our description will start with the basics, ignoring exceptions such as large records. Later articles will refine the description to include the subtleties.
UniVerse treats the records in a group as a linked list - each record has a link to both the next record and the previous record. There is a twelve-byte header at the beginning of each record. Those twelve bytes contain the forward and backward links and a number of bits that reflect various pieces of status information about the record. Immediately following the record header are the record key and the record data, separated by a segment mark (the HEX "FF" character). The forward link is the physical address - the byte count within the group - to the beginning of the next record. This facilitates scanning the group because it provides the address for the next read. It also facilitates data retrieval because subtracting the current record address from the next record address yields the length of the current record.
UniData uses a different method, in which the record keys serve as an index to the data locations. A position near the middle of the group is chosen - we will refer to this location as DATAPOS. The data record keys for the group are placed in a list, with the number of characters in each key preceding the key. For example, if our record keys were ONE, TWO, THREE and FOUR the list would look like this: "3ONE3TWO5THREE4FOUR". BUT, the list is reversed - it starts at DATAPOS and goes backward toward the beginning of the group, so it would appear more like this: "FOUR4THREE5TWO3ONE3". Starting at the beginning of the group are pairs of numbers containing the length of each record and the physical address of the data for that record. The data itself is written after DATAPOS and grows toward the end of the group with a segment mark between the data for each record.
This approach offers some advantages, too. Scanning the list of record keys for a particular record is quick because we are simply parsing a short string of keys - there is no need to read past the data. Once the required record is located we get the corresponding length / address pair which tells us how much data and where it begins.
Buffer Size, Blocksize and Separation
After reading the preceding discussion of how data records are stored in groups, a natural question is: "How big is a group?" The answer is: "That depends." The size is configurable by the user at the time the file is created. Both UniVerse and UniData allow specification of the buffer size used by the file, however, the two environments use different terminology.
UniVerse uses the term "separation" to define buffer size. Separation is the number of 512 byte units that compose a buffer. So a separation of 1 means the buffer size is 512 bytes; a separation of 2 yields a buffer size of 1,024 bytes; separation 4 gives a buffer size of 2,048 bytes and so forth. The syntax of creating a file varies in UniVerse according to the "account flavor" being used. In the "Ideal" flavor, the command looks like this:
CREATE.FILE filename type modulo separation
For example:
CREATE.FILE TESTFILE 2 3 4
This will create a file named TESTFILE with a file type of 2, a modulo of 3 and a separation of 4.
UniData uses the term "blocksize" to indicate the buffer size of the file. The UniData commands use a "blocksize multiplier" to specify blocksize. A blocksize multiplier of 0 produces a buffer size of 512. Positive integers from 1 to 16 are multiplied by 1,024 so that 1 gives 1,024, 2 yields 2,048, etc. The highest buffer size available in UniData is 16K. The UniData command to create a file looks like this:
CREATE.FILE filename modulo, block.size.multiplier TYPE type
For example:
CREATE.FILE TESTFILE 3,2 TYPE 0
This will create a file named TESTFILE with a modulo of 3, a buffer size of 2,048 and which uses hash type 0.
Future articles will deal in depth with strategies for picking buffer size and will explain the interactions and performance implications of various choices.
Overflow
Now that we've discussed the size of buffers used in groups, it leads naturally to the subject of "overflow". Since the buffer size is finite, what happens when more data is hashed to a particular group than can be contained in one buffer? For instance, suppose that we have a UniVerse file with a separation of 4, thus each group starts out with one buffer of 2,048 bytes. Suppose we add 30 records that average 100 bytes in length and all 30 records happen to hash to the same group of the file. The amount of data to be stored is 3,000 bytes - more than can be held in a buffer of 2,048 bytes. What now?
Both UniVerse and UniData accommodate this situation by adding "overflow" buffers to the group. An overflow buffer is an additional buffer that is linked to the primary buffer of the group. From a logical point of view the group can be treated as though it were simply extended and doubled in size. From a physical point of view, the overflow buffer is not contiguous with the primary buffer and additional disk I/O will be required to retrieve it. The more disk I/O that is needed to retrieve data the slower the retrieval will be. Since there is more data in the group, the process of scanning for a required record will also impact performance.
If the first overflow buffer becomes full, additional overflow buffers are linked to the group. In fact, a group can be composed of as many overflow buffers as are required to hold the data. But the longer the chain of overflow buffers in a group, the slower access to data in the group will be. And the relationship between overflow and performance is not linear - it is geometric. This means that retrieving data in the tenth overflow buffer will be more than twice as slow as retrieving data in the fifth overflow buffer.
It is safe to say that overflow in hashed files can be the single biggest factor in system performance.
Resizing
Since overflow has such a huge performance impact, you might guess that there are strategies for eliminating or reducing it. Both UniVerse and UniData provide tools to resize their files. Resizing recreates the file with altered values for the modulo, file type (hashing algorithm) and buffer size. Intelligent choice of these parameters can turn a poorly performing file into a fast one.
Resizing is time-consuming and requires downtime. The analysis and selection of optimum parameters is not always straightforward. Future articles will talk in greater detail about the trade-offs and strategies for file tuning. The hashed file basics that we've explained will make these future discussions more readily understandable.