Secure Coding for MultiValue
Remember when you were a kid and all you wanted was to be a "grown up?" Everyone told you not to rush it. "It's not as great as you think it will be," they would say, knowingly. The situation is not so different for those of us in the MultiValue community. We have been talking for years about wanting to be more mainstream and more recognized, but as we extend our reach we may find that, just like being a grown-up, being mainstream isn't all it is cracked up to be. This is particularly true as it relates to security. For years we've been able to crouch behind "security by obscurity" as cover. Unfortunately, even if our flexible database remains fairly obscure, we are all busy pushing our data out to SQL and widely used reporting databases and web interfaces. We are moving stuff around through very mainstream methods — and lauding the fact that we are doing so. User interfaces and networking and generally data in motion are NOT unique to the MultiValue world, so we must consider all of the usual security issues that any IT entity must consider — firewalls, email filtering, passwords, encryption, and SSL. In addition all of that, there really are very specific ways that data and security breaches can be orchestrated against our own beloved database and programming environments.
My own perspective on this issue took a right turn several years ago when I was speaking to one of my typically small but fierce audiences of security enthusiasts. When I mentioned a very common SQL injection threat, Dan McGrath (now Managing Director U2 Servers Lab) began describing how a similar attack could be implemented using a Basic program and our own retrieval language. Dan was working with data that had to be secure and was doing some creative thinking about hacking into and hacking up MultiValue systems that most of us weren't. Some of this discussion came from ideas born in his devious mind. More information and some fascinating yet terrifying examples can be found at his blog http://u2tech.wordpress.com/.
There are some general security issues and ideas that are the same no matter what the platform, the threat, or the year. The details between these guideposts are the specific vulnerabilities in MultiValue and secure coding practices for the environment. To keep it all in perspective, we'll start from the top.
Awareness — Yours and Your Users'
Inarguably the best defense against any threat or attack is awareness — both within the IT community and within the larger user community. Make sure users are aware of what could happen if they clicked a link and the types of persuasion they may run into. Make a point to understand the motivation — what would someone be after, and why — and convey that to the user base. This way they can think creatively when confronted with a new angle. For example, a rising awareness issue is employment ads. When the company advertises for specific skills they are stating very publicly exactly what technology is in play. Can you hire people without describing the skills? Of course not. But risks can be mitigated with some thought to wording, so HR must not be excluded from the company-wide awareness initiative!
Threat Modeling & Defense in Depth
In order to keep users aware of the nature of new threats, and to maintain a state of vigilance in general, IT must affect good security controls. Threat Modeling helps in understanding the risk — the vulnerabilities and the threats. Defense in depth refers to having a multi-faceted approach to security. It's like having a lock on the door and a deadbolt and a chain. And maybe a barbed wire fence. And a big dog. Probably not a moat. Some specific security approaches do go out of favor.
Reducing the Attack Surface
This is the hip new way to say "minimize access." The "attack surface" is any point of access into the system or the data. This encompasses issues around data at rest and in motion, firewalls, VPN, and other access. These concerns are common to all IT enterprises regardless of platform or industry. The fundamental human issues related to access are 1) passwords: don't use default passwords or allow static passwords; 2) the principle of least privilege: don't allow any individual access to anything unless it is necessary; and 3) defined roles / segregation of duties: making sure that other eyes/sign-offs are involved in any critical activities. These three can be addressed in the code itself under the duel umbrellas of IT General Controls (an example of which is passwords) and Application Controls (defining and enforcing who can do what within the application software). But the real action is in sanitizing data, including validating input from human or non-human sources.
Security in All Stages of the Software Lifecycle
Requirements Phase
The best way to get the coders, the testers, and the users involved and aware of security issues is to begin by at least mentioning security when defining the requirement. It can be as simple as remembering to pose the question, "Have the security implications of this new feature been considered?"
Coding Phase
As programmers, our focus is on providing quality solutions to our users by thinking about the "user friendliness" of what we are developing. We don't want to lose that, but we may need to temper it a bit by thinking about whether we are developing applications that are "too friendly." We must ask ourselves, "Have we put too much power in the hands of an unhappy or unwelcome user?" And while we don't want to be paranoid, neither do we want to be too trusting of data that comes into the system from outside sources. Or even inside MV data that comes through third-party applications. Unlike Stella from A Streetcar Named Desire, we mustn't depend upon the kindness of strangers when it comes to sanitizing incoming data.
When we are coding and reviewing code here are some general risks with a MultiValue twist that we should consider:
Math: Potential integer arithmetic issues. Equations that are incorrect can corrupt data (precision loss/rounding errors) or bring the system to its knees due to overflows.
Input: When something can get into the system from the tip of human fingers it should be carefully examined before it is welcomed in to the rest of your data where it may do harm. It is not just a convenience to the user to remind them of the sort of data they should be entering, but by making certain the data is what was expected we can protect the system from accidental or intentional corruption. When input data should be a number, check for a number. When it should be a date, check for a date. Not just any date or any number, either. Make sure it is in a reasonable range. Range checks are rather infrequently found in MultiValue applications, but extremely large or incorrectly small numbers or massive ranges of dates are a very easy way to force a breakdown.
In addition to testing for range we may want to go further and rely more on the practice of testing input against a white list. In non-security terminology that might be a code file or a reference table. Whatever you call it, it is a list of valid inputs so that something that comes in from an outside source won't make it past the fence if it isn't specifically listed as valid.
Input: Executes The riskiest thing we ever do in a MultiValue Basic program is an EXECUTE. This is because we can execute almost anything. Many applications cleverly build up executable statements based on input from the user. This opens the door for a creative vandal. In the same way that a SQL injection can cause a dump of a whole database, allowing untested strings to be built into executes can wreak all kinds of havoc. Here are some things to keep in mind.
- The 'USING' keyword allows a statement to be created against one file, using the dictionary of another file. An intrepid hacker could create a dictionary that executes subroutines, and more, that would not be detected even in a carefully protected data file and its associated dictionary. Then simply add the phrase 'USING OTHERDICT' when the prompt asks for customer name. It's all about the quote marks.
- Quotes and Wildcards: If allowing a user to input their own criteria, sanitize the entry for extra quotes and wildcard characters. Imagine all the things that could be added to an execute by expanding it this way.
Input: Marker Injection Sanitize for marker/control characters — attribute marks, value marks. Most folks that have been around the MultiValue database for any length of time have dealt with a data file corrupted by control characters. Such a simple thing can cause mysterious problems.
Input: Keys Allowing external input to be written out as the key to a file is risky business. When this is necessary, be sure to carefully validate anything that may be used as the key to a file. The wrong character in a record key can corrupt an entire database. It can skew the hashing algorithm or throw off indexes, too — things that will not show up immediately but will degrade performance.
Input: external input as called subroutines Avoid using external input to set dynamic subroutine names for CALL @ statements without white-listing the input first. If the external input is compromised, it may enable un-intended subroutines to be called.
BASIC Coding vulnerabilities
- Lock / write / release matchup. A common accidental mistake, this can be readily capitalized on for nefarious purpose. When coding and when reviewing code, be sure to match up the reads and releases. If an item isn't written after it has been read-locked, it must be released. Otherwise a lock table can overflow and bring the system to its knees.
- Dynamic arrays. Similarly a dynamic array will work very well and very flexibly — even when used badly. It will try and try. But like the lane-line painter who leaves the can of paint at the end of the road, as the distance back to the paint increases the effort will slow down accordingly, while the amount of effort required will continue to increase.
- Oversharing. Error messages and help prompts are tricky beasts on so many levels. Ideally they pose a training opportunity. And of course they are useless if they don't present enough of the right information to help the user. But beware of exposing too much information, especially about the underpinnings of the system. Too many specific details about the inner workings of the software can help a casual intruder.
Other MultiValue-specific security miscellany
- TCL commands and verbs. The availability of TCL commands that can compromise data is at the root of the cautions about building execute statements in Basic. While paying close attention to what users can enter that can be executed is still sound advice, the use of Remote VOCs can provide sweeping security by limiting what users or what processes can execute what commands against what files, even from TCL.
- Spooler closing. Another example of something that is often done by accident, not closing a spooler entry can create a monster of epic proportion. This is something that show be on the code-review check-list. It's a cheap hack.
- Uniobjects. This handy tool provided by U2 can really give developers a leg-up to providing various user interfaces. Unfortunately it can give an unfriendly a similar leg-up. For one thing it is a dangerous security hole to have a port open to anything except internal servers. Then, Uniobjects has no fine-grained server-side control of what actions can be done, or commands issued. In the default configuration, as long as you can log in, you can get a free pass to the back-end data. There are some steps that will help, though. One is to create a UOLOGIN subroutine to whitelist users who can make database calls via UniObjects. (Particularly useful when you use middle-tier services, as you only need to allow the services usernames to have access.) As mentioned above, Remote VOCs can be implemented to protect verbs through the use of white lists and Access Control Lists.
Hashed Files
The hashing and group structure of a MultiValue database is ripe with opportunity for a well-informed admin with a grudge. Loading up a file beyond its reasonable capacity, skewing it with certain alpha or numeric key structures, and other strategies can bring a particular file or even the entire system to its knees. This gets back to sanitizing input — particularly whatever will be the key to the file. Be sure to test it for the right characters and range and general reasonable-ness.
Testing Phase
Code Review / Peer Review. Happily, it is more common than in years past for companies to have a policy of peer reviewing or even to have dedicated code reviewers. Either way, this is a really good idea. Having another pair of eyes looking over any new or changed software before it goes for user testing — and especially for that review to be conducted with security in mind — can up the game considerably. Bear in mind that the critical point here is that it must not be the programmer reviewing his or her own work. The idea of a code review is to have someone look at it fresh, to see it differently. Another smart idea is for the security folks to build a list of things to consider during the code review. This would be specifically for security related issues,separate from whatever other things your company may wish to review such as coding standards, etc.
Failing gracefully. The first item on this new list of security considerations during a technical review is to force a failure and see how it behaves — see if any gaping holes are opened in the security of the system. Does a failure drop the user to TCL? Or into a powerful debugger?
There are a couple of settings in a U2 system that can mitigate the risks that go along with a program fail. On Unidata there is an option 41 which, when on, will loop back to the program when an "execute" fails, rather than dropping out to TCL (see also UDT.OPTOINS 105). An entry called ON.ABORT in the VOC can direct control to another command, process, or program whenever the ctrl-break, debugger, or drop to TCL does occur.
Includes A peer reviewer/code reviewer would be well advised to take a look at any includes in a program — not just assume that they do what their name seems to imply. Imagine the havoc that can be wreaked by tweaking an include statement to equate common and database variables to something other than what it sounds like they are! Particular attention should be paid to security related COMMON and EQUATE statements.
Logs, Audits and Reports
Logs are helpful for securing, monitoring and debugging applications. They can be particularly helpful for logging the status of phantom processes. It is important however, that log files do not contain sensitive data that is not needed, such as passwords. Make sure you are not recording normally secure information in unsecured log files. Log files are likely to turn up on developers machines, in print-outs and generally unencrypted.
Don't log too little or useless info. Log files should contain enough information to determine when something happened, what happened exactly and by whom. Effective logging not only gives you diagnostic capabilities for when mistakes happen (user OR developer), but can also act as a deterrent for would-be malicious parties.
Don't log too much information. Big Data is a lot like the Boogie Man. An invisible threat that is scary in theory but so far we haven't run into it face to face. Unlike the Boogie Man, however, the threat of Big Data is going to remain — and become more real — as we become grown-ups. Not only are we dealing with large amounts of varied data formats, but we are also creating big data when we log access to the data! Our log files and audit trails quickly become useless when they are too many, too much, or too detailed. Next, we want to think about protecting the logs themselves. And logging access to the logs. Log logs. This can quickly get out of hand! Not only should content, format, security, and duration of the logs be considered when you are building new audit trails but we need to consider the audit trails left by spoolers, COMOs and external communications.
Security is big topic. Specific security for MultiValue is even a big topic. It is getting bigger and more urgent all the time. The most important piece of advice offered here is to remember that this list is incomplete no matter how complete it could be. We have to keep thinking about security in every new thing that we do. Talk about it, think about it, keep it in mind.