Parsing JSON data in MultiValue - BASIC
What is JSON?
JSON is a simple, text-based way to store and transmit structured data. By using a simple syntax, you can easily store anything from a single number through to strings, arrays, and objects using nothing but a string of delimited text. You can also nest arrays and objects, allowing you to create complex data structures.
JSON is much like XML, but is much more compact. While originally designed to interact with web pages and JavaScript, JSON has become the standard way of exchanging information between hybrid systems and web services.
Why would I use JSON?
If you are looking at integrating existing 3 rd party web services into your MultiValue applications, then you will likely be using JSON to exchange the data.
For example, if you need to GEO code your information, or list your information within a mapping service like Google, then JSON will be the format they will require and provide back to you.
Format of JSON
Here are the basic rules:
- JSON string is grouped together with a "Name" and "Value". The value may be another string, number, Boolean, null, or object.
- A JSON Name/Value pair consisted of a Field Name (think MultiValue dictionary) and a Value separated by a colon. I.e.: "name":"value"
This is the same as using the XML of:
<Name>Value</Name> - A JSON Object is a list of Name/Value pairs, separated by a comma, and surrounded by curly brackets. I.e.: {"name1":"value1","name2":"value2"}
This is the same as using the XML of:
<data>
<name1>value1</name1>
<name2>value2</name2>
</data>
This looks like the following in JavaScript:
data.name1 = "value1"
data.name2 = "value2"
- A JSON Array allows you to specific a group of objects, or name/Value pairs in what would be an "Indexed" Array i.e.: [{"id":"1","Name":"Bob"},{"id":"2","Name":"Joe"}]
This would be the same as doing the following in XML:
<data>
<record>
<id>1</id>
<name>Bob</name>
</record>
<record>
<id>2</id>
<name>Joe</name>
</record>
</data>
This looks like the following in JavaScript:
data[0].id = "1"
data[0].name = "Bob"
data[1].id = "2"
data[1].name = "Joe"
- JSON Value consists of a few possibilities:
Number - Integer, or decimal
String - data surrounded by double quotes
Boolean - the value is either "True" or "False"
Null - the value "null" would be the same as not assigned or ""
Object - anything surrounded in {}
Array - Anything surrounded in []
Here is an example in both XML and JSON:
JSON Example:
{ "orderID": 12345, "shopperName": "John Smith", "shopperEmail": "johnsmith@example.com", "contents": [ { "productID": 34, "productName": "SuperWidget", "quantity": 1 }, { "productID": 56, "productName": "WonderWidget", "quantity": 3 } ], "orderCompleted": true}
XML Example:
<order> <orderID>1234</orderID> <shopperName>John Smith</shopperName> < shopperEmail >John Smith</ shopperEmail > <contents> <product> <productID>34</productID> <productName>SuperWidget</productName> <quantity>1</quantity> </product> <product> <productID>56</productID> <productName>WonderWidget</productName> <quantity>3</quantity> </product> </contents> <orderCompleted>true</orderCompleted> </order>
Parsing JSON
MultiValue BASIC handles data strings really well, and parsing a JSON string is actually easier than you think. Even with the nested components.
The first think to check for though is if the JSON sting is an Object or Array. You do this by checking the 1 st character for an "{" or "[":
ST.POS = 1 BEGIN CASE CASE JSON.ITEM[ST.POS,1] = "{" GOSUB PARSE.OBJECT CASE JSON.ITEM[ST.POS,1] = "[" GOSUB PARSE.ARRAY DATA.ITEM = AM: VALUE CASE 1 ; JSON.ERROR = ST.POS END CASE
Once you know how are you parsing, you need to find the end mark. Since you may have nested objects and/or arrays, this requires a little more intelligence than a FIELD() statement. Let's take a look at how to parse a Object.
Finding the Property names
Since we know that a object will start with an "{" and end with an "}", and between the brackets will be name and value pairs, then we can build a simple dynamic array that contains the Names and the Pairs. Start by find the end of the "Property name", which will be a colon (:).
PARSE.KEY: POS = INDEX(JSON.ITEM[ST.POS,FN.POS],":",1) KEY.NAME = JSON.ITEM[ST.POS,POS-1] * ST.POS = ST.POS + POS * IF KEY.NAME[1,1] = '"' OR KEY.NAME[1,1] = "'" THEN QUOTE = KEY.NAME[1,1] LAST.CHAR = KEY.NAME[LEN(KEY.NAME),1] IF KEY.NAME = LAST.CHAR THEN JSON.ERROR = ST.POS * VALUE = KEY.NAME[2,LEN(KEY.NAME)-2] GOSUB PROCESS.ESCAPE KEY.NAME = VALUE END RETURN
Once we have the Key, or property name, for the value, we need to store the next start position of the JSON string. This helps up speed up the processing of the JSON data.
If you are not quite following what is going with the ST.POS variable, here is another way to look at it:
ST.POS = 1 { "orderID": 12345,"shopperName": "John Smith","shopperEmail": "johnsmith@example.com"} ^ POS = INDEX(JSON.STRING,":",1) ST.POS = ST.POS + POS { "orderID": 12345,"shopperName": "John Smith","shopperEmail": "johnsmith@example.com"} -----------^
There are other ways to optimize this process, but I'll leave that up to you to do.
Handle Escaped Strings
If the supplier of the JSON object was falling the specs correctly, then the key should always be quoted. Quoted strings can also include escaped characters. For example, additional quotes, or other non ASCI characters. These are all proceeded by a backsplash (\). Once the quotes are removed, then you must look to see if there are any escape values.
PROCESS.ESCAPE: *** Process the escape values in a string * DELIMITS = OCONV(VALUE,"MC/N" :VM: "MC/A") IF INDEX(DELIMITS,"\",1) THEN C.COUNT = 1 LOOP C.POS = INDEX(VALUE,"\",C.COUNT) UNTIL C.POS = 0 DO NEXT.CHAR = VALUE[C.POS+1,1] BEGIN CASE CASE NEXT.CHAR = "\" OR NEXT.CHAR = "/" OR NEXT.CHAR = "'" OR NEXT.CHAR = '"' VALUE = VALUE[1,C.POS] :NEXT.CHAR: VALUE[C.POS+2,LEN(VALUE)] C.COUNT = C.COUNT + 1 CASE NEXT.CHAR = "f" VALUE = VALUE[1,C.POS] :CHAR(12): VALUE[C.POS+2,LEN(VALUE)] CASE NEXT.CHAR = "n" VALUE = VALUE[1,C.POS] :CHAR(10): VALUE[C.POS+2,LEN(VALUE)] CASE NEXT.CHAR = "r" VALUE = VALUE[1,C.POS] :CHAR(13): VALUE[C.POS+2,LEN(VALUE)] CASE NEXT.CHAR = "t" VALUE = VALUE[1,C.POS] :CHAR(9): VALUE[C.POS+2,LEN(VALUE)] CASE NEXT.CHAR = "u" HEX.VALUE = VALUE[C.POS+1,4] VALUE = VALUE[1,C.POS] :OCONV(HEX.VALUE,"MY"): VALUE[C.POS+6,LEN(VALUE)] END CASE REPEAT END RETURN
This code is pretty straight forward, but may look complex. It really isn't that complex. The main power of this routine is the OCONV() statement. You will see that it is using "MC/N" and "MC/A". These conversion codes will remove all alpha letters (a-z), and remove all numbers (0-9), and leave everything else (quotes, spaces, punctuation, etc.).
For Example:
Before: 'Hello, world \"we are quoted \"' After: ', \" \"'
the next step is to see if we have any backslashes in the converted string. If so, use the INDEX() statement to find them in the string, and replace the escaped value with the correct value.
Before: 'Hello, world \"we are quoted\"' After: 'Hello, world "we are quoted"'
There are special escape codes that mean different things. For example, if you have a \u, then the next 4 digits after the "\u" is a hex value that needs to be converted. "\n" and "\r" are Carriage Return and Linefeed values, "\t" is a tab value, and "\f" is a formfeed.
Finding the Value
Once we have the Property Name, then we need to extract the value that is assigned to that property. Since the value is not always a string, we need to test to see what type of value it is to decide how to parse it:
PARSE.VALUE: NEXT.CHAR = JSON.ITEM[ST.POS,1] BEGIN CASE CASE NEXT.CHAR = "{" ;* Parse Object ST.DELIMIT = "{" ; FN.DELIMIT = "}" GOSUB PARSE.DELIMIT VALUE = "{": VALUE :"}" CASE NEXT.CHAR = "[" ;* Parse Array GOSUB PARSE.ARRAY CASE NEXT.CHAR = "T" ;* Value: True GOSUB PARSE.TRUE CASE NEXT.CHAR = "F" ;* Value: False GOSUB PARSE.FALSE CASE NEXT.CHAR = "N" ;* Value: Null GOSUB PARSE.NULL CASE NEXT.CHAR = "'" OR NEXT.CHAR = \"\ GOSUB PARSE.STRING CASE 1 GOSUB PARSE.NUMBER END CASE RETURN
Depending on the value, looking for the end can be tricky. There are a few obvious and easy ones: True/False/Null
PARSE.NULL: POS = 4 VALUE = "" ;* JSON.ITEM[ST.POS,POS] ST.POS = ST.POS + POS + 1 RETURN PARSE.TRUE: POS = 4 VALUE = JSON.ITEM[ST.POS,POS] ST.POS = ST.POS + POS + 1 RETURN PARSE.FALSE: POS = 5 VALUE = JSON.ITEM[ST.POS,POS] ST.POS = ST.POS + POS + 1 RETURN
Extract a number is pretty easy as well, but requires a little bit more processing to find the end of the number. Number many only have +, -, ., and 0-9 Any other character terminates the numbers:
PARSE.NUMBER: D = 1 LOOP DELIMIT = OCONV(JSON.ITEM[ST.POS,FN.POS],"MC/N":VM:"MC/A")[D,1] UNTIL INDEX(".,+-",DELIMIT,1) DO D = D + 1 REPEAT POS = INDEX(JSON.ITEM[ST.POS,FN.POS],DELIMIT,1) * VALUE = JSON.ITEM[ST.POS,POS-1] ST.POS = ST.POS + POS RETURN
Much like the escaped string, we are using the MC/N and MC/A conversion codes to get the next non-number, and looking for that character to find the end of the number.
Parse a string value again requires a little bit of processing, mainly because you have to watch for escaped quotes. You start by figuring out the type of quote the provider was using; double or single. Then find the next similar quote that is not escaped.
PARSE.STRING: VALUE = "" * QUOTE = JSON.ITEM[ST.POS,1] ST.POS = ST.POS + 1 * *** Find the next quote, but make sure we check for escaped *** quotes. * DELIMIT.STOP = 0 OCCUR = 1 LOOP QUOTE.POS = INDEX(JSON.ITEM[ST.POS,FN.POS],QUOTE,OCCUR) BEGIN CASE CASE QUOTE.POS = 0 QUOTE.POS = FN.POS + 1 CASE JSON.ITEM[ST.POS + QUOTE.POS - 1,1] = "\" OCCUR = OCCUR + 1 CASE 1 DELIMIT.STOP = 1 END CASE UNTIL DELIMIT.STOP DO REPEAT * *** Get the value, and translate the escaped information * VALUE = JSON.ITEM[ST.POS,QUOTE.POS-1] GOSUB PROCESS.ESCAPE ST.POS = ST.POS + QUOTE.POS RETURN
Now, let's put this all together to create the PARSE.OBJECT subroutine.
PARSE.OBJECT: ST.POS = ST.POS + 1 ERROR.CHECK.POS = 0 * LOOP UNTIL JSON.ITEM[ST.POS,1] = "}" OR ST.POS >= FN.POS OR JSON.ERROR > 0 DO IF ST.POS = ERROR.CHECK.POS THEN *** We have not moved any further. Parsing error * JSON.ERROR = ST.POS END ERROR.CHECK.POS = ST.POS * *** Parse the information * GOSUB PARSE.KEY GOSUB PARSE.VALUE * *** Add the Data * LOCATE KEY.NAME IN DATA.ITEM<1> SETTING KEY.POS THEN DATA.ITEM<2,-1> = VALUE END ELSE DATA.ITEM<1,-1> = KEY.NAME DATA.ITEM<2,-1> = VALUE END * *** Look for comma, since this is another value * TEST.POS = INDEX(JSON.ITEM[ST.POS,FN.POS],"}",1) IF TEST.POS = 0 THEN TEST.POS = FN.POS * NEXT.CHAR = JSON.ITEM[ST.POS,1] BEGIN CASE CASE NEXT.CHAR = "," ST.POS = ST.POS + 1 CASE NEXT.CHAR = "}" * do nothing CASE 1 POS = INDEX(JSON.ITEM[ST.POS,TEST.POS],",",1) IF POS > 0 THEN *** Place Pointer just after the ',' * ST.POS = ST.POS + POS END ELSE *** Place the pointer on the "}" * ST.POS = ST.POS + TEST.POS - 1 END END CASE REPEAT ST.POS = ST.POS + 1 RETURN
There is no real gotcha in this Object parsing code. The end result of the this code and the attached programs will generate a dynamic array contain JSON property Name and value:
DATA.ITEM<1,1> = "orderID" DATA.ITEM<2,1> = "1234" * DATA.ITEM<1,2> = "shopperName" DATA.ITEM<2,2> = "John Smith"