JSON is a must-have skill for IT professionals, since it is probably the most used document format when dealing with AJAX and with REST web-services: since both of them are broadly used on the web, it is very likely that sooner or later you'll ever have to deal with it.

Being skilled on JSON does not only mean being able to write JSON documents, but also knowing how to exploit tools such as jq to extract values or even a subset from a JSON document.

Having these skills makes your life easier not only if you are a developer, but also if you are involved in the system integration field.

This post is an overview about all you must know about JSON and how to work with it using jq.

By the way, this post is part of a trilogy of posts dedicated to markup and serialization formats, so be sure not to miss

What is JSON

JavaScript Object Notation is an easy-to-parse and lightweight language independent data-interchange format for the serialization and exchange of structured data.

Its syntax is a subset of the Standard ECMA-404 and defines a small set of formatting rules for the portable representation of structured data. The official JSON website is https://www.json.org.

JSON files consist of a collection of name/value pairs separated using commas: values can be either number, strings, lists or other nested collections. These kinds of collections can be easily mapped to data structures used by the  very most of the programming languages.

Therefore, JSON can be easily integrated with any language.

In spite of its name, JSON is completely language-agnostic and has nothing to deal with Java or Javascript.

In order to ease learning the JSON syntax, I introduce it step by step, with an hands-on approach that relies on the jq command line utility.

Using VIM with JSON

Of course we have to see how to use jq, but if you are ancient like me, you certainly cannot do without the old fashioned vi and so ... install indentLine plugin:

for our convenience we can install it using vim-plug: let's download vim-plug plugin manager as follows:

curl -fLo ~/.vim/autoload/plug.vim --create-dirs https://raw.githubusercontent.com/junegunn/vim-plug/master/plug.vim

now let's modify our "~/.vimrc" file adding the following snippet:

call plug#begin()
Plug 'https://github.com/Yggdroot/indentLine'
call plug#end()

then let's launch vim as usual:

vim

while in command mode, type the command to install the plugin we listed into "~/.vimrc":

:PlugInstall

if everything went smoothly we can verify the status of the plugins as follows:

:PlugStatus

now simply add the following snippet to your "~/.vimrc" file:

au! BufNewFile,BufReadPost *.{json} set filetype=json
autocmd FileType json setlocal ft=javascript ts=2 sts=2 sw=2 expandtab
let g:indentLine_char = '|'
let g:indentLine_color_term = 239 

here we defined the "json" file type and bound the ".json" files to it. Then we configured some help with indentation.

This is the outcome when editing a JSON file:

it's much more readable and usable, isn't it?

JQ - An Overview

It is a really handy command line tool to manage JSON objects in shell scripts. We can easily install it as follows:

sudo dnf -y install jq
If you are working with Red Hat versions older than Red Hat Enterprise Linux 8, you must enable the EPEL repository.

Its most basic usage is simply parsing and pretty-printing a JSON object. For example:

echo '{"version": 1, "kind": "myobject" }' | jq

pretty-prints the output as follows:

{
  "version": 1,
  "kind": "myobject"
}

as you see, jq output is colored and properly indented: this makes it much more human readable.

jq is a handy tool also when you are working with the old fashioned vim: if you are dealing with a plain JSON and you want to pretty-print it in vim, simply switch to command mode and type: "%!jq .". Of course, you can apply the trick to whatever you want, for example with Python ":%!python -m json.tool".

The main aim of jq however is manipulating JSON.

It leverages on filters, such as:

.Identifier
the identifier of the field you want to get: in the previous example, if you want to get the value of the "version" field, the filter is ".version". You can even specify only the dot (.) as filter to tell jq that you want to get the whole input. If you need to return more identifiers, you can list them separated by commas. For example ".version, .kind".
.Identifier?
same as above, but does not raise an error if the Identifier is missing in the input.
[i:j]
returns a slice of the input list, where "i" is the index of the first item of the slice and "j" is the index of the last item. When they are missing, "i" defaults to 0 (the index of the first item) and "j" to the index of the last item.

and functions, such as:

tonumber
convert the input into a number
tostring
convert the input into a string
length
returns the length (number of elements) of the object
min
returns the minimum value
max
returns the maximum value
unique
removes duplicate items from the list of entities

filters and functions can be combined in a fully meshed way separating with a pipe "|".

Let see a few examples - consider the following JSON:

{
    "status": "WARN",
    "message": "Free disk space too low",
    "threshold": "10%",
    "current": "8%"
}

we can easily get the value of the "current" attribute by specifying the ".current" filter:

RESULT='{"status": "WARN", "message": "Free disk space too low", "threshold": "10%", "current": "8%"}'
echo ${RESULT} | jq '.current '

the output is:

"8%"

if necessary, we can even specify a string containing the filters:

echo ${RESULT} | jq -r '"Current consumed percentage is \(.current) while threshold is \(.threshold)"'

the outcome is the same string with the values fetched from the JSON:

Current consumed percentage is 8% while threshold is 10%

we can exploit the above feature to export into SHELL variables the fetched value: for example, the string to export the two filters is as follows:

CURRENT=\(.current) THRESHOLD=\(.threshold)

so the whole command is:

export RESULT
export $(echo ${RESULT}|jq -r '"CURRENT=\(.current) THRESHOLD=\(.threshold)"')

let's try it:

echo ${CURRENT}
echo ${THRESHOLD}

the outcome is:

8%
10%

Of course I'm not fostering to create SHELL scripts to manage JSON: these are my two cents when it comes to modify existing legacy scripts to let them parse JSON documents generated by modern software without having to re-code the whole script, saving time and so money.

In the rest of this post I use jq to show you some things, but a thorough explanation of its usage is outside of the scope of this post: if you want to learn more you can find the official documentation at https://stedolan.github.io/jq/manual/

JSON Serialization

Writing valid JSON documents requires knowing JSON syntax: we are about to see it along with the jq commands that can be used to query the sample JSON documents.

JSON Objects

A JSON object is a comma separated list of key/value pairs where values can be either numbers, strings or any of the literals described later on.

The value of a key can be a simple type, such as a number, a string or a literal representing other kinds of objects, such as lists, including other JSON objects.

For example, this is the syntax to use to specify a JSON object as the value of the "OS" key:

"OS": {
    "OS-Family":"Linux",
    "Vendor": "Red Hat",
    "Version": "8.0"
}

accessing the values of the "OS" JSON object requires minding of the nesting level: the syntax to get the ".Version" element of the ".OS" dictionary is:

.OS.Version

Let's see how to deal using jq to get the value of the ".OS-Family" element of the ".OS" dictionary: you may guess that the syntax is ".OS.OS_family", but this time there's the dash character that makes things harder, so the actual syntax is the one shown in the following snippet:

JSON_DICT='"OS":{"OS-Family":"Linux","Vendor": "Red Hat","Version": "8.0"}'
echo "{$JSON_DICT}" | jq '.OS."OS-Family"'

the output indeed is:

"Linux"

Quotes around "OS-Family" are necessary to avoid jq to get mixed up by the presence of the dash "-" character.

I intentionally made a key with a dash (-) to show you how to deal with this special case: jq filter requires to specify keys with dashes or numbers enclosed by double quotes. Keep this tip in mind.

A less common need is to list just the "keys" of the JSON object: we can achieve this by piping the output of the filter to the "keys" function as follows:

echo "{$JSON_DICT}" | jq '.OS|keys' 

the output is as follows:

[
  "OS-Family",
  "Vendor",
  "Version"
]

Comments

JSON format does not provide a syntax for comment: although you can define your own key which value should be considered a comment by your application – but keep in mind that it is not a convention.

null value and booleans

As claimed by one of the changes published by RFC 7159 (it updates RFC 4627), a JSON text is a serialized value.

This means that null, true and false have officially become valid JSON literals.

The syntax for specifying null and booleans is depicted by the following JSON snippet:

{
    "myCount": null,
    "enabled": true,
    "autoload": false
}

jq recognizes all of these literals, and so it is able to correctly guess the type:

echo '{"mycount": null,"enabled": true,"autoload": false}' | jq 'map(type)' 

as expected, the outcome is:

[
    "null",
    "boolean",
    "boolean"
]
Be wary that not all JSON parsers/deserializers do support parsing the null literal: the risk is to get them raising an invalid syntax exception.

Numbers

Numbers can be either integers or double-precision floating-point. The actual syntax is depicted by the following JSON snippet:

{ 
    "aninteger": 42,
    "floating-point": 1.56, 
    "other-floating": 0.9, 
    "exponent": 10E-2
}

valid exponent character markers are "e", "e+", "e-", "E", "E+", "E-".

Let's check the types guessed from jq:

echo '{"aninteger": 42, "floating-point": 1.56, "other-floating": 0.9, "exponent": 10E-2}' | jq 'map(type)'

as expected, the outcome is:

[
    "number",
    "number",
    "number",
    "number"
]

Strings

Strings are declared by enclosing the value with double quotes as shown by the following JSON snippet:

{
    "FullName": "Marco Antonio Carcano"
}

let's check the type guessed from jq:

echo '{"FullName": "Marco Antonio Carcano"}' | jq 'map(type)'

as expected, the outcome is:

[
    "string"
]

Be wary also that according to the specification:

A string is a sequence of Unicode code points wrapped with quotation marks (U+0022). All characters may be placed within the quotation marks except for the characters that must be escaped: quotation mark (U+0022), reverse solidus (U+005C), and the control characters U+0000 to U+001F. There are two-character escape sequence representations of some characters.

Unicode control characters from U+0000 to U+001F are:

  • U+0000 <control-0000> (NUL: NULL) (used in null-terminated strings)
  • U+0009 <control-0009> (HT: HORIZONTAL TABULATION) (inserted by the tab key)
  • U+000A <control-000A> (LF: LINE FEED) (used as a line break)
  • U+000C <control-000C> (FF: FORM FEED) (denotes a page break in a plain text file)
  • U+000D <control-000D> (CR: CARRIAGE RETURN) (used in some line-breaking conventions)

The consequence of this claim is that you cannot insert into a string any of the above control characters, neither the quotation mark (U+0022) and the reverse solidus (U+005C) without escaping them first.

Multiline strings

JSON serializers/deserializers do not have a standard way to handle multi-line strings, so it's up to the application to decide how to deal with it.

There are two possible approaches:

If your serializer/deserializer manages the escaping of  "\n" (U+000A) control character as a newline, you can try the following syntax:

{
    "Multiline-string": "first line\\nSecond line"
}

otherwise you  can try the dirty workaround of splitting the multi-line string into an array (a list - more on this later) of single line strings and have your application treat them as a single multi-line string:

{
    "Multiline-string": [ 
        "first line",
        "Second line"
    ]
}

let's check the type guessed from jq:

echo '{"Multiline-string": [ "first line", "Second line" ]}' | jq 'map(type)' 

this time the outcome is:

[
    "array"
]

an array is actually a list and is often called like so indeed.

Lists

This data structure is what many programming languages call array; the following example depicts the JSON syntax of a list called "disk":

"disks": [
    "/dev/sda",
    "/dev/sdb",
    "/dev/sdc"
]

Let's see some examples of how to use jq to deal with common needs on JSON lists.

Number Of Elements Using the Length Filter

A common need is certainly getting the number of elements in the list - we can pipe the output of the filter to the "length" function:

JSON_LIST='"disks": [ "/dev/sda", "/dev/sdb", "/dev/sdc" ]'
echo "{$JSON_LIST}" | jq '.disks | length'

Getting One Element Of The List

Same way as we do with arrays when coding using a programming language, elements of a list are accessed by index: jq filters work the same way. For example, we can use jq to access the 3rd element of the "disks" list as follows:

echo "{$JSON_LIST}" | jq .disks[2]

the output is as follows:

"/dev/sdc"

Getting A Slice Of The List

Another common need can be getting a slice (so a subset) of a list. For example, the jq statement to get items from the 2nd to the last of the list is:

echo "{$JSON_LIST}" | jq .disks[1:]
As we previously saw, [i:j] returns a slice of the input list, where i is the index of the first item of the slice and j is the index of the last item. When they are missing, i defaults to 0 (the index of the first item) and j to the index of the last item.

Sorting a List

When dealing with the elements of a list, often we have to get them sorted - it is just a matter of piping the output of the filter to the "sort" function:

NUMBERS='"numbers": [ 1, 0.3, 2.5, -0.9 ]'
echo "{$NUMBERS}" | jq '.numbers | sort'

the output is as follows:

[
  -0.9,
  0.3,
  1,
  2.5
]

these are numbers, but of course we may have the same needs with strings:

NAMES='"names": [ "Marco", "Antonio"]'
echo "{$NAMES}" | jq '.names | sort'

the output is as follows:

[
  "Antonio",
  "Marco"
]

Getting Min and Max

We can pipe the output of the filter to the "min" function to get the minimum element:

echo "{$NUMBERS}" | jq '.numbers | min'

it returns -0.9.

Conversely , if we need the maximum element we can pipe the output of the filter to the "max" function:

echo "{$NUMBERS}" | jq '.numbers | max'

it returns 2.5.

Converting A List Into a List Of Dictionaries

Although this is not very common, there may be situations that require you to convert a list into a list of dictionaries: for example you can turn a list into a list of dictionaries so that you can then use the "select" function to know if the list does actually contain a given value.

You can achieve this using to_entries[] function:

echo "{$JSON_LIST}" | jq '.disks| to_entries[]'

the output is:

{
    "key": 0,
    "value": "/dev/sda"
}
{
    "key": 1,
    "value": "/dev/sdb"
}
{
    "key": 2,
    "value": "/dev/sdc"
}

Getting The Index Of An Item Of The list With a Certain Value

After turning the list into a list of dictionaries you can leverage on the "select" function to get a list of dictionaries with a given value.

For example, to get the dictionary with "/dev/sdb":

echo "{$JSON_LIST}" | jq '.disks| to_entries[] |select(.value=="/dev/sdb")'

the output is

{
    "key": 1,
    "value": "/dev/sdb"
}

now, to get the key, simply add it to the filter:

echo "{$JSON_LIST}" | jq '.disks| to_entries[] |select(.value=="/dev/sdb") .key'

the output is:

1

So the item with a value of "/dev/sdb" is the 2nd one (index=1).

The same statement can of course be used to see if a list contains a given value:

echo "{$JSON_LIST}" | jq '.disks| to_entries[] |select(.value=="/dev/sdd") .key'

since the "disks" list does not contain "/dev/sdd", the above command produces no output.

List of JSON Objects

Since we just talked about them, let's see an example of nesting dictionaries as items of a list.

For example:

"foo_vg": [
    {
        "name": "bar_lv",
        "size": "2GiB",
        "type": "xfs",
        "mountpoint": "/opt/bar"
    },
    {
        "name": "baz_lv",
        "size": "1GiB",
        "type": "xfs",
        "mountpoint": "/opt/baz"
    }
]

we can get the entries from the list by specifying [] in the filter:

FS='"foo_vg":[{"name": "bar_lv","size":"2GiB","type":"xfs","mountpoint":"/opt/bar"},{"name": "baz_lv","size":"1GiB","type":"xfs","mountpoint":"/opt/baz"}]'
echo {$FS} | jq '.foo_vg[]'

the output is:

{
    "name": "bar_lv",
    "size": "2GiB",
    "type": "xfs",
    "mountpoint": "/opt/bar"
}
{
    "name": "baz_lv",
    "size": "1GiB",
    "type": "xfs",
    "mountpoint": "/opt/baz"
}

Selecting Elements Of A List

The select function can be used to get a slice of a list with only the elements that match a certain criteria. For example, to get the entries with "xfs" filesystem that are "1GiB" in size:

echo "{$FS}" | jq '.foo_vg[] | select(.type=="xfs" and .size=="1GiB")'

we can easily get the value of the "name" key as follows:

echo "{$FS}" | jq '.foo_vg[] | select(.type=="xfs" and .size=="1GiB")| .name'

the output is:

"baz_lv"

We can of course use a regular expression as parameter of the "select" function:

echo "{$FS}" | jq '.foo_vg[] | select(.mountpoint | test("^/opt."))'

Select an Item From a List Of Dictionaries BY Name

we can now select only the dictionary that has an item with a key called "mountpoint":

FS='"foo_vg":[{"name": "bar_lv","size":"2GiB","type":"xfs","mountpoint":"/opt/bar"},{"name": "baz_lv","size":"1GiB","type":"xfs"}]'
echo {$FS} | jq '.foo_vg | map(has("mountpoint"))'

as you see by the output, the first item has the "mountpoint" attribute, whereas the second one hasn't:

[
    true,
    false
]

Select an Item From a List Of Dictionaries BY Value

we can now select only the dictionary that has an item with a key called "name" with "baz_lv" as value:

echo {$FS} | jq '.foo_vg[] | select(.name=="baz_lv")'

the output is:

{
    "name": "baz_lv",
    "size": "1GiB",
    "fs-type": "xfs",
    "mountpoint": "/opt/baz"
}

Footnotes

Here it ends our quick tour of the amazing world of JSON: instead of limiting to show only the grammar, I preferred to leverage on jq to show you things in action. However, be wary that although we saw how to deal with JSON using jq, when dealing with JSON it is more convenient to avoid writing shell scripts and instead use other scripting languages that can natively handle it, such as Python.

In my opinion jq is a powerful tool, but I use it only to perform ad-hoc commands or to add JSON support to scripts that maybe is not worth the effort to rewrite into another language.

By the way, this post is part of a trilogy of posts dedicated to markup and serialization formats, so be sure not to miss

Writing a post like this takes a lot of hours. I'm doing it for the only pleasure of sharing knowledge and thoughts, but all of this does not come for free: it is a time consuming volunteering task. This blog is not affiliated to anybody, does not show advertisements nor sells data of visitors. The only goal of this blog is to make ideas flow. So please, if you liked this post, spend a little of your time to share it on Linkedin or Twitter using the buttons below: seeing that posts are actually read is the only way I have to understand if I'm really sharing thoughts or if I'm just wasting time and I'd better give up.

3 thoughts on “JSON and jq in a nutshell

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>