AWK is a powerful pattern scanning and processing language developed by Alfred Aho, Peter Weinberger and Brian Kernighan at Bell Labs - the name of this tool is indeed derived by concatenating the letter of their surnames to one another. It is one of that tools that every Linux professionals (not only the more seasoned ones) must be skilled on, since it is broadly used in a lot of shell scripts that very often are inherited from predecessors and that must be maintained: the sad truth is that very often is not worth the effort to rewrite them using other more modern languages, so knowing how to deal with it can really ease your life. And anyway, ... sometimes it requires much less time to code an AWK one liner than a Python script, so knowing how and when to use AWK is certainly a valuable skill still nowadays.
The aim of "The Ultimate AWK Tutorial For Professionals" is not to provide a complete explain about how to code with AWK - there are more modern and handy languages such as Python nowadays: I just want to provide a very quick yet comprehensive walkthrough on it focusing on how to write AWK one-liners that are often embedded in shell scripts or that you can use to sort out common system administration tasks. That's why I'm also showing some real-life use cases with AWK one-liners that can very quickly and easily sort things out.
Category: Pillars
Seasoned Linux professionals thoroughly know data formats: it is mandatory since these formats are used by many tools as:
- output format (CSV, XML, JSON, …)
- the format for their settings files (YAML, TOML, XML, JSON, INI, …)
- the format of the document to be sent to an API (XML by SOAP, JSON by REST)
It is straightforward that is mandatory to be familiar to Regular Expressions: many legacy tools like grep (in all of its flavours, such as egrep) and sed use them as pattern matching criteria. They should also know how to leverage on awk when a little bit of business logic is needed while processing data, and of course know most of the so called “coreutils” (sort, cut, wc, uniq, …). Modern Linux professionals are also skilled on format specific tools such as xpath and xmlstarlet (XML), jq (JSON) or yq (YAML).
In addition to that, Linux professionals should also have a thorough understanding of:
- encryption technologies that guarantee data integrity and confidentiality on the disk (openssl, GPG, PGP, …)
- encryption technologies that guarantee data integrity and confidentiality on the wire (TLS, X.509 certificates, Public Key Infrastructure
- design patterns that exploit encryption technologies, such as Shamir’s Secret Sharing
- Cryptographic API, such as PKCS#11
Honestly, in my experience, I saw too many technicians neglecting this topic, but believe me, this can be very dangerous.
The aim of this post is showing a tidy way to structure a C o C++ project managing the build lifecycle using the GNU Make and packaging it as RPM.
The post demonstrates a full featured C project managed by make and packaged as RPM, showing how to set up a tidy structure, develop and package a C application with its own shared objects, that reads the configuration from a file, validates settings, logs events into a file and handles error conditions printing to standard error and setting properly shell return code.
This post is certainly useful not only to developers, but to anybody who wants to learn how to build third part C or C++ software, since it clearly describes the compilation and linking process. In addition to that, we also learn how to create the product certificate that can be exploited by the subscription-manager to know that the product is installed on the system.
The application is then packaged, besides as a gzipped tarball, also as RPM, creating the application package, the package with the development resource files (the C include files) and the package with the debug information that can be used with a debugger to troubleshoot things.
Sed is a command line tool that can really do amazing stream manipulations: despite it certainly being a "seasoned" tool, it is very likely that there are a lot of sed one-liners inside your Company's scripts.
Having at least an understanding of it is a must if you want to be able to maintain this legacy stuff that very often is not worth the effort to rework.
And anyway, when having to deal with quick and dirty solutions that rely on shell scripts, or when writing documentation with shell commands that can be easily replaced by a copy and paste by the reader... it's still an excellent tool honestly I cannot work without.
The aim of this post is to provide an easy tutorial to quickly learn how to use sed in every situation that can be easily sorted out with a sed one-liner.
In memory of Lee E. McMahon, contributor to early versions of the Unix operating system, ... and of course in particular of the sed stream editor.
Despite it is a boring task, comparing file is a need that sometimes IT professionals have to deal with: there are many reasons for having to deal with this:
- verify if a file has been corrupted
- verify if a file has been tampered
- compare two versions of a configuration file to see where they differ - this happens quite often when after a configuration modification an application stops working as it should and you have to guess why
- generate a patch that can be used to go back and forth to current and previous version of the same files
and so on.
This post explains how to deal with these needs on Linux using the tools provided by the Linux distribution.
YAML is a must-have skill for IT professionals, since it is probably becoming the the most commonly used document format for manifest and configuration files - think for example to Kubernetes, Ansible and a lot of other modern Dev-Ops oriented or CI/CD tools such as drone.
Being skilled on YAML does not only mean being able to write YAML documents, but also efficiently query and manipulate YAML files.
This post provides everything it is very likely you must know to exploit YAML in your daily work, explaining its syntax and showing things in action by using yq - a tool we can consider "the jq for YAML", and using Python with PyYAML.
By the way, this post is part of a trilogy of posts dedicated to markup and serialization formats, so be sure not to miss
JSON is a must-have skill for IT professionals, since it is probably the most used document format when dealing with AJAX and with REST web-services: since both of them are broadly used on the web, it is very likely that sooner or later you'll ever have to deal with it.
Being skilled on JSON does not only mean being able to write JSON documents, but also knowing how to exploit tools such as jq to extract values or even a subset from a JSON document.
Having these skills makes your life easier not only if you are a developer, but also if you are involved in the system integration field.
This post is an overview about all you must know about JSON and how to work with it using jq.
By the way, this post is part of a trilogy of posts dedicated to markup and serialization formats, so be sure not to miss