Skip to main content
  1. Technologies/

awk

·203 words·1 min· ·
Technologies notes - This article is part of a series.
Part 2: This Article

awk is a language for text parsing and manipulation, it’s often used as a tokenizer in bash pipes but it can do a lot more of that

before starting i write down the most common use case of awk

some_command_that_prints_on_stdout | awk -F'[SEPARATOR]'  '{print $[FIELD]}'

Processing model
#

awk starts by loading user defined functions than execute BEGIN block that process text one record at a time (default behavior is line filter) :

flowchart TD
A[load functions]
B[initial setup\n by running the BEGIN block]
C[read line]
D[process line]
E[clean up\by running END block]
A --> B --> C
C --> D
D --> C
D --> E

Syntax
#

Blocks are delimited by {} each line contains an isntruction, instructoins can be separated by ;

BEGIN{  }
{  }
END{  }

Regex filters and ~ operator
#

lines can be regex parsed using a variable with a regular expression and then filter the input using the ~ operator

BEGIN{
filter="REGEX"
}


$0 ~ filter{
    # operation on matched records
}

Match function
#

match regex element and put beckrefs in an array

    match($0, /.* — (.*) ft\. .*/, arr)
    print arr[1]

Oneliners
#

  • print all token except first one
awk '{$1=""; print $0}'
Matteo Longhi
Author
Matteo Longhi
I’m a software engineer with a passion for Music, food, dogs, videogames and open source software, i’m currently working as a devops engineer
Technologies notes - This article is part of a series.
Part 2: This Article