The awk command in Linux: Understanding syntax, options and common examples
awk is a widely used Linux command for text-processing tasks. You can use this command directly in the terminal to extract data from a text file, scan for patterns, and perform minor actions like formatting text.
This command is also a scripting language, meaning it can be used to write full-fledged programs. However, this article will focus on what you can do with awk in the terminal to manipulate text files. We will cover syntax, common use cases, and answer the most common questions.
Syntax of the awk command
At its core, the awk command takes two kinds of input: a text file and a set of instructions. This is reflected in the basic syntax:
awk '{ action }' filename.txt
- action corresponds to the action you want to take on your text file.
- filename is the text file.
On the most basic level, the awk command syntax is very simple. All you need is a text file to interact with and an action to perform.
Options and syntax variations
Your basic awk command can be further extended by adding options:
- -F: defines a field separator.
- -v: defines variables.
- -f: reads the script from a file.
Since awk treats whitespace (spaces or tabs) as the default delimiter between fields in a file or input, -F tells it how to interpret the columns or fields in each line based on a delimiter. In other words, when you use -F, awk knows how to split each line into parts (fields).
Using your document from before, you can use -F as a command line argument to define the colon as the field separator.
awk -F':' '/house/ { print "ID:", $1, "- Type:", $2, "- Location:", $3 }' filename.txt
awk identifies the separator and interprets the fields accordingly:
ID: 1 - Type: Big house - Location: New York ID: 2 - Type: Small house - Location: Los Angeles ID: 4 - Type: Houseboat - Location: Seattle
To assign a variable from the command line, you can run:
awk -v word="house" '$0 ~ word { print $0 }' filename.txt
word is now a variable that can be used in your action.
Finally, the -f option is useful for running multiple awk commands at once from the command line within a single script. Imagine you have a file simple_script.awk containing the following:
# Print the line number and the line content if the line contains the word "house" $0 ~ /house/ { print NR, $0 } # Print a message before every output BEGIN { print "Starting to search for 'house'..." }
You can run this with:
awk -f simple_script.awk filename.txt
And you’ll have:
Starting to search for 'house'... 1:Big house:New York 2:Small house:Los Angeles 4:Houseboat:Seattle
Creating a sample file
Before we discuss use cases, you will need to create a sample file.
For the sake of this example, we will continue to use houses and locations as examples, but create a brand new input file.
To do this, simply use the touch command to create a new file:
touch houses.txt
Since the file is empty, we need to populate it. Let’s also change up the houses from our first example: we may want a small house in Vermont, a large house in San Diego, an apartment in New York, and a houseboat in London. We will also add the square meters for each home.
You can use your preferred text editor (e.g., nano or vim), or append data directly with echo.
echo -e "1:Small house:Vermont:100 sqm\n2:Large house:San Diego:300 sqm\n3:Apartment:New York:70 sqm\n4:Houseboat:London:40 sqm" > houses.txt
Now, houses.txt is ready for use in our awk examples.
Examples of the awk command
Let’s see how we can use the awk command on our houses.txt in multiple use cases. Below is a list of common scenarios.
1. Printing all lines of a file
To print all the lines from an input file, run the following command:
awk '{print}' houses.txt
This will return the following:
1:Small house:Vermont:100 sqm 2:Large house:San Diego:300 sqm 3:Apartment:New York:70 sqm 4:Houseboat:London:40 sqm
2. Printing a specific column
As we already saw, awk splits each line of a text file into fields (or columns) using whitespace as the separator. In our case, we are using a colon (:). To print specific columns, we need to know the column’s position within the line.
Let’s imagine we want to print the column containing the square footage of each home. To do this, we’ll run:
awk -F':' '{print $4}' houses.txt
The result will be:
100 sqm 300 sqm 70 sqm 40 sqm
Here:
- -F’:’ tells awk to use colon (:) as the field separator.
- $4 prints the fourth field (square footage).
3. Displaying lines that match a pattern
Let’s imagine you are only interested in the lines of your input file that contain a certain word, or that match a certain pattern. To do this, you’ll need to use regex.
Regex is a pattern-matching technique, and it can be used to create complex patterns to extract very specific parts of text. Here, we will use a very straightforward regular expression.
For example, if you want to print the entire line containing the word “Houseboat” from your input file, you’ll run:
awk -F ':' '/Houseboat/ {print}' houses.txt
Which will give you:
4:Houseboat:London:40 sqm
/Houseboat/ is the regex pattern: we are telling the system to look for all matching text for the word “Houseboat”.
4. Extracting and printing columns using field manipulation
You can also manipulate the fields within your text file and print them in a different order.
Let’s say you want to print each line of our text file as a real estate listing. You can do:
awk -F ':' '{print "For sale:", $2, "in", $3, ".", "Square footage:", $4}' houses.txt
Running this command will print:

The command allows you to rearrange and format the fields however you like. For example, you could swap $2 and $3 to print the location before the house type.
5. Calculating mathematical operations
The awk command can perform calculations.
Let’s add a column to our data containing the price of each property.
awk -F ':' '{print $0, ": $", NR * 100000}' houses.txt > priced_houses.txt
This command creates a priced_houses.txt with prices for all properties. For simplicity’s sake, we will make up prices based on the line number with: $, NR * 100000.
1:Small house:Vermont:100 sqm: $100000 2:Large house:San Diego:300 sqm: $200000 3:Apartment:New York:70 sqm: $300000 4:Houseboat:London:40 sqm: $400000
Now that we have some numbers, we can try out mathematical operations.
To calculate the total cost of the properties, you’ll be summing the last column, where the prices are stored ($5):
awk -F ':' '{gsub("[$,]", "", $5); sum += $5} END {print "Total cost:", sum}' priced_houses.txt
Which prints:
Total cost: 1000000
Here:
- gsub(“[$,]”, “”, $5) removes any dollar signs or commas from the price in the fifth field (to allow for proper calculation).
- sum += $5 adds the price to the running total.
- END {print “Total cost:”, sum} prints the total cost after processing all lines.
6. Processing data based on conditional statements
To calculate only the price of selected properties – let’s say the apartment in New York and the houseboat in London – you will have to use conditional statements.
awk -F ':' '($2 == "Apartment" || $2 == "Houseboat") {gsub("[$,]", "", $5); sum += $5} END {print "NY + LDN, total cost:", sum}' priced_houses.txt
In this example:
- $2 == “Apartment” || $2 == “Houseboat” is the condition that ensures that only lines containing “Apartment” or “Houseboat” are processed. || is the conditional symbol for “OR”.
- gsub(“[$,]”, “”, $5) removes the dollar signs or commas.
- sum += $5 adds the price to the sum.
- END {print “NY + LDN, total cost:”, sum} prints the total cost for the selected properties.
The above command will print:
NY + LDN, total cost: 700000
Using our original input file houses.txt, you can use another conditional statement to, for example, only print a property if it’s larger than 50 sqm:
awk -F ':' '{if ($4 > 50) print $2, "in", $3, ":", "OK"; else print $2, "in", $3, ":", "too small."}' houses.txt
The above command uses a simple if-else conditional statement separated by a semicolon (;). It will print:

You can find all conditional statements in the GNU Awk User Manual.
7. Using built-in variables
awk has several built-in variables, both numerical and string-based, that are pre-defined in the language.
Here are the most commonly used:
- NR (Number of Records)
- NF (Number of Fields)
- FS (Field Separator)
- OFS (Output Field Separator)
- FILENAME
- RS (Record Separator)
For example, to display the number of fields in each line, you’d run:
awk -F ':' '{print "Line", NR, "has", NF, "fields"}' houses.txt
Which would output:
Line 1 has 4 fields Line 2 has 4 fields Line 3 has 4 fields Line 4 has 4 fields
If you want to use OFS to specify the separator between fields when printing the output, you can run:
awk 'BEGIN {OFS="XXXX"} {print $1, $2, $3, $4}' houses.txt
Which will print:
1:SmallXXXXhouse:Vermont:100XXXXsqmXXXX 2:LargeXXXXhouse:SanXXXXDiego:300XXXXsqm 3:Apartment:NewXXXXYork:70XXXXsqmXXXX 4:Houseboat:London:40XXXXsqmXXXXXXXX
8. Using user-defined functions
In awk, you can manipulate your text more efficiently by using functions directly on the terminal.
For example, to convert the second column (house types) to lowercase, you’ll run:
awk -F ':' '{print tolower($2)}' houses.txt
Which outputs:
small house large house apartment houseboat
Here, tolower($2) is the function being used.
If you want to replace the word “house” with “mansion” in the second column:
awk -F ':' '{gsub(/house/, "mansion", $2); print $2}' houses.txt
Where gsub(/house/, “mansion”, $2) is the function.
Small mansion Large mansion Apartment Houseboat

Conclusion
Linux’s awk command is a powerful processing tool developers can use to extract, manipulate, and process data from text files. It can be particularly useful for tasks like parsing logs or even CSV files, as it supports mathematical operations, pattern matching, and field manipulation.
By mastering the basics of awk, you’ll soon be able to use it efficiently across tasks, keeping documents streamlined and having powerful functions at your fingertips.
awk command FAQ
What is awk best used for?
awk is a powerful tool for both arithmetic and string operations. It is best used for text processing, extracting, manipulating structured data, pattern matching, field-based operations, and calculations.
How is awk different from sed?
Both of these are Linux commands. However, sed is best suited for line-based editing and basic text manipulation, while awk is a complete programming language that allows for conditionals and calculations as well as field-based data processing.
Can awk handle large datasets?
Because awk operates line by line rather than loading the entire file into memory, it can process large datasets. However, when performing extremely complex operations on very large files, performance may suffer.