Apr 23, 2025

Marta P.

6min Read

The awk command in Linux: Understanding syntax, options and common examples

awk is a widely used Linux command for text-processing tasks. You can use this command directly in the terminal to extract data from a text file, scan for patterns, and perform minor actions like formatting text.

This command is also a scripting language, meaning it can be used to write full-fledged programs. However, this article will focus on what you can do with awk in the terminal to manipulate text files. We will cover syntax, common use cases, and answer the most common questions.

Syntax of the awk command

At its core, the awk command takes two kinds of input: a text file and a set of instructions. This is reflected in the basic syntax:

awk '{ action }' filename.txt

action corresponds to the action you want to take on your text file.
filename is the text file.

On the most basic level, the awk command syntax is very simple. All you need is a text file to interact with and an action to perform.

Options and syntax variations

Your basic awk command can be further extended by adding options:

-F: defines a field separator.
-v: defines variables.
-f: reads the script from a file.

Since awk treats whitespace (spaces or tabs) as the default delimiter between fields in a file or input, -F tells it how to interpret the columns or fields in each line based on a delimiter. In other words, when you use -F, awk knows how to split each line into parts (fields).

Using your document from before, you can use -F as a command line argument to define the colon as the field separator.

awk -F':' '/house/ { print "ID:", $1, "- Type:", $2, "- Location:", $3 }' filename.txt

awk identifies the separator and interprets the fields accordingly:

ID: 1 - Type: Big house - Location: New York

ID: 2 - Type: Small house - Location: Los Angeles

ID: 4 - Type: Houseboat - Location: Seattle

To assign a variable from the command line, you can run:

awk -v word="house" '$0 ~ word { print $0 }' filename.txt

word is now a variable that can be used in your action.

Finally, the -f option is useful for running multiple awk commands at once from the command line within a single script. Imagine you have a file simple_script.awk containing the following:

# Print the line number and the line content if the line contains the word "house"

$0 ~ /house/ { print NR, $0 }

# Print a message before every output

BEGIN { print "Starting to search for 'house'..." }

You can run this with:

awk -f simple_script.awk filename.txt

And you’ll have:

Starting to search for 'house'...
1:Big house:New York
2:Small house:Los Angeles
4:Houseboat:Seattle

Creating a sample file

Before we discuss use cases, you will need to create a sample file.

For the sake of this example, we will continue to use houses and locations as examples, but create a brand new input file.

To do this, simply use the touch command to create a new file:

touch houses.txt

Since the file is empty, we need to populate it. Let’s also change up the houses from our first example: we may want a small house in Vermont, a large house in San Diego, an apartment in New York, and a houseboat in London. We will also add the square meters for each home.

You can use your preferred text editor (e.g., nano or vim), or append data directly with echo.

echo -e "1:Small house:Vermont:100 sqm\n2:Large house:San Diego:300 sqm\n3:Apartment:New York:70 sqm\n4:Houseboat:London:40 sqm" > houses.txt

Now, houses.txt is ready for use in our awk examples.

Examples of the awk command

Let’s see how we can use the awk command on our houses.txt in multiple use cases. Below is a list of common scenarios.

1. Printing all lines of a file

To print all the lines from an input file, run the following command:

awk '{print}' houses.txt

This will return the following:

1:Small house:Vermont:100 sqm

2:Large house:San Diego:300 sqm

3:Apartment:New York:70 sqm

4:Houseboat:London:40 sqm

2. Printing a specific column

As we already saw, awk splits each line of a text file into fields (or columns) using whitespace as the separator. In our case, we are using a colon (:). To print specific columns, we need to know the column’s position within the line.

Let’s imagine we want to print the column containing the square footage of each home. To do this, we’ll run:

awk -F':' '{print $4}' houses.txt

The result will be:

100 sqm

300 sqm

70 sqm

40 sqm

Here:

-F’:’ tells awk to use colon (:) as the field separator.
$4 prints the fourth field (square footage).

3. Displaying lines that match a pattern

Let’s imagine you are only interested in the lines of your input file that contain a certain word, or that match a certain pattern. To do this, you’ll need to use regex.

Regex is a pattern-matching technique, and it can be used to create complex patterns to extract very specific parts of text. Here, we will use a very straightforward regular expression.

For example, if you want to print the entire line containing the word “Houseboat” from your input file, you’ll run:

awk -F ':' '/Houseboat/ {print}' houses.txt

Which will give you:

4:Houseboat:London:40 sqm

/Houseboat/ is the regex pattern: we are telling the system to look for all matching text for the word “Houseboat”.

4. Extracting and printing columns using field manipulation

You can also manipulate the fields within your text file and print them in a different order.

Let’s say you want to print each line of our text file as a real estate listing. You can do:

awk -F ':' '{print "For sale:", $2, "in", $3, ".", "Square footage:", $4}' houses.txt

Running this command will print:

The command allows you to rearrange and format the fields however you like. For example, you could swap $2 and $3 to print the location before the house type.

5. Calculating mathematical operations

The awk command can perform calculations.

Let’s add a column to our data containing the price of each property.

awk -F ':' '{print $0, ": $", NR * 100000}' houses.txt > priced_houses.txt

This command creates a priced_houses.txt with prices for all properties. For simplicity’s sake, we will make up prices based on the line number with: $, NR * 100000.

1:Small house:Vermont:100 sqm: $100000

2:Large house:San Diego:300 sqm: $200000

3:Apartment:New York:70 sqm: $300000

4:Houseboat:London:40 sqm: $400000

Now that we have some numbers, we can try out mathematical operations.

To calculate the total cost of the properties, you’ll be summing the last column, where the prices are stored ($5):

awk -F ':' '{gsub("[$,]", "", $5); sum += $5} END {print "Total cost:", sum}' priced_houses.txt

Which prints:

Total cost: 1000000

Here:

gsub(“[$,]”, “”, $5) removes any dollar signs or commas from the price in the fifth field (to allow for proper calculation).
sum += $5 adds the price to the running total.
END {print “Total cost:”, sum} prints the total cost after processing all lines.

6. Processing data based on conditional statements

To calculate only the price of selected properties – let’s say the apartment in New York and the houseboat in London – you will have to use conditional statements.

awk -F ':' '($2 == "Apartment" || $2 == "Houseboat") {gsub("[$,]", "", $5); sum += $5} END {print "NY + LDN, total cost:", sum}' priced_houses.txt

In this example:

$2 == “Apartment” || $2 == “Houseboat” is the condition that ensures that only lines containing “Apartment” or “Houseboat” are processed. || is the conditional symbol for “OR”.
gsub(“[$,]”, “”, $5) removes the dollar signs or commas.
sum += $5 adds the price to the sum.
END {print “NY + LDN, total cost:”, sum} prints the total cost for the selected properties.

The above command will print:

NY + LDN, total cost: 700000

Using our original input file houses.txt, you can use another conditional statement to, for example, only print a property if it’s larger than 50 sqm:

awk -F ':' '{if ($4 > 50) print $2, "in", $3, ":", "OK"; else print $2, "in", $3, ":", "too small."}' houses.txt

The above command uses a simple if-else conditional statement separated by a semicolon (;). It will print:

You can find all conditional statements in the GNU Awk User Manual.

7. Using built-in variables

awk has several built-in variables, both numerical and string-based, that are pre-defined in the language.

Here are the most commonly used:

NR (Number of Records)
NF (Number of Fields)
FS (Field Separator)
OFS (Output Field Separator)
FILENAME
RS (Record Separator)

For example, to display the number of fields in each line, you’d run:

awk -F ':' '{print "Line", NR, "has", NF, "fields"}' houses.txt

Which would output:

Line 1 has 4 fields

Line 2 has 4 fields

Line 3 has 4 fields

Line 4 has 4 fields

If you want to use OFS to specify the separator between fields when printing the output, you can run:

awk 'BEGIN {OFS="XXXX"} {print $1, $2, $3, $4}' houses.txt

Which will print:

1:SmallXXXXhouse:Vermont:100XXXXsqmXXXX
2:LargeXXXXhouse:SanXXXXDiego:300XXXXsqm
3:Apartment:NewXXXXYork:70XXXXsqmXXXX
4:Houseboat:London:40XXXXsqmXXXXXXXX

8. Using user-defined functions

In awk, you can manipulate your text more efficiently by using functions directly on the terminal.

For example, to convert the second column (house types) to lowercase, you’ll run:

awk -F ':' '{print tolower($2)}' houses.txt

Which outputs:

small house

large house

apartment

houseboat

Here, tolower($2) is the function being used.

If you want to replace the word “house” with “mansion” in the second column:

awk -F ':' '{gsub(/house/, "mansion", $2); print $2}' houses.txt

Where gsub(/house/, “mansion”, $2) is the function.

Small mansion

Large mansion

Apartment

Houseboat

Conclusion

Linux’s awk command is a powerful processing tool developers can use to extract, manipulate, and process data from text files. It can be particularly useful for tasks like parsing logs or even CSV files, as it supports mathematical operations, pattern matching, and field manipulation.

By mastering the basics of awk, you’ll soon be able to use it efficiently across tasks, keeping documents streamlined and having powerful functions at your fingertips.

awk command FAQ

What is awk best used for?

awk is a powerful tool for both arithmetic and string operations. It is best used for text processing, extracting, manipulating structured data, pattern matching, field-based operations, and calculations.

How is awk different from sed?

Both of these are Linux commands. However, sed is best suited for line-based editing and basic text manipulation, while awk is a complete programming language that allows for conditionals and calculations as well as field-based data processing.

Can awk handle large datasets?

Because awk operates line by line rather than loading the entire file into memory, it can process large datasets. However, when performing extremely complex operations on very large files, performance may suffer.

The author

Marta Palandri

Marta Palandri is a senior technical editor with over six years of experience as a developer, working extensively with APIs and backend systems. She now combines her development experience with her editorial background to create content focused on accessibility and storytelling. Find her on LinkedIn.

The awk command in Linux: Understanding syntax, options and common examples

Syntax of the awk command

Options and syntax variations

Creating a sample file

Examples of the awk command

1. Printing all lines of a file

2. Printing a specific column

3. Displaying lines that match a pattern

4. Extracting and printing columns using field manipulation

5. Calculating mathematical operations

6. Processing data based on conditional statements

7. Using built-in variables

8. Using user-defined functions

Conclusion

awk command FAQ

What is awk best used for?

How is awk different from sed?

Can awk handle large datasets?

What our customers say

Leave a reply Cancel reply

The awk command in Linux: Understanding syntax, options and common examples

Syntax of the awk command

Options and syntax variations

Creating a sample file

Examples of the awk command

1. Printing all lines of a file

2. Printing a specific column

3. Displaying lines that match a pattern

4. Extracting and printing columns using field manipulation

5. Calculating mathematical operations

6. Processing data based on conditional statements

7. Using built-in variables

8. Using user-defined functions

Conclusion

awk command FAQ

What is awk best used for?

How is awk different from sed?

Can awk handle large datasets?

Related tutorials

How to install Docker using Ansible on Ubuntu

How to run Nginx in a Docker container: From image pull to configuration

How to change Minecraft version from Java to Bedrock on a dedicated server

What our customers say

Leave a reply Cancel reply