Sed & Awk

Master powerful text processing tools for stream editing and data manipulation

Introduction to Sed

Sed (Stream Editor) is a powerful text-processing utility that performs editing operations on text streams or files. It's perfect for automated editing tasks, search and replace operations, and text transformation.

When to Use Sed

  • Simple find and replace
  • Delete lines
  • Insert/append text
  • Line-based operations

Basic Syntax

sed [options] 'command' file

sed -i 's/old/new/g' file


Sed: Basic Operations

Substitution (s command)

# Basic substitution
echo "hello world" | sed 's/world/universe/'
# Output: hello universe

# Global substitution (all occurrences)
echo "foo foo foo" | sed 's/foo/bar/g'
# Output: bar bar bar

# Case-insensitive substitution
echo "Hello WORLD" | sed 's/hello/hi/I'
# Output: hi WORLD

# Substitute on specific line
sed '3s/old/new/' file.txt

# Substitute on line range
sed '2,5s/old/new/g' file.txt

Delete (d command)

# Delete line 3
sed '3d' file.txt

# Delete lines 2-5
sed '2,5d' file.txt

# Delete last line
sed '$d' file.txt

# Delete lines matching pattern
sed '/pattern/d' file.txt

# Delete empty lines
sed '/^$/d' file.txt

# Delete lines starting with #
sed '/^#/d' file.txt

Print (p command)

# Print specific line (use -n to suppress default output)
sed -n '5p' file.txt

# Print line range
sed -n '10,20p' file.txt

# Print lines matching pattern
sed -n '/error/p' file.txt

# Print lines NOT matching pattern
sed -n '/error/!p' file.txt

Sed: Advanced Operations

In-place Editing

# Edit file in-place
sed -i 's/old/new/g' file.txt

# Create backup before editing
sed -i.bak 's/old/new/g' file.txt

# Edit multiple files
sed -i 's/old/new/g' *.txt

Insert and Append

# Insert line before match
sed '/pattern/i\New line before' file.txt

# Append line after match
sed '/pattern/a\New line after' file.txt

# Insert at specific line number
sed '3i\New line' file.txt

# Append to end of file
sed '$a\Last line' file.txt

Multiple Commands

# Use -e for multiple commands
sed -e 's/foo/bar/g' -e 's/hello/hi/g' file.txt

# Use semicolon separator
sed 's/foo/bar/g; s/hello/hi/g' file.txt

# Use script file
cat << 'EOF' > script.sed
s/foo/bar/g
s/hello/hi/g
/^$/d
EOF
sed -f script.sed file.txt

Address Ranges

# From pattern to pattern
sed '/START/,/END/d' file.txt

# From pattern to line number
sed '/BEGIN/,10d' file.txt

# Every nth line
sed -n '1~2p' file.txt  # Print odd lines

# Last line
sed -n '$p' file.txt

Sed: Practical Examples

🔧 Configuration File Management

# Change configuration value
sed -i 's/^DEBUG=.*/DEBUG=true/' config.sh

# Comment out a line
sed -i 's/^max_connections/# max_connections/' postgresql.conf

# Uncomment a line
sed -i 's/^# *\(max_connections\)/\1/' postgresql.conf

# Update port number
sed -i 's/port=[0-9]*/port=8080/' config.ini
                

📝 Log File Processing

# Extract ERROR lines
sed -n '/ERROR/p' application.log

# Remove timestamps
sed 's/^[0-9]\{4\}-[0-9]\{2\}-[0-9]\{2\} [0-9:]\+ //' log.txt

# Keep only IP addresses
sed -n 's/.*\([0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\).*/\1/p' access.log

🔄 Text Transformation

# Convert Windows line endings to Unix
sed -i 's/\r$//' file.txt

# Add line numbers
sed = file.txt | sed 'N;s/\n/\t/'

# Reverse lines (tac alternative)
sed '1!G;h;$!d' file.txt

# Double-space a file
sed 'G' file.txt

Introduction to Awk

Awk is a powerful programming language designed for text processing and data extraction. It excels at processing columnar data, generating reports, and performing complex text manipulations.

When to Use Awk

  • Column-based data
  • Mathematical operations
  • Complex conditionals
  • Report generation

Basic Syntax

awk 'pattern { action }' file

awk -F: '{ print $1 }' /etc/passwd


Awk: Basic Concepts

Fields and Records

# $0 = entire line
# $1, $2, $3... = fields (columns)
# NF = number of fields
# NR = record (line) number

# Print first column
awk '{ print $1 }' file.txt

# Print last column
awk '{ print $NF }' file.txt

# Print line number and first field
awk '{ print NR, $1 }' file.txt

# Custom field separator
awk -F: '{ print $1 }' /etc/passwd

Pattern Matching

# Print lines matching pattern
awk '/error/ { print }' log.txt

# Print lines where column 3 > 100
awk '$3 > 100' data.txt

# Print lines where column 2 equals "active"
awk '$2 == "active" { print $1 }' status.txt

# Regex matching on specific field
awk '$1 ~ /^[0-9]+$/ { print }' file.txt

# NOT matching
awk '$2 !~ /test/ { print }' file.txt

Awk: Advanced Features

BEGIN and END Blocks

# BEGIN executes before processing
# END executes after processing

awk 'BEGIN { print "Report Start" }
     { total += $1 }
     END { print "Total:", total }' numbers.txt

# Print header and footer
awk 'BEGIN { print "Name\tAge" }
     { print $1 "\t" $2 }
     END { print "---End---" }' data.txt

Variables and Calculations

# Sum column values
awk '{ sum += $3 } END { print sum }' sales.txt

# Average
awk '{ sum += $1; count++ }
     END { print sum/count }' numbers.txt

# Count occurrences
awk '{ count[$1]++ }
     END { for (item in count) print item, count[item] }' data.txt

# Built-in variables
awk '{ print "Line", NR, "has", NF, "fields" }' file.txt

Conditional Statements

# If-else
awk '{ if ($3 > 100) 
           print $1, "high"
       else 
           print $1, "low" 
     }' data.txt

# Multiple conditions
awk '{ if ($2 == "active" && $3 > 50)
           print $1, "good"
       else if ($2 == "inactive")
           print $1, "offline"
       else
           print $1, "check"
     }' status.txt

Functions

# String functions
awk '{ print length($0) }' file.txt          # Length
awk '{ print toupper($1) }' file.txt         # Uppercase
awk '{ print tolower($1) }' file.txt         # Lowercase
awk '{ print substr($1, 1, 3) }' file.txt    # Substring

# Math functions
awk '{ print sqrt($1) }' numbers.txt         # Square root
awk '{ print int($1) }' numbers.txt          # Integer part

# String matching
awk '{ if (match($0, /[0-9]+/)) print }' file.txt

Awk: Practical Examples

📊 Data Analysis

# Calculate statistics
awk 'BEGIN { max = 0; min = 999999 }
     { 
       sum += $1
       if ($1 > max) max = $1
       if ($1 < min) min = $1
       count++
     }
     END {
       print "Count:", count
       print "Sum:", sum
       print "Average:", sum/count
       print "Max:", max
       print "Min:", min
     }' numbers.txt

# Group by column and sum
awk '{ sales[$1] += $2 }
     END { for (name in sales) 
           print name, sales[name] }' sales.txt

📋 Report Generation

# Format as table
awk 'BEGIN { printf "%-20s %-10s %10s\n", "Name", "Status", "Value" 
             print "-------------------------------------------" }
     { printf "%-20s %-10s %10.2f\n", $1, $2, $3 }
     END { print "===========================================" }' data.txt

# CSV to formatted output
awk -F, 'NR==1 { 
           for (i=1; i<=NF; i++) header[i]=$i
         }
         NR>1 {
           for (i=1; i<=NF; i++) print header[i]": "$i
           print "---"
         }' data.csv

🔍 Log Analysis

# Count HTTP status codes
awk '{ status[$9]++ }
     END { for (code in status)
           print code, status[code] }' access.log

# Find top 10 IP addresses
awk '{ ip[$1]++ }
     END { for (addr in ip)
           print ip[addr], addr }' access.log | sort -rn | head -10

# Calculate average response time
awk '{ sum += $NF; count++ }
     END { print "Avg response:", sum/count, "ms" }' response.log

Combining Sed and Awk

Sed and Awk work great together in pipelines:

# Clean and process data
cat data.txt | \
    sed 's/[[:space:]]\+/ /g' | \    # Normalize spaces
    sed '/^$/d' | \                   # Remove empty lines
    awk '{ print $1, $2 }'            # Extract columns

# Extract and calculate
grep "SALE" transactions.log | \
    sed 's/.*amount=\([0-9.]*\).*/\1/' | \
    awk '{ sum += $1 } END { print "Total: $"sum }'

# Complex pipeline
cat access.log | \
    sed -n '/200 /p' | \              # Only successful requests
    awk '{ print $1, $7 }' | \        # IP and path
    sort | \
    uniq -c | \
    sort -rn | \
    head -20

Quick Reference

Sed Commands

  • s/old/new/ - Substitute
  • d - Delete
  • p - Print
  • i\ - Insert
  • a\ - Append
  • c\ - Change

Awk Built-ins

  • $0 - Entire line
  • $1, $2... - Fields
  • NF - Number of fields
  • NR - Line number
  • FS - Field separator
  • OFS - Output separator

💡 Pro Tip: Use sed for simple line-based operations and awk for column-based data processing. Combine them in pipelines for powerful text processing workflows!