Sed & Awk
Master powerful text processing tools for stream editing and data manipulation
Introduction to Sed
Sed (Stream Editor) is a powerful text-processing utility that performs editing operations on text streams or files. It's perfect for automated editing tasks, search and replace operations, and text transformation.
When to Use Sed
- Simple find and replace
- Delete lines
- Insert/append text
- Line-based operations
Basic Syntax
sed [options] 'command' file
sed -i 's/old/new/g' file
Sed: Basic Operations
Substitution (s command)
# Basic substitution
echo "hello world" | sed 's/world/universe/'
# Output: hello universe
# Global substitution (all occurrences)
echo "foo foo foo" | sed 's/foo/bar/g'
# Output: bar bar bar
# Case-insensitive substitution
echo "Hello WORLD" | sed 's/hello/hi/I'
# Output: hi WORLD
# Substitute on specific line
sed '3s/old/new/' file.txt
# Substitute on line range
sed '2,5s/old/new/g' file.txt
Delete (d command)
# Delete line 3
sed '3d' file.txt
# Delete lines 2-5
sed '2,5d' file.txt
# Delete last line
sed '$d' file.txt
# Delete lines matching pattern
sed '/pattern/d' file.txt
# Delete empty lines
sed '/^$/d' file.txt
# Delete lines starting with #
sed '/^#/d' file.txt
Print (p command)
# Print specific line (use -n to suppress default output)
sed -n '5p' file.txt
# Print line range
sed -n '10,20p' file.txt
# Print lines matching pattern
sed -n '/error/p' file.txt
# Print lines NOT matching pattern
sed -n '/error/!p' file.txt
Sed: Advanced Operations
In-place Editing
# Edit file in-place
sed -i 's/old/new/g' file.txt
# Create backup before editing
sed -i.bak 's/old/new/g' file.txt
# Edit multiple files
sed -i 's/old/new/g' *.txt
Insert and Append
# Insert line before match
sed '/pattern/i\New line before' file.txt
# Append line after match
sed '/pattern/a\New line after' file.txt
# Insert at specific line number
sed '3i\New line' file.txt
# Append to end of file
sed '$a\Last line' file.txt
Multiple Commands
# Use -e for multiple commands
sed -e 's/foo/bar/g' -e 's/hello/hi/g' file.txt
# Use semicolon separator
sed 's/foo/bar/g; s/hello/hi/g' file.txt
# Use script file
cat << 'EOF' > script.sed
s/foo/bar/g
s/hello/hi/g
/^$/d
EOF
sed -f script.sed file.txt
Address Ranges
# From pattern to pattern
sed '/START/,/END/d' file.txt
# From pattern to line number
sed '/BEGIN/,10d' file.txt
# Every nth line
sed -n '1~2p' file.txt # Print odd lines
# Last line
sed -n '$p' file.txt
Sed: Practical Examples
🔧 Configuration File Management
# Change configuration value
sed -i 's/^DEBUG=.*/DEBUG=true/' config.sh
# Comment out a line
sed -i 's/^max_connections/# max_connections/' postgresql.conf
# Uncomment a line
sed -i 's/^# *\(max_connections\)/\1/' postgresql.conf
# Update port number
sed -i 's/port=[0-9]*/port=8080/' config.ini
📝 Log File Processing
# Extract ERROR lines
sed -n '/ERROR/p' application.log
# Remove timestamps
sed 's/^[0-9]\{4\}-[0-9]\{2\}-[0-9]\{2\} [0-9:]\+ //' log.txt
# Keep only IP addresses
sed -n 's/.*\([0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\).*/\1/p' access.log
🔄 Text Transformation
# Convert Windows line endings to Unix
sed -i 's/\r$//' file.txt
# Add line numbers
sed = file.txt | sed 'N;s/\n/\t/'
# Reverse lines (tac alternative)
sed '1!G;h;$!d' file.txt
# Double-space a file
sed 'G' file.txt
Introduction to Awk
Awk is a powerful programming language designed for text processing and data extraction. It excels at processing columnar data, generating reports, and performing complex text manipulations.
When to Use Awk
- Column-based data
- Mathematical operations
- Complex conditionals
- Report generation
Basic Syntax
awk 'pattern { action }' file
awk -F: '{ print $1 }' /etc/passwd
Awk: Basic Concepts
Fields and Records
# $0 = entire line
# $1, $2, $3... = fields (columns)
# NF = number of fields
# NR = record (line) number
# Print first column
awk '{ print $1 }' file.txt
# Print last column
awk '{ print $NF }' file.txt
# Print line number and first field
awk '{ print NR, $1 }' file.txt
# Custom field separator
awk -F: '{ print $1 }' /etc/passwd
Pattern Matching
# Print lines matching pattern
awk '/error/ { print }' log.txt
# Print lines where column 3 > 100
awk '$3 > 100' data.txt
# Print lines where column 2 equals "active"
awk '$2 == "active" { print $1 }' status.txt
# Regex matching on specific field
awk '$1 ~ /^[0-9]+$/ { print }' file.txt
# NOT matching
awk '$2 !~ /test/ { print }' file.txt
Awk: Advanced Features
BEGIN and END Blocks
# BEGIN executes before processing
# END executes after processing
awk 'BEGIN { print "Report Start" }
{ total += $1 }
END { print "Total:", total }' numbers.txt
# Print header and footer
awk 'BEGIN { print "Name\tAge" }
{ print $1 "\t" $2 }
END { print "---End---" }' data.txt
Variables and Calculations
# Sum column values
awk '{ sum += $3 } END { print sum }' sales.txt
# Average
awk '{ sum += $1; count++ }
END { print sum/count }' numbers.txt
# Count occurrences
awk '{ count[$1]++ }
END { for (item in count) print item, count[item] }' data.txt
# Built-in variables
awk '{ print "Line", NR, "has", NF, "fields" }' file.txt
Conditional Statements
# If-else
awk '{ if ($3 > 100)
print $1, "high"
else
print $1, "low"
}' data.txt
# Multiple conditions
awk '{ if ($2 == "active" && $3 > 50)
print $1, "good"
else if ($2 == "inactive")
print $1, "offline"
else
print $1, "check"
}' status.txt
Functions
# String functions
awk '{ print length($0) }' file.txt # Length
awk '{ print toupper($1) }' file.txt # Uppercase
awk '{ print tolower($1) }' file.txt # Lowercase
awk '{ print substr($1, 1, 3) }' file.txt # Substring
# Math functions
awk '{ print sqrt($1) }' numbers.txt # Square root
awk '{ print int($1) }' numbers.txt # Integer part
# String matching
awk '{ if (match($0, /[0-9]+/)) print }' file.txt
Awk: Practical Examples
📊 Data Analysis
# Calculate statistics
awk 'BEGIN { max = 0; min = 999999 }
{
sum += $1
if ($1 > max) max = $1
if ($1 < min) min = $1
count++
}
END {
print "Count:", count
print "Sum:", sum
print "Average:", sum/count
print "Max:", max
print "Min:", min
}' numbers.txt
# Group by column and sum
awk '{ sales[$1] += $2 }
END { for (name in sales)
print name, sales[name] }' sales.txt
📋 Report Generation
# Format as table
awk 'BEGIN { printf "%-20s %-10s %10s\n", "Name", "Status", "Value"
print "-------------------------------------------" }
{ printf "%-20s %-10s %10.2f\n", $1, $2, $3 }
END { print "===========================================" }' data.txt
# CSV to formatted output
awk -F, 'NR==1 {
for (i=1; i<=NF; i++) header[i]=$i
}
NR>1 {
for (i=1; i<=NF; i++) print header[i]": "$i
print "---"
}' data.csv
🔍 Log Analysis
# Count HTTP status codes
awk '{ status[$9]++ }
END { for (code in status)
print code, status[code] }' access.log
# Find top 10 IP addresses
awk '{ ip[$1]++ }
END { for (addr in ip)
print ip[addr], addr }' access.log | sort -rn | head -10
# Calculate average response time
awk '{ sum += $NF; count++ }
END { print "Avg response:", sum/count, "ms" }' response.log
Combining Sed and Awk
Sed and Awk work great together in pipelines:
# Clean and process data
cat data.txt | \
sed 's/[[:space:]]\+/ /g' | \ # Normalize spaces
sed '/^$/d' | \ # Remove empty lines
awk '{ print $1, $2 }' # Extract columns
# Extract and calculate
grep "SALE" transactions.log | \
sed 's/.*amount=\([0-9.]*\).*/\1/' | \
awk '{ sum += $1 } END { print "Total: $"sum }'
# Complex pipeline
cat access.log | \
sed -n '/200 /p' | \ # Only successful requests
awk '{ print $1, $7 }' | \ # IP and path
sort | \
uniq -c | \
sort -rn | \
head -20
Quick Reference
Sed Commands
s/old/new/- Substituted- Deletep- Printi\- Inserta\- Appendc\- Change
Awk Built-ins
$0- Entire line$1, $2...- FieldsNF- Number of fieldsNR- Line numberFS- Field separatorOFS- Output separator
💡 Pro Tip: Use sed for simple line-based operations and awk for column-based data processing. Combine them in pipelines for powerful text processing workflows!