Python for Security Automation

Write the scripts that security professionals actually use.

🟡 Intermediate ⏱️ 2.5 hours Cybersecurity / Programming

Learn how to automate cybersecurity tasks with Python. Build a port scanner, log analyzer, and hash verification tool — practical security automation scripts used in real cybersecurity workflows.

What You're Building

You'll build three practical security automation tools from scratch:

Tool 1 — Port Scanner: Scans a target host to identify open ports and the services running on them. A fundamental reconnaissance tool in any security workflow.

Tool 2 — Log Analyzer: Parses server access logs, identifies suspicious patterns — repeated failed logins, unusual request volumes, specific error codes — and generates a structured summary report.

Tool 3 — Hash Verification Utility: Computes and compares file hashes (MD5, SHA-256) to verify file integrity — confirming whether a downloaded file or configuration has been tampered with.

These aren't toy scripts. Each tool solves a real problem that security professionals encounter regularly. By the end, you'll understand how Python applies directly to security work — and have three working tools you can adapt and extend.

Before You Start

What you need installed:

  • Python 3.8 or later — verify with python3 --version in your terminal
  • pip — Python's package manager (included with Python)
  • A text editor — VS Code recommended

What you should know:

  • Python variables, functions, and loops
  • Basic command line navigation
  • How to install Python packages with pip

If you're newer to Python, follow the "Python for Beginners" blog post first and come back here.

Project setup:

mkdir security-automation
cd security-automation

All three tools will live in this folder.

Tool 1: Port Scanner

What Is Port Scanning?

Every networked service on a computer communicates through a port — a numbered endpoint (0–65535) that routes traffic to the right service. HTTP runs on port 80. HTTPS on 443. SSH on 22. FTP on 21.

A port scanner probes a target host to determine which ports are open — which services are accepting connections. Security professionals use port scanning to:

  • Inventory the services running on their own infrastructure
  • Identify unexpected open ports that represent attack surface
  • Verify firewall rules are working as intended

Important: Only scan systems you own or have explicit written permission to scan. Unauthorized port scanning is illegal in many jurisdictions. This tutorial is for learning on your own systems or designated lab environments.

Building the Scanner

Create port_scanner.py:

#!/usr/bin/env python3
"""
Port Scanner — Security Automation Tool 1
Scans a target host for open ports and identifies common services.
"""

import socket
import concurrent.futures
import argparse
import sys
from datetime import datetime

# Common ports and their associated services
COMMON_SERVICES = {
    21: "FTP",
    22: "SSH",
    23: "Telnet",
    25: "SMTP",
    53: "DNS",
    80: "HTTP",
    110: "POP3",
    143: "IMAP",
    443: "HTTPS",
    445: "SMB",
    3306: "MySQL",
    3389: "RDP",
    5432: "PostgreSQL",
    6379: "Redis",
    8080: "HTTP-Alt",
    8443: "HTTPS-Alt",
    27017: "MongoDB"
}

def resolve_host(target):
    """Resolve hostname to IP address."""
    try:
        ip = socket.gethostbyname(target)
        return ip
    except socket.gaierror:
        print(f"[ERROR] Could not resolve hostname: {target}")
        sys.exit(1)

def scan_port(ip, port, timeout=1.0):
    """
    Attempt to connect to a specific port.
    Returns the port number if open, None if closed or filtered.
    """
    try:
        # Create a TCP socket
        sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        sock.settimeout(timeout)
        
        # Attempt connection — returns 0 if successful
        result = sock.connect_ex((ip, port))
        sock.close()
        
        if result == 0:
            return port
        return None
        
    except socket.error:
        return None

def get_service_name(port):
    """Return the service name for known ports, or 'Unknown' for others."""
    return COMMON_SERVICES.get(port, "Unknown")

def run_scan(target, ports, max_threads=100, timeout=1.0):
    """
    Run the port scan using concurrent threads for speed.
    """
    ip = resolve_host(target)
    
    print("\n" + "="*55)
    print(f"  PORT SCANNER — Security Automation Tool")
    print("="*55)
    print(f"  Target:    {target} ({ip})")
    print(f"  Ports:     {min(ports)} - {max(ports)}")
    print(f"  Threads:   {max_threads}")
    print(f"  Started:   {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
    print("="*55 + "\n")
    
    open_ports = []
    
    # ThreadPoolExecutor runs multiple scans simultaneously
    with concurrent.futures.ThreadPoolExecutor(max_workers=max_threads) as executor:
        # Submit all port scans concurrently
        futures = {
            executor.submit(scan_port, ip, port, timeout): port 
            for port in ports
        }
        
        for future in concurrent.futures.as_completed(futures):
            result = future.result()
            if result:
                open_ports.append(result)
    
    # Sort and display results
    open_ports.sort()
    
    if open_ports:
        print(f"  {'PORT': < 10} {'STATE':<12} {'SERVICE'}")
        print(f"  {'-'*40}")
        for port in open_ports:
            service = get_service_name(port)
            print(f"  {port:<10} {'OPEN':<12} {service}")
    else:
        print("  No open ports found in the specified range.")
    
    print("\n" + "="*55)
    print(f"  Scan complete. {len(open_ports)} open port(s) found.")
    print(f"  Finished:  {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
    print("="*55 + "\n")
    
    return open_ports

def parse_port_range(port_string):
    """Parse port range string like '1-1024' or '80,443,22'."""
    ports = []
    
    if ',' in port_string:
        # Comma-separated list: "80,443,22"
        ports = [int(p.strip()) for p in port_string.split(',')]
    elif '-' in port_string:
        # Range: "1-1024"
        start, end = port_string.split('-')
        ports = list(range(int(start), int(end) + 1))
    else:
        # Single port
        ports = [int(port_string)]
    
    return ports

def main():
    parser = argparse.ArgumentParser(
        description='Security Port Scanner',
        formatter_class=argparse.RawDescriptionHelpFormatter,
        epilog="""
Examples:
  python3 port_scanner.py localhost -p 1-1024
  python3 port_scanner.py 192.168.1.1 -p 80,443,22,3306
  python3 port_scanner.py scanme.nmap.org -p 1-100
        """
    )
    
    parser.add_argument('target', help='Target hostname or IP address')
    parser.add_argument('-p', '--ports', 
                       default='1-1024',
                       help='Port range (e.g., 1-1024 or 80,443,22)')
    parser.add_argument('-t', '--threads',
                       type=int, default=100,
                       help='Number of concurrent threads (default: 100)')
    parser.add_argument('--timeout',
                       type=float, default=1.0,
                       help='Connection timeout in seconds (default: 1.0)')
    
    args = parser.parse_args()
    ports = parse_port_range(args.ports)
    
    run_scan(args.target, ports, args.threads, args.timeout)

if __name__ == '__main__':
    main()

What the Code Does — Explained

socket.socket(socket.AF_INET, socket.SOCK_STREAM) creates a TCP socket. AF_INET specifies IPv4. SOCK_STREAM specifies TCP (a connection-oriented protocol). We're essentially creating the same kind of connection your browser makes when it loads a webpage.

sock.connect_ex((ip, port)) attempts the connection and returns 0 if it succeeds (port is open) or an error code if it fails (port is closed or filtered). Unlike connect(), connect_ex() doesn't raise an exception on failure — it just returns the error code, which is what we want for a scanner.

concurrent.futures.ThreadPoolExecutor runs many port scans simultaneously using threads. Without threading, scanning 1,024 ports sequentially with a 1-second timeout would take 1,024 seconds. With 100 threads, it takes roughly 10 seconds. Threading is essential for any network tool.

argparse provides a professional command-line interface with help text, argument validation, and default values — the same pattern used in real security tools.

Run the Scanner

Scan your own machine on the first 1,024 ports:

python3 port_scanner.py localhost -p 1-1024

Scan specific ports:

python3 port_scanner.py localhost -p 22,80,443,3000,3306,5432

Nmap provides a legal scan target for testing:

python3 port_scanner.py scanme.nmap.org -p 1-100

✅ Checkpoint: The scanner runs, completes, and displays a formatted table of open ports with their service names.

Tool 2: Log Analyzer

What Is Log Analysis?

Web servers, applications, firewalls, and operating systems generate logs — timestamped records of every event that occurs. Log analysis is the process of parsing those records to identify patterns, anomalies, and potential security incidents.

Security professionals analyze logs to:

  • Identify brute-force login attempts (many failed authentications from one IP)
  • Detect unusual traffic patterns that may indicate scanning or exploitation
  • Investigate incidents by reconstructing what happened and when
  • Monitor for specific error codes that indicate attacks (SQL injection attempts, path traversal)

Generating Sample Log Data

Create generate_logs.py to generate realistic test data:

#!/usr/bin/env python3
"""Generates sample web server access logs for testing the analyzer."""

import random
from datetime import datetime, timedelta

# Realistic IP pool — mostly legitimate, a few suspicious
IPS = [
    "192.168.1.100", "192.168.1.101", "10.0.0.45",
    "172.16.0.88", "203.0.113.42", "198.51.100.7",
    # Suspicious IPs that will generate lots of failed requests
    "45.33.32.156", "45.33.32.156", "45.33.32.156",  # repeated = high volume
    "23.92.24.22", "23.92.24.22",
]

PATHS = [
    "/", "/about", "/contact", "/products", "/api/users",
    "/api/products", "/login", "/dashboard",
    # Suspicious paths
    "/admin", "/wp-login.php", "/.env", "/etc/passwd",
    "/api/users?id=1 OR 1=1",  # SQL injection attempt
    "/../../../etc/shadow",  # Path traversal attempt
]

METHODS = ["GET", "GET", "GET", "POST", "GET"]  # Weighted toward GET
STATUS_CODES = [200, 200, 200, 200, 301, 404, 403, 500, 401]  # Weighted toward 200

USER_AGENTS = [
    "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36",
    "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7)",
    "curl/7.81.0",
    "python-requests/2.28.0",
    "sqlmap/1.7",  # Known attack tool
    "Nikto/2.1.6",  # Known scanner
]

def generate_logs(filename="access.log", count=500):
    base_time = datetime.now() - timedelta(hours=2)
    
    with open(filename, 'w') as f:
        for i in range(count):
            ip = random.choice(IPS)
            timestamp = base_time + timedelta(seconds=i*14)
            method = random.choice(METHODS)
            path = random.choice(PATHS)
            status = random.choice(STATUS_CODES)
            size = random.randint(200, 15000)
            agent = random.choice(USER_AGENTS)
            
            # Apache Combined Log Format
            log_line = (
                f'{ip} - - [{timestamp.strftime("%d/%b/%Y:%H:%M:%S +0000")}] '
                f'"{method} {path} HTTP/1.1" {status} {size} '
                f'"-" "{agent}"\n'
            )
            f.write(log_line)
    
    print(f"Generated {count} log entries in {filename}")

if __name__ == '__main__':
    generate_logs()

Run it:

python3 generate_logs.py

This creates access.log with 500 realistic log entries.

Building the Analyzer

Create log_analyzer.py:

#!/usr/bin/env python3
"""
Log Analyzer — Security Automation Tool 2
Parses web server access logs and generates a security-focused summary report.
"""

import re
import argparse
from collections import defaultdict, Counter
from datetime import datetime

# Regex pattern for Apache/Nginx Combined Log Format
LOG_PATTERN = re.compile(
    r'(?P<ip>)\S+'           # IP address
    r' \S+ \S+ '             # ident and auth (usually -)
    r'\[(?P<timestamp>)[^\]]+\]'  # timestamp
    r' "(?P<method>)\S+'     # HTTP method
    r' (?P<path>)\S+'        # requested path
    r' \S+" '                # protocol
    r'(?P<status>)\d{3}'     # status code
    r' (?P<size>)\S+'        # response size
    r'.*?"(?P<agent>)[^"]*"' # user agent
)

# Patterns that indicate potential attacks or reconnaissance
SUSPICIOUS_PATTERNS = {
    'sql_injection': [
        r'union\s+select', r'or\s+1\s*=\s*1', r'drop\s+table',
        r'insert\s+into', r'exec\s*\(', r"'--", r'xp_cmdshell'
    ],
    'path_traversal': [
        r'\.\./\.\.',  r'%2e%2e', r'etc/passwd', r'etc/shadow',
        r'windows/system32'
    ],
    'sensitive_files': [
        r'\.env', r'\.git/', r'wp-login\.php', r'phpMyAdmin',
        r'\.htaccess', r'\.config', r'config\.php'
    ],
    'known_scanners': [
        r'sqlmap', r'nikto', r'nmap', r'masscan',
        r'zgrab', r'nuclei', r'dirbuster'
    ]
}

def parse_log_line(line):
    """Parse a single log line and return a dictionary of fields."""
    match = LOG_PATTERN.match(line.strip())
    if match:
        return match.groupdict()
    return None

def detect_suspicious_activity(path, agent):
    """Check a request path and user agent for suspicious patterns."""
    findings = []
    combined = (path + " " + agent).lower()
    
    for category, patterns in SUSPICIOUS_PATTERNS.items():
        for pattern in patterns:
            if re.search(pattern, combined, re.IGNORECASE):
                findings.append(category)
                break  # One match per category is enough
    
    return findings

def analyze_logs(filepath):
    """Parse the log file and collect statistics."""
    stats = {
        'total_requests': 0,
        'parse_errors': 0,
        'ip_requests': Counter(),
        'ip_errors': defaultdict(int),
        'status_codes': Counter(),
        'top_paths': Counter(),
        'suspicious_ips': defaultdict(list),
        'agent_counts': Counter(),
        'error_401_ips': Counter(),  # Unauthorized — possible brute force
    }
    
    with open(filepath, 'r') as f:
        for line in f:
            parsed = parse_log_line(line)
            
            if not parsed:
                stats['parse_errors'] += 1
                continue
            
            stats['total_requests'] += 1
            ip = parsed['ip']
            status = int(parsed['status'])
            path = parsed['path']
            agent = parsed['agent']
            
            # Count requests per IP
            stats['ip_requests'][ip] += 1
            
            # Count status codes
            stats['status_codes'][status] += 1
            
            # Count top paths
            stats['top_paths'][path] += 1
            
            # Count user agents
            stats['agent_counts'][agent[:50]] += 1
            
            # Track 401 Unauthorized by IP (brute force indicator)
            if status == 401:
                stats['error_401_ips'][ip] += 1
            
            # Track error responses per IP
            if status >= 400:
                stats['ip_errors'][ip] += 1
            
            # Check for suspicious patterns
            suspicious = detect_suspicious_activity(path, agent)
            if suspicious:
                stats['suspicious_ips'][ip].extend(suspicious)
    
    return stats

def generate_report(stats, filepath):
    """Format and print the analysis report."""
    print("\n" + "="*60)
    print("  SECURITY LOG ANALYSIS REPORT")
    print("="*60)
    print(f"  File analyzed: {filepath}")
    print(f"  Generated:     {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
    print("="*60)
    
    # Overview
    print(f"\n📊 OVERVIEW")
    print(f"  Total requests:    {stats['total_requests']:,}")
    print(f"  Parse errors:      {stats['parse_errors']}")
    print(f"  Unique IPs:        {len(stats['ip_requests'])}")
    
    # Status code breakdown
    print(f"\n📈 STATUS CODES")
    for code, count in sorted(stats['status_codes'].items()):
        category = "✓ OK" if code < 400 else ("⚠ Client Error" if code < 500 else "✗ Server Error")
        print(f"  {code} {category:<20} {count:>6} requests")
    
    # High-volume IPs
    print(f"\n🔝 TOP 5 IPs BY REQUEST VOLUME")
    for ip, count in stats['ip_requests'].most_common(5):
        error_count = stats['ip_errors'].get(ip, 0)
        error_rate = (error_count / count * 100) if count > 0 else 0
        print(f"  {ip:<20} {count:>5} requests  |  {error_rate:.1f}% errors")
    
    # Brute force indicators
    brute_force_candidates = [
        (ip, count) for ip, count in stats['error_401_ips'].items() 
        if count >= 5
    ]
    if brute_force_candidates:
        print(f"\n🔐 POSSIBLE BRUTE FORCE ATTEMPTS (5+ 401 errors)")
        for ip, count in sorted(brute_force_candidates, key=lambda x: x[1], reverse=True):
            print(f"  ⚠  {ip:<20} {count} unauthorized requests")
    
    # Suspicious activity
    if stats['suspicious_ips']:
        print(f"\n🚨 SUSPICIOUS ACTIVITY DETECTED")
        for ip, findings in stats['suspicious_ips'].items():
            unique_findings = list(set(findings))
            print(f"  ⚠  {ip}")
            for finding in unique_findings:
                print(f"       → {finding.replace('_', ' ').title()}")
    else:
        print(f"\n✅ No suspicious patterns detected in requests")
    
    # Top requested paths
    print(f"\n🛤  TOP 5 REQUESTED PATHS")
    for path, count in stats['top_paths'].most_common(5):
        display_path = path[:50] + "..." if len(path) > 50 else path
        print(f"  {count:>5}  {display_path}")
    
    print("\n" + "="*60)
    print("  End of report")
    print("="*60 + "\n")

def main():
    parser = argparse.ArgumentParser(
        description='Security Log Analyzer',
        epilog='Example: python3 log_analyzer.py access.log'
    )
    parser.add_argument('logfile', help='Path to the log file to analyze')
    args = parser.parse_args()
    
    print(f"\nAnalyzing log file: {args.logfile}")
    
    stats = analyze_logs(args.logfile)
    generate_report(stats, args.logfile)

if __name__ == '__main__':
    main()

Run the Analyzer

python3 log_analyzer.py access.log

What you should see: A structured security report showing request volumes, status code distribution, high-volume IPs with error rates, brute force candidates, suspicious activity by type and IP, and top requested paths.

What the code does — explained:

re.compile(LOG_PATTERN) — Pre-compiling the regex pattern outside the loop is a performance optimization. The regex is compiled once and reused for every line instead of recompiling it 500 times.

defaultdict and Counter from the collections module are specialized dictionaries. Counter automatically counts occurrences. defaultdict(int) starts every key at 0, preventing KeyError when accessing a key for the first time.

detect_suspicious_activity checks each request against multiple lists of known malicious patterns — SQL injection strings, path traversal sequences, sensitive file names, and known security tool user agents. This is a simplified version of the pattern-matching logic used in real Web Application Firewalls (WAFs) and SIEM (Security Information and Event Management) tools.

✅ Checkpoint: The analyzer produces a complete, formatted security report with suspicious activity flagged by category and IP.

Tool 3: Hash Verification Utility

What Is Hash Verification?

A cryptographic hash is a fixed-length fingerprint of a file — computed by running the file's contents through a hash function like MD5 or SHA-256. The same file always produces the same hash. Change even one bit of the file and the hash changes completely.

Security professionals use hash verification to:

  • Confirm downloaded files haven't been tampered with (malware often modifies legitimate tools)
  • Verify configuration files haven't been altered
  • Detect file integrity violations in incident response
  • Chain-of-custody verification in digital forensics

Building the Utility

Create hash_verify.py:

#!/usr/bin/env python3
"""
Hash Verification Utility — Security Automation Tool 3
Computes and verifies cryptographic hashes for file integrity checking.
"""

import hashlib
import argparse
import sys
import os
from pathlib import Path
from datetime import datetime

SUPPORTED_ALGORITHMS = ['md5', 'sha1', 'sha256', 'sha512']

def compute_hash(filepath, algorithm='sha256', buffer_size=65536):
    """
    Compute the cryptographic hash of a file.
    Uses buffered reading for memory-efficient handling of large files.
    """
    algorithm = algorithm.lower()
    
    if algorithm not in SUPPORTED_ALGORITHMS:
        print(f"[ERROR] Unsupported algorithm: {algorithm}")
        print(f"        Supported: {', '.join(SUPPORTED_ALGORITHMS)}")
        sys.exit(1)
    
    # Create hash object for the specified algorithm
    hash_obj = hashlib.new(algorithm)
    
    try:
        file_size = os.path.getsize(filepath)
        
        with open(filepath, 'rb') as f:
            while chunk := f.read(buffer_size):
                hash_obj.update(chunk)
        
        return hash_obj.hexdigest(), file_size
        
    except FileNotFoundError:
        print(f"[ERROR] File not found: {filepath}")
        sys.exit(1)
    except PermissionError:
        print(f"[ERROR] Permission denied: {filepath}")
        sys.exit(1)

def format_file_size(size_bytes):
    """Human-readable file size."""
    for unit in ['B', 'KB', 'MB', 'GB']:
        if size_bytes < 1024:
            return f"{size_bytes:.1f} {unit}"
        size_bytes /= 1024
    return f"{size_bytes:.1f} TB"

def compute_and_display(filepath, algorithms):
    """Compute and display hashes for a file using multiple algorithms."""
    print("\n" + "="*65)
    print("  HASH VERIFICATION UTILITY")
    print("="*65)
    print(f"  File:      {filepath}")
    print(f"  Computed:  {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
    
    # Compute with first algorithm to get file size
    first_hash, file_size = compute_hash(filepath, algorithms[0])
    print(f"  Size:      {format_file_size(file_size)}")
    print("="*65)
    
    results = {}
    
    # Display first result
    print(f"\n  {algorithms[0].upper():<10}  {first_hash}")
    results[algorithms[0]] = first_hash
    
    # Compute remaining algorithms
    for algo in algorithms[1:]:
        hash_value, _ = compute_hash(filepath, algo)
        print(f"  {algo.upper():<10}  {hash_value}")
        results[algo] = hash_value
    
    print()
    return results

def verify_hash(filepath, expected_hash, algorithm='sha256'):
    """Verify a file against a known hash value."""
    print("\n" + "="*65)
    print("  HASH VERIFICATION — INTEGRITY CHECK")
    print("="*65)
    print(f"  File:      {filepath}")
    print(f"  Algorithm: {algorithm.upper()}")
    print(f"  Expected:  {expected_hash}")
    print("="*65)
    
    computed, file_size = compute_hash(filepath, algorithm)
    
    print(f"\n  File size:  {format_file_size(file_size)}")
    print(f"  Expected:   {expected_hash}")
    print(f"  Computed:   {computed}")
    
    if computed.lower() == expected_hash.lower():
        print("\n  ✅ INTEGRITY VERIFIED — Hash matches. File is authentic.")
    else:
        print("\n  🚨 INTEGRITY FAILURE — Hash mismatch. File may be corrupted or tampered with.")
        
        # Show where hashes differ
        min_len = min(len(computed), len(expected_hash))
        diff_positions = [
            i for i in range(min_len) 
            if computed[i] != expected_hash.lower()[i]
        ]
        if diff_positions:
            print(f"     First difference at character position: {diff_positions[0]}")
    
    print("="*65 + "\n")
    
    return computed.lower() == expected_hash.lower()

def scan_directory(directory, algorithm='sha256'):
    """
    Compute hashes for all files in a directory.
    Useful for creating a baseline integrity snapshot.
    """
    dir_path = Path(directory)
    
    if not dir_path.is_dir():
        print(f"[ERROR] Not a directory: {directory}")
        sys.exit(1)
    
    print("\n" + "="*65)
    print("  DIRECTORY HASH SCAN")
    print("="*65)
    print(f"  Directory: {directory}")
    print(f"  Algorithm: {algorithm.upper()}")
    print(f"  Scanned:   {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
    print("="*65 + "\n")
    
    results = {}
    files = list(dir_path.rglob('*'))
    file_count = 0
    
    for file_path in files:
        if file_path.is_file():
            try:
                hash_value, size = compute_hash(str(file_path), algorithm)
                relative_path = file_path.relative_to(dir_path)
                results[str(relative_path)] = hash_value
                print(f"  {hash_value[:16]}...  {format_file_size(size):<10}  {relative_path}")
                file_count += 1
            except Exception as e:
                print(f"  [SKIP] {file_path.name} — {e}")
    
    print(f"\n  Scanned {file_count} file(s).")
    print("="*65 + "\n")
    
    return results

def main():
    parser = argparse.ArgumentParser(
        description='Hash Verification Utility — File Integrity Checking',
        formatter_class=argparse.RawDescriptionHelpFormatter,
        epilog="""
Examples:
  # Compute SHA-256 hash of a file
  python3 hash_verify.py compute suspicious_file.exe

  # Compute multiple hash types
  python3 hash_verify.py compute downloaded_tool.zip --algo sha256 md5

  # Verify a file against a known hash
  python3 hash_verify.py verify config.json a3f5... --algo sha256

  # Scan all files in a directory
  python3 hash_verify.py scan ./config_directory
        """
    )
    
    subparsers = parser.add_subparsers(dest='command', required=True)
    
    # Compute command
    compute_parser = subparsers.add_parser('compute', help='Compute file hash(es)')
    compute_parser.add_argument('filepath', help='File to hash')
    compute_parser.add_argument('--algo', nargs='+', 
                               default=['sha256'],
                               choices=SUPPORTED_ALGORITHMS,
                               help='Hash algorithm(s) to use')
    
    # Verify command
    verify_parser = subparsers.add_parser('verify', help='Verify file against known hash')
    verify_parser.add_argument('filepath', help='File to verify')
    verify_parser.add_argument('expected_hash', help='Expected hash value')
    verify_parser.add_argument('--algo', default='sha256',
                              choices=SUPPORTED_ALGORITHMS,
                              help='Hash algorithm')
    
    # Scan command
    scan_parser = subparsers.add_parser('scan', help='Hash all files in a directory')
    scan_parser.add_argument('directory', help='Directory to scan')
    scan_parser.add_argument('--algo', default='sha256',
                            choices=SUPPORTED_ALGORITHMS,
                            help='Hash algorithm')
    
    args = parser.parse_args()
    
    if args.command == 'compute':
        compute_and_display(args.filepath, args.algo)
    elif args.command == 'verify':
        result = verify_hash(args.filepath, args.expected_hash, args.algo)
        sys.exit(0 if result else 1)
    elif args.command == 'scan':
        scan_directory(args.directory, args.algo)

if __name__ == '__main__':
    main()

Run the Utility

Compute the hash of any file:

python3 hash_verify.py compute access.log

Compute multiple hash types at once:

python3 hash_verify.py compute access.log --algo sha256 md5 sha512

Verify a file against a known hash:

First, compute the hash and copy the output:

python3 hash_verify.py compute access.log

Then verify (paste the hash you just computed):

python3 hash_verify.py verify access.log [paste-hash-here] --algo sha256

You should see ✅ INTEGRITY VERIFIED. Now open access.log, change one character, save it, and run the verify command again. You'll see 🚨 INTEGRITY FAILURE — proving that any modification, no matter how small, changes the hash completely.

Scan an entire directory:

python3 hash_verify.py scan .

This hashes every file in the current directory — useful for creating an integrity baseline of a configuration directory or a build output.

What the code does — explained:

while chunk := f.read(buffer_size) — The walrus operator (:=) reads a chunk and assigns it to chunk simultaneously. This pattern allows reading large files in chunks without loading the entire file into memory — essential for hashing files that might be gigabytes in size.

hashlib.new(algorithm) — Creates a hash object dynamically based on the algorithm name. This is more flexible than using hashlib.sha256() directly and allows the same code to support multiple algorithms.

The directory scan creates a hash baseline — a snapshot of a directory's contents at a point in time. Run it before and after a deployment or configuration change, compare the outputs, and you know exactly what changed and what didn't. This is a core technique in file integrity monitoring.

✅ Final Checkpoint: All three tools run, produce correct output, and handle errors gracefully.

What You Just Built — And What You Learned

Three working security automation tools that address real security needs. Skills you practiced across this tutorial:

  • Python networking — sockets, TCP connections, concurrent scanning
  • Regex for log parsing — extracting structured data from unstructured text
  • Pattern matching for threat detection — identifying known attack signatures
  • Cryptographic hashing — computing and verifying file integrity
  • Python CLI design — argparse subcommands, help text, validation
  • Concurrent programming — ThreadPoolExecutor for parallel network operations
  • File I/O — efficient buffered reading for large file handling

These are not conceptual exercises. They're real tools that solve real security problems. Security professionals write scripts like these every day.

Python es el lenguaje de la automatización en ciberseguridad. Estos tres scripts te dan una base real — no solo para usarlos, sino para entender cómo extenderlos y adaptarlos a tus propios necesidades.

Extending These Tools

Ideas for taking each tool further:

Port Scanner: Add service version detection using banner grabbing — connect to an open port and read what the service sends back. Add output to JSON or CSV for reporting. Add ICMP ping check before scanning to confirm the host is up.

Log Analyzer: Add support for different log formats (nginx, Apache, custom). Export the report to HTML or JSON. Add real-time monitoring mode that watches the log file and alerts on new suspicious activity.

Hash Verify: Add a baseline command that saves a directory scan to a JSON file, and a compare command that compares the current state against the baseline — the core of a file integrity monitoring system.

What to Build Next

→ Explore the Cybersecurity Learning Path

→ Read: Python for Beginners — Blog Post

→ Read: Understanding Zero Trust Security

Glossary: Penetration Testing, Vulnerability, Encryption, Malware

→ Next tutorial: Network Traffic Analysis with Python *(coming soon)*