What You're Building
You'll build three practical security automation tools from scratch:
Tool 1 — Port Scanner: Scans a target host to identify open ports and the services running on them. A fundamental reconnaissance tool in any security workflow.
Tool 2 — Log Analyzer: Parses server access logs, identifies suspicious patterns — repeated failed logins, unusual request volumes, specific error codes — and generates a structured summary report.
Tool 3 — Hash Verification Utility: Computes and compares file hashes (MD5, SHA-256) to verify file integrity — confirming whether a downloaded file or configuration has been tampered with.
These aren't toy scripts. Each tool solves a real problem that security professionals encounter regularly. By the end, you'll understand how Python applies directly to security work — and have three working tools you can adapt and extend.
Before You Start
What you need installed:
- Python 3.8 or later — verify with
python3 --versionin your terminal - pip — Python's package manager (included with Python)
- A text editor — VS Code recommended
What you should know:
- Python variables, functions, and loops
- Basic command line navigation
- How to install Python packages with pip
If you're newer to Python, follow the "Python for Beginners" blog post first and come back here.
Project setup:
mkdir security-automation
cd security-automation
All three tools will live in this folder.
Tool 1: Port Scanner
What Is Port Scanning?
Every networked service on a computer communicates through a port — a numbered endpoint (0–65535) that routes traffic to the right service. HTTP runs on port 80. HTTPS on 443. SSH on 22. FTP on 21.
A port scanner probes a target host to determine which ports are open — which services are accepting connections. Security professionals use port scanning to:
- Inventory the services running on their own infrastructure
- Identify unexpected open ports that represent attack surface
- Verify firewall rules are working as intended
Important: Only scan systems you own or have explicit written permission to scan. Unauthorized port scanning is illegal in many jurisdictions. This tutorial is for learning on your own systems or designated lab environments.
Building the Scanner
Create port_scanner.py:
#!/usr/bin/env python3
"""
Port Scanner — Security Automation Tool 1
Scans a target host for open ports and identifies common services.
"""
import socket
import concurrent.futures
import argparse
import sys
from datetime import datetime
# Common ports and their associated services
COMMON_SERVICES = {
21: "FTP",
22: "SSH",
23: "Telnet",
25: "SMTP",
53: "DNS",
80: "HTTP",
110: "POP3",
143: "IMAP",
443: "HTTPS",
445: "SMB",
3306: "MySQL",
3389: "RDP",
5432: "PostgreSQL",
6379: "Redis",
8080: "HTTP-Alt",
8443: "HTTPS-Alt",
27017: "MongoDB"
}
def resolve_host(target):
"""Resolve hostname to IP address."""
try:
ip = socket.gethostbyname(target)
return ip
except socket.gaierror:
print(f"[ERROR] Could not resolve hostname: {target}")
sys.exit(1)
def scan_port(ip, port, timeout=1.0):
"""
Attempt to connect to a specific port.
Returns the port number if open, None if closed or filtered.
"""
try:
# Create a TCP socket
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.settimeout(timeout)
# Attempt connection — returns 0 if successful
result = sock.connect_ex((ip, port))
sock.close()
if result == 0:
return port
return None
except socket.error:
return None
def get_service_name(port):
"""Return the service name for known ports, or 'Unknown' for others."""
return COMMON_SERVICES.get(port, "Unknown")
def run_scan(target, ports, max_threads=100, timeout=1.0):
"""
Run the port scan using concurrent threads for speed.
"""
ip = resolve_host(target)
print("\n" + "="*55)
print(f" PORT SCANNER — Security Automation Tool")
print("="*55)
print(f" Target: {target} ({ip})")
print(f" Ports: {min(ports)} - {max(ports)}")
print(f" Threads: {max_threads}")
print(f" Started: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
print("="*55 + "\n")
open_ports = []
# ThreadPoolExecutor runs multiple scans simultaneously
with concurrent.futures.ThreadPoolExecutor(max_workers=max_threads) as executor:
# Submit all port scans concurrently
futures = {
executor.submit(scan_port, ip, port, timeout): port
for port in ports
}
for future in concurrent.futures.as_completed(futures):
result = future.result()
if result:
open_ports.append(result)
# Sort and display results
open_ports.sort()
if open_ports:
print(f" {'PORT': < 10} {'STATE':<12} {'SERVICE'}")
print(f" {'-'*40}")
for port in open_ports:
service = get_service_name(port)
print(f" {port:<10} {'OPEN':<12} {service}")
else:
print(" No open ports found in the specified range.")
print("\n" + "="*55)
print(f" Scan complete. {len(open_ports)} open port(s) found.")
print(f" Finished: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
print("="*55 + "\n")
return open_ports
def parse_port_range(port_string):
"""Parse port range string like '1-1024' or '80,443,22'."""
ports = []
if ',' in port_string:
# Comma-separated list: "80,443,22"
ports = [int(p.strip()) for p in port_string.split(',')]
elif '-' in port_string:
# Range: "1-1024"
start, end = port_string.split('-')
ports = list(range(int(start), int(end) + 1))
else:
# Single port
ports = [int(port_string)]
return ports
def main():
parser = argparse.ArgumentParser(
description='Security Port Scanner',
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""
Examples:
python3 port_scanner.py localhost -p 1-1024
python3 port_scanner.py 192.168.1.1 -p 80,443,22,3306
python3 port_scanner.py scanme.nmap.org -p 1-100
"""
)
parser.add_argument('target', help='Target hostname or IP address')
parser.add_argument('-p', '--ports',
default='1-1024',
help='Port range (e.g., 1-1024 or 80,443,22)')
parser.add_argument('-t', '--threads',
type=int, default=100,
help='Number of concurrent threads (default: 100)')
parser.add_argument('--timeout',
type=float, default=1.0,
help='Connection timeout in seconds (default: 1.0)')
args = parser.parse_args()
ports = parse_port_range(args.ports)
run_scan(args.target, ports, args.threads, args.timeout)
if __name__ == '__main__':
main()
What the Code Does — Explained
socket.socket(socket.AF_INET, socket.SOCK_STREAM) creates a TCP
socket. AF_INET specifies IPv4. SOCK_STREAM specifies TCP (a
connection-oriented protocol). We're essentially creating the same kind of connection your
browser makes when it loads a webpage.
sock.connect_ex((ip, port)) attempts the connection and returns
0 if it succeeds (port is open) or an error code if it fails (port is closed or
filtered). Unlike connect(), connect_ex() doesn't raise an exception
on failure — it just returns the error code, which is what we want for a scanner.
concurrent.futures.ThreadPoolExecutor runs many port scans
simultaneously using threads. Without threading, scanning 1,024 ports sequentially with a
1-second timeout would take 1,024 seconds. With 100 threads, it takes roughly 10 seconds.
Threading is essential for any network tool.
argparse provides a professional command-line interface with help
text, argument validation, and default values — the same pattern used in real security tools.
Run the Scanner
Scan your own machine on the first 1,024 ports:
python3 port_scanner.py localhost -p 1-1024
Scan specific ports:
python3 port_scanner.py localhost -p 22,80,443,3000,3306,5432
Nmap provides a legal scan target for testing:
python3 port_scanner.py scanme.nmap.org -p 1-100
✅ Checkpoint: The scanner runs, completes, and displays a formatted table of open ports with their service names.
Tool 2: Log Analyzer
What Is Log Analysis?
Web servers, applications, firewalls, and operating systems generate logs — timestamped records of every event that occurs. Log analysis is the process of parsing those records to identify patterns, anomalies, and potential security incidents.
Security professionals analyze logs to:
- Identify brute-force login attempts (many failed authentications from one IP)
- Detect unusual traffic patterns that may indicate scanning or exploitation
- Investigate incidents by reconstructing what happened and when
- Monitor for specific error codes that indicate attacks (SQL injection attempts, path traversal)
Generating Sample Log Data
Create generate_logs.py to generate realistic test data:
#!/usr/bin/env python3
"""Generates sample web server access logs for testing the analyzer."""
import random
from datetime import datetime, timedelta
# Realistic IP pool — mostly legitimate, a few suspicious
IPS = [
"192.168.1.100", "192.168.1.101", "10.0.0.45",
"172.16.0.88", "203.0.113.42", "198.51.100.7",
# Suspicious IPs that will generate lots of failed requests
"45.33.32.156", "45.33.32.156", "45.33.32.156", # repeated = high volume
"23.92.24.22", "23.92.24.22",
]
PATHS = [
"/", "/about", "/contact", "/products", "/api/users",
"/api/products", "/login", "/dashboard",
# Suspicious paths
"/admin", "/wp-login.php", "/.env", "/etc/passwd",
"/api/users?id=1 OR 1=1", # SQL injection attempt
"/../../../etc/shadow", # Path traversal attempt
]
METHODS = ["GET", "GET", "GET", "POST", "GET"] # Weighted toward GET
STATUS_CODES = [200, 200, 200, 200, 301, 404, 403, 500, 401] # Weighted toward 200
USER_AGENTS = [
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7)",
"curl/7.81.0",
"python-requests/2.28.0",
"sqlmap/1.7", # Known attack tool
"Nikto/2.1.6", # Known scanner
]
def generate_logs(filename="access.log", count=500):
base_time = datetime.now() - timedelta(hours=2)
with open(filename, 'w') as f:
for i in range(count):
ip = random.choice(IPS)
timestamp = base_time + timedelta(seconds=i*14)
method = random.choice(METHODS)
path = random.choice(PATHS)
status = random.choice(STATUS_CODES)
size = random.randint(200, 15000)
agent = random.choice(USER_AGENTS)
# Apache Combined Log Format
log_line = (
f'{ip} - - [{timestamp.strftime("%d/%b/%Y:%H:%M:%S +0000")}] '
f'"{method} {path} HTTP/1.1" {status} {size} '
f'"-" "{agent}"\n'
)
f.write(log_line)
print(f"Generated {count} log entries in {filename}")
if __name__ == '__main__':
generate_logs()
Run it:
python3 generate_logs.py
This creates access.log with 500 realistic log entries.
Building the Analyzer
Create log_analyzer.py:
#!/usr/bin/env python3
"""
Log Analyzer — Security Automation Tool 2
Parses web server access logs and generates a security-focused summary report.
"""
import re
import argparse
from collections import defaultdict, Counter
from datetime import datetime
# Regex pattern for Apache/Nginx Combined Log Format
LOG_PATTERN = re.compile(
r'(?P<ip>)\S+' # IP address
r' \S+ \S+ ' # ident and auth (usually -)
r'\[(?P<timestamp>)[^\]]+\]' # timestamp
r' "(?P<method>)\S+' # HTTP method
r' (?P<path>)\S+' # requested path
r' \S+" ' # protocol
r'(?P<status>)\d{3}' # status code
r' (?P<size>)\S+' # response size
r'.*?"(?P<agent>)[^"]*"' # user agent
)
# Patterns that indicate potential attacks or reconnaissance
SUSPICIOUS_PATTERNS = {
'sql_injection': [
r'union\s+select', r'or\s+1\s*=\s*1', r'drop\s+table',
r'insert\s+into', r'exec\s*\(', r"'--", r'xp_cmdshell'
],
'path_traversal': [
r'\.\./\.\.', r'%2e%2e', r'etc/passwd', r'etc/shadow',
r'windows/system32'
],
'sensitive_files': [
r'\.env', r'\.git/', r'wp-login\.php', r'phpMyAdmin',
r'\.htaccess', r'\.config', r'config\.php'
],
'known_scanners': [
r'sqlmap', r'nikto', r'nmap', r'masscan',
r'zgrab', r'nuclei', r'dirbuster'
]
}
def parse_log_line(line):
"""Parse a single log line and return a dictionary of fields."""
match = LOG_PATTERN.match(line.strip())
if match:
return match.groupdict()
return None
def detect_suspicious_activity(path, agent):
"""Check a request path and user agent for suspicious patterns."""
findings = []
combined = (path + " " + agent).lower()
for category, patterns in SUSPICIOUS_PATTERNS.items():
for pattern in patterns:
if re.search(pattern, combined, re.IGNORECASE):
findings.append(category)
break # One match per category is enough
return findings
def analyze_logs(filepath):
"""Parse the log file and collect statistics."""
stats = {
'total_requests': 0,
'parse_errors': 0,
'ip_requests': Counter(),
'ip_errors': defaultdict(int),
'status_codes': Counter(),
'top_paths': Counter(),
'suspicious_ips': defaultdict(list),
'agent_counts': Counter(),
'error_401_ips': Counter(), # Unauthorized — possible brute force
}
with open(filepath, 'r') as f:
for line in f:
parsed = parse_log_line(line)
if not parsed:
stats['parse_errors'] += 1
continue
stats['total_requests'] += 1
ip = parsed['ip']
status = int(parsed['status'])
path = parsed['path']
agent = parsed['agent']
# Count requests per IP
stats['ip_requests'][ip] += 1
# Count status codes
stats['status_codes'][status] += 1
# Count top paths
stats['top_paths'][path] += 1
# Count user agents
stats['agent_counts'][agent[:50]] += 1
# Track 401 Unauthorized by IP (brute force indicator)
if status == 401:
stats['error_401_ips'][ip] += 1
# Track error responses per IP
if status >= 400:
stats['ip_errors'][ip] += 1
# Check for suspicious patterns
suspicious = detect_suspicious_activity(path, agent)
if suspicious:
stats['suspicious_ips'][ip].extend(suspicious)
return stats
def generate_report(stats, filepath):
"""Format and print the analysis report."""
print("\n" + "="*60)
print(" SECURITY LOG ANALYSIS REPORT")
print("="*60)
print(f" File analyzed: {filepath}")
print(f" Generated: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
print("="*60)
# Overview
print(f"\n📊 OVERVIEW")
print(f" Total requests: {stats['total_requests']:,}")
print(f" Parse errors: {stats['parse_errors']}")
print(f" Unique IPs: {len(stats['ip_requests'])}")
# Status code breakdown
print(f"\n📈 STATUS CODES")
for code, count in sorted(stats['status_codes'].items()):
category = "✓ OK" if code < 400 else ("⚠ Client Error" if code < 500 else "✗ Server Error")
print(f" {code} {category:<20} {count:>6} requests")
# High-volume IPs
print(f"\n🔝 TOP 5 IPs BY REQUEST VOLUME")
for ip, count in stats['ip_requests'].most_common(5):
error_count = stats['ip_errors'].get(ip, 0)
error_rate = (error_count / count * 100) if count > 0 else 0
print(f" {ip:<20} {count:>5} requests | {error_rate:.1f}% errors")
# Brute force indicators
brute_force_candidates = [
(ip, count) for ip, count in stats['error_401_ips'].items()
if count >= 5
]
if brute_force_candidates:
print(f"\n🔐 POSSIBLE BRUTE FORCE ATTEMPTS (5+ 401 errors)")
for ip, count in sorted(brute_force_candidates, key=lambda x: x[1], reverse=True):
print(f" ⚠ {ip:<20} {count} unauthorized requests")
# Suspicious activity
if stats['suspicious_ips']:
print(f"\n🚨 SUSPICIOUS ACTIVITY DETECTED")
for ip, findings in stats['suspicious_ips'].items():
unique_findings = list(set(findings))
print(f" ⚠ {ip}")
for finding in unique_findings:
print(f" → {finding.replace('_', ' ').title()}")
else:
print(f"\n✅ No suspicious patterns detected in requests")
# Top requested paths
print(f"\n🛤 TOP 5 REQUESTED PATHS")
for path, count in stats['top_paths'].most_common(5):
display_path = path[:50] + "..." if len(path) > 50 else path
print(f" {count:>5} {display_path}")
print("\n" + "="*60)
print(" End of report")
print("="*60 + "\n")
def main():
parser = argparse.ArgumentParser(
description='Security Log Analyzer',
epilog='Example: python3 log_analyzer.py access.log'
)
parser.add_argument('logfile', help='Path to the log file to analyze')
args = parser.parse_args()
print(f"\nAnalyzing log file: {args.logfile}")
stats = analyze_logs(args.logfile)
generate_report(stats, args.logfile)
if __name__ == '__main__':
main()
Run the Analyzer
python3 log_analyzer.py access.log
What you should see: A structured security report showing request volumes, status code distribution, high-volume IPs with error rates, brute force candidates, suspicious activity by type and IP, and top requested paths.
What the code does — explained:
re.compile(LOG_PATTERN) — Pre-compiling the regex pattern outside
the loop is a performance optimization. The regex is compiled once and reused for every line
instead of recompiling it 500 times.
defaultdict and Counter from the
collections module are specialized dictionaries. Counter automatically
counts occurrences. defaultdict(int) starts every key at 0, preventing
KeyError when accessing a key for the first time.
detect_suspicious_activity checks each request against multiple
lists of known malicious patterns — SQL injection strings, path traversal sequences, sensitive
file names, and known security tool user agents. This is a simplified version of the
pattern-matching logic used in real Web Application Firewalls (WAFs) and SIEM (Security
Information and Event Management) tools.
✅ Checkpoint: The analyzer produces a complete, formatted security report with suspicious activity flagged by category and IP.
Tool 3: Hash Verification Utility
What Is Hash Verification?
A cryptographic hash is a fixed-length fingerprint of a file — computed by running the file's contents through a hash function like MD5 or SHA-256. The same file always produces the same hash. Change even one bit of the file and the hash changes completely.
Security professionals use hash verification to:
- Confirm downloaded files haven't been tampered with (malware often modifies legitimate tools)
- Verify configuration files haven't been altered
- Detect file integrity violations in incident response
- Chain-of-custody verification in digital forensics
Building the Utility
Create hash_verify.py:
#!/usr/bin/env python3
"""
Hash Verification Utility — Security Automation Tool 3
Computes and verifies cryptographic hashes for file integrity checking.
"""
import hashlib
import argparse
import sys
import os
from pathlib import Path
from datetime import datetime
SUPPORTED_ALGORITHMS = ['md5', 'sha1', 'sha256', 'sha512']
def compute_hash(filepath, algorithm='sha256', buffer_size=65536):
"""
Compute the cryptographic hash of a file.
Uses buffered reading for memory-efficient handling of large files.
"""
algorithm = algorithm.lower()
if algorithm not in SUPPORTED_ALGORITHMS:
print(f"[ERROR] Unsupported algorithm: {algorithm}")
print(f" Supported: {', '.join(SUPPORTED_ALGORITHMS)}")
sys.exit(1)
# Create hash object for the specified algorithm
hash_obj = hashlib.new(algorithm)
try:
file_size = os.path.getsize(filepath)
with open(filepath, 'rb') as f:
while chunk := f.read(buffer_size):
hash_obj.update(chunk)
return hash_obj.hexdigest(), file_size
except FileNotFoundError:
print(f"[ERROR] File not found: {filepath}")
sys.exit(1)
except PermissionError:
print(f"[ERROR] Permission denied: {filepath}")
sys.exit(1)
def format_file_size(size_bytes):
"""Human-readable file size."""
for unit in ['B', 'KB', 'MB', 'GB']:
if size_bytes < 1024:
return f"{size_bytes:.1f} {unit}"
size_bytes /= 1024
return f"{size_bytes:.1f} TB"
def compute_and_display(filepath, algorithms):
"""Compute and display hashes for a file using multiple algorithms."""
print("\n" + "="*65)
print(" HASH VERIFICATION UTILITY")
print("="*65)
print(f" File: {filepath}")
print(f" Computed: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
# Compute with first algorithm to get file size
first_hash, file_size = compute_hash(filepath, algorithms[0])
print(f" Size: {format_file_size(file_size)}")
print("="*65)
results = {}
# Display first result
print(f"\n {algorithms[0].upper():<10} {first_hash}")
results[algorithms[0]] = first_hash
# Compute remaining algorithms
for algo in algorithms[1:]:
hash_value, _ = compute_hash(filepath, algo)
print(f" {algo.upper():<10} {hash_value}")
results[algo] = hash_value
print()
return results
def verify_hash(filepath, expected_hash, algorithm='sha256'):
"""Verify a file against a known hash value."""
print("\n" + "="*65)
print(" HASH VERIFICATION — INTEGRITY CHECK")
print("="*65)
print(f" File: {filepath}")
print(f" Algorithm: {algorithm.upper()}")
print(f" Expected: {expected_hash}")
print("="*65)
computed, file_size = compute_hash(filepath, algorithm)
print(f"\n File size: {format_file_size(file_size)}")
print(f" Expected: {expected_hash}")
print(f" Computed: {computed}")
if computed.lower() == expected_hash.lower():
print("\n ✅ INTEGRITY VERIFIED — Hash matches. File is authentic.")
else:
print("\n 🚨 INTEGRITY FAILURE — Hash mismatch. File may be corrupted or tampered with.")
# Show where hashes differ
min_len = min(len(computed), len(expected_hash))
diff_positions = [
i for i in range(min_len)
if computed[i] != expected_hash.lower()[i]
]
if diff_positions:
print(f" First difference at character position: {diff_positions[0]}")
print("="*65 + "\n")
return computed.lower() == expected_hash.lower()
def scan_directory(directory, algorithm='sha256'):
"""
Compute hashes for all files in a directory.
Useful for creating a baseline integrity snapshot.
"""
dir_path = Path(directory)
if not dir_path.is_dir():
print(f"[ERROR] Not a directory: {directory}")
sys.exit(1)
print("\n" + "="*65)
print(" DIRECTORY HASH SCAN")
print("="*65)
print(f" Directory: {directory}")
print(f" Algorithm: {algorithm.upper()}")
print(f" Scanned: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
print("="*65 + "\n")
results = {}
files = list(dir_path.rglob('*'))
file_count = 0
for file_path in files:
if file_path.is_file():
try:
hash_value, size = compute_hash(str(file_path), algorithm)
relative_path = file_path.relative_to(dir_path)
results[str(relative_path)] = hash_value
print(f" {hash_value[:16]}... {format_file_size(size):<10} {relative_path}")
file_count += 1
except Exception as e:
print(f" [SKIP] {file_path.name} — {e}")
print(f"\n Scanned {file_count} file(s).")
print("="*65 + "\n")
return results
def main():
parser = argparse.ArgumentParser(
description='Hash Verification Utility — File Integrity Checking',
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""
Examples:
# Compute SHA-256 hash of a file
python3 hash_verify.py compute suspicious_file.exe
# Compute multiple hash types
python3 hash_verify.py compute downloaded_tool.zip --algo sha256 md5
# Verify a file against a known hash
python3 hash_verify.py verify config.json a3f5... --algo sha256
# Scan all files in a directory
python3 hash_verify.py scan ./config_directory
"""
)
subparsers = parser.add_subparsers(dest='command', required=True)
# Compute command
compute_parser = subparsers.add_parser('compute', help='Compute file hash(es)')
compute_parser.add_argument('filepath', help='File to hash')
compute_parser.add_argument('--algo', nargs='+',
default=['sha256'],
choices=SUPPORTED_ALGORITHMS,
help='Hash algorithm(s) to use')
# Verify command
verify_parser = subparsers.add_parser('verify', help='Verify file against known hash')
verify_parser.add_argument('filepath', help='File to verify')
verify_parser.add_argument('expected_hash', help='Expected hash value')
verify_parser.add_argument('--algo', default='sha256',
choices=SUPPORTED_ALGORITHMS,
help='Hash algorithm')
# Scan command
scan_parser = subparsers.add_parser('scan', help='Hash all files in a directory')
scan_parser.add_argument('directory', help='Directory to scan')
scan_parser.add_argument('--algo', default='sha256',
choices=SUPPORTED_ALGORITHMS,
help='Hash algorithm')
args = parser.parse_args()
if args.command == 'compute':
compute_and_display(args.filepath, args.algo)
elif args.command == 'verify':
result = verify_hash(args.filepath, args.expected_hash, args.algo)
sys.exit(0 if result else 1)
elif args.command == 'scan':
scan_directory(args.directory, args.algo)
if __name__ == '__main__':
main()
Run the Utility
Compute the hash of any file:
python3 hash_verify.py compute access.log
Compute multiple hash types at once:
python3 hash_verify.py compute access.log --algo sha256 md5 sha512
Verify a file against a known hash:
First, compute the hash and copy the output:
python3 hash_verify.py compute access.log
Then verify (paste the hash you just computed):
python3 hash_verify.py verify access.log [paste-hash-here] --algo sha256
You should see ✅ INTEGRITY VERIFIED. Now open access.log, change one
character, save it, and run the verify command again. You'll see
🚨 INTEGRITY FAILURE — proving that any modification, no matter how small, changes
the hash completely.
Scan an entire directory:
python3 hash_verify.py scan .
This hashes every file in the current directory — useful for creating an integrity baseline of a configuration directory or a build output.
What the code does — explained:
while chunk := f.read(buffer_size) — The walrus operator
(:=) reads a chunk and assigns it to chunk simultaneously. This
pattern allows reading large files in chunks without loading the entire file into memory —
essential for hashing files that might be gigabytes in size.
hashlib.new(algorithm) — Creates a hash object dynamically based on
the algorithm name. This is more flexible than using hashlib.sha256() directly and
allows the same code to support multiple algorithms.
The directory scan creates a hash baseline — a snapshot of a directory's contents at a point in time. Run it before and after a deployment or configuration change, compare the outputs, and you know exactly what changed and what didn't. This is a core technique in file integrity monitoring.
✅ Final Checkpoint: All three tools run, produce correct output, and handle errors gracefully.
What You Just Built — And What You Learned
Three working security automation tools that address real security needs. Skills you practiced across this tutorial:
- Python networking — sockets, TCP connections, concurrent scanning
- Regex for log parsing — extracting structured data from unstructured text
- Pattern matching for threat detection — identifying known attack signatures
- Cryptographic hashing — computing and verifying file integrity
- Python CLI design — argparse subcommands, help text, validation
- Concurrent programming — ThreadPoolExecutor for parallel network operations
- File I/O — efficient buffered reading for large file handling
These are not conceptual exercises. They're real tools that solve real security problems. Security professionals write scripts like these every day.
Python es el lenguaje de la automatización en ciberseguridad. Estos tres scripts te dan una base real — no solo para usarlos, sino para entender cómo extenderlos y adaptarlos a tus propios necesidades.
Extending These Tools
Ideas for taking each tool further:
Port Scanner: Add service version detection using banner grabbing — connect to an open port and read what the service sends back. Add output to JSON or CSV for reporting. Add ICMP ping check before scanning to confirm the host is up.
Log Analyzer: Add support for different log formats (nginx, Apache, custom). Export the report to HTML or JSON. Add real-time monitoring mode that watches the log file and alerts on new suspicious activity.
Hash Verify: Add a baseline command that saves a directory scan to
a JSON file, and a compare command that compares the current state against the
baseline — the core of a file integrity monitoring system.
What to Build Next
→ Explore the Cybersecurity Learning Path
→ Read: Python for Beginners — Blog Post
→ Read: Understanding Zero Trust Security
→ Glossary: Penetration Testing, Vulnerability, Encryption, Malware
→ Next tutorial: Network Traffic Analysis with Python *(coming soon)*