Digital Forensics

1. Digital Forensics Principles

Foundational Concepts

Digital forensics is both a technical discipline and a legal process. Errors in evidence handling can render findings inadmissible or unreliable even if technically accurate.

Locard's Exchange Principle: Every interaction between attacker and system leaves traces. The attacker left evidence on compromised systems; the forensic examiner leaves evidence on the evidence itself — hence the need for read-only access and documented procedures.
Order of Volatility: Collect most volatile evidence first. Memory disappears on shutdown; disk persists; logs may be rotated. Prioritize: RAM → Network connections → Running processes → Temp files → Disk → Logs → Backups.
Chain of Custody: Every person who handles evidence must be documented — who collected it, when, from where, using what tools, what hash verified integrity. Breaks in chain of custody challenge legal admissibility.
Forensic Soundness: Work on a bit-for-bit copy (forensic image) of evidence, never the original. Hash verification (MD5 + SHA-256) proves the copy is identical to the original. Any analysis that modifies the evidence is unacceptable.
Write Blockers: Hardware write blockers (Tableau, Wiebetech) or software write blockers prevent any writes to the evidence drive during acquisition. Essential for disk forensics — without write blocking, simply mounting a drive modifies metadata.

Legal Admissibility Requirements

Forensic evidence that may be used in litigation or criminal prosecution must meet specific standards. Even if you don't anticipate court proceedings, following these standards protects the integrity of your investigation.

Authentication: Evidence must be shown to be what you claim it is. Hash verification from acquisition to analysis proves an image is an unaltered copy of the original.
Hearsay Exceptions: Digital records are admissible under business records exception if created in the regular course of business and there's testimony from a qualified witness about how they're generated and maintained.
Best Evidence Rule: Original digital evidence (the forensic image) is preferred. Copies must be authenticated as accurate reproductions (hashes match).
Expert Witness Standards: Forensic analysts may be called as expert witnesses. Work must be reproducible (another examiner with the same image reaches the same conclusions) and methods must be scientifically accepted.
Engage legal counsel before any forensic investigation that may result in litigation, criminal prosecution, or regulatory enforcement. Attorney-client privilege may apply to IR communications.

Evidence Type	Volatility	Acquisition Tool	Format	Integrity Verification
RAM / Memory	Very High (lost on reboot)	Winpmem, LiME, VMware snapshot	.raw, .lime, .vmem	SHA-256 immediately after acquisition
Running Processes	High (changes constantly)	tasklist, wmic, ps, Velociraptor	Text/CSV output	Timestamp + hash of output file
Network Connections	High (ephemeral)	netstat, ss, Wireshark, tcpdump	PCAP, text	Timestamp + hash
Registry (Windows)	Medium (modified by activity)	FTK Imager, reg export, Velociraptor	.hive files	Hash of exported hives
Disk Image	Low (persists)	FTK Imager, Guymager, dd	.E01, .001, .raw	MD5 + SHA-256 of full image; re-verify before analysis
System Logs	Low-Medium (rotated)	Copy evtx files, journalctl, log export	.evtx, text	Hash + backup immediately

2. Memory Forensics

Why Memory is Irreplaceable

Memory contains evidence that exists nowhere else — the running state of the system at the moment of investigation, including artifacts that malware specifically tries to avoid writing to disk.

Encryption Keys: Ransomware holds encryption keys in memory before writing encrypted files. BitLocker keys, TrueCrypt keys, malware encryption keys — all potentially recoverable from memory.
Fileless Malware: Malware that runs entirely in memory leaves no artifacts on disk. PowerShell-based attacks, reflective DLL injection, process hollowing — memory is the only place they exist.
LSASS Credentials: Windows LSASS process holds NTLM hashes and Kerberos tickets for logged-in users in memory. Mimikatz extracts these; forensic analysis of a memory dump can reconstruct what was available.
Network Connections: Memory shows the full socket table including connections that may have closed before network forensics capture began — useful for reconstructing C2 infrastructure.
Command History: Commands typed in terminal emulators, PowerShell, and cmd.exe are retained in memory buffers. May capture attacker commands that were not logged anywhere else.
Injected Code: Process injection techniques (DLL injection, process hollowing, shellcode injection) leave the injected code detectable in memory analysis even when the processes appear legitimate in a process listing.

Volatility 3 Framework

Volatility Framework is the standard open-source memory forensics tool. Version 3 is the current major version, with rewritten architecture for speed and Python 3 support.

pslist / pstree: List running processes at time of capture. pstree shows parent-child relationships — abnormal parent-child pairs (svchost spawned from explorer.exe) are immediately suspicious.
malfind: Identifies memory regions with RWX (read-write-execute) permissions containing executable code — the signature of code injection. Automatically dumps suspicious regions for malware analysis.
netscan: Reconstructs network connection table from memory structures — shows all TCP/UDP connections including those that closed between connection and analysis.
dumpfiles: Extracts files cached in memory (both mapped files and loaded DLLs). Can recover malware executables that were deleted from disk after loading.
hashdump: Extracts NTLM password hashes from SAM database structures in memory. Critical for understanding what credentials the attacker could harvest.
cmdline: Recovers command-line arguments for all processes at time of capture — shows what arguments malware was launched with even if processes have since terminated.

# Volatility 3: Key IR investigation commands

# List all processes with parent relationships
python3 vol.py -f memory.raw windows.pstree

# Find processes with injected code (RWX memory regions with executable content)
python3 vol.py -f memory.raw windows.malfind

# List all network connections (including recently closed)
python3 vol.py -f memory.raw windows.netstat

# Show command-line arguments for all processes (catch attacker commands)
python3 vol.py -f memory.raw windows.cmdline

# Extract NTLM hashes from SAM/LSASS structures
python3 vol.py -f memory.raw windows.hashdump

# Dump all files cached in memory to output directory
python3 vol.py -f memory.raw -o /output/ windows.dumpfiles

# List loaded DLLs for a specific process (check for injected DLLs)
python3 vol.py -f memory.raw windows.dlllist --pid 1234

# Find processes that have been hollowed (empty process shells with injected code)
# Look for processes where mapped image doesn't match on-disk hash
python3 vol.py -f memory.raw windows.vadinfo --pid 1234

# Linux memory: list processes
python3 vol.py -f memory.lime linux.pslist

# Linux memory: show bash history from memory
python3 vol.py -f memory.lime linux.bash

3. Disk Forensics

Windows NTFS Artifacts

NTFS, the Windows file system, maintains extensive metadata structures that survive file deletion and record system activity beyond what any individual log captures.

$MFT (Master File Table): Every file on an NTFS volume has an MFT entry recording filename, size, timestamps (created/modified/accessed/changed), and data locations. Even deleted files have residual MFT entries recoverable with tools like MFTECmd.
$LogFile: NTFS transaction log for filesystem consistency. Records file creation, modification, and deletion operations. Useful for reconstructing recent filesystem activity.
$UsnJrnl (USN Change Journal): Records all file and directory changes. Persists longer than $LogFile and is the primary source for recovering information about deleted files — even after deletion, the USN entry often persists.
Prefetch: C:\Windows\Prefetch\ — Windows tracks the first 8 seconds of application execution to speed future loads. Records execution time, execution count, files and folders accessed. Survives malware deletion attempts.
LNK Files: Windows shortcut files created when you open a file. Record original path, timestamps, and volume information. Persist even after the referenced file is deleted — show what files an attacker opened.
ShimCache / Amcache: Windows application compatibility databases that record every executed application with path and last modification time. Evidence of program execution even without logs.

Linux Forensic Artifacts

Linux systems maintain different but equally rich forensic artifacts across various log files, filesystem metadata, and shell history files.

.bash_history: Command history for each user. Location: ~/.bash_history. Root: /root/.bash_history. Attackers often clear this (history -c, unset HISTFILE), but the absence of history is itself forensically interesting. Memory may contain unsaved history.
/var/log/auth.log: Authentication events — SSH logins, sudo commands, su attempts, PAM events. Primary source for reconstructing unauthorized access. Debian/Ubuntu; RHEL uses /var/log/secure.
/var/log/syslog & kern.log: System events including service starts/stops, kernel messages, cron job execution. Useful for reconstructing what happened and when.
systemd Journal: journalctl provides comprehensive logging for systemd-based systems. Includes service logs, boot sequence, and kernel messages. journalctl -u sshd for SSH-specific events.
wtmp / utmp / btmp: Binary log files recording user login/logout (wtmp), currently logged-in users (utmp), and failed login attempts (btmp). Read with last, who, lastb.
/proc/ filesystem: Runtime kernel data structures exposed as files. /proc/[pid]/cmdline (command-line of running process), /proc/[pid]/maps (memory maps), /proc/net/ (network state). Volatile — only available on live systems.

4. Timeline Analysis & Log Forensics

Super Timeline Construction

A super timeline aggregates timestamps from all available sources — filesystem metadata, event logs, browser history, prefetch, registry — into a single chronological record of system activity.

log2timeline (plaso): Open-source tool that ingests disk images, log files, and other artifact sources and outputs all timestamps in a normalized CSV or SQLite format. Understands hundreds of artifact types automatically.
Timesketch (Google): Web-based timeline analysis platform built to work with plaso output. Multiple investigator collaboration, annotation, search, and filtering across millions of events. Hosted on GCP or self-hosted.
Timeline approach: start with known-bad events (malware execution timestamp from AV logs), then look backward (how did attacker get there?) and forward (what did they do after?) in the timeline.
Timestamp manipulation: attackers may modify file timestamps (timestomping) to blend malware in with legitimate system files. Compare $MFT timestamps with $LogFile and $UsnJrnl timestamps — discrepancies indicate tampering.
Timezone awareness: systems may use local time, UTC, or a mix. Log sources may use different timezones. Normalize everything to UTC before building timeline.

Critical Windows Event Log IDs for Forensics

Windows Event Logs are often the primary forensic record for investigation. Knowing which Event IDs matter — and their forensic interpretation — is essential for IR.

4624 — Logon Success: Record of every successful Windows logon. LogonType field is critical: Type 2 (interactive), Type 3 (network), Type 10 (remote interactive/RDP). Source IP and account name trace attacker lateral movement.
4625 — Logon Failure: Failed authentication attempt. Multiple 4625s before a 4624 indicate brute force.
4648 — Explicit Credentials Logon: A process used explicit credentials to authenticate (RunAs, net use, Pass-the-Hash). The logged credentials may differ from the logged-in user — lateral movement indicator.
4688 — Process Created: New process creation with command-line (requires audit policy for command line logging). The richest event for detecting malicious execution. Enable "Include command line in process creation events."
4698 — Scheduled Task Created: New scheduled task registered. Attacker persistence technique. Task name, action, and creating account all logged.
4720 — Account Created: New user account created. Attacker backdoor account creation. 4722 (account enabled), 4738 (account changed) also important.
7045 — Service Installed: New Windows service created. PsExec, Cobalt Strike beacon services, and attacker-installed backdoor services all generate this event. Service name, path, and account logged.

5. Network Forensics

PCAP Analysis

Full packet capture (PCAP) provides the most complete network forensic record. When PCAP is available, it enables reconstruction of data exfiltrated, commands received, and lateral movement details.

Wireshark: The standard GUI PCAP analysis tool. Key capabilities: Follow TCP/UDP stream (reconstructs full conversation), Export Objects (extracts files transferred over HTTP/SMB), Protocol Statistics (identify unusual protocols), and filter by expression.
Follow TCP Stream: Select any packet from a session, right-click → Follow → TCP Stream. Reconstructs the entire conversation as readable text — see HTTP requests/responses, FTP transfers, or SMTP emails as they occurred.
Export Objects: File → Export Objects → HTTP (or SMB, DICOM, TFTP). Extracts files transferred over supported protocols. Critical for recovering malware downloaded via HTTP or data exfiltrated via HTTP POST.
tcpdump for targeted capture: tcpdump -i eth0 -w capture.pcap host 192.168.1.100 — targeted capture by host. Add filters for specific ports, protocols, or traffic patterns.
NetworkMiner: Passive network forensics tool that parses PCAP and organizes findings by host — extracted files, credentials transmitted in cleartext, session reconstruction, OS fingerprinting.

Zeek / Bro Log Analysis

Zeek (formerly Bro) is a network analysis framework that converts raw network traffic into structured log files suitable for retrospective investigation and SIEM ingestion.

conn.log: Summary of every network connection — source/dest IP, port, protocol, duration, bytes sent/received. Primary source for identifying data exfiltration (large bytes_out to external IPs) and beaconing patterns.
dns.log: All DNS queries and responses. DNS-based C2, DGA (domain generation algorithm) domains, DNS tunneling, and DNS exfiltration all visible here.
http.log: HTTP requests with method, URI, user-agent, referrer, response code, and extracted file hashes. Attacker tool downloads, web shell activity, data exfiltration via HTTP POST all appear here.
ssl.log / x509.log: TLS session information including JA3 fingerprints (client hello hash). JA3 fingerprints are unique per TLS client implementation — identify specific malware families even through encrypted C2.
files.log: All files transferred over protocols Zeek understands — HTTP, FTP, SMB, SMTP attachments. SHA-256 hash for VirusTotal lookup.
JA3/JA4 TLS Fingerprinting: TLS client hello parameters generate a fingerprint unique to each TLS implementation. Cobalt Strike, Metasploit, and other frameworks have known JA3 signatures even when using HTTPS.

Cloud Log Forensics

In cloud environments, logs are often the only forensic evidence available. Unlike physical systems, cloud instances may be ephemeral — the instance that was compromised may no longer exist.

AWS CloudTrail: Records all API calls to AWS services — who did what, when, from which IP, with which credentials. Essential for detecting unauthorized IAM changes, data access, and resource modifications. Enable in all regions, including management events.
Azure AD Audit Log & Sign-in Log: Records all identity events — user sign-ins, MFA events, conditional access outcomes, role assignments, app consent grants. Retained 30 days free; extend via Log Analytics.
GCP Admin Activity Audit Logs: Records admin actions on GCP resources. Always enabled, free, retained 400 days. Supplemented by Data Access logs (disabled by default due to volume).
Cloud forensics limitation: no memory forensics on cloud VMs (unless snapshot was taken). No kernel-level visibility without agent. Cloud provider logs are often the only evidence available for cloud-native attacks.
Enable VPC Flow Logs (AWS), NSG Flow Logs (Azure), VPC Flow Logs (GCP) for network-level visibility. These are not enabled by default and represent the closest equivalent to network flow analysis in cloud.

Comprehensive Logging Before the Incident is Everything

In cloud-native and hybrid environments, logs are often the only forensic evidence available. Without AWS CloudTrail, you cannot reconstruct which API calls were made during an AWS incident. Without Azure AD sign-in logs extended beyond 30 days, you cannot determine if an attacker was present before the detection window. Without Windows process creation audit logging with command-line arguments enabled, Event 4688 tells you that cmd.exe ran but not what it ran. The time to configure comprehensive logging is before an incident occurs. Audit your logging coverage quarterly against a reference checklist, and treat logging gaps as security controls gaps — because they are.