OSINT & Reconnaissance: Know Your Target
Passive intelligence gathering before touching the target network
OSINT happens before any active engagement. You're gathering intelligence from public sources—no packets touch the target network. This is how sophisticated attackers build target profiles for initial access and social engineering.
Why Reconnaissance Matters
The more you know about a target, the more precise your attack. Spray-and-pray phishing has low success rates. Targeted attacks using discovered employee names, org structure, technology stack, and recent events have dramatically higher success.
┌─────────────────────────────────────────────────────────────────────────────┐
│ RECONNAISSANCE PHASES │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │
│ │ PASSIVE OSINT │ ──► │ SEMI-PASSIVE │ ──► │ ACTIVE RECON │ │
│ │ │ │ │ │ │ │
│ │ • Public records│ │ • Web browsing │ │ • Port scanning │ │
│ │ • Social media │ │ • DNS lookups │ │ • Vulnerability │ │
│ │ • Job postings │ │ • WHOIS queries │ │ scanning │ │
│ │ • Leaked data │ │ • Certificate │ │ • Web spidering │ │
│ │ │ │ transparency │ │ │ │
│ └─────────────────┘ └─────────────────┘ └─────────────────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │
│ │ NO DETECTION │ │ MINIMAL RISK │ │ DETECTABLE │ │
│ │ RISK │ │ │ │ (Firewall logs) │ │
│ └─────────────────┘ └─────────────────┘ └─────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
People Intelligence (HUMINT)
Humans are usually the weakest link. Finding the right person to target—and learning enough about them to craft a convincing pretext—is often more valuable than technical reconnaissance.
LinkedIn Reconnaissance
# LinkedIn is the gold standard for corporate intelligence
WHAT TO GATHER:
• Employee names, titles, departments
• Reporting structure (who reports to whom)
• Recent hires (less security awareness)
• Recent departures (impersonation opportunities)
• Technology skills listed (reveals tech stack)
• Job postings (reveals projects, tools, gaps)
• Company pages (org size, locations, news)
TECHNIQUES:
• Search: site:linkedin.com "Company Name"
• Boolean: "security engineer" AND "Company Name"
• Find IT staff: "system administrator" OR "network engineer"
• Recent hires: Added "started" "2024" to search
OPSEC CONSIDERATIONS:
• LinkedIn shows who viewed profiles (use incognito or fake account)
• Premium shows more profile views
• Sales Navigator searches are logged
• Consider using a burner account from VPN
Email Discovery
# Discover email format and valid addresses
# theHarvester - Aggregate multiple sources
theHarvester -d target.com -b all
# hunter.io - Email format and addresses
# API: https://hunter.io/api
curl "https://api.hunter.io/v2/domain-search?domain=target.com&api_key=YOUR_KEY"
# Common email formats to try:
# firstname.lastname@company.com
# firstnamelastname@company.com
# first.last@company.com
# flastname@company.com
# firstl@company.com
# Verify emails exist (careful - may alert target)
# smtp-user-enum, email verification APIs
Social Media Deep Dive
PLATFORM-SPECIFIC INTELLIGENCE:
TWITTER/X:
• Tech staff often discuss tools, frustrations
• Search: "from:employee" OR "@companyhandle"
• Company hashtags reveal events, culture
• Complaints reveal pain points
GITHUB:
• Employee personal repos may leak secrets
• Organization repos reveal tech stack
• Commit history shows developer names/emails
• Issues/PRs reveal internal processes
• GitHub dorks: "company.com" password/secret/key
FACEBOOK:
• Personal connections to employees
• Company page events, photos
• Employee check-ins reveal locations
INSTAGRAM:
• Office photos (badges, whiteboards, screens)
• Employee locations, routines
• Company culture intel
Technical Reconnaissance
Domain Enumeration
# Subdomain discovery - find attack surface
# Passive - Certificate Transparency logs
curl "https://crt.sh/?q=%.target.com&output=json" | jq '.[].name_value' | sort -u
# Amass - Comprehensive subdomain enumeration
amass enum -passive -d target.com -o subdomains.txt
# Subfinder - Fast passive enumeration
subfinder -d target.com -all -o subs.txt
# Assetfinder - Quick and simple
assetfinder --subs-only target.com
# DNSdumpster - Web-based visualization
# https://dnsdumpster.com
# Combine and dedupe
cat *.txt | sort -u > all_subdomains.txt
# Check which are alive
httpx -l all_subdomains.txt -o alive.txt
DNS Intelligence
# DNS records reveal infrastructure
# Basic enumeration
dig target.com ANY
dig target.com MX
dig target.com TXT
dig target.com NS
# Zone transfer attempt (usually blocked but worth trying)
dig axfr @ns1.target.com target.com
# Reverse DNS on IP ranges
# If you find their IP range, reverse lookup all IPs
for ip in $(seq 1 254); do
host 192.168.1.$ip 2>/dev/null | grep "pointer"
done
# DNS history - find old/forgotten infrastructure
# securitytrails.com, viewdns.info
# SPF records reveal email infrastructure
dig target.com TXT | grep spf
# DMARC policy
dig _dmarc.target.com TXT
Technology Stack Identification
# Identify what technology they're running
# Wappalyzer - Browser extension or CLI
# Shows CMS, frameworks, analytics, etc.
# WhatWeb - CLI tool
whatweb target.com -v
# BuiltWith - Extensive technology profiling
# https://builtwith.com/target.com
# Headers reveal server info
curl -I https://target.com
# Common indicators:
# X-Powered-By: PHP/7.4.3
# Server: nginx/1.18.0
# X-AspNet-Version: 4.0.30319
# Set-Cookie: JSESSIONID (Java)
# Set-Cookie: PHPSESSID (PHP)
# Favicon hash for technology identification
curl https://target.com/favicon.ico | md5sum
# Compare against known technology favicons
# robots.txt and sitemap.xml
curl https://target.com/robots.txt
curl https://target.com/sitemap.xml
Leaked Data & Credentials
Accessing leaked databases may be illegal in your jurisdiction. For authorized pentests, checking if employee credentials appear in breaches is valuable intelligence—but handle this data carefully and within scope.
# Check if company emails appear in breaches
# Have I Been Pwned - API
# https://haveibeenpwned.com/API/v3
curl -H "hibp-api-key: YOUR_KEY" \
"https://haveibeenpwned.com/api/v3/breachedaccount/user@target.com"
# DeHashed - Search leaked databases
# https://dehashed.com
# Requires subscription, very comprehensive
# Intelligence X
# https://intelx.io
# Historical data, pastes, leaked documents
# Snusbase
# https://snusbase.com
# Email/username/password search
# WHAT YOU'RE LOOKING FOR:
# • Plaintext passwords (try across services)
# • Password patterns (Summer2024! -> Winter2025!)
# • Old passwords + small modification
# • Email addresses not found elsewhere
Code Repository Mining
# Search GitHub/GitLab for leaked secrets
# GitHub Dorks - search for company secrets
site:github.com "target.com" password
site:github.com "target.com" api_key
site:github.com "target.com" secret
site:github.com "target.com" AWS_ACCESS_KEY
# TruffleHog - Scan repos for secrets
trufflehog github --org=target-company
# GitDorker - Automated GitHub dorking
python3 GitDorker.py -tf TOKENSFILE -q target.com -d dorks/alldorksv3
# Gitrob - GitHub org reconnaissance
gitrob analyze target-company
# Common leaks found:
# • API keys (AWS, GCP, Azure)
# • Database connection strings
# • Internal URLs and endpoints
# • JWT secrets
# • SSH private keys
# • .env files in repos
Infrastructure Mapping
IP Range Discovery
# Find all IP space owned by target
# ARIN/WHOIS for IP ownership
whois -h whois.arin.net target.com
whois -h whois.arin.net "n Company Name"
# BGP information - ASN lookup
# hurricane electric BGP toolkit: bgp.he.net
whois -h whois.radb.net AS12345
# Shodan - Search by organization
shodan search org:"Target Company"
# Censys - Certificate-based discovery
# Search by organization in SSL certs
# IP ranges from DNS and other sources
# Compile all discovered IPs, find netblocks
# Check if cloud-hosted
# AWS IP ranges: https://ip-ranges.amazonaws.com/ip-ranges.json
# Azure: https://www.microsoft.com/en-us/download/details.aspx?id=56519
# GCP: https://www.gstatic.com/ipranges/cloud.json
Cloud Asset Discovery
# Cloud-specific reconnaissance
# S3 Bucket enumeration
# Common naming: target-backup, target-dev, target-assets
aws s3 ls s3://target-backup --no-sign-request 2>/dev/null
# Bucket finder tools
python3 cloud_enum.py -k target
# Checks: S3, Azure blobs, GCP buckets
# Azure tenant discovery
# https://login.microsoftonline.com/target.com/.well-known/openid-configuration
# Google Workspace discovery
dig txt _dmarc.target.com
# Often reveals if using Google or Microsoft
# Office 365 tenant info
curl https://login.microsoftonline.com/getuserrealm.srf?login=user@target.com
# Cloud IP to provider mapping
# Determine if assets are in AWS/Azure/GCP based on IP
Document Metadata
# Extract intelligence from public documents
# Find documents on target site
site:target.com filetype:pdf
site:target.com filetype:docx
site:target.com filetype:xlsx
site:target.com filetype:pptx
# Download and extract metadata
wget https://target.com/annual-report.pdf
exiftool annual-report.pdf
# METADATA REVEALS:
# • Author names (real employee names)
# • Software versions (Word 2016, Acrobat 11)
# • Internal paths (C:\Users\jsmith\Documents)
# • Printer names (\\CORP-PRINT01)
# • Creation dates
# • Company name / department
# FOCA - Automated metadata extraction
# Windows tool, extracts from many doc types
# Metagoofil - CLI alternative
metagoofil -d target.com -t pdf,docx,xlsx -o output/