Skip to content

Instantly share code, notes, and snippets.

@aw-junaid
Created January 30, 2026 19:12
Show Gist options
  • Select an option

  • Save aw-junaid/6ed3fff20fe321c1d3f943a162d3c3b2 to your computer and use it in GitHub Desktop.

Select an option

Save aw-junaid/6ed3fff20fe321c1d3f943a162d3c3b2 to your computer and use it in GitHub Desktop.
Ultimate guide to OSINT techniques, tools, and methodologies for digital investigations including social media analysis, image forensics, geolocation, person search, and advanced search operators with practical examples and resources.

OSINT (Open Source Intelligence) Comprehensive Guide

Ultimate guide to OSINT techniques, tools, and methodologies for digital investigations including social media analysis, image forensics, geolocation, person search, and advanced search operators with practical examples and resources.


Advanced Search Techniques

Google Dorking & Advanced Operators

Basic Operators:

# Site-specific search
site:example.com "password"
site:github.com "API_KEY"

# File type searches
filetype:pdf "confidential"
filetype:sql "backup"
filetype:env "password"
filetype:log "error"

# URL searches
inurl:admin "login"
inurl:config "password"
inurl:backup "database"

# Title searches
intitle:"index of" "parent directory"
intitle:"Dashboard" "admin"
intitle:"Login" "username"

# Cache search
cache:example.com

Advanced Combinations:

# Find exposed credentials
site:pastebin.com "password" "@gmail.com"
site:trello.com "API" "key"
site:github.com "SECRET_KEY" "django"

# Find exposed databases
intitle:"index of" "database.sql"
filetype:sql "CREATE TABLE" "users"
inurl:"phpmyadmin" "index.php"

# Find exposed config files
site:example.com ext:env OR ext:cfg OR ext:conf
inurl:"wp-config.php" -inurl:wp-content
filetype:xml "password" "connectionString"

# Find exposed documents
filetype:pdf "confidential" "draft"
filetype:docx "internal use only"
filetype:xlsx "employee" "salary"

Real-World Examples:

# Find WordPress exposed users
site:example.com inurl:wp-json/wp/v2/users
site:example.com "/wp-content/uploads/"

# Find exposed Jenkins instances
intitle:"Dashboard [Jenkins]" "login"
inurl:":8080/jenkins" "login"

# Find exposed S3 buckets
inurl:"s3.amazonaws.com" "bucket"
site:s3.amazonaws.com filetype:xlsx

Specialized Search Engines

OSINT-Focused Search:

# Shodan (IoT/Device Search)
# Network devices, servers, webcams
country:IR port:22
product:"Apache httpd" "Server at"
net:"192.168.1.0/24"

# Censys
# Certificates, devices, cloud infrastructure
services.port:443 AND services.service_name: HTTP
443.https.tls.certificate.parsed.names: "*.google.com"

# ZoomEye
# Chinese-focused OSINT engine
app:"nginx" country:"CN"
port:3389 os:"Windows"

# GreyNoise
# Identify scanning IPs
ip:"8.8.8.8" classification:benign

People Search Engines:

# Pipl
# Deep people search with associations
name:"John Smith" location:"New York"

# Spokeo
# Address, phone, email lookups
phone:"555-123-4567"

# TruePeopleSearch
# Free people search (US focused)
name city state

# That's Them
# Reverse phone/address lookups

Document & File Search:

# FilePursuit
# Search across file hosting sites
"confidential report" filetype:pdf

# DocSearch
# Academic and technical documents
"proprietary information"

# SlideShare Search
# Business presentations
filetype:ppt "financial projections"

Boolean Search Strings for Social Media

LinkedIn Advanced Search:

# Find employees by role
"software engineer" AND "Google" AND "San Francisco"
"director" AND "Amazon" AND "Seattle"

# Find by skills
"Python" AND "machine learning" AND "IBM"
"cybersecurity" AND "CISSP" AND "bank"

# Company connections
current:"Microsoft" past:"Google" education:"Stanford"

# Location based
location:"New York City" AND "product manager"
postalcode:"10001" AND "recruiter"

Twitter/X Search Operators:

# User searches
from:elonmusk since:2023-01-01
to:twitter until:2023-12-31

# Content searches
"data breach" filter:links
#cybersecurity filter:images

# Location searches
geocode:40.7128,-74.0060,10km
near:"New York" within:15mi

# Engagement searches
min_faves:1000 min_retweets:500
filter:native_video

Facebook Graph Search Alternatives:

# Use site: operator
site:facebook.com "works at" "Apple"
site:facebook.com "lives in" "London"

# Photo searches
site:facebook.com/photo.php?fbid=
site:facebook.com/media/set/?set=

# Event searches
site:facebook.com/events/ "tech conference"

Social Media Intelligence (SOCMINT)

Platform-Specific Techniques

Instagram Investigation:

# User enumeration
# Use: https://www.picuki.com/
# Or: https://imginn.com/

# Hashtag analysis
# Track location tags
#locationname
#cityname

# Story analysis
# Use: https://storiesig.com/
# Or browser extensions

# EXIF data from images
# Download and analyze with exiftool
exiftool instagram_photo.jpg | grep -i "gps\|location"

TikTok OSINT:

# User discovery
https://www.tiktok.com/@username
https://www.tiktok.com/tag/hashtag

# Video metadata
# Use: https://tokviz.com/
# Or: https://tikbuddy.com/

# Location tracking
# Analyze geotags in videos
# Check bio locations
# Cross-reference with other platforms

Telegram Intelligence:

# Channel discovery
https://t.me/s/channelname
https://telemetr.io/en/channels

# Member extraction
# Use: TGStat, Telepathy tools
python3 telepathy.py -c channelname

# Message analysis
cat messages.json | jq '.messages[] | select(.text | contains("keyword"))'

Discord Investigation:

# Server discovery
# Use: disboard.org, discordservers.com
# Search for invite codes on GitHub

# User profiling
# Check connections with other platforms
# Analyze message patterns

# Bot analysis
# Investigate bot permissions and activities

Cross-Platform Correlation

Username Enumeration:

# Using Sherlock
python3 sherlock username

# Using Whatsmyname
python3 whatsmyname.py -u username

# Custom script for cross-reference
import requests
platforms = ['twitter', 'github', 'instagram', 'linkedin']
for platform in platforms:
    url = f"https://{platform}.com/{username}"
    response = requests.get(url)
    if response.status_code == 200:
        print(f"[+] Found on {platform}: {url}")

Email-Based Discovery:

# Using holehe
holehe email@example.com

# Using GHunt
python3 ghunt.py email email@example.com

# Manual checks
# Check breach databases: HaveIBeenPwned
# Check GitHub commits for email
# Search for email in code repositories

Image & Video Analysis

Reverse Image Search

Multiple Engine Search:

# Google Lens
https://lens.google.com/upload

# Yandex Images (Excellent for faces)
https://yandex.com/images/

# TinEye
https://tineye.com/

# Bing Visual Search
https://www.bing.com/visualsearch

# Baidu Images (Chinese content)
https://image.baidu.com/

EXIF Data Extraction:

# Using exiftool
exiftool image.jpg

# Extract GPS coordinates
exiftool -GPSLatitude -GPSLongitude image.jpg

# Extract creation date
exiftool -CreateDate -ModifyDate image.jpg

# Extract camera info
exiftool -Make -Model -Software image.jpg

# Remove metadata
exiftool -all= image.jpg

Image Forensics:

# Error Level Analysis (ELA)
# Use: https://fotoforensics.com/
# Or: forensically beta

# Clone detection
# Use: https://www.imageforensic.org/

# Neural network analysis
# Use: https://aiforensic.io/

Video Analysis

YouTube Investigations:

# Channel analysis
https://www.youtube.com/@username
https://socialblade.com/youtube/user/username

# Video metadata
youtube-dl --list-formats VIDEO_URL
youtube-dl --get-description VIDEO_URL

# Comment analysis
python3 youtube-comment-downloader.py --url VIDEO_URL --output comments.json

# Thumbnail extraction
youtube-dl --write-thumbnail VIDEO_URL

Live Stream Geolocation:

# Analyze timestamps and shadows
import suncalc
from datetime import datetime

# Calculate sun position
sun_position = suncalc.get_position(
    datetime.now(),
    51.5074,  # Latitude
    -0.1278   # Longitude
)

# Match with video shadows

Geolocation Intelligence (GEOINT)

Satellite & Aerial Imagery

Free Satellite Imagery:

# Google Earth Pro
# Historical imagery, measurement tools

# Sentinel Hub
https://apps.sentinel-hub.com/eo-browser/

# NASA Worldview
https://worldview.earthdata.nasa.gov/

# Planet Labs
https://www.planet.com/explorer/

# USGS EarthExplorer
https://earthexplorer.usgs.gov/

Street View & 360° Imagery:

# Google Street View
# Historical views, business info

# Mapillary (Open Street View)
https://www.mapillary.com/

# Bing Streetside
# Alternative to Google

# 360° panorama sites
https://www.360cities.net/
https://kuula.co/

Flight & Marine Tracking

Flight Tracking:

# ADS-B Exchange
https://globe.adsbexchange.com/

# FlightRadar24
https://www.flightradar24.com/

# FlightAware
https://flightaware.com/

# OpenSky Network
https://opensky-network.org/

# Specific flight search
# IATA codes: https://www.iata.org/

Marine Vessel Tracking:

# MarineTraffic
https://www.marinetraffic.com/

# VesselFinder
https://www.vesselfinder.com/

# AIS Hub
https://www.aishub.net/

# Satellite AIS
https://www.spire.com/maritime/

Wi-Fi & Bluetooth Tracking

Wi-Fi Positioning:

# Wigle WiFi
https://wigle.net/

# WiGLE Android App
# Wardriving data collection

# Geolocation via SSID
# Search for business/network names

Bluetooth Beacon Mapping:

# Bluetooth scanners
hcitool scan
bluetoothctl scan on

# Analyze with Ubertooth
ubertooth-scan -f 2402

Business & Corporate Intelligence

Company Research

Corporate Structure:

# Official registries
# US: https://www.sec.gov/edgar/searchedgar/companysearch
# UK: https://beta.companieshouse.gov.uk/
# EU: https://e-justice.europa.eu/

# Business directories
https://www.zoominfo.com/
https://www.linkedin.com/company/
https://crunchbase.com/

# Financial data
https://www.bloomberg.com/
https://finance.yahoo.com/
https://markets.ft.com/

Supply Chain Analysis:

# Import/export data
https://www.importyeti.com/
https://panjiva.com/

# Shipping manifests
# Use port authority websites
# Container tracking

# Supplier discovery
https://www.thomasnet.com/
https://www.alibaba.com/

Employee Discovery

Professional Networks:

# LinkedIn scraping (ethical boundaries!)
# Use official API or manual research

# GitHub organization analysis
curl https://api.github.com/orgs/company/members

# Conference speakers
site:speakerdeck.com "company name"
site:slideshare.net "company name"

# Research papers
site:researchgate.net "company"
site:academia.edu "employee name"

Email Pattern Discovery:

# Common patterns
first.last@company.com
firstl@company.com
flast@company.com
f.last@company.com

# Verification tools
https://hunter.io/
https://clearbit.com/
https://rocketreach.co/

# Breach data cross-reference
https://haveibeenpwned.com/

Technical Intelligence (TECHINT)

Domain & Infrastructure Analysis

Domain Research:

# WHOIS lookup
whois example.com
https://who.is/
https://whois.domaintools.com/

# DNS enumeration
dig example.com ANY
nslookup -type=any example.com

# Subdomain discovery
subfinder -d example.com
amass enum -d example.com
assetfinder --subs-only example.com

# Historical DNS
https://securitytrails.com/
https://viewdns.info/

Certificate Analysis:

# Certificate Transparency Logs
https://crt.sh/
https://censys.io/certificates

# SSL certificate details
openssl s_client -connect example.com:443 -servername example.com
sslscan example.com

# Certificate fingerprinting
https://sslbl.abuse.ch/

Network Infrastructure

IP Address Research:

# IP geolocation
https://ipinfo.io/
https://www.maxmind.com/

# ASN information
whois -h whois.radb.net AS15169
https://bgp.he.net/

# Abuse contact lookup
whois -h whois.abuse.net 8.8.8.8

Cloud Infrastructure:

# AWS resources
# S3 buckets: companyname.s3.amazonaws.com
# CloudFront: *.cloudfront.net

# Azure resources
# *.azurewebsites.net
# *.blob.core.windows.net

# Google Cloud
# *.appspot.com
# *.googleusercontent.com

Data Analysis & Visualization

Timeline Analysis

Creating Digital Timelines:

from datetime import datetime
import pandas as pd

# Collect timestamps from various sources
timeline_data = [
    {"date": "2023-01-15", "event": "Social media post", "source": "Twitter"},
    {"date": "2023-02-20", "event": "Blog article", "source": "WordPress"},
    {"date": "2023-03-10", "event": "Conference talk", "source": "YouTube"},
]

# Create timeline
df = pd.DataFrame(timeline_data)
df['date'] = pd.to_datetime(df['date'])
df = df.sort_values('date')

# Visualize
import matplotlib.pyplot as plt
plt.figure(figsize=(10, 6))
for i, row in df.iterrows():
    plt.plot(row['date'], i, 'o', label=row['event'])
plt.show()

Network Mapping

Relationship Mapping:

import networkx as nx
import matplotlib.pyplot as plt

# Create relationship graph
G = nx.Graph()

# Add nodes (people, companies, locations)
G.add_node("Person A", type="person")
G.add_node("Company X", type="company")
G.add_node("Location Y", type="location")

# Add relationships
G.add_edge("Person A", "Company X", relationship="works_at")
G.add_edge("Person A", "Location Y", relationship="lives_in")

# Visualize
pos = nx.spring_layout(G)
nx.draw(G, pos, with_labels=True, node_color='lightblue')
plt.show()

Data Correlation

Cross-Platform Correlation:

import pandas as pd
from datetime import datetime

# Load data from different sources
twitter_data = pd.read_csv('twitter_posts.csv')
instagram_data = pd.read_csv('instagram_posts.csv')
linkedin_data = pd.read_csv('linkedin_posts.csv')

# Normalize timestamps
def normalize_date(date_str):
    return datetime.strptime(date_str, '%Y-%m-%d %H:%M:%S')

# Find correlations
common_patterns = []
for t_post in twitter_data:
    for i_post in instagram_data:
        if similar_content(t_post, i_post):
            common_patterns.append({
                'twitter': t_post,
                'instagram': i_post,
                'timestamp': normalize_date(t_post['date'])
            })

Privacy & Anonymity Techniques

Safe Investigation Practices

Browser Isolation:

# Use dedicated virtual machines
# Whonix: https://www.whonix.org/
# Tails: https://tails.boum.org/

# Browser isolation
# Use: https://github.com/arkenfox/user.js
# Or dedicated browser profiles

# VPN and proxy chains
# Multiple hop proxies
# Residential proxies for sensitive searches

Search Anonymization:

# Privacy-focused search engines
https://duckduckgo.com/
https://startpage.com/
https://searx.space/

# Anonymous viewing
https://textise dot iitty/
https://archive.is/

Data Protection

Secure Data Handling:

# Encryption for sensitive data
from cryptography.fernet import Fernet

key = Fernet.generate_key()
cipher = Fernet(key)

# Encrypt data
encrypted_data = cipher.encrypt(b"Sensitive information")

# Decrypt data
decrypted_data = cipher.decrypt(encrypted_data)

Secure Storage:

# Encrypted containers
veracrypt --create volume.tc --size=100M --encryption=AES --hash=SHA-512 --filesystem=FAT

# Secure deletion
shred -u sensitive_file.txt
# Or use: wipe, srm

OSINT Tools & Frameworks

Automated OSINT Suites

SpiderFoot:

# Install and run
pip install spiderfoot
spiderfoot -l 127.0.0.1:5001

# Web interface: http://127.0.0.1:5001
# Use modules:
- sfp_dns
- sfp_whois
- sfp_social
- sfp_leaks

Maltego:

# Transform sets for OSINT
# Built-in transforms:
- Domain analysis
- Email investigation
- Social media correlation
- Network mapping

# Custom transforms via Paterva

OSINT Framework:

# Web-based resource directory
https://osintframework.com/

# Categories:
- Username search
- Email search
- Image analysis
- Social networks
- Documents

Specialized Tools

Image Analysis Tools:

# Forensically
https://29a.ch/photo-forensics/

# Jeffrey's Image Metadata Viewer
https://exif.regex.info/exif.cgi

# Ghiro
https://www.getghiro.org/

# Amped Authenticate
https://ampedsoftware.com/authenticate

Mobile App Analysis:

# APK decompilation
apktool d app.apk
jadx app.apk

# Traffic analysis
mitmproxy -p 8080
# Configure device proxy

# App metadata
aapt dump badging app.apk

Email Analysis:

# Email header analysis
python3 email-header-analyzer.py email.eml

# Breach correlation
https://dehashed.com/
https://leak-lookup.com/

# Email pattern generation
python3 email-guesser.py "John Doe" "company.com"

Legal & Ethical Considerations

Legal Frameworks

Jurisdictional Considerations:

# GDPR (Europe)
# Requires lawful basis for data processing

# CCPA (California)
# Consumer privacy rights

# Local privacy laws
# Check jurisdiction-specific regulations

Terms of Service Compliance:

# Always check platform ToS
# Common restrictions:
- Automated scraping
- Data aggregation
- Commercial use
- Privacy violations

Ethical Guidelines

OSINT Code of Ethics:

ethics_checklist = {
    "purpose": "Is the investigation justified?",
    "consent": "Is informed consent obtained?",
    "privacy": "Are privacy rights respected?",
    "minimization": "Is data collection minimized?",
    "accuracy": "Is information verified?",
    "security": "Is data properly secured?",
    "disclosure": "Is reporting responsible?",
    "legal": "Is activity legal?",
}

Responsible Reporting:

# Vulnerability disclosure
1. Document finding
2. Verify impact
3. Contact responsible party
4. Allow time for remediation
5. Public disclosure (if appropriate)

# Law enforcement cooperation
# When to involve authorities
# Proper evidence handling

Training & Resources

Learning Platforms

Free Training:

# SANS OSINT Summit
https://www.sans.org/cyber-security-summit/archives/osint-summit-2022

# OSINT Curious
https://www.osintcurio.us/

# Bellingcat Training
https://www.bellingcat.com/resources/

# Trace Labs
https://www.tracelabs.org/

Certifications:

# SANS SEC487
https://www.sans.org/cyber-security-courses/open-source-intelligence-gathering/

# Crest Practitioner
https://www.crest-approved.org/

# OSINT Combine Academy
https://academy.osintcombine.com/

Practice Platforms

CTF & Challenges:

# OSINT CTF
https://ctf.osint.guru/

# Trace Labs Search Party
https://www.tracelabs.org/initiatives/search-party

# Hack The Box Challenges
https://www.hackthebox.com/

# TryHackMe Rooms
https://tryhackme.com/hacktivities?tab=search&search=osint

Real-World Practice:

practice_scenarios = {
    "missing_person": "Use OSINT to locate",
    "business_intel": "Competitor analysis",
    "threat_hunting": "Identify malicious actors",
    "geolocation": "Image location challenge",
    "social_analysis": "Profile investigation",
}

Quick Reference Cheat Sheet

Essential Commands

# WHOIS lookup
whois domain.com
whois 8.8.8.8

# DNS enumeration
dig domain.com ANY
nslookup -type=any domain.com
host -a domain.com

# Subdomain discovery
subfinder -d domain.com
amass enum -d domain.com

# Reverse image search
curl -F "file=@image.jpg" https://lens.google.com/upload

# Email verification
holehe email@example.com

Useful Websites

# General OSINT
https://osintframework.com/
https://start.me/p/1kvvxN/

# People Search
https://thatsthem.com/
https://fastpeoplesearch.com/

# Business Intelligence
https://opencorporates.com/
https://www.sec.gov/edgar/searchedgar/companysearch

# Image Analysis
https://fotoforensics.com/
https://29a.ch/photo-forensics/

# Flight Tracking
https://globe.adsbexchange.com/
https://www.flightradar24.com/

Browser Extensions

# Search Enhancement
- Selection Search
- Search by Image
- Multiple Search

# Privacy
- uBlock Origin
- Privacy Badger
- HTTPS Everywhere

# Investigation
- EXIF Viewer
- Wappalyzer
- FakerFace

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment