Skip to content

Getting Started with MITRE ATT&CK: Fetching and Processing Data Like a Pro

Hey there, fellow threat hunters! 👋 Today, we're diving into something that every security professional should have in their toolkit - working with MITRE ATT&CK data programmatically. If you've been manually browsing the MITRE website to look up techniques, it's time to level up your game!

What's MITRE ATT&CK Anyway?

Before we get our hands dirty with code, let's quickly understand what we're dealing with. MITRE ATT&CK is basically the encyclopedia of adversary tactics and techniques - think of it as the "bad guys' playbook" that we use to improve our defenses. It's a globally-accessible knowledge base of adversary tactics and techniques based on real-world observations.

For more information you can look at our MITRE ATT&CK Fundamentals.

Project Setup

First things first, we'll need a few ingredients for our cyber-soup:

  • Python (because we're not savages 😉)
  • The mitreattack-python library
  • Basic understanding of JSON (or at least the ability to pretend you do)
  • Coffee ☕ (optional but highly recommended)

Let's set up our project structure:

mitre_project/
├── main.py
├── config.py
├── loader.py
└── cache/
    └── enterprise-attack.json

The Code Breakdown

Our project keeps things neat and organized. Let's break down each component and see what makes it tick.

config.py - The Settings Master

This is where we keep all our configuration settings neat and tidy:

import os
import logging

# Paths
BASE_DIR = os.path.dirname(os.path.abspath(__file__))
CACHE_DIR = os.path.join(BASE_DIR, 'cache')

# Cache File paths
TECH_PATH = os.path.join(CACHE_DIR, 'all_attack_techniques.json')

def setup_logging():
    logging.basicConfig(
        level=logging.INFO,
        format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
        handlers=[logging.StreamHandler()]
    )

The config file keeps everything organized and makes sure we're consistent with where we store our data. It also sets up our logging configuration because, let's face it, print statements are so 2010.

loader.py - The Data Fetcher

Here's where the real magic starts. Our loader uses the official MITRE ATT&CK Python library to fetch and process the data:

def load_attack_data(use_cache: bool = True) -> MitreAttackData:
    """Initialize MitreAttackData with STIX data"""
    stix_path = os.path.join(os.path.dirname(__file__), 'cache', 'enterprise-attack.json')
    
    if not os.path.exists(os.path.dirname(stix_path)):
        os.makedirs(os.path.dirname(stix_path))
    
    if os.path.exists(stix_path) and use_cache:
        logger.info("Loading STIX data from cache")
    else:
        logger.info("Downloading latest STIX data")
        import requests
        
        url = "https://raw.githubusercontent.com/mitre/cti/master/enterprise-attack/enterprise-attack.json"
        response = requests.get(url)
        response.raise_for_status()
        
        with open(stix_path, 'w', encoding='utf-8') as f:
            json.dump(response.json(), f, ensure_ascii=False, indent=2)

The loader handles several crucial tasks:

  • Fetching the latest STIX data from MITRE's GitHub repository
  • Managing local caching (because waiting for downloads is like watching paint dry)
  • Converting STIX objects into Python-friendly dictionaries
  • Extracting technique information with proper references

Here's how we handle technique extraction:

def get_techniques(attack: MitreAttackData) -> list:
    """Get all techniques from MITRE ATT&CK"""
    techniques = []
    for technique in attack.get_techniques():
        technique_id = None
        external_references = []
        
        if hasattr(technique, 'external_references'):
            external_references = make_json_serializable(technique.external_references)
            for ref in external_references:
                if ref.get('source_name') == 'mitre-attack':
                    technique_id = ref.get('external_id')
                    break
                
        if technique_id:
            techniques.append({
                "technique_id": technique_id,
                "id": technique.id,
                "name": technique.name,
                "description": technique.description,
                "external_references": external_references,
                "groups": [],
                "mitigations": []
            })
    return techniques

Making It All Work Together

In main.py, we bring everything together:

if __name__ == "__main__":
    # Set up logging
    setup_logging()
    logging.getLogger().setLevel(logging.INFO)
    
    # Load all data including the attack object
    data = load_all_data()

When you run this, you'll see something like:

2024-12-22 14:07:29,945 - loader - INFO - Loading STIX data from cache
2024-12-22 14:07:35,539 - __main__ - INFO - Loaded 799 techniques
2024-12-22 14:07:35,539 - __main__ - INFO - Loaded 174 groups
2024-12-22 14:07:35,539 - __main__ - INFO - Loaded 268 mitigations

Error Handling

One thing we've learned the hard way: MITRE data can sometimes be unpredictable. Here's how we handle that:

def make_json_serializable(obj):
    """Convert STIX objects to dictionaries"""
    if hasattr(obj, 'serialize'):
        return json.loads(obj.serialize())
    elif isinstance(obj, dict):
        return {k: make_json_serializable(v) for k, v in obj.items()}
    elif isinstance(obj, list):
        return [make_json_serializable(i) for i in obj]
    return obj

This function makes sure everything can be properly serialized to JSON, no matter what weird formats MITRE throws at us.

Get The Code

Want to dive right in? The complete code is right here in this project. Just:

# Clone the repository
git clone [your-repo-url]
cd mitre

# Create a virtual environment (optional but recommended)
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install requirements
pip install -r requirements.txt

Troubleshooting Common Issues

Here are some common gotchas you might run into:

  1. "Module not found" errors: Make sure you've installed all requirements and activated your virtual environment
  2. JSON serialization errors: Check that you're using our make_json_serializable function
  3. Cache issues: Try deleting the cache directory and letting the script download fresh data

What's Next?

In our next post, we'll dive into how to map relationships between different MITRE ATT&CK components. We'll explore how to connect techniques with the groups that use them and the mitigations that defend against them.

Until then, happy hunting!

References

  • MITRE ATT&CK
  • mitreattack-python
  • Cyberchef
  • NIST CSF 

0 comments:

Post a Comment