2026-03-11 23:08:57 +05:30
2026-03-11 21:18:24 +05:30
2026-03-11 21:18:24 +05:30
2026-03-11 23:08:57 +05:30
2026-03-11 21:18:24 +05:30
2026-03-11 23:08:57 +05:30
2026-03-11 23:08:57 +05:30
2026-03-11 21:18:24 +05:30
2026-03-11 21:18:24 +05:30
2026-03-11 23:08:57 +05:30
2026-03-11 23:08:57 +05:30
2026-03-11 23:08:57 +05:30

Internet Domain Database

This project maintains two separate tables for internet domain data:

  • domain_root: Top-Level Domains (TLDs) from IANA
  • domain_suffix: Public Suffix List from Mozilla

Database Connection

  • Host: l2
  • Port: 3306
  • User: root
  • Database: sp_spider

Table Structure

domain_root

Contains IANA TLD data with unique root domains and soft delete functionality.

CREATE TABLE domain_root (
    id INT AUTO_INCREMENT PRIMARY KEY,
    root VARCHAR(63) NOT NULL UNIQUE,
    removed BOOLEAN DEFAULT FALSE,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
    INDEX idx_root (root),
    INDEX idx_removed (removed)
);

domain_suffix

Contains Public Suffix List data with unique suffixes and soft delete functionality.

CREATE TABLE domain_suffix (
    id INT AUTO_INCREMENT PRIMARY KEY,
    suffix VARCHAR(255) NOT NULL UNIQUE,
    removed BOOLEAN DEFAULT FALSE,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
    INDEX idx_suffix (suffix),
    INDEX idx_removed (removed)
);

Setup

  1. Install dependencies:
pip install -r requirements.txt
  1. Create the tables:
mariadb -h l2 -u root -p0000 --ssl=FALSE sp_spider < create_separate_tables.sql
  1. Add the removed column (for existing installations):
mariadb -h l2 -u root -p0000 --ssl=FALSE sp_spider < add_removed_column.sql
  1. Populate the tables:
python populate_separate_tables.py 0000
  1. Update tables with soft delete functionality:
python update.py 0000

Data Sources

Usage

Query active TLDs:

SELECT * FROM domain_root WHERE removed = FALSE AND root = 'com';

Query removed TLDs:

SELECT * FROM domain_root WHERE removed = TRUE;

Query active suffixes:

SELECT * FROM domain_suffix WHERE removed = FALSE AND suffix LIKE '%.com';

Query removed suffixes:

SELECT * FROM domain_suffix WHERE removed = TRUE;

Get statistics:

SELECT 
    'domain_root' as table_name,
    COUNT(*) as total,
    SUM(CASE WHEN removed = FALSE THEN 1 ELSE 0 END) as active,
    SUM(CASE WHEN removed = TRUE THEN 1 ELSE 0 END) as removed
FROM domain_root
UNION ALL
SELECT 
    'domain_suffix' as table_name,
    COUNT(*) as total,
    SUM(CASE WHEN removed = FALSE THEN 1 ELSE 0 END) as active,
    SUM(CASE WHEN removed = TRUE THEN 1 ELSE 0 END) as removed
FROM domain_suffix;

Project Files

  • create_separate_tables.sql - Table creation script
  • add_removed_column.sql - Script to add removed column for soft delete functionality
  • populate_separate_tables.py - Initial data population script
  • update.py - Update script with soft delete functionality
  • requirements.txt - Python dependencies
  • README.md - This documentation

Soft Delete Functionality

The update.py script provides soft delete functionality that:

  1. Marks entries as removed: Sets removed = TRUE for entries no longer found in source data
  2. Adds new entries: Inserts new entries from source with removed = FALSE
  3. Restores entries: Sets removed = FALSE for previously removed entries that reappear in source
  4. Provides statistics: Shows counts of active, removed, new, and restored entries

Update Process

The update script:

  • Fetches latest data from IANA TLD and Public Suffix List
  • Compares with current database entries
  • Performs batch updates for efficiency
  • Handles duplicate entries gracefully with INSERT IGNORE
  • Updates updated_at timestamp for all changes

Run the update script periodically to keep the database synchronized with source data.

Description
No description provided
Readme 44 KiB