IoT Workshop #2: ออกแบบ Database ให้ไม่งง

ถามจริงๆ น้องๆ เคยสงสัยไหมว่าทำไม IoT system ถึงต้องใช้ database หลายตัว? ใช้ MySQL ตัวเดียวไม่ได้เหรอ? คำตอบคือ “ได้ แต่จะเจ็บปวดมาก” 555 — มาลุยกันเลยดีกว่า เราจะออกแบบ data layer ทั้งหมดตั้งแต่ต้น พร้อม WHY ก่อน HOW เสมอ!

WHY ก่อน: ทำไมต้องใช้ 2 Databases?

ลองนึกภาพแบบนี้นะ — ข้อมูลใน IoT system มี 2 แบบที่ ต่างกันโดยสิ้นเชิง:

อุปมาอุปไมย: ลองนึกถึงสมุดโทรศัพท์ VS ไดอารี่บันทึกประจำวัน

สมุดโทรศัพท์ (MongoDB) — ข้อมูล device, user, config — อ่านได้, แก้ได้, ลบได้, มี structure ซับซ้อน

ไดอารี่ (InfluxDB) — sensor readings — เขียนเพิ่มทุกวินาที, ย้อนดูตามช่วงเวลา, ไม่ค่อยแก้ไข

ถ้าเอา “ไดอารี่” ไปเก็บใน “สมุดโทรศัพท์” เดียวกัน เดี๋ยวสมุดโทรศัพท์จะหนาจนยกไม่ขึ้น (และ query ช้าสัสมาก)

( ╥ω╥ ) ← สภาพ DBA ที่ยอมเก็บ time-series ใน MySQL

Database	ใช้ทำอะไร	ลักษณะข้อมูล
MongoDB	Device registry, users, configs, alerts	Document-based, CRUD-heavy, schema ซับซ้อนได้
InfluxDB	Sensor readings, metrics	Append-only, เรียงตามเวลา, query เร็วมาก

┌──────────────────────────────────────────────┐
│                Data Layer                     │
│                                              │
│  ┌─────────────────┐  ┌──────────────────┐   │
│  │    MongoDB       │  │    InfluxDB      │   │
│  │  (สมุดโทรศัพท์) │  │    (ไดอารี่)     │   │
│  │                 │  │                  │   │
│  │  - devices      │  │  - sensor_data   │   │
│  │  - users        │  │  - device_status │   │
│  │  - device_groups│  │  - system_metrics│   │
│  │  - alert_rules  │  │                  │   │
│  │  - alert_history│  │  Retention:      │   │
│  │  - commands_log │  │  - raw: 7 days   │   │
│  │  - configs      │  │  - hourly: 90d   │   │
│  │                 │  │  - daily: 1 year │   │
│  └─────────────────┘  └──────────────────┘   │
└──────────────────────────────────────────────┘

ภาพรวม Data Model ทั้งหมด

ก่อนลงรายละเอียด ขอให้เห็น big picture ก่อนนะ — collections ทั้งหมดมันเชื่อมกันยังไง:

Mermaid Diagram

โอเค ตอนนี้เห็นภาพแล้ว มาลงรายละเอียดแต่ละ collection กัน!

MongoDB Collections

1. `devices` Collection — หัวใจของระบบ

นี่คือ collection ที่สำคัญที่สุด เป็น “ทะเบียนบ้าน” ของ device ทุกตัว เก็บทั้ง hardware info, ตำแหน่ง, status ล่าสุด และ config:

// devices collection
{
  _id: ObjectId("..."),
  device_id: "sensor-temp-001",        // Unique device identifier
  name: "Temperature Sensor #1",
  description: "ห้องเซิร์ฟเวอร์ ชั้น 3",
  type: "temperature_humidity",         // sensor, actuator, gateway

  // Hardware info
  hardware: {
    model: "DHT22",
    firmware_version: "2.1.0",
    mac_address: "AA:BB:CC:DD:EE:01"
  },

  // Location
  location: {
    building: "อาคาร A",
    floor: 3,
    room: "Server Room",
    coordinates: {
      lat: 13.7563,
      lng: 100.5018
    }
  },

  // Status
  status: "online",                     // online, offline, error, maintenance
  last_seen: ISODate("2026-03-26T10:30:00Z"),
  last_data: {
    temperature: 25.5,
    humidity: 60.2,
    timestamp: ISODate("2026-03-26T10:30:00Z")
  },

  // Configuration
  config: {
    report_interval: 30,               // seconds
    thresholds: {
      temperature_max: 40,
      temperature_min: 10,
      humidity_max: 80,
      humidity_min: 20
    }
  },

  // Metadata
  group_id: ObjectId("..."),            // Reference to device_groups
  tags: ["server-room", "critical", "floor-3"],
  enabled: true,

  created_at: ISODate("2026-01-15T08:00:00Z"),
  updated_at: ISODate("2026-03-26T10:30:00Z"),
  created_by: ObjectId("...")           // Reference to users
}

// Indexes
db.devices.createIndex({ "device_id": 1 }, { unique: true })
db.devices.createIndex({ "status": 1, "type": 1 })
db.devices.createIndex({ "group_id": 1 })
db.devices.createIndex({ "tags": 1 })
db.devices.createIndex({ "last_seen": 1 })
db.devices.createIndex({ "location.building": 1, "location.floor": 1 })

Tips จากพี่: เราเก็บ last_data ไว้ใน device document เพื่อให้ดึง “ค่าล่าสุด” ได้โดยไม่ต้อง query InfluxDB ทุกครั้ง — เหมือนจำ “ค่าล่าสุดที่วัดได้” ไว้บนหน้าปก แทนที่จะพลิกไปหาในไดอารี่ทุกที

2. `users` Collection — จัดการคนที่ใช้ระบบ

// users collection
{
  _id: ObjectId("..."),
  email: "[email protected]",
  username: "admin",
  password_hash: "$2a$12$...",          // bcrypt hash

  profile: {
    first_name: "สมชาย",
    last_name: "ดีงาม",
    avatar_url: "/avatars/admin.png",
    phone: "+66-81-234-5678"
  },

  role: "admin",                        // admin, operator, viewer
  permissions: [
    "devices:read", "devices:write", "devices:delete",
    "users:read", "users:write",
    "alerts:read", "alerts:write",
    "commands:execute"
  ],

  preferences: {
    notification_email: true,
    notification_push: true,
    notification_sms: false,
    theme: "dark",
    language: "th",
    dashboard_layout: "grid"
  },

  // Security
  refresh_token: "...",
  last_login: ISODate("2026-03-26T08:00:00Z"),
  login_attempts: 0,
  locked_until: null,

  enabled: true,
  created_at: ISODate("2026-01-01T00:00:00Z"),
  updated_at: ISODate("2026-03-26T08:00:00Z")
}

// Indexes
db.users.createIndex({ "email": 1 }, { unique: true })
db.users.createIndex({ "username": 1 }, { unique: true })
db.users.createIndex({ "role": 1 })

สังเกตว่าเราเก็บ permissions เป็น array — ทำให้ fine-grained ได้ ไม่ต้องแบ่งแค่ role อย่างเดียว คนคนเดียวอาจดู device ได้แต่สั่ง command ไม่ได้ก็ได้

3. `device_groups` Collection — จัดกลุ่ม device

เหมือน “folder” สำหรับจัด device ให้เป็นระเบียบ รองรับ nested groups ด้วย (group ใน group ได้):

// device_groups collection
{
  _id: ObjectId("..."),
  name: "Server Room Sensors",
  description: "เซ็นเซอร์ทั้งหมดในห้องเซิร์ฟเวอร์",

  parent_id: null,                      // For nested groups
  icon: "server",
  color: "#FF6B35",

  // Aggregated stats (updated periodically)
  stats: {
    total_devices: 12,
    online_devices: 11,
    active_alerts: 2
  },

  created_by: ObjectId("..."),
  created_at: ISODate("2026-01-15T08:00:00Z"),
  updated_at: ISODate("2026-03-26T10:00:00Z")
}

// Indexes
db.device_groups.createIndex({ "name": 1 }, { unique: true })
db.device_groups.createIndex({ "parent_id": 1 })

4. `alert_rules` Collection — กฎการแจ้งเตือน

นี่คือ “หน้าที่เจ้าหน้าที่รักษาความปลอดภัย” — บอกว่า “ถ้าอุณหภูมิเกิน 40°C ให้ส่ง email หาแอดมินทันที”:

// alert_rules collection
{
  _id: ObjectId("..."),
  name: "High Temperature Alert",
  description: "แจ้งเตือนเมื่ออุณหภูมิเกิน 40°C",

  // Conditions
  condition: {
    metric: "temperature",
    operator: "gt",                     // gt, gte, lt, lte, eq, neq
    value: 40,
    duration: 60,                       // seconds - sustained for
    device_filter: {
      type: "temperature_humidity",
      tags: ["critical"]
    }
  },

  severity: "critical",                 // info, warning, critical

  // Actions
  actions: [
    { type: "webhook", url: "http://api:3000/api/v1/alerts/webhook" },
    { type: "email", recipients: ["[email protected]"] },
    { type: "push_notification", topic: "critical-alerts" }
  ],

  // Cooldown
  cooldown_minutes: 15,                 // Don't re-alert within 15 min
  last_triggered: ISODate("2026-03-25T14:30:00Z"),

  enabled: true,
  created_by: ObjectId("..."),
  created_at: ISODate("2026-02-01T00:00:00Z"),
  updated_at: ISODate("2026-03-20T00:00:00Z")
}

// Indexes
db.alert_rules.createIndex({ "enabled": 1, "severity": 1 })
db.alert_rules.createIndex({ "condition.metric": 1 })

cooldown_minutes สำคัญมาก! ถ้าไม่มี cooldown อุณหภูมิสูงแค่ 1 นาที อาจได้รับ email 2 ฉบับต่อนาที — แบบนั้นไม่มีใครอยากเปิด notification แน่ๆ 😅

5. `alert_history` Collection — ประวัติการแจ้งเตือน

เก็บว่าเคย alert อะไรไปบ้าง ใครรับทราบแล้ว แก้ไขแล้วหรือยัง:

// alert_history collection
{
  _id: ObjectId("..."),
  rule_id: ObjectId("..."),
  device_id: "sensor-temp-001",

  severity: "critical",
  title: "High Temperature Alert",
  message: "Temperature 42.3°C exceeded threshold 40°C on sensor-temp-001",

  // Trigger context
  trigger_value: 42.3,
  threshold_value: 40,
  metric: "temperature",

  // Status tracking
  status: "acknowledged",               // triggered, acknowledged, resolved
  acknowledged_by: ObjectId("..."),
  acknowledged_at: ISODate("2026-03-26T10:35:00Z"),
  resolved_at: null,

  // Actions taken
  actions_executed: [
    { type: "webhook", success: true, timestamp: ISODate("2026-03-26T10:30:05Z") },
    { type: "email", success: true, timestamp: ISODate("2026-03-26T10:30:06Z") }
  ],

  triggered_at: ISODate("2026-03-26T10:30:00Z"),
  created_at: ISODate("2026-03-26T10:30:00Z")
}

// Indexes
db.alert_history.createIndex({ "triggered_at": -1 })
db.alert_history.createIndex({ "device_id": 1, "triggered_at": -1 })
db.alert_history.createIndex({ "status": 1, "severity": 1 })
db.alert_history.createIndex({ "rule_id": 1 })
// TTL index - auto-delete after 90 days (ไม่ต้อง cron job เลย!)
db.alert_history.createIndex({ "created_at": 1 }, { expireAfterSeconds: 7776000 })

TTL Index เจ๋งมาก — MongoDB จะลบ record อัตโนมัติหลัง 90 วัน ไม่ต้องเขียน cron job เองเลย เหมือนตั้งนาฬิกาปลุกให้ MongoDB ช่วยเก็บขยะให้

6. `commands_log` Collection — บันทึกคำสั่งที่ส่งไป

// commands_log collection
{
  _id: ObjectId("..."),
  device_id: "actuator-relay-001",

  command: {
    action: "toggle",
    payload: { state: "on" }
  },

  // Result
  status: "delivered",                  // pending, delivered, executed, failed, timeout
  response: {
    success: true,
    message: "Relay turned ON",
    executed_at: ISODate("2026-03-26T10:31:00Z")
  },

  issued_by: ObjectId("..."),
  issued_at: ISODate("2026-03-26T10:30:55Z"),
  expires_at: ISODate("2026-03-26T10:31:55Z")   // Command TTL
}

// Indexes
db.commands_log.createIndex({ "device_id": 1, "issued_at": -1 })
db.commands_log.createIndex({ "status": 1 })
db.commands_log.createIndex({ "issued_by": 1, "issued_at": -1 })

InfluxDB Schema Design

WHY InfluxDB? ทำไมไม่ใช้ MongoDB เก็บ sensor data เลย?

อุปมา: เหมือนถามว่า “ทำไมต้องใช้ Excel ทำ pivot table ทั้งที่ Word ก็พิมพ์ตารางได้?” — ทำได้ แต่มันไม่ได้ถูก design มาเพื่องานนั้น

InfluxDB ถูก optimize มาเพื่อ time-series โดยเฉพาะ — query ข้อมูล 30 วันย้อนหลัง ทีละ 1 นาที บน device 100 ตัว ทำได้ใน milliseconds

Database & Retention Policies

ก่อนเข้าใจ retention policy ต้องรู้ก่อนว่า — ข้อมูล sensor ทุก 30 วินาที ถ้าเก็บไว้ 1 ปี ปริมาณ storage มหาศาลมาก เราเลย “ย่อย” ข้อมูลเก่าให้เล็กลง:

-- Create database
CREATE DATABASE iot_workshop

-- Retention policies
CREATE RETENTION POLICY "raw" ON "iot_workshop" DURATION 7d REPLICATION 1 DEFAULT
CREATE RETENTION POLICY "hourly" ON "iot_workshop" DURATION 90d REPLICATION 1
CREATE RETENTION POLICY "daily" ON "iot_workshop" DURATION 365d REPLICATION 1

แปลเป็นภาษาคน:

raw (7 วัน) — ข้อมูลดิบทุก 30 วินาที เก็บไว้ 1 อาทิตย์
hourly (90 วัน) — ค่าเฉลี่ยรายชั่วโมง เก็บไว้ 3 เดือน
daily (1 ปี) — ค่าเฉลี่ยรายวัน เก็บไว้ 1 ปี

Measurement: `sensor_data`

ใน InfluxDB ไม่เรียก “collection” แต่เรียก measurement — คิดซะว่าคือ “ชื่อตาราง” แล้วกัน:

sensor_data
├── Tags (indexed, string only) ← ใช้ filter, GROUP BY
│   ├── device_id    = "sensor-temp-001"
│   ├── device_type  = "temperature_humidity"
│   ├── location     = "building-a-floor-3"
│   └── group        = "server-room"
│
├── Fields (not indexed, various types) ← ค่าจริงที่วัดได้
│   ├── temperature  = 25.5          (float)
│   ├── humidity     = 60.2          (float)
│   ├── pressure     = 1013.25       (float)
│   ├── battery      = 85            (integer)
│   └── rssi         = -67           (integer)
│
└── Timestamp = 2026-03-26T10:30:00Z

Tags vs Fields — นี่คือจุดที่คนสับสนบ่อยที่สุด! Tags เปรียบเหมือน “หมวดหมู่” ที่ index แล้ว (เหมือน category ใน database) ส่วน Fields คือ “ค่าที่วัดได้จริง” ถ้าเอา temperature ไปเป็น tag ระบบจะพัง เพราะ cardinality สูงมาก

Write example (Line Protocol):

sensor_data,device_id=sensor-temp-001,device_type=temperature_humidity,location=building-a-floor-3 temperature=25.5,humidity=60.2,battery=85 1711443000000000000

Measurement: `device_status`

device_status
├── Tags
│   ├── device_id    = "sensor-temp-001"
│   └── device_type  = "temperature_humidity"
│
├── Fields
│   ├── online       = true          (boolean)
│   ├── uptime       = 86400         (integer, seconds)
│   ├── cpu_usage    = 12.5          (float, %)
│   ├── memory_usage = 45.2          (float, %)
│   └── disk_usage   = 30.1          (float, %)
│
└── Timestamp

Measurement: `system_metrics`

system_metrics
├── Tags
│   ├── service      = "api"         (api, mqtt, telegraf)
│   └── instance     = "api-01"
│
├── Fields
│   ├── requests_total     = 15234   (integer)
│   ├── requests_per_sec   = 42.5    (float)
│   ├── error_rate         = 0.02    (float)
│   ├── avg_response_ms    = 12.3    (float)
│   └── active_connections = 156     (integer)
│
└── Timestamp

Continuous Queries (Downsampling) — ระบบย่อยข้อมูลอัตโนมัติ

นี่คือ magic ของ InfluxDB — มัน “ย่อย” ข้อมูล raw ให้เป็นรายชั่วโมงและรายวันโดยอัตโนมัติ เหมือนมี intern ที่คอยสรุปรายงานให้ทุกชั่วโมงโดยไม่บ่น:

-- Downsample to hourly averages
CREATE CONTINUOUS QUERY "cq_sensor_hourly" ON "iot_workshop"
BEGIN
  SELECT
    mean("temperature") AS "temperature",
    mean("humidity") AS "humidity",
    mean("pressure") AS "pressure",
    min("temperature") AS "temperature_min",
    max("temperature") AS "temperature_max"
  INTO "hourly"."sensor_data"
  FROM "raw"."sensor_data"
  GROUP BY time(1h), "device_id", "device_type", "location"
END

-- Downsample to daily averages
CREATE CONTINUOUS QUERY "cq_sensor_daily" ON "iot_workshop"
BEGIN
  SELECT
    mean("temperature") AS "temperature",
    mean("humidity") AS "humidity",
    min("temperature") AS "temperature_min",
    max("temperature") AS "temperature_max",
    count("temperature") AS "readings_count"
  INTO "daily"."sensor_data"
  FROM "hourly"."sensor_data"
  GROUP BY time(1d), "device_id", "device_type", "location"
END

Data Flow — ข้อมูลไหลยังไงตั้งแต่ sensor จนถึง database?

Mermaid Diagram

เห็นไหมว่า Telegraf กับ Go Backend ทำงานคู่ขนานกัน — Telegraf เป็น “สายพาน” ที่เอาข้อมูลเข้า InfluxDB ส่วน Go Backend เป็น “คนที่อ่านข้อมูลแล้วทำงาน” (update status, broadcast)

Query Patterns — ตัวอย่าง Queries ที่ใช้บ่อย

MongoDB Queries

// Get all online devices in a group
db.devices.find({
  group_id: ObjectId("..."),
  status: "online",
  enabled: true
}).sort({ name: 1 })

// Get devices with active alerts (aggregate pipeline)
db.alert_history.aggregate([
  { $match: { status: "triggered" } },
  { $group: { _id: "$device_id", alert_count: { $sum: 1 } } },
  { $sort: { alert_count: -1 } },
  { $limit: 10 }
])

// Device search with text
db.devices.find({
  $or: [
    { name: { $regex: "server", $options: "i" } },
    { "location.room": { $regex: "server", $options: "i" } },
    { tags: "server-room" }
  ]
})

InfluxDB Queries

-- Last reading per device
SELECT last("temperature"), last("humidity")
FROM "sensor_data"
WHERE time > now() - 1h
GROUP BY "device_id"

-- Hourly average for a specific device
SELECT mean("temperature") AS "avg_temp",
       min("temperature") AS "min_temp",
       max("temperature") AS "max_temp"
FROM "sensor_data"
WHERE "device_id" = 'sensor-temp-001'
  AND time > now() - 24h
GROUP BY time(1h)

-- Devices with readings above threshold
SELECT "device_id", "temperature"
FROM "sensor_data"
WHERE "temperature" > 40
  AND time > now() - 1h
GROUP BY "device_id"

Migration & Seed Data Plan

Step 1: MongoDB Initialization

// scripts/seed-mongo.js
// Create collections with validation (บังคับให้ข้อมูลถูก format)
db.createCollection("devices", {
  validator: {
    $jsonSchema: {
      bsonType: "object",
      required: ["device_id", "name", "type", "status"],
      properties: {
        device_id: { bsonType: "string" },
        name: { bsonType: "string" },
        type: { enum: ["temperature_humidity", "motion", "relay", "gateway"] },
        status: { enum: ["online", "offline", "error", "maintenance"] }
      }
    }
  }
})

// Seed 20 sample devices
// Seed 3 user accounts (admin, operator, viewer)
// Seed 3 device groups
// Seed 5 alert rules

Schema validation ใน MongoDB เหมือน “กฎของหอพัก” — ต้องกรอกชื่อ, เลขห้อง, ประเภทห้อง ถ้าไม่ครบไม่ได้เข้า เพื่อป้องกัน garbage data เข้ามาตั้งแต่ต้น

Step 2: InfluxDB Initialization

# scripts/seed-influx.sh
# Create database, retention policies, continuous queries
# Generate 24h of sample sensor data (1 reading per 30 seconds)

สรุป — เราได้ออกแบบอะไรไปบ้าง?

ใน workshop #2 นี้เราลุยกันไปเยอะมาก:

MongoDB Schema — 6 collections พร้อม indexes ที่ optimize แล้ว สำหรับ device management, users, alerts และ commands
InfluxDB Schema — 3 measurements พร้อม retention policies 3 tier และ continuous queries สำหรับ downsampling อัตโนมัติ
Data Flow — เข้าใจว่าข้อมูลไหลจาก sensor → MQTT → Telegraf/Go → databases → Kapacitor ยังไง
Query Patterns — ตัวอย่าง queries ทั้ง MongoDB และ InfluxDB ที่จะใช้จริงใน workshop
Seed Data Plan — พร้อมสร้าง development environment ที่มีข้อมูลทดสอบ

(ง •̀_•́)ง Database design เสร็จแล้ว! ตอนนี้น้องๆ ควรเห็นภาพชัดเจนว่า data ทุก type อยู่ที่ไหน และเชื่อมกันยังไง — นั่นคือ foundation ที่แข็งแกร่งก่อนจะลงมือ code จริง

Navigation:

Prev: #1 System Architecture
Next: #3 Project Setup & DevOps