- status_monitor: add availability_managed set; _monitor_loop skips devices
in this set so the LWT/availability topic is the sole online/offline source
- device_manager: register device with status_monitor.set_availability_managed()
so the monitor actually skips them (previously the monitor had no knowledge
of DeviceManager.availability_managed)
- mqtt_bridge: remove blanket 'reset all devices to offline' on bridge restart;
this was causing a race condition where the cron reset state AFTER the bridge
had already sent device_online events via retained MQTT messages;
stale running session cleanup is kept (still needed)
- direct_session devices now use availability_topic (LWT) exclusively
for online/offline state - timeout monitor no longer interferes
- Added availability_managed set: devices in this set bypass
update_last_seen() and are ignored by the timeout monitor
- Added heartbeat_topics set: heartbeat messages return early before
the session parser, eliminating direct_session_missing_fields warnings
- Added mark_online_silent() to DeviceStatusMonitor: updates state
without emitting a duplicate device_online event
- registry.py: added availability_topic + status_topic params for
direct_session parser type
- server.py: set last_config_update from file mtime on load_persisted_config
- mqtt_bridge.py: auto push config + reset device states when bridge
comes back from offline (prevents stale state in Odoo after restart)
When _process_event raised a DB exception (e.g. constraint violation),
PostgreSQL put the whole transaction into ABORTED state. Any subsequent
ORM call (event.mark_processed, Odoo's own flush) then raised
psycopg2.errors.InFailedSqlTransaction, masking the real error.
Fix: wrap _process_event in a database savepoint inside receive_iot_event.
A processing failure now only rolls back the session/device side-effects;
the ows.iot.event record stays committed and the error is stored in
processing_error. The transaction itself remains valid for Odoo's flush.
NameError: name 'fields' is not defined (ab 20:12 Uhr in den Logs)
Ursache: iot_api.py ist ein Controller, kein Model – 'fields' war
niemals importiert. Der UTC-Fix verwendete fälschlicherweise
fields.Datetime.to_datetime() und fields.Datetime.to_string().
Fix:
- 'fields' und 'timedelta' zu den top-level Imports ergänzt
- fields.Datetime.to_datetime(event.timestamp) → event.timestamp
(ORM-Datetime-Felder sind im Controller bereits datetime-Objekte)
- fields.Datetime.to_string(dt) → dt.strftime('%Y-%m-%d %H:%M:%S')
- Inline 'from datetime import timedelta as _td' entfernt
_logger.info('msg', session_id=...) ist structlog-Syntax.
Python-Standard-Logger wirft dabei TypeError → Odoo rollback →
session_complete-Events wurden niemals committed und landeten
in der Bridge-Retry-Queue (Endlosschleife).
Fix: Auf _logger.info('msg %s %s', val1, val2)-Format umgestellt.
Problem: Lasercutter-Sessions wurden in get_pos_session_suggestions
übersprungen, weil start_time > end_time.
Bug 1 – iot_api.py (session_complete):
session_start_time im Gerät-Payload ist Lokalzeit ohne TZ-Info.
Bisher naiv als UTC gespeichert → start_time lag 1h nach end_time.
Fix: start_time = end_time - session_seconds (komplett UTC-basiert).
session_start_time aus dem Payload wird nicht mehr verwendet.
Bug 2 – machine_time_control_button.js:
toServerDatetime(new Date()) verwendete getHours() (Lokalzeit).
Odoo erwartet UTC-Strings → attendanceEnd war 1h zu groß.
Fix: Auf getUTCHours() / getUTCDate() etc. umgestellt.
Lasercutter (direct_session) sendet Plain-Text 'online'/'offline' auf
<device-id>/availability – bislang wurden diese Nachrichten still verworfen.
Änderungen:
- mqtt_client.py: non-JSON Payloads als {'_raw': text} durchleiten
(statt bei JSONDecodeError komplett zu verwerfen)
- device_manager._add_device: direct_session-Geräte abonnieren zusätzlich
<device_id>/availability; Eintrag landet in device_map
- device_manager._remove_device: entfernt ALLE Topics eines Geräts
(vorher nur das erste gefundene – Bug bei mehreren Topics)
- device_manager.route_message: {'_raw': 'online'/'offline'} erzeugt
device_online / device_offline Event in der Queue (case-insensitive)
- 15 neue Unit-Tests in test_availability_pipeline.py (102/102 grün)
session_config wird nicht mehr über die API/YAML gesendet.
parser_config ist jetzt das einzige Config-Format zwischen Odoo und Bridge.
Änderungen:
- api/models.py: DeviceConfig.session_config → parser_config: dict
SessionConfig bleibt als internes Modell für SessionDetector
DeviceConfig.to_session_config() extrahiert Werte mit Defaults
- config/schema.py: DeviceConfig gleich umgestellt + to_session_config()
- config/loader.py: liest parser_config aus YAML, Fallback für legacy
session_config (Rückwärtskompatibilität für bestehende config-active.yaml)
- core/device_manager.py: device.session_config → device.to_session_config()
- core/service_manager.py: session_config Referenzen entfernt
- Odoo _build_bridge_config: sendet parser_config direkt (+ heartbeat)
- Odoo iot_api.py: gleich umgestellt
- Tests: alle SessionConfig-Fixtures → parser_config dicts
63/63 passing
write() rief _build_bridge_config() auf bevor der ORM-Cache
geleert war – self.search() las noch die alten (gecachten) Werte.
Resultat: parser_type-Änderung wurde in die config-active.yaml
nicht übernommen.
Fix: flush_all() + invalidate_all() vor dem Push erzwingt,
dass _build_bridge_config() die soeben geschriebenen Werte liest.
Der vorherige Fix war falsch: 'nur bei leerem Feld befüllen' führt
dazu, dass nach shelly_pm → dummy → shelly_pm die dummy-Config übrig
bleibt. Jeder Parser hat eigene Parameter, eine fremde Config ist
für den neuen Parser wertlos.
Korrekt: Typwechsel = immer frische Defaults des neuen Parsers.
Bisher wurden bei jedem parser_type-Wechsel die Registry-Defaults
überschrieben – angepasste Werte gingen verloren.
Neu: Defaults werden nur eingetragen wenn parser_config leer ('{}')
ist, z. B. bei neu angelegten Geräten oder nach manuellem Leeren.
Das ace-Widget erwartet einen String, aber fields.Json liefert ein
Python-Dict zurück. JS ruft .toString() darauf auf → '[object Object]'.
Änderungen:
- mqtt_device.py: fields.Json → fields.Text(default='{}')
- _onchange_parser_type: json.dumps(defaults, indent=2) statt dict
- get_parser_config_dict: json.loads() mit Fehlerbehandlung
- iot_api.py: json.loads(device.parser_config or '{}')
- DB: parser_config jsonb → text (USING parser_config::text)
Demonstriert den vollständigen Ablauf für einen neuen Parser mit
ANDEREN Parametern als shelly_pm (Flat Wide Table, Bridge SoT).
Bridge (iot_bridge):
- parsers/dummy_parser.py: Komplett neu – liest 'pulses' (int) statt
'value', gibt apower=float(pulses) für SessionDetector zurück
- parsers/registry.py: dummy_generic-Parameter ersetzt:
standby_threshold_w/working_threshold_w → pulse_count, pulse_debounce_ms,
reset_interval_min (je mit odoo_field-Mapping)
Odoo (open_workshop_mqtt):
- mqtt_device.py: 3 neue Felder dummy_pulse_count, dummy_pulse_debounce_ms,
dummy_reset_interval_min (Flat Wide Table, NULL für andere Parser)
- mqtt_device.py: _compute_strategy_config depends + elif für dummy_generic
erzeugt jetzt pulse-spezifischen JSON-Config-Dict
- mqtt_device.py: _onchange_parser_type setzt Puls-Defaults statt
shelly-Schwellenwerte
- mqtt_device_views.xml: invisible-Bedingungen auf parser_type not in [...]
umgestellt (korrekt skalierend bei 3+ Parsern)
- mqtt_device_views.xml: Inline-Hint-Divs aus Leistungs-Schwellenwerte
entfernt (unnötig)
- mqtt_device_views.xml: Neue Gruppe '🔢 Puls-Konfiguration' mit 3 Feldern
DB: 3 neue Spalten in ows_mqtt_device angelegt (odoo -u Erfolg)
Tests: 63/63 grün
- mqtt_device.py: _PARSER_SELECTION-Konstante eingeführt; tasmota und
generic entfernt (hatten keine Bridge-Implementierung); depends in
_compute_strategy_config um parser_type erweitert; _onchange nutzt
jetzt topic_hint aus Registry (<device>/status/pm1:0)
- mqtt_device_views.xml: Parser-spezifischer Info-Block für shelly_pm
(topic_hint, Beschreibung) mit invisible='parser_type != shelly_pm'
- Bestehende DB-Geräte: alle 3 bereits auf shelly_pm → keine Migration
- GET /parsers: läuft und liefert vollständige Schema-Antwort
- Add ARCHITECTURE.md with component overview and runtime data flow
- Add DEVELOPMENT.md with local setup, testing, and debugging workflows
- Update README.md links and development documentation references
- Fix outdated API models reference in related docs section
- Update optimization plan status for Phase 4.3 progress
- Add optional Odoo circuit-breaker for transient failures
- Unify timeout handling in Odoo and MQTT clients
- Improve transient error classification (timeout/connection/5xx/429)
- Add focused unit tests for recovery and circuit-breaker behavior
- Mark Phase 3.3 tasks as completed in optimization plan
- Include device_status_timeout_s in Odoo bridge payload
- Resolve status monitor timeout robustly with backward-compatible fallbacks
- Update running status monitor timeout on POST /config without restart
- Keep compatibility for legacy/local configs without device_status_timeout_s
Result: shaperorigin uses configured 90s timeout for online/offline monitor, preventing 30s flapping.
Problem: Device Status Monitor was using a hardcoded 30-second global timeout
for marking devices offline, independent of the configurable message_timeout_s.
This caused alternating offline/online events for devices with power=0 that
don't send frequent MQTT messages.
Solution: Use the same timeout value (message_timeout_s) for both:
1. Session Detection (message_timeout_s)
2. Device Status Monitoring (device_status_timeout_s)
Implementation:
- Add device_status_timeout_s field to api/models.py DeviceConfig (default: 120s)
- Update Odoo iot_api.py to include device_status_timeout_s in config response
(synchronized with message_timeout_s from device strategy config)
- Update Bridge service_manager.py to use device_status_timeout_s when
initializing DeviceStatusMonitor (fallback to global config if not provided)
Result:
- Single configurable timeout per device in Odoo
- Both checks (session + device status) use same value
- Backward compatible (defaults to 120s if not provided)
- Solves alternating offline/online events for low-power/idle devices
Validation:
- mypy: 0 errors across 47 files
- API model test: device_status_timeout_s field functional
Phase 3.1: Type Safety
- Add bridge_types.py for shared type aliases (EventDict, PowerWatts, Timestamp, DeviceID)
- Define protocols for callbacks and message parsers
- Strict type annotations on all core modules (session_detector, event_queue, device_manager)
- Fix Optional handling and type guards throughout codebase
- Achieve full mypy compliance: 0 errors across 47 source files
Phase 3.2: Logging Unification
- Migrate from stdlib logging to pure structlog across all runtime modules
- Convert all logs to structured event+fields format (snake_case event names)
- Remove f-string and printf-style logger calls
- Add contextvars support for per-request correlation
- Implement FastAPI middleware to bind request_id, http_method, http_path
- Propagate X-Request-ID header in responses
- Remove stdlib logging imports except setup layer (utils/logging.py)
- Ensure log-level consistency across all modules
Files Modified:
- iot_bridge/bridge_types.py (new) - Central type definitions
- iot_bridge/core/* - Type safety and logging unification
- iot_bridge/clients/* - Structured logging with request context
- iot_bridge/parsers/* - Type-safe parsing with structured logs
- iot_bridge/utils/logging.py - Pure structlog setup with contextvars
- iot_bridge/api/server.py - Added request correlation middleware
- iot_bridge/tests/* - Test fixtures updated for type safety
- iot_bridge/OPTIMIZATION_PLAN.md - Phase 3 status updated
Validation:
- mypy . → 0 errors (47 files)
- All unit tests pass
- Runtime behavior unchanged
- API response headers include X-Request-ID
Implemented Phase 2.4 (Dependency Injection Pattern):
- Added new dependencies module with DI container and runtime context
- RuntimeContainer for injectable factories
- RuntimeContext for resolved runtime objects
- create_service_manager() factory
- build_runtime_context() composition root
- Refactored main.py to use dependency container wiring
- Main orchestration now resolves runtime via DI factories
- Reduced direct constructor coupling in entrypoint
- Added unit tests for DI behavior with mocked dependencies
- Verifies factory injection for service manager creation
- Verifies runtime composition uses injected callables
- Updated optimization plan checkboxes for Phase 2.4
Validation:
- py_compile passed for new/changed files
- tests/unit/test_dependencies.py passed
- regression test test_event_queue::test_enqueue passed
Notes:
- Keeps existing runtime behavior unchanged
- Establishes clear composition root for future testability improvements
- Set overall status to in-progress
- Mark Phase 2 as partially completed
- Add implemented Phase 2 commit references
- Check off completed tasks for 2.1, 2.2 and 2.3
- Update timeline table (Phase 0/1 done, Phase 2 partial)
Note: Remaining open checkboxes now reflect work still pending (tests, DI, env hierarchy docs).
- Recreate MQTT client on reconnect to apply broker/auth/TLS changes reliably
- Restart loop on new MQTT client instance after reconnect
- Track loop lifecycle to avoid stale client state
- Include MQTT section in initial ConfigServer current_config state
- Keep /config response consistent with persisted /data/config-active.yaml after restart
Result:
- Broker switches via Odoo push now connect reliably (including TLS/non-TLS changes)
- Bridge startup + persisted config reload now exposes mqtt data correctly via GET /config
- Event flow MQTT -> Bridge -> Odoo remains stable after container restarts
Alle Exception-Klassen haben jetzt sinnvolle __init__-Methoden:
- ConfigurationError: path Parameter für Config-Dateipfad
- ConfigValidationError: field + value für fehlerhafte Felder
- ConnectionError: service Parameter (mqtt/odoo)
- MQTTConnectionError: broker + port Parameter
- DeviceError: device_id Parameter
- ValidationError: field + value für Validierungsfehler
Vorher: Klassen hatten nur 'pass' (technisch korrekt, aber wenig nützlich)
Nachher: Strukturierte Fehlerkontext-Erfassung mit dedizierten Attributen
Beispiel:
# Alt: raise ConfigurationError('File not found', details={'path': ...})
# Neu: raise ConfigurationError('File not found', path='/etc/config.yaml')
Angepasst:
- config/loader.py: Nutzt neuen path-Parameter statt details-Dict
- Alle bestehenden Aufrufe bleiben kompatibel (backward-compatible)
Test: python3 -c 'from exceptions import *; e = MQTTConnectionError(...)'