Heartbeat WhatsApp Evolution (Grupo 6.1): detecção + incident + alerta admin

Detecta celular desconectado antes de falhar envios silenciosamente.

Banco (migration 20260423000002):
- Tabela whatsapp_connection_incidents (tenant_id, channel_id, kind,
  started_at, resolved_at, duration_seconds, notified_at, details).
  UNIQUE parcial garante no máximo 1 incident aberto por channel.
- RPCs whatsapp_heartbeat_open_incident (idempotente), _resolve_open_incidents
  e _mark_notified. Service_role only.
- RLS: membros do tenant leem, saas_admin tudo.
- ALTER notifications.type pra aceitar 'system_alert' (usado pelo alerta).

Edge function whatsapp-heartbeat-check:
- Varre notification_channels provider=evolution_api e ativos.
- GET {api_url}/instance/connectionState/{instance} (timeout 8s, rewrite
  localhost → host.docker.internal pra containers).
- Mapeia state pra connection_status (open/connecting/qr_pending/
  disconnected/error), persiste + last_health_check.
- Lógica de threshold: marca first_unhealthy_at em metadata na primeira
  falha; só abre incident após heartbeat_threshold_minutes (default 5).
- Notifica admins ativos (clinic_admin/tenant_admin) do tenant via
  insert em notifications. Anti-spam: só notifica 1x por incident.
- Aceita ?channel_id=X pra check on-demand de um tenant específico.

UI tenant (/configuracoes/whatsapp-pessoal):
- Novo card "Monitoramento de conexão" com toggle alerts_enabled +
  InputNumber threshold (1-60 min). Persiste em
  notification_channels.metadata.
- Histórico últimos 7 dias: kind (tag colorida), aberto/resolvido,
  início → fim, duração formatada (Ns/Xmin Ys/Nh Xmin).

UI SaaS (/saas/whatsapp):
- Badge "N incidents abertos" no header quando há algum.
- Botão "Verificar tudo agora" invoca a edge function e atualiza a lista.
- Tabela enriquecida: coluna Status ganha pill "Incident aberto",
  colunas novas Última check e Incidents 7d (em laranja se > 0).

Cron template no final da migration (comentado — descomentar
cron.schedule pra ativar 2min periódico).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Leonardo
2026-04-23 07:49:09 -03:00
parent f76a2e3033
commit e1f756ea82
4 changed files with 832 additions and 13 deletions
@@ -0,0 +1,272 @@
-- ==========================================================================
-- Agencia PSI — Migracao: Heartbeat de conexao WhatsApp (Grupo 6.1)
-- ==========================================================================
-- Criado por: Leonardo Nohama
-- Data: 2026-04-23 · Sao Carlos/SP — Brasil
--
-- Contexto: notification_channels ja tem connection_status e last_health_check,
-- mas nao ha tracking de incidents (quando conexao caiu, quanto tempo ficou
-- fora, se ja alertou o admin). Criamos tabela de incidents + helpers RPC.
--
-- Modelo:
-- - notification_channels.connection_status (ja existe) = estado atual
-- - notification_channels.last_health_check (ja existe) = ultima check
-- - notification_channels.metadata (ja existe) = config (threshold, toggles)
--
-- Novo:
-- - whatsapp_connection_incidents = eventos de degradacao (open/close)
-- - RPC open_incident (idempotente — ignora se ja tem aberto)
-- - RPC resolve_open_incidents (fecha todos abertos pro channel)
--
-- Fluxo esperado (edge function whatsapp-heartbeat-check):
-- 1. Lista channels evolution_api ativos
-- 2. Pra cada: GET /instance/connectionState/{instance}
-- 3. Atualiza connection_status + last_health_check
-- 4. Se state != 'open' por > threshold minutos → open_incident (se nao tem)
-- e dispara notificacao pros admins do tenant
-- 5. Se state == 'open' → resolve_open_incidents (se tiver aberto)
-- ==========================================================================
-- ---------------------------------------------------------------------------
-- Tabela: whatsapp_connection_incidents
-- ---------------------------------------------------------------------------
CREATE TABLE IF NOT EXISTS public.whatsapp_connection_incidents (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
tenant_id UUID NOT NULL REFERENCES public.tenants(id) ON DELETE CASCADE,
channel_id UUID NOT NULL REFERENCES public.notification_channels(id) ON DELETE CASCADE,
provider TEXT NOT NULL CHECK (provider IN ('evolution_api', 'twilio')),
-- Tipo do incident (snapshot do connection_status que abriu)
kind TEXT NOT NULL CHECK (kind IN ('disconnected', 'error', 'qr_pending', 'connecting', 'unknown')),
-- Snapshots pra auditoria
last_state TEXT, -- estado quando abriu
details JSONB, -- payload bruto do provider
started_at TIMESTAMPTZ NOT NULL DEFAULT now(),
resolved_at TIMESTAMPTZ, -- NULL enquanto aberto
-- Tempo caido (preenchido ao resolver)
duration_seconds INT,
-- Controle de notificacao (anti-spam)
notified_at TIMESTAMPTZ, -- quando enviou notificacao pros admins
notification_count INT NOT NULL DEFAULT 0,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
-- Trigger updated_at
DROP TRIGGER IF EXISTS trg_wa_incidents_updated_at ON public.whatsapp_connection_incidents;
CREATE TRIGGER trg_wa_incidents_updated_at
BEFORE UPDATE ON public.whatsapp_connection_incidents
FOR EACH ROW EXECUTE FUNCTION public.set_updated_at();
-- Apenas 1 incident aberto por channel (constraint de negocio)
CREATE UNIQUE INDEX IF NOT EXISTS uq_wa_incidents_open_per_channel
ON public.whatsapp_connection_incidents (channel_id)
WHERE resolved_at IS NULL;
CREATE INDEX IF NOT EXISTS idx_wa_incidents_tenant_started
ON public.whatsapp_connection_incidents (tenant_id, started_at DESC);
CREATE INDEX IF NOT EXISTS idx_wa_incidents_open
ON public.whatsapp_connection_incidents (resolved_at)
WHERE resolved_at IS NULL;
COMMENT ON TABLE public.whatsapp_connection_incidents IS
'Eventos de degradacao de conexao WhatsApp (evolution/twilio). Max 1 aberto por channel (UNIQUE parcial).';
-- ---------------------------------------------------------------------------
-- RPC: whatsapp_heartbeat_open_incident
-- Idempotente — se ja tem incident aberto no channel, retorna o existente.
-- ---------------------------------------------------------------------------
CREATE OR REPLACE FUNCTION public.whatsapp_heartbeat_open_incident(
p_channel_id UUID,
p_kind TEXT,
p_last_state TEXT DEFAULT NULL,
p_details JSONB DEFAULT NULL
)
RETURNS UUID
LANGUAGE plpgsql
SECURITY DEFINER
SET search_path = public
AS $$
DECLARE
v_tenant_id UUID;
v_provider TEXT;
v_existing_id UUID;
v_new_id UUID;
BEGIN
-- Busca tenant/provider do channel
SELECT tenant_id, provider INTO v_tenant_id, v_provider
FROM public.notification_channels
WHERE id = p_channel_id
AND deleted_at IS NULL;
IF NOT FOUND THEN
RAISE EXCEPTION 'channel_not_found';
END IF;
IF p_kind NOT IN ('disconnected', 'error', 'qr_pending', 'connecting', 'unknown') THEN
RAISE EXCEPTION 'invalid_kind: %', p_kind;
END IF;
-- Ja tem aberto? Retorna o mesmo id (idempotente)
SELECT id INTO v_existing_id
FROM public.whatsapp_connection_incidents
WHERE channel_id = p_channel_id
AND resolved_at IS NULL;
IF FOUND THEN
-- Atualiza o incident existente com detalhes frescos
UPDATE public.whatsapp_connection_incidents
SET last_state = COALESCE(p_last_state, last_state),
details = COALESCE(p_details, details),
kind = p_kind -- pode mudar de qr_pending → disconnected, por ex
WHERE id = v_existing_id;
RETURN v_existing_id;
END IF;
-- Abre novo
INSERT INTO public.whatsapp_connection_incidents
(tenant_id, channel_id, provider, kind, last_state, details)
VALUES
(v_tenant_id, p_channel_id, v_provider, p_kind, p_last_state, p_details)
RETURNING id INTO v_new_id;
RETURN v_new_id;
END;
$$;
REVOKE ALL ON FUNCTION public.whatsapp_heartbeat_open_incident(UUID, TEXT, TEXT, JSONB) FROM PUBLIC;
GRANT EXECUTE ON FUNCTION public.whatsapp_heartbeat_open_incident(UUID, TEXT, TEXT, JSONB) TO service_role;
-- ---------------------------------------------------------------------------
-- RPC: whatsapp_heartbeat_resolve_open_incidents
-- Fecha todos os incidents abertos de um channel. Retorna quantos fechou.
-- ---------------------------------------------------------------------------
CREATE OR REPLACE FUNCTION public.whatsapp_heartbeat_resolve_open_incidents(
p_channel_id UUID
)
RETURNS INT
LANGUAGE plpgsql
SECURITY DEFINER
SET search_path = public
AS $$
DECLARE
v_count INT := 0;
BEGIN
UPDATE public.whatsapp_connection_incidents
SET resolved_at = now(),
duration_seconds = EXTRACT(EPOCH FROM (now() - started_at))::INT
WHERE channel_id = p_channel_id
AND resolved_at IS NULL;
GET DIAGNOSTICS v_count = ROW_COUNT;
RETURN v_count;
END;
$$;
REVOKE ALL ON FUNCTION public.whatsapp_heartbeat_resolve_open_incidents(UUID) FROM PUBLIC;
GRANT EXECUTE ON FUNCTION public.whatsapp_heartbeat_resolve_open_incidents(UUID) TO service_role;
-- ---------------------------------------------------------------------------
-- RPC: whatsapp_heartbeat_mark_notified
-- Marca incident como notificado (anti-spam de alertas).
-- ---------------------------------------------------------------------------
CREATE OR REPLACE FUNCTION public.whatsapp_heartbeat_mark_notified(
p_incident_id UUID
)
RETURNS VOID
LANGUAGE plpgsql
SECURITY DEFINER
SET search_path = public
AS $$
BEGIN
UPDATE public.whatsapp_connection_incidents
SET notified_at = now(),
notification_count = notification_count + 1
WHERE id = p_incident_id;
END;
$$;
REVOKE ALL ON FUNCTION public.whatsapp_heartbeat_mark_notified(UUID) FROM PUBLIC;
GRANT EXECUTE ON FUNCTION public.whatsapp_heartbeat_mark_notified(UUID) TO service_role;
-- ---------------------------------------------------------------------------
-- RLS
-- ---------------------------------------------------------------------------
ALTER TABLE public.whatsapp_connection_incidents ENABLE ROW LEVEL SECURITY;
DROP POLICY IF EXISTS "wa_incidents: select membros/admin" ON public.whatsapp_connection_incidents;
CREATE POLICY "wa_incidents: select membros/admin"
ON public.whatsapp_connection_incidents
FOR SELECT
TO authenticated
USING (
public.is_saas_admin()
OR EXISTS (
SELECT 1 FROM public.tenant_members tm
WHERE tm.tenant_id = whatsapp_connection_incidents.tenant_id
AND tm.user_id = auth.uid()
AND tm.status = 'active'
)
);
-- Write apenas via service_role (edge function cron)
DROP POLICY IF EXISTS "wa_incidents: write service_role" ON public.whatsapp_connection_incidents;
CREATE POLICY "wa_incidents: write service_role"
ON public.whatsapp_connection_incidents
FOR ALL
TO service_role
USING (true)
WITH CHECK (true);
-- ---------------------------------------------------------------------------
-- Expandir notifications.type pra aceitar 'system_alert' (usado por heartbeat)
-- ---------------------------------------------------------------------------
ALTER TABLE public.notifications
DROP CONSTRAINT IF EXISTS notifications_type_check;
ALTER TABLE public.notifications
ADD CONSTRAINT notifications_type_check
CHECK (type = ANY (ARRAY[
'new_scheduling'::text,
'new_patient'::text,
'recurrence_alert'::text,
'session_status'::text,
'inbound_message'::text,
'system_alert'::text
]));
-- ---------------------------------------------------------------------------
-- Cron job (TEMPLATE — descomentar pra ativar)
-- ---------------------------------------------------------------------------
-- Checa heartbeat de todos os tenants Evolution ativos a cada 2 minutos.
-- Threshold de minutos fora do ar antes de abrir incident fica em
-- notification_channels.metadata.heartbeat_threshold_minutes (default 5).
--
-- SELECT cron.schedule(
-- 'whatsapp-heartbeat-every-2min',
-- '*/2 * * * *',
-- $$
-- SELECT net.http_post(
-- url := current_setting('app.settings.supabase_url') || '/functions/v1/whatsapp-heartbeat-check',
-- headers := jsonb_build_object(
-- 'Authorization', 'Bearer ' || current_setting('app.settings.service_role_key'),
-- 'Content-Type', 'application/json'
-- ),
-- body := '{}'::jsonb
-- );
-- $$
-- );
--
-- Desativar: SELECT cron.unschedule('whatsapp-heartbeat-every-2min');