Change network in docker-compose(api, mongo and redis), add mongo-exp… #2

helioelias · 2023-07-14T17:53:13Z

Change network in docker-compose Evolution-API to work in Ubuntu Server and Linux Mint
Add mongo-express to admin in interface for MongoDB
Add rebrow to look Redis store

…ress and rebrow, tools for maintenance and visualize data

Develop

Fix/business api

Critical bug fix for auto-restart system deadlock. Problem: - isAutoRestarting flag gets stuck at true when: 1. ForceRestart triggers due to proxy failure 2. connectToWhatsapp() creates client but doesn't reach 'open' (proxy still down) 3. Flag never resets, blocking ALL future reconnect attempts 4. Instance stuck in 'connecting' state permanently 5. Logs show: 'Skipping auto-reconnect (auto-restart in progress)' for 14+ minutes Root Cause: - forceRestart() sets isAutoRestarting=true - connectToWhatsapp() returns immediately (doesn't wait for 'open') - No timeout to reset flag if connection fails - Flag only resets on 'open' (success) or exception Solution (Double Safety Net): Fix EvolutionAPI#1 - Safety timeout in forceRestart() (30s): - After connectToWhatsapp(), set 30s timeout - If state != 'open', reset isAutoRestarting flag - Sends INSTANCE_STUCK webhook for monitoring - Primary recovery mechanism Fix EvolutionAPI#2 - Health check backup (60s): - Detects stuck isAutoRestarting flag > 60s - Forcefully resets all restart-related flags - Secondary safety net if Fix EvolutionAPI#1 fails - Runs every 60s via health check Benefits: - Recovery in 30s instead of permanent deadlock - Emulates manual restart behavior (which always works) - Webhook monitoring for stuck flags - Fail-safe with dual timeout protection - No changes to core reconnection logic

Comprehensive optimization of auto-restart and health check system. Resolved all identified issues including memory leaks, race conditions, performance bottlenecks, and edge cases. CRITICAL FIXES (Deploy ASAP): FIX EvolutionAPI#1: Safety Timeout Memory Leak - Save safetyTimeout reference to allow cancellation - Cancel timeout on connection 'open', logout, and exception - Prevents accumulation of uncancelled timeouts - Impact: Eliminates memory leak (100 restart = 100 timeout leak) FIX EvolutionAPI#2: Max ForceRestart Attempts + Rate Limiting - Track forceRestartAttempts (max 5) - Min 5s interval between force restarts - Send INSTANCE_STUCK webhook when max reached - Reset counter on successful 'open' - Impact: Prevents infinite restart loop, alerts unrecoverable instances FIX EvolutionAPI#3: Database Fallback in PerformHealthCheck - Wrap DB query in try-catch - Safe fallback: skip force restart if DB down - Use cached ownerJid when available - Impact: System continues functioning with DB issues HIGH PRIORITY FIXES: FIX EvolutionAPI#4: Health Check Jitter (Anti-Thundering Herd) - Random jitter ±10s on health check interval - Distributes load over 50-70s window instead of 60s spike - Impact: Prevents 100 instances all checking simultaneously FIX EvolutionAPI#5: Stop Health Check During Connecting - stopHealthCheck() when entering 'connecting' state - Avoids wasted resources and potential conflicts - Impact: Cleaner state transitions, less overhead FIX EvolutionAPI#6: Reset ownerJid on Logout - Update DB to set ownerJid=null on logout - Allows safe instance name reuse - Impact: Health check won't trigger on new QR scan for reused name MEDIUM PRIORITY FIXES: FIX EvolutionAPI#7: LoadProxy Mutex - Simple mutex lock to prevent concurrent loadProxy() calls - Retry with 100ms delay if lock held - Impact: Prevents proxy config corruption from race conditions FIX EvolutionAPI#8: Proxy Test Cache + ownerJid Cache - Cache proxy test results for 2 minutes - Cache ownerJid in memory to avoid DB queries - Impact: Reduces external API calls and DB load by ~90% FIX EvolutionAPI#9: Await ConnectionUpdate Events - Add await to connectionUpdate() call in eventHandler - Sequentializes connection events - Impact: Prevents race conditions on rapid state changes FIX EvolutionAPI#11: Conditional Logging - Log health check only on state changes or milestones - Impact: Reduces log spam from 1000 log/min to ~10 log/min CONSISTENCY FIXES: FIX EvolutionAPI#15: Flag Consistency - Set isAutoRestartTriggered in forceRestart() (was missing) - Consistent with autoRestart() behavior - Impact: Correct flag coordination TOTALS: - 2 files modified - ~180 lines added/modified - 15 bugs/issues fixed - 1 CRITICAL memory leak eliminated - 3 HIGH severity issues resolved - 9 MEDIUM severity improvements - 2 LOW priority optimizations BENEFITS: - No more permanent deadlocks (30s recovery max) - No memory leaks from uncancelled timeouts - Handles DB/Redis failures gracefully - Scales better with many instances (jitter, cache, rate limiting) - Comprehensive webhook monitoring for stuck instances - Alerts when instances are unrecoverable - Better log management (less spam) - Production-ready for high-load scenarios

…ents permanent stuck CRITICAL BUG FOUND: - Instance was stuck in 'connecting' state for 9+ hours this morning - wasOpenBeforeReconnect flag was lost during forceRestart() safety timeout - Timer auto-restart couldn't start → permanent stuck state - Manual server restart required to recover ROOT CAUSE: 4 locations in code were resetting/losing wasOpenBeforeReconnect flag: 1. forceRestart() safety timeout (line 1338-1342) 2. forceRestart() catch block (line 1359-1362) 3. Health check safety net (line 1051-1054) 4. autoRestart() catch block (line 880-883) IMPACT: When these code paths executed, wasOpenBeforeReconnect was reset to false. Next reconnection attempt → timer check fails → no auto-restart → stuck forever. SOLUTION: Add explicit comments in all 4 locations to preserve the flag: - Safety timeout: Do NOT reset wasOpenBeforeReconnect - Catch blocks: Do NOT reset wasOpenBeforeReconnect - Health check: Do NOT reset wasOpenBeforeReconnect This ensures the flag is ALWAYS preserved across: - Timeout scenarios - Exception scenarios - Safety net scenarios VERIFICATION: - Test scenario EvolutionAPI#2 (408 timeout): ✅ Passed, reconnected in 4s - Instance recovered immediately after server restart - Flag preservation logic now consistent across all paths FILES MODIFIED: - src/api/integrations/channel/whatsapp/whatsapp.baileys.service.ts FIXES: - Bug EvolutionAPI#1: forceRestart() safety timeout preserves flag - Bug EvolutionAPI#2: forceRestart() catch preserves flag - Bug EvolutionAPI#3: Health check preserves flag - Bug EvolutionAPI#4: autoRestart() catch preserves flag 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

FIX #0: Set wasOpenBeforeReconnect=true in forceRestart() when restarting from 'open' state - This was the main cause of today's blocking - the flag was being reset in 'open' handler - Now properly captures state before cleanup to allow auto-restart timer FIX EvolutionAPI#1: Add finally blocks to autoRestart() and forceRestart() - Ensures isRestartInProgress is always reset even on uncaught exceptions - Prevents deadlock scenarios where flag remains stuck FIX EvolutionAPI#2: Verify createClient() success - Throws error if client is null after createClient() completes - Prevents silent failures that could cause infinite loops FIX EvolutionAPI#4: Cancel existing timers in forceRestart() - Clears connectingTimer and safetyTimeout before setting flags - Prevents race conditions between timer execution and restart FIX EvolutionAPI#6: Prevent infinite loop in safety timeout - Sets isRestartInProgress=true BEFORE forcing close - This prevents connectionUpdate('close') from calling connectToWhatsapp() - Explicitly calls autoRestart() after delay instead of relying on close handler 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

FIX EvolutionAPI#1 - Memory Leak Event Listeners: - Add eventProcessUnsubscribe field to store ev.process() return value - Save unsubscribe function in eventHandler() - Call unsubscribe in cleanupClient() BEFORE client.end() - Prevents accumulation of stale listeners on each restart FIX EvolutionAPI#2 - Graceful Shutdown Parallel: - Change from sequential for-loop to Promise.allSettled() - Increase per-instance timeout from 5s to 10s - Skip instances in 'connecting' state - Add summary logging of shutdown results - All instances now close concurrently instead of sequentially 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

helioelias added 2 commits July 14, 2023 17:35

Change network in docker-compose(api, mongo and redis), add mongo-exp…

4453dff

…ress and rebrow, tools for maintenance and visualize data

Change network in docker-compose(api, mongo and redis), add mongo-exp…

8c4f299

…ress and rebrow, tools for maintenance and visualize data

helioelias closed this Jul 14, 2023

helioelias deleted the feature/mongo-express-rebrow branch July 14, 2023 18:00

mordocks mentioned this pull request May 31, 2024

[PT][BUG] EvoAPI não cria caixa de entrada no Chatwoot. #545

Closed

DavidsonGomes pushed a commit that referenced this pull request May 15, 2025

Merge pull request #2 from EvolutionAPI/develop

4f2b0c4

Develop

ricaelchiquetti pushed a commit to ricaelchiquetti/evolution that referenced this pull request Sep 19, 2025

Merge pull request EvolutionAPI#2 from ricaelchiquetti/fix/business_api

ed4c886

Fix/business api

jdinix mentioned this pull request Oct 14, 2025

QRCODE nao gerando, mensagens nao enviando e nem recebendo #2084

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Change network in docker-compose(api, mongo and redis), add mongo-exp… #2

Change network in docker-compose(api, mongo and redis), add mongo-exp… #2

Uh oh!

helioelias commented Jul 14, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Change network in docker-compose(api, mongo and redis), add mongo-exp… #2

Change network in docker-compose(api, mongo and redis), add mongo-exp… #2

Uh oh!

Conversation

helioelias commented Jul 14, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant