Feature/mongo express rebrow #4

helioelias · 2023-07-14T18:07:08Z

Add mongo-express for MongoDb
Add rebrow for Redis
Add docker-compose-full include all services in one docker-compose file
Remove and ajust networks on docker-compose file

…ress and rebrow, tools for maintenance and visualize data

DavidsonGomes

Very good, let's add develop for the next version

…baileys_7 fix: ajustar a manipulação do remoteJid na mensagem

Comprehensive optimization of auto-restart and health check system. Resolved all identified issues including memory leaks, race conditions, performance bottlenecks, and edge cases. CRITICAL FIXES (Deploy ASAP): FIX EvolutionAPI#1: Safety Timeout Memory Leak - Save safetyTimeout reference to allow cancellation - Cancel timeout on connection 'open', logout, and exception - Prevents accumulation of uncancelled timeouts - Impact: Eliminates memory leak (100 restart = 100 timeout leak) FIX EvolutionAPI#2: Max ForceRestart Attempts + Rate Limiting - Track forceRestartAttempts (max 5) - Min 5s interval between force restarts - Send INSTANCE_STUCK webhook when max reached - Reset counter on successful 'open' - Impact: Prevents infinite restart loop, alerts unrecoverable instances FIX EvolutionAPI#3: Database Fallback in PerformHealthCheck - Wrap DB query in try-catch - Safe fallback: skip force restart if DB down - Use cached ownerJid when available - Impact: System continues functioning with DB issues HIGH PRIORITY FIXES: FIX EvolutionAPI#4: Health Check Jitter (Anti-Thundering Herd) - Random jitter ±10s on health check interval - Distributes load over 50-70s window instead of 60s spike - Impact: Prevents 100 instances all checking simultaneously FIX EvolutionAPI#5: Stop Health Check During Connecting - stopHealthCheck() when entering 'connecting' state - Avoids wasted resources and potential conflicts - Impact: Cleaner state transitions, less overhead FIX EvolutionAPI#6: Reset ownerJid on Logout - Update DB to set ownerJid=null on logout - Allows safe instance name reuse - Impact: Health check won't trigger on new QR scan for reused name MEDIUM PRIORITY FIXES: FIX EvolutionAPI#7: LoadProxy Mutex - Simple mutex lock to prevent concurrent loadProxy() calls - Retry with 100ms delay if lock held - Impact: Prevents proxy config corruption from race conditions FIX EvolutionAPI#8: Proxy Test Cache + ownerJid Cache - Cache proxy test results for 2 minutes - Cache ownerJid in memory to avoid DB queries - Impact: Reduces external API calls and DB load by ~90% FIX EvolutionAPI#9: Await ConnectionUpdate Events - Add await to connectionUpdate() call in eventHandler - Sequentializes connection events - Impact: Prevents race conditions on rapid state changes FIX EvolutionAPI#11: Conditional Logging - Log health check only on state changes or milestones - Impact: Reduces log spam from 1000 log/min to ~10 log/min CONSISTENCY FIXES: FIX EvolutionAPI#15: Flag Consistency - Set isAutoRestartTriggered in forceRestart() (was missing) - Consistent with autoRestart() behavior - Impact: Correct flag coordination TOTALS: - 2 files modified - ~180 lines added/modified - 15 bugs/issues fixed - 1 CRITICAL memory leak eliminated - 3 HIGH severity issues resolved - 9 MEDIUM severity improvements - 2 LOW priority optimizations BENEFITS: - No more permanent deadlocks (30s recovery max) - No memory leaks from uncancelled timeouts - Handles DB/Redis failures gracefully - Scales better with many instances (jitter, cache, rate limiting) - Comprehensive webhook monitoring for stuck instances - Alerts when instances are unrecoverable - Better log management (less spam) - Production-ready for high-load scenarios

…ents permanent stuck CRITICAL BUG FOUND: - Instance was stuck in 'connecting' state for 9+ hours this morning - wasOpenBeforeReconnect flag was lost during forceRestart() safety timeout - Timer auto-restart couldn't start → permanent stuck state - Manual server restart required to recover ROOT CAUSE: 4 locations in code were resetting/losing wasOpenBeforeReconnect flag: 1. forceRestart() safety timeout (line 1338-1342) 2. forceRestart() catch block (line 1359-1362) 3. Health check safety net (line 1051-1054) 4. autoRestart() catch block (line 880-883) IMPACT: When these code paths executed, wasOpenBeforeReconnect was reset to false. Next reconnection attempt → timer check fails → no auto-restart → stuck forever. SOLUTION: Add explicit comments in all 4 locations to preserve the flag: - Safety timeout: Do NOT reset wasOpenBeforeReconnect - Catch blocks: Do NOT reset wasOpenBeforeReconnect - Health check: Do NOT reset wasOpenBeforeReconnect This ensures the flag is ALWAYS preserved across: - Timeout scenarios - Exception scenarios - Safety net scenarios VERIFICATION: - Test scenario EvolutionAPI#2 (408 timeout): ✅ Passed, reconnected in 4s - Instance recovered immediately after server restart - Flag preservation logic now consistent across all paths FILES MODIFIED: - src/api/integrations/channel/whatsapp/whatsapp.baileys.service.ts FIXES: - Bug EvolutionAPI#1: forceRestart() safety timeout preserves flag - Bug EvolutionAPI#2: forceRestart() catch preserves flag - Bug EvolutionAPI#3: Health check preserves flag - Bug EvolutionAPI#4: autoRestart() catch preserves flag 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

FIX #0: Set wasOpenBeforeReconnect=true in forceRestart() when restarting from 'open' state - This was the main cause of today's blocking - the flag was being reset in 'open' handler - Now properly captures state before cleanup to allow auto-restart timer FIX EvolutionAPI#1: Add finally blocks to autoRestart() and forceRestart() - Ensures isRestartInProgress is always reset even on uncaught exceptions - Prevents deadlock scenarios where flag remains stuck FIX EvolutionAPI#2: Verify createClient() success - Throws error if client is null after createClient() completes - Prevents silent failures that could cause infinite loops FIX EvolutionAPI#4: Cancel existing timers in forceRestart() - Clears connectingTimer and safetyTimeout before setting flags - Prevents race conditions between timer execution and restart FIX EvolutionAPI#6: Prevent infinite loop in safety timeout - Sets isRestartInProgress=true BEFORE forcing close - This prevents connectionUpdate('close') from calling connectToWhatsapp() - Explicitly calls autoRestart() after delay instead of relying on close handler 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

helioelias added 3 commits July 14, 2023 17:35

Change network in docker-compose(api, mongo and redis), add mongo-exp…

4453dff

…ress and rebrow, tools for maintenance and visualize data

Change network in docker-compose(api, mongo and redis), add mongo-exp…

8c4f299

…ress and rebrow, tools for maintenance and visualize data

docker-compose full services

666023d

DavidsonGomes approved these changes Jul 14, 2023

View reviewed changes

DavidsonGomes changed the base branch from main to develop July 14, 2023 18:18

DavidsonGomes merged commit 0fc160f into EvolutionAPI:develop Jul 14, 2023

ricaelchiquetti pushed a commit to ricaelchiquetti/evolution that referenced this pull request Oct 14, 2025

Merge pull request EvolutionAPI#4 from ricaelchiquetti/fix/evolution_…

f0c6300

…baileys_7 fix: ajustar a manipulação do remoteJid na mensagem

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature/mongo express rebrow #4

Feature/mongo express rebrow #4

Uh oh!

helioelias commented Jul 14, 2023

Uh oh!

DavidsonGomes left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Feature/mongo express rebrow #4

Feature/mongo express rebrow #4

Uh oh!

Conversation

helioelias commented Jul 14, 2023

Uh oh!

DavidsonGomes left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants