Skip to content

Conversation

@icecrasher321
Copy link
Collaborator

@icecrasher321 icecrasher321 commented Dec 22, 2025

Summary

  • Memory + Streaming fixes, need to wrap streaming response to persist memory at the end of accumulation
  • Memory should not be block scoped. Managed via conversation id.
  • Move cost calculation to same level as token count calcualtion. This fixes bugs related to client side execution not tracking cost for chat
  • Type providers using official SDK types and remove fallbacks that were difficult to see when hit
  • Support for parallel tool calls

Type of Change

  • Bug fix
  • New feature

Testing

Tested manually with all providers with streaming and no streaming for OpenAI, Anthropic, Gemini.

Checklist

  • Code follows project style guidelines
  • Self-reviewed my changes
  • Tests added/updated and passing
  • No new warnings introduced
  • I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)

@vercel
Copy link

vercel bot commented Dec 22, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Review Updated (UTC)
docs Ready Ready Preview, Comment Dec 22, 2025 10:57pm

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Dec 22, 2025

Greptile Summary

This PR refactors memory management, improves provider type safety, and consolidates cost calculation logic. The changes address three key areas:

Memory Management Improvements

  • Changed memory scoping from workflowId to workspaceId for conversation-level persistence
  • Wrapped streaming responses to persist assistant messages after stream completion
  • Used atomic PostgreSQL array concatenation (||) for race-safe appends
  • Migration backfills workspaceId from workflows and deletes orphaned records

Provider Type Safety

  • Added explicit TypeScript types from official SDKs (ChatCompletionCreateParamsStreaming, RawMessageStreamEvent, etc.)
  • Removed fallback code for legacy response formats (Google's content.function_call)
  • Enhanced streaming utilities to track token usage from provider events

Cost Calculation Consolidation

  • Moved cost calculation from logger to provider layer for consistency
  • Cost now calculated in real-time during streaming using onComplete callbacks
  • Both streaming and non-streaming paths now compute costs at the same level

Issues Found

  • Memory seeding in agent-handler.ts:714 has race condition potential when concurrent requests use the same conversationId
  • Sliding window memory modes fetch-modify-write, which could lose messages during concurrent appends
  • Cost tracking happens in streaming callback before response completes - interrupted streams may record costs for incomplete responses

Confidence Score: 4/5

  • Safe to merge with minor race condition risks in high-concurrency scenarios
  • The PR implements solid architectural improvements with proper type safety and atomic database operations. However, memory seeding and sliding window modes have race condition vulnerabilities when multiple requests use the same conversationId concurrently. These are edge cases that may not surface in typical usage but could cause message loss or duplication under load. The provider type safety improvements and cost calculation consolidation are well-executed.
  • apps/sim/executor/handlers/agent/agent-handler.ts (seeding race condition) and apps/sim/executor/handlers/agent/memory.ts (sliding window race condition)

Important Files Changed

Filename Overview
apps/sim/executor/handlers/agent/memory.ts Refactored memory from workflow-scoped to workspace-scoped with atomic PostgreSQL operations, but sliding window modes still have race condition risk
apps/sim/executor/handlers/agent/agent-handler.ts Wrapped streaming responses for memory persistence and moved memory seeding logic, but seeding has potential race condition with concurrent requests
apps/sim/providers/openai/index.ts Added proper TypeScript types from OpenAI SDK and moved cost calculation into streaming callback for real-time tracking
apps/sim/providers/anthropic/utils.ts Enhanced streaming to track token usage from Anthropic SDK events and invoke callback with accumulated content and usage metrics
packages/db/schema.ts Changed memory table from workflowId to workspaceId scoping with unique constraint on workspace+key
packages/db/migrations/0130_bored_master_chief.sql Migration backfills workspaceId from workflows and deletes orphaned memory records that couldn't be resolved
apps/sim/providers/google/utils.ts Removed legacy function_call format fallback, now only supports functionCall in parts

Sequence Diagram

sequenceDiagram
    participant Client
    participant AgentHandler
    participant Provider
    participant MemoryService
    participant Database

    Client->>AgentHandler: execute(block, inputs)
    AgentHandler->>AgentHandler: buildMessages()
    
    alt Memory Enabled
        AgentHandler->>MemoryService: fetchMemoryMessages()
        MemoryService->>Database: SELECT data FROM memory
        Database-->>MemoryService: existing messages
        
        alt First Run (no existing)
            AgentHandler->>MemoryService: seedMemory(conversationMessages)
            MemoryService->>Database: INSERT memory record
        else Existing Memory
            alt New User Message
                AgentHandler->>MemoryService: appendToMemory(userMessage)
                MemoryService->>Database: UPDATE with array concat
            end
        end
    end

    AgentHandler->>Provider: executeProviderRequest()
    
    alt Streaming Response
        Provider-->>AgentHandler: StreamingExecution
        
        alt Memory Enabled
            AgentHandler->>MemoryService: wrapStreamForPersistence()
            MemoryService-->>AgentHandler: wrapped stream
        end
        
        AgentHandler-->>Client: stream chunks
        
        Note over MemoryService: Stream accumulates content
        
        MemoryService->>MemoryService: flush() callback
        MemoryService->>Database: UPDATE memory (append assistant message)
        MemoryService->>Provider: calculate cost from usage
        
    else Non-Streaming Response
        Provider-->>AgentHandler: BlockOutput
        Provider->>Provider: calculate cost from tokens
        
        alt Memory Enabled
            AgentHandler->>MemoryService: persistResponseToMemory()
            MemoryService->>Database: UPDATE memory (append assistant)
        end
        
        AgentHandler-->>Client: complete response
    end
Loading

@icecrasher321
Copy link
Collaborator Author

@greptile

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

39 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

@icecrasher321
Copy link
Collaborator Author

@greptile

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

52 files reviewed, 4 comments

Edit Code Review Agent Settings | Greptile

@icecrasher321
Copy link
Collaborator Author

@greptile

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional Comments (1)

  1. apps/sim/providers/openai/utils.ts, line 37-39 (link)

    logic: warns if usage data missing, but still calls onComplete with zeros

    if OpenAI doesn't return usage tokens (which can happen in some edge cases or errors), cost calculation will be zero. this creates silent underreporting of costs rather than failing visibly.

    consider whether this should throw or use token estimation as fallback. is zero-cost acceptable for streams without usage data, or should you estimate tokens from accumulated content?

    Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

54 files reviewed, 12 comments

Edit Code Review Agent Settings | Greptile

@waleedlatif1
Copy link
Collaborator

@greptile

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

115 files reviewed, 6 comments

Edit Code Review Agent Settings | Greptile

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants