Skip to content

fix(api): fix setTargetBlockTemperature() race condition#21278

Open
mjhuff wants to merge 1 commit intoedgefrom
api_fix-set-target-block-temp-race-condition
Open

fix(api): fix setTargetBlockTemperature() race condition#21278
mjhuff wants to merge 1 commit intoedgefrom
api_fix-set-target-block-temp-race-condition

Conversation

@mjhuff
Copy link
Copy Markdown
Contributor

@mjhuff mjhuff commented Apr 15, 2026

Closes RESC-564

Overview

A race condition causes the PE waitForBlockTemperature() to read stale thermocycler state before the PE setTargetBlockTemperature()'s background task has updated it. This is most visible when calling the PAPI set_block_temperature() twice at the same temperature, first without a hold, then with one, causing the hold to be skipped entirely.

For example

tc.set_block_temperature(temperature=37) 
tc.set_block_temperature(temperature=37, hold_time_minutes=1)

When the PAPI set_block_temperature() is called, it dispatches two PE commands:

  1. setTargetBlockTemperature, which spawns a background task and returns immediately.
  2. waitForBlockTemperature, which runs concurrently, reading TC state.

The explicit wait can read the hold_time value before the background task has sent the new M104 command to the firmware, causing it to see stale data (ex., hold_time=0) from the previous command and exit immediately.

To fix, we can add a wait_for_lock_release() to TaskHandler and utilize it in waitForBlockTemperature() before reading thermocycler state. This ensures the background task from setTargetBlockTemperature() has completed sending the M104 command before the explicit wait reads the hold time.

Test Plan and Hands on Testing

Ran the following minimal protocol on edge and then on this branch to confirm the fix:

from opentrons import protocol_api

metadata = {
    "protocolName": "TC Hold Bug",
}

requirements = {"robotType": "Flex", "apiLevel": "2.28"}


def run(protocol: protocol_api.ProtocolContext) -> None:
    tc = protocol.load_module("thermocyclerModuleV2", "B1")
    tc.open_lid()

    tc.set_block_temperature(temperature=37.0)

    # Hold time should be respected.
    tc.set_block_temperature(temperature=37.0, hold_time_minutes=1)

    tc.deactivate_block()

Changelog

  • Fixed a race condition in set_block_temperature() that may cause the thermocycler not to respect the hold_time value before proceeding to the next protocol command.

Review requests

  • Are we happy with the solution's layer of abstraction? I think it's either here in PE land or in hardware land, and it seems more correct to place the lock release logic here, but I'm open to thoughts.

Risk assessment

  • low, in theory. the lock is ephemeral, but perhaps TC logic elsewhere is inadvertently dependent on this bug existing for proper functionality.

Copy link
Copy Markdown
Member

@sfoster1 sfoster1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, that's gross. Good catch!

Copy link
Copy Markdown
Contributor

@rclarke0 rclarke0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, I think this placement makes sense

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants