This tutorial introduces how to define a trainable workflow.

AgentJet provides two convenient and mutually compatible ways to wrap your Workflow:

  • Simple: Emphasizes simplicity, ease of use, and readability
  • Advanced: Emphasizes flexibility, controllability, and extensibility

In this article we use AgentScope framework for demonstration.

Simple Practice

Simple Practice Abstract

  • Simply set model argument in AgentScope ReActAgent argument to tuner.as_agentscope_model() when initializing your agent.
  • Wrap your code with class MyWorkflow(Workflow) and you agent is ready to be tuned.

1. When to Use This Simple Practice

Choose Simple Practice If You...

  • Know exactly which agents should be trained, or the number of agents is small
  • Already finished basic debugging of your workflow
  • Do not need to change which agents are trained on the fly

2. Convert Your Workflow to AgentJet Trainable Workflow

The very first step is to create a class as a container to wrap your code:

from ajet import AjetTuner, Workflow, WorkflowOutput, WorkflowTask
class MyWorkflow(Workflow):
    async def execute(self, workflow_task: WorkflowTask, tuner: AjetTuner) -> WorkflowOutput:
        # ... your ReActAgent workflow here 🛩️ ...
        return WorkflowOutput(reward=..., metadata={...})

Next, use the tuner argument, call its tuner.as_agentscope_model() method:

model = DashScopeChatModel(model_name="qwen-max", stream=False)  # 🛩️ change here
agent_instance = ReActAgent(
   name=f"Friday",
   sys_prompt="You are a helpful assistant",
   model=model,
   formatter=DashScopeChatFormatter(),
)
model = tuner.as_agentscope_model() # 🛩️ change here
agent_instance = ReActAgent(
   name=f"Friday",
   sys_prompt="You are a helpful assistant",
   model=model,
   formatter=DashScopeChatFormatter(),
)

AjetTuner

AjetTuner also has .as_raw_openai_sdk_client() and .as_oai_baseurl_apikey() method. But .as_agentscope_model() is more convenient for AgentScope agent workflow.

3. Code Example

Advanced Practice

Advanced Practice Abstract

  • The tuner.as_agentscope_model() function has hidden paramters, please further complete them to tell AgentJet the identity of agents.
  • The ajet.Workflow class has hidden attribute trainable_targets, please assign it manually to narrow down agents to be tuned.

1. When to Use Advanced Practice

When designing a multi-agent collaborative workflow where each agent plays a different role, AgentJet provides enhanced training and debugging capabilities.

Multi-Agent Benefits

With a multi-agent setup, you can:

  • Precisely control which agents are fine-tuned
  • Explicitly define the default model for agents not being trained
  • Switch trainable targets on the fly without modifying source code

1. How to promote to advanced agent scenario:

Simple, there are only two more issues that should be take care of in addition:

i. .as_agentscope_model has three hidden (optional) parameters, complete them for each agent.

parameter explaination
agent_name The name of this agent
target_tag A tag that mark the agent category
debug_model The model used when this agent is not being tuned
model_for_an_agent = tuner.as_agentscope_model(
    agent_name="AgentFriday",    # the name of this agent
    target_tag="Agent_Type_1",                # `target_tag in self.trainable_targets` means we train this agent, otherwise we do not train this agent.
    debug_model=OpenAIChatModel(
        model_name="Qwen/Qwen3-235B-A22B-Instruct-2507",
        stream=False,
        api_key="api_key",
    ),      # the model used when this agent is not in `self.trainable_targets`
)

ii. Workflow has a hidden (optional) attribute called trainable_targets, config it.

trainable_targets value explaination
trainable_targets = None All agents using as_agentscope_model will be trained
trainable_targets = ["Agent_Type_1", "Agent_Type_2"] Agents with target_tag=Agent_Type_1, target_tag=Agent_Type_2, ... will be trained
trainable_targets = [] Illegal, no agents are trained
Scenario Model Used
target_tag in trainable_targets Trainable model
target_tag NOT in trainable_targets Registered debug_model

Warning

Regardless of target_tag differences, all agents share a single model instance (one model weight to play different roles, the model receives different preceptions when playing different roles).

2. Multi-Agent Example

Here's a complete example with multiple agent roles (Werewolves game):

class ExampleWerewolves(Workflow):
    trainable_targets: List[str] | None = Field(default=["werewolf"], description="List of agents to be fine-tuned.")

    async def execute(self, workflow_task: WorkflowTask, tuner: AjetTuner) -> WorkflowOutput:

        # ensure trainable targets is legal
        assert self.trainable_targets is not None, "trainable_targets cannot be None in ExampleWerewolves (because we want to demonstrate a explicit multi-agent case)."

        # bad guys and good guys cannot be trained simultaneously
        # (because mix-cooperation-competition MARL needs too many advanced techniques to be displayed here)
        if "werewolf" in self.trainable_targets:
            assert len(self.trainable_targets) == 1, "Cannot train hostile roles simultaneously."
        else:
            assert len(self.trainable_targets) != 0, "No trainable targets specified."

        # make and shuffle roles (fix random seed for reproducibility)
        roles = ["werewolf"] * 3 + ["villager"] * 3 + ["seer", "witch", "hunter"]
        task_id = workflow_task.task.metadata["random_number"]
        np.random.seed(int(task_id))
        np.random.shuffle(roles)

        # initialize agents
        players = []
        for i, role in enumerate(roles):
            default_model = OpenAIChatModel(
                model_name="/mnt/data_cpfs/model_cache/modelscope/hub/Qwen/Qwen/Qwen3-235B-A22B-Instruct-2507/",
                stream=False,
                client_args={"base_url": "http://22.17.52.4:2888/v1"},
                api_key="no_api_key",
                generate_kwargs={"temperature": 0.01},
            )
            model_for_this_agent = tuner.as_agentscope_model(
                agent_name=f"Player{i + 1}",    # the name of this agent
                target_tag=role,                # `target_tag in self.trainable_targets` means we train this agent, otherwise we do not train this agent.
                debug_model=default_model,      # the model used when this agent is not in `self.trainable_targets`
            )
            agent = ReActAgent(
                name=f"Player{i + 1}",
                sys_prompt=get_official_agent_prompt(f"Player{i + 1}"),
                model=model_for_this_agent,
                formatter=DashScopeMultiAgentFormatter()
                    if role in self.trainable_targets
                    else OpenAIMultiAgentFormatter(),
                max_iters=3 if role in self.trainable_targets else 5,
            )
            # agent.set_console_output_enabled(False)
            players += [agent]

        # reward condition
        try:
            good_guy_win = await werewolves_game(players, roles)
            raw_reward = 0
            is_success = False
            if (good_guy_win and self.trainable_targets[0] != "werewolf") or (
                not good_guy_win and self.trainable_targets[0] == "werewolf"
            ):
                raw_reward = 1
                is_success = True
            logger.warning(f"Raw reward: {raw_reward}")
            logger.warning(f"Is success: {is_success}")
        except BadGuyException as e:
            logger.bind(exception=True).exception(
                f"Error during game execution. Game cannot continue, whatever the cause, let's punish trainable agents  (Although they maybe innocent)."
            )
            raw_reward = -0.1
            is_success = False
        except Exception as e:
            logger.bind(exception=True).exception(
                f"Error during game execution. Game cannot continue, whatever the cause, let's punish trainable agents  (Although they maybe innocent)."
            )
            raw_reward = -0.1
            is_success = False

        return WorkflowOutput(reward=raw_reward, is_success=is_success)

Configuration Flexibility

In this example:

  • role describes an agent's in-game identity (werewolf, villager, etc.)
  • chosen_model defines the default model when the role is not being trained
  • You can flexibly switch training targets by modifying trainable_targets

TinkerJet

Wrapping and training your agent on a machine without GPU.

Working in progress and comming soon.

Next Steps