Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(sdk): allow custom factors via parseBoxToScreenCoords api #162

Merged
merged 7 commits into from
Feb 28, 2025

Conversation

ulivz
Copy link
Collaborator

@ulivz ulivz commented Feb 27, 2025

Summary

While SDK support custom factors in the input stage via custom Model:

import { ChatCompletionMessageParam } from 'openai/resources/chat/completions';
import { UITarsModel } from '@ui-tars/sdk/core';

export class MyCustomModel extends UITarsModel {
  private readonly openai: AzureOpenAI;

  override get factors(): [number, number] {
    return [1366, 768];
  }

  // ...
}

The action processing stage(Operator) should also support custom

import { Operator, parseBoxToScreenCoords } from '@ui-tars/sdk/core';

export class MyCustomOperator extends Operator {
  public async execute(params: ExecuteParams): Promise<ExecuteOutput> {
    const coords = parseBoxToScreenCoords({
       boxStr: startBoxStr,
 <screenWidth>,  <screenHeight>,  [1366, 768]
   });
  }
}

This pull request# Summary

While SDK support custom factors in the input stage via custom Model:

import { ChatCompletionMessageParam } from 'openai/resources/chat/completions';
import { UITarsModel } from '@ui-tars/sdk/core';

export class MyCustomModel extends UITarsModel {
  private readonly openai: AzureOpenAI;

  override get factors(): [number, number] {
    return [1366, 768];
  }

  // ...
}

The action processing stage(Operator) should also support custom

import { Operator, parseBoxToScreenCoords } from '@ui-tars/sdk/core';

export class MyCustomOperator extends Operator {
  public async execute(params: ExecuteParams): Promise<ExecuteOutput> {
    const coords = parseBoxToScreenCoords({
       boxStr: params.startBoxStr,
       screenWidth: <screenWidth>, 
       screenHeight: <screenHeight>,  
       factors: [1366, 768]
   });
  }
}

This pull request also refactor the factors related code.

@ycjcl868
Copy link
Collaborator

There may be factors scattered in various places, I suggest passing factors into execute

Because the operator should not concern itself with factors

Copy link

codecov bot commented Feb 28, 2025

Codecov Report

Attention: Patch coverage is 91.66667% with 1 line in your changes missing coverage. Please review.

Project coverage is 24.93%. Comparing base (2845023) to head (bfe1b19).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
src/main/shared/setOfMarks.ts 50.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #162      +/-   ##
==========================================
+ Coverage   24.75%   24.93%   +0.18%     
==========================================
  Files          69       67       -2     
  Lines        1515     1516       +1     
  Branches      237      238       +1     
==========================================
+ Hits          375      378       +3     
+ Misses       1096     1094       -2     
  Partials       44       44              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ycjcl868 ycjcl868 merged commit 3327d9d into main Feb 28, 2025
10 checks passed
@ycjcl868 ycjcl868 deleted the refactor/coords branch February 28, 2025 03:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants