Skip to main content

Documentation Index

Fetch the complete documentation index at: https://askui-docs-on-premise-architecture.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

When interacting with UI elements in AskUI, following these best practices ensures robust and reliable automation workflows.

Action Selection

Choose the Right Action

  • Use the simplest action that accomplishes your goal
    • Prefer click() over complex mouse operations when possible
    • Use type() for text input instead of individual keyboard events
    • Choose native methods over tool-based approaches when available

Consider Element State

  • Verify element visibility before interacting
  • Check if elements are enabled before clicking
  • Account for loading states and dynamic content
  • Handle overlapping elements by being specific with locators

Keyboard Operations

Modifier Key Management

  • Always release modifier keys after use to prevent stuck keys
  • Use context managers when possible for automatic cleanup
  • Combine modifiers properly for complex shortcuts
# Good: Automatic cleanup
with VisionAgent() as agent:
    agent.keyboard("a", modifier_keys=["control"])
    
# Also good: Explicit key management
agent.key_down("control")
agent.keyboard("a")
agent.key_up("control")

Key Combination Best Practices

  • Use standard shortcuts that work across platforms
  • Allow time between keystrokes for complex sequences
  • Test keyboard operations across different OS environments

Tool Usage Guidelines

Tool Selection

  1. OS Tools: Use for low-level system operations
  2. Browser Tools: Use for web-specific tasks
  3. Clipboard Tools: Use for cross-application data transfer
  4. Native Methods: Prefer these for standard interactions

Tool Combination

# Example: Combining tools effectively
with VisionAgent() as agent:
    # Copy from one application
    agent.click("data field")
    agent.keyboard("a", modifier_keys=["control"])
    agent.keyboard("c", modifier_keys=["control"])
    
    # Switch application and paste
    agent.keyboard("tab", modifier_keys=["alt"])
    agent.click("input field")
    agent.keyboard("v", modifier_keys=["control"])

Information Extraction

Query Specificity

  • Be specific in your questions to get accurate results
  • Use structured schemas for complex data extraction
  • Validate extracted data before using it
from askui import ResponseSchemaBase

class ProductInfo(ResponseSchemaBase):
    name: str
    price: float
    in_stock: bool

# Specific query with schema
product = agent.get(
    "What is the product name, price, and stock status?",
    response_schema=ProductInfo
)

Performance Considerations

  • Cache extracted information when used multiple times
  • Batch queries when extracting multiple data points
  • Use appropriate timeouts for extraction operations

Timing and Synchronization

Wait Strategies

  1. Implicit waits: Built into AskUI operations
  2. Explicit waits: Use when needed for dynamic content
  3. Smart waits: Wait for specific conditions
# Wait for element to appear
for _ in range(10):
    try:
        agent.click("Dynamic Button")
        break
    except:
        agent.wait(1)

Performance Optimization

  • Minimize unnecessary waits
  • Use element detection to verify readiness
  • Batch operations when possible
  • Monitor execution time for optimization opportunities

Data Extraction Performance

When extracting data from UI elements, consider these optimization strategies:
Batch Extraction
Instead of multiple calls:
# Inefficient
name = agent.get("What is the username?", response_schema=str)
email = agent.get("What is the email?", response_schema=str)
role = agent.get("What is the role?", response_schema=str)

# Efficient - single extraction
class UserInfo(ResponseSchemaBase):
    name: str
    email: str
    role: str

user = agent.get("Extract user information", response_schema=UserInfo)
Caching Results
class DataCache:
    def __init__(self):
        self._cache = {}
    
    def get_or_extract(self, agent, key, question, schema):
        if key not in self._cache:
            self._cache[key] = agent.get(question, response_schema=schema)
        return self._cache[key]

cache = DataCache()
# Reuse expensive extractions
table_data = cache.get_or_extract(
    agent, 
    'main_table',
    "Extract the main data table",
    List[dict]
)

Cross-Platform Considerations

Platform-Specific Handling

import platform

with VisionAgent() as agent:
    if platform.system() == "Darwin":  # macOS
        agent.keyboard("a", modifier_keys=["command"])
    else:  # Windows/Linux
        agent.keyboard("a", modifier_keys=["control"])

Resolution Independence

  • Use relative positioning instead of absolute coordinates
  • Test across different screen resolutions
  • Account for scaling factors on high-DPI displays

Debugging and Troubleshooting

Logging Best Practices

with VisionAgent(log_level="DEBUG") as agent:
    # Detailed logging for troubleshooting
    agent.click("problematic button")

Visual Debugging

  • Take screenshots before and after critical actions
  • Use annotation features to verify element detection
  • Enable visual feedback during development

Summary

Following these best practices ensures:
  1. Reliability: Actions work consistently across environments
  2. Maintainability: Code is easy to understand and modify
  3. Performance: Operations execute efficiently
  4. Robustness: Automation handles edge cases gracefully
For practical implementation details, see the how-to guide on interacting with elements.