RPALite

Python Programming Guide

Introduction
Platform Support
OCR Engine Configuration
Installation
Basic Usage
Advanced Features
Troubleshooting
Creating an RPALite Object
Application Operations
Simulating Mouse Actions
Keyboard/Text Operations
Clipboard Operations
- Getting Clipboard Text
- Copying Text to Clipboard
Image Operations
Control Operations
Window Operations
- Finding Windows by Title
Screen Recording
- Starting Screen Recording
- Stopping Screen Recording
Global Operations

Introduction

This guide provides detailed information on how to use RPALite with Python. RPALite is an open-source RPA (Robotic Process Automation) library that allows you to automate various tasks through Python.

Platform Support

RPALite currently supports the following platforms:

Windows: Full automation support including UI controls
macOS (Under Development): Basic automation support is implemented with some limitations

OCR Engine Configuration

RPALite supports two OCR engines:

EasyOCR (Default)
- Supports more languages out of the box
- Better for general purpose OCR
- Larger model size
PaddleOCR
- Better performance for Chinese text recognition
- Smaller model size
- Faster inference speed

You can configure the OCR engine when initializing RPALite:

# Use EasyOCR (default)
rpalite = RPALite(ocr_engine="easyocr")

# Use PaddleOCR
rpalite = RPALite(ocr_engine="paddleocr")

Installation

To install RPALite, use pip:

pip install RPALite

Basic Usage

Here’s a simple example of using RPALite with Python:

from RPALite import RPALite

# Initialize RPALite
rpalite = RPALite()

# Show desktop
rpalite.show_desktop()

# Run Notepad and input text
rpalite.run_command("notepad.exe")
rpalite.input_text("Hello from RPALite!")

# Find and close Notepad
app = rpalite.find_application(".*Notepad")
rpalite.close_app(app)

Advanced Features

RPALite provides many advanced features including:

Image recognition
OCR (Optical Character Recognition) with support for multiple engines
Window management
Clipboard operations
Keyboard and mouse control
Screen recording
Control finding and automation
Waiting mechanisms for text and images

Troubleshooting

If you encounter any issues:

Ensure you have the required permissions
Check the log files for error messages
Verify that all dependencies are properly installed
For Windows, ensure you have administrative privileges if required

Using RPALite in Python

You can directly navigate to each section via the following links in the Table of Contents.

Creating an RPALite Object

Since RPALite is declared as a class, you need to create an RPALite object before performing any operations:

from RPALite import RPALite
rpalite = RPALite()

The constructor of RPALite includes multiple optional parameters:

debug_mode: Boolean, default value is False. If set to True, RPALite will output debug information and mark elements in images during operations that require image recognition.
ocr_engine: String, default value is “easyocr”. Specifies which OCR engine to use (either “easyocr” or “paddleocr”).
step_pause_interval: Integer. Represents the waiting time after each simulated action. Default value is 3 seconds. This value cannot be set to 0, mainly because the Windows system or the program being operated on also needs some time to respond after simulating mouse or keyboard actions; otherwise, there would be a high likelihood of issues occurring.
languages: List of strings indicating which languages RPALite will use for OCR recognition. The default value is ["en"] (English). You can specify other languages by passing in their language codes to enable input in those languages. For a list of supported languages, refer to the EasyOCR documentation’s language list.

In subsequent examples in this document, assume that the rpalite object has already been created.

Application Operations

Launching an Application

You can launch an application using run_command:

rpalite.run_command("notepad.exe")

The run_command function has two parameters:

command: String representing the command to start the application, which could be the path to an executable file or a command that the operating system can execute.
noblock: Optional boolean parameter, default value is True, meaning RPALite does not wait for the application to finish launching but returns immediately. If set to False, RPALite waits for the application to fully launch before returning from run_command.

Finding an Application

You can find an application using the following code:

app = rpalite.find_application(".*Notepad")

The find_application function supports finding applications through the following parameters:

title: String representing the regular expression that the application title should match,
class_name: String representing the class name of the application. To find the class name of an application, you can use the Accessibility Insights for Windows tool.

Closing an Application

After obtaining an application instance using the find_application function, you can close the application using the following code:

app = rpalite.find_application(".*Notepad")
rpalite.close_app(app)

You can also force quit an application by setting the force_quit parameter to True:

rpalite.close_app(app, force_quit=True)

Maximizing a Window

You can maximize an application window using the following code:

app = rpalite.find_application(".*Notepad")
rpalite.maximize_window(app)

If you want to maximize a specific window within the application, you can specify a window title pattern:

rpalite.maximize_window(app, window_title_pattern="Document - Notepad")

Simulating Mouse Actions

RPALite supports various mouse simulation operations, such as clicking on text, images, or coordinates.

Getting Current Cursor Position

position = rpalite.get_cursor_position()
print(f"Current mouse position: {position}")

The returned coordinates are in tuple form (x, y), for example (10, 20) represents an X-coordinate of 10 and a Y-coordinate of 20. Note that these coordinates are relative to the top-left corner of the screen.

Moving Mouse to Specified Position

rpalite.mouse_move(10, 20)

Parameters are the X-coordinate x and Y-coordinate y. The top-left corner of the screen is (0, 0).

Moving Mouse to Text

rpalite.move_mouse_to_the_middle_of_text("Text to move to")

This function will move the mouse cursor to the center of the specified text on screen.

Clicking by Coordinates

rpalite.click_by_position(10, 20)

The first parameter is the X-coordinate x, and the second parameter is the Y-coordinate y. The top-left corner of the screen is (0, 0).

You can also specify button and double click parameters:

# Right click
rpalite.click_by_position(10, 20, button='right')

# Double left click
rpalite.click_by_position(10, 20, double_click=True)

Clicking Text

You can click on text using the following code:

rpalite.click_by_text("Text to click")

You can also specify the button (left or right) and whether to double-click:

# Right click on text
rpalite.click_by_text("Text to click", button='right')

# Double left click on text
rpalite.click_by_text("Text to click", double_click=True)

Clicking an Image

You can click an image using the following code:

rpalite.click_by_image("path/to/image.png")

You can also specify the button (left or right) and whether to double-click:

# Right click on image
rpalite.click_by_image("path/to/image.png", button='right')

# Double left click on image
rpalite.click_by_image("path/to/image.png", double_click=True)

RPALite uses OpenCV to locate the corresponding image on the screen, and if found, clicks at the center of the image.

Mouse Press and Release

You can simulate pressing and releasing mouse buttons separately:

# Press left mouse button
rpalite.mouse_press(button='left')

# Move mouse while button is pressed (for drag and drop)
rpalite.mouse_move(100, 200)

# Release left mouse button
rpalite.mouse_release(button='left')

Scrolling

You can scroll the mouse wheel using the following code:

# Scroll up 3 times
rpalite.scroll(3)

# Scroll down 2 times
rpalite.scroll(-2)

# Scroll with custom sleep time after
rpalite.scroll(1, sleep=1)

Keyboard/Text Operations

Entering Text at Current Cursor Position

You can enter a piece of text using the following code:

rpalite.input_text("This is a demo using RPALite.\n")

As shown above, the input_text function does not automatically insert line breaks, so you need to add them yourself. If you need to enter text at a specific location, first use the mouse_move function to move to the specified location, then enter the text.

You can also specify how long to wait after inputting text:

rpalite.input_text("This is a demo using RPALite.\n", seconds=5)

Retrieving Field Value

value = rpalite.get_text_field_value("Field name")
print(f"Value of field: {value}")

RPALite uses OCR and AI image technology to recognize the corresponding fields and their values. Since this recognition is not always accurate, there may be errors or mistakes with this function. Adjustments based on actual usage are needed.

Inputting Text by Field Name

rpalite.enter_in_field("Field name", "New value")

The enter_in_field function has two parameters:

field_name: String representing the name of the field,
text: String representing the text to be entered.

RPALite uses OCR and AI image technology to recognize the position of the corresponding field and text box. Similarly, there is a possibility of errors or mistakes. Adjustments based on actual usage are needed.

Simulating Key Presses

You can simulate pressing a key on the keyboard using the following code:

rpalite.send_keys("{VK_LWIN down}D{VK_LWIN up}")

For Windows, it uses pywinauto’s send_keys format. For macOS, it converts the keys to keyboard module format.

Examples of key formats:

"Hello World" - types the text
"^c" - Control+C
"%{F4}" - Alt+F4
"{ENTER}" - Press Enter key
"+(abc)" - Shift+ABC (uppercase)

Validating Text Existence

rpalite.validate_text_exists("Text to check")

You may notice that the validate_text_exists function does not return a value. This is because if the text does not exist, the function will directly throw an AssertionError exception.

You can disable throwing exceptions by setting throw_exception_when_failed to False:

result = rpalite.validate_text_exists("Text to check", throw_exception_when_failed=False)

RPALite uses OCR technology to identify text, which is not always accurate and only recognizes single-line text, so this function might sometimes produce errors or mistakes and cannot recognize multi-line text. Adjustments based on actual usage are needed.

Getting Coordinates of Text

positions = rpalite.find_text_positions("Text to find")
print(f"Text positions: {positions}")
print(f"First matched text position: {positions[0]}")

Note that the find_text_positions function returns a list representing the locations of the text on the screen. Each item in the list is a tuple structured as (x, y, width, height), indicating the position of the text on the screen. x and y represent the coordinates of the top-left corner of the text, while width and height represent the width and height of the recognized text.

You can use exact matching to improve accuracy:

positions = rpalite.find_text_positions("Text to find", exact_match=True)

Waiting for Text to Appear

You can wait for text to appear on the screen with a timeout:

position = rpalite.wait_until_text_shown("Text to wait for", timeout=30)

This will wait for up to 30 seconds for the text to appear, and will return the position of the text if found, or raise an AssertionError if not found within the timeout.

Waiting for Text to Disappear

Similarly, you can wait for text to disappear from the screen:

rpalite.wait_until_text_disappears("Text to wait for disappearing", timeout=30)

Clipboard Operations

Getting Clipboard Text

text = rpalite.get_clipboard_text()
print(f"Clipboard content: {text}")

Copying Text to Clipboard

rpalite.copy_text_to_clipboard("This is a demo using RPALite.")

Image Operations

Finding an Image

You can find an image on the screen:

location = rpalite.find_image_location("path/to/image.png")

Or use a PIL Image object directly:

from PIL import Image
img = Image.open("path/to/image.png")
location = rpalite.find_image_location(img)

You can also search within another image:

location = rpalite.find_image_location("path/to/needle.png", "path/to/haystack.png")

Finding All Image Instances

To find all instances of an image on screen:

locations = rpalite.find_all_image_locations("path/to/image.png")
for loc in locations:
    print(f"Found image at: {loc}")

If no matches are found, this function will return an empty list, not None.

Waiting for an Image

You can wait for an image to appear on screen:

location = rpalite.wait_until_image_shown("path/to/image.png", timeout=30)

Control Operations

Finding Controls by Label

control = rpalite.find_control_by_label("Label text")
print(f"Control position: {control}")

Finding Controls Near Text

control = rpalite.find_control_near_text("Text near control")
print(f"Control position: {control}")

Clicking Controls by Label

rpalite.click_control_by_label("Button label")

With right-click or double-click:

rpalite.click_control_by_label("Button label", button="right", double_click=True)

Finding Controls by Automation ID

For Windows applications, you can find controls using their automation properties:

app = rpalite.find_application("Notepad")
control = rpalite.find_control(app, class_name="Edit", title="Text Editor")

You can then click on a specific part of the control:

rpalite.click_control(app, class_name="Edit", click_position="center")

Click position options include ‘center’, ‘center-left’, ‘center-right’, ‘left’, and ‘right’.

Window Operations

Finding Windows by Title

windows = rpalite.find_windows_by_title("Window Title")

Screen Recording

Starting Screen Recording

You can record the screen to an AVI file:

video_path = rpalite.start_screen_recording("output.avi")

If you don’t specify a file path, RPALite will create a random file in a temporary directory:

video_path = rpalite.start_screen_recording()
print(f"Recording to: {video_path}")

You can also specify the frames per second:

video_path = rpalite.start_screen_recording(fps=30)

Stopping Screen Recording

final_path = rpalite.stop_screen_recording()
print(f"Recording saved to: {final_path}")

Global Operations

Sleeping

rpalite.sleep(5)

The sleep function accepts an integer parameter indicating how many seconds RPALite should sleep. This parameter is optional, with a default value of the step_pause_interval property of the rpalite object.

We previously mentioned that this value cannot be set to 0, because the Windows system or the program being controlled also need some time to respond after simulating mouse or keyboard actions; otherwise, the likelihood of issues would increase significantly. If you set this parameter to 0, RPALite uses the value of the step_pause_interval attribute. If the step_pause_interval attribute of RPALite is set to 0, RPALite skips the sleep operation.

Show Desktop

rpalite.show_desktop()

Getting Screen Size

size = rpalite.get_screen_size()
print(f"Screen size: {size}")

The get_screen_size function returns a tuple indicating the dimensions of the screen. For example, (1920, 1080) indicates a screen width of 1920 pixels and a height of 1080 pixels.

Taking Screenshot

pil_image = rpalite.take_screenshot()

The take_screenshot function returns a PIL image object representing the current screenshot. It has two optional parameters:

all_screens: Boolean, default value is False, meaning only the current screen is captured. If set to True, all screens are captured. This parameter is useful in multi-monitor environments.
filename: String indicating the path where the screenshot file should be saved. If this parameter is specified, RPALite saves the screenshot to the specified file. If this string is None, RPALite does not save the screenshot.

# Take screenshot and save to file
rpalite.take_screenshot(filename="screenshot.png")

# Capture all screens
rpalite.take_screenshot(all_screens=True)

Generic Locator

RPALite provides a generic locate function that can find objects in different ways:

# Locate by text
position = rpalite.locate("OK Button")

# Locate by image path
position = rpalite.locate("image:path/to/image.png")

# Locate by automation ID (Windows only)
app = rpalite.find_application("Notepad")
position = rpalite.locate("automateId:EditControl", app=app)

You can also use the general-purpose click function that works with these locators:

# Click on text
rpalite.click("OK Button")

# Click on image
rpalite.click("image:path/to/image.png")

# Click by automation ID
rpalite.click("automateId:EditControl", app=app)

This site is open source. Improve this page.