I suck at gaming but I can code

hacking
I played Final Fantasy X a lot on PS2 when I was in high school. Great RPG if you ask me, but it's riddled with punishing little mini-games that you have to complete in order to unlock certain items that ultimately make your characters stronger. One of them, arguably the most annoying, was remarkably efficient at making me rage quit: the lightning bolt dodging mini-game of the Thunder Plains. Today, I'm going to shamelessly brag about my way of cheating through this sadistic mini-game on the HD remaster PC version, by coding a Python bot.

I did admit upfront that it is cheating, and although I can find comfort in the fact that I made it legit once a long time ago, the jeering FFX purists among you will certainly hiss at this article in contempt. Meh! At least they’ll be learning the most basic form of game hacking, and as Sir Francis Bacon would have put it, knowledge itself is power.

The problem to solve

Each character has an ultimate weapon allowing them, among other things, to break the 9999 damage limit, which is critical if you are to engage optional late game super-bosses. But finding the ultimate weapon isn’t enough, you have to unlock its powers thanks to two associated objects: a crest and a sigil. The crests are usually easy to find, but the sigils require you to complete an arduous task. In particular, the Venus sigil associated with Lulu’s celestial weapon can only be obtained by successfully dodging 200 lightning strikes in a row, in the Thunder Plains.

The Thunder Plains, courtesy of finalfantasy.fandom.com

The Thunder Plains are this barren, inhospitable landscape North of Guadosalam, where an unending thunderstorm menaces to obliterate the wandering player with frequent unpredictable lightning strikes. The lightning strikes happen seemingly at random, shortly preceded by a white screen flash. When you see the flash, you need to quickly hit the action key in order to dodge the impending lightning bolt. The timing is quite tight. Each time you manage to avoid being struck, a hidden counter is incremented. When you get struck, the counter is reset to zero. You read right, you’re not told how many successful dodges you placed, so you don’t want to miscount… If you succeed in doing 200 consecutive dodges, you win the sigil. These are the rules of the nightmarish mini-game we’re having a crack at. There is in fact a specific spot where the lightning bolts become quite predictable and thus easier to dodge (seel [1]), but if we are to cheat make our lives easier, at least we’ll do it in the nerdiest way possible.

Tooling up

The astute reader understood that we’ll have to programmatically detect the white screen flashing and place an input right after. So essentially, we need two components: some way of capturing the display, and an interface to send keystrokes. As an irksome GNU/Linux user, I’ll be using the python3-xlib module for screen captures, you could use the PIL.ImageGrab module as well (as we’ll be using PIL anyway), opencv-python if you’re fancy, or win32gui if you’re still generously sending money to Microsoft (see [2] for code examples). The keyboard input part will be handled with pynput. There are alternatives like the pyautogui and keyboard modules, none of which worked for me.

Install the stuff you need, read about the various APIs if you intend to use different modules, and carry on.

Writing the bot

First, we create a new python script, and import everything we need:

#!/usr/bin/python3
import sys
import signal
import time
import Xlib.display
import PIL.Image
import PIL.ImageStat
from pynput import keyboard

Screen flash detection

This is the first sub-problem we need to solve. My strategy was to sample the screen regularly, analyze pixel lightness level, and update a basic finite state machine in order to detect a transition from a dark to a light screen. As the sampling procedure will be repeated quite frequently, only a small portion of the screen was captured so as to minimize blitting time. In fact, I’m only sampling pixels along a line starting from the left of the screen at mid-height, with length the half-width of the screen. This line will henceforth be called the probe.

The probe is shown in red.

Let’s write a generic function to obtain the pixel color values of an arbitrary rectangular portion of the screen:

def sample_screen(x, y, w, h):
    # Capture screen
    root = Xlib.display.Display().screen().root
    image = root.get_image(x, y, w, h, Xlib.X.ZPixmap, 0xffffffff)
    image_raw = image.data

    # Sometimes, image data will be a string, I don't get why, but we always want bytes
    if isinstance(image_raw, str):
        image_raw = str.encode(image_raw)

    # Convert raw data to a PIL image in HSV format
    return PIL.Image.frombytes("RGB", (w, h), image_raw, "raw", "BGRX").convert("HSV")

Almost straightforward. First, we get a handle to the root window, then we use it to get the pixel values inside our box (starting at screen coordinates (x,y), of width w and height h). Now, a problem I encountered, and it certainly is just an artefact of python3-xlib, is that sometimes the underlying type of the raw image image.data will be bytes as expected, but sometimes it will be a string, no idea why. In the latter case, we need to convert to bytes, hence the conditional cast. If you’re using another module, you probably don’t have to worry about this. As an additional step, we convert this raw image format to a PIL image in HSV color space, which is more suited to evaluate pixel lightness later on. Ideally, you’d want to convert to HSL straight away, but PIL does not support HSL conversion (as far as I know), so we’ll have to do this in the next step.

Now, we write a function to get a statistic measure of the lightness along the probe:

def probe_lightness():
    # Sample screen along probe
    screen = Xlib.display.Display().screen()
    width = screen.width_in_pixels
    height = screen.height_in_pixels
    image_hsv = sample_screen(0, height//2, width//2, 1)

    # Get normalized median HSV color
    median_color = PIL.ImageStat.Stat(image_hsv).median
    median_color = [x/255 for x in median_color]

    # Calculate lightness
    return median_color[2]*(1-median_color[1]/2)

We sample the screen along the probe, and use PIL.ImageStat to get the median HSV color value. I prefer using a median rather than a mean as it’s more robust to local color variations. Now, from this single color, we can calculate and return a normalized lightness value. For this, we can use the following conversion formula found here:

(1)L=V(1S2)

L is the lightness we’re after, V is the value channel, and S is the saturation channel. The closer the color is to black, the lower L is. The closer it is to white, the higher. This formula works for normalized values of V and S (between 0 and 1), that’s why we have to divide each channel by 255 before the calculation.

Let’s write a little program to test this:

def handler(signum, frame):
    print("Exiting now")
    exit(1)


def main(argv):
    signal.signal(signal.SIGINT, handler)
    
    state = False
    old_state = False
    while True:
        L = probe_lightness()
        state = L > 0.75
        if state and not old_state:
            print("Transition detected")
            
        old_state = state
        time.sleep(0.05)


if __name__ == '__main__':
    main(sys.argv[1:])

We’re using an infinite loop to sample the screen regularly. Thus, we should register a signal handler so as to capture the SIGINT signal emitted when hitting Ctrl+C in the terminal, to halt execution properly. Each iteration we measure lightness along the probe by calling our probe_lightness() function. If the measured lightness is above the detection threshold (0.75 here), the state boolean flag is set to True, and False otherwise. Now, we need to compare this flag to the value it had in the previous iteration, which is saved in the old_state flag. If in the previous iteration the state was False and it is True in the current iteration, it means that we have transitioned from a dark screen to a light screen, which means that we detected a white screen flash.

You can prepare a dark background on your desktop, and test this program by dragging a white window to the probe’s location. It should print “Transition detected” whenever the white window enters the probe area. This should confirm that the detection code is working. Now, science is all about the data! Let’s run around in circles in the Thunder Plains, sample the lightness at regular intervals, and plot it:

Lightness plot while wandering in the Thunder Plains

I spare you the detail on how to do this, it is pretty trivial. The big spikes correspond to the white flashes before a thunder strike hits. Given this, we’re justified to leave the detection threshold at 0.75, it won’t give us spurious detections as the noise floor seems to be well under. All that is left to do is to send the appropriate keystroke whenever such a transition is detected.

Response

We’ll use a keyboard.Controller object to send keyboard events. Also, we would like to be able to toggle the bot on and off during the game, which will avoid untimely triggering of the response code in the scenes where this system is not relevant. Let’s write a small helper function that simulates pressing a key and releasing it a bit later:

def hit(kb, key, interval):
    kb.press(key)
    time.sleep(interval)
    kb.release(key)

Here, kb is intended to be a keyboard.Controller object that we’ll provide later on. Let’s modify our test main() function from the previous section to add the required features:

active = False
def main(argv):

    # Toggle bot by pressing the '!' key
    def on_press(key):
        if hasattr(key, 'char'):
            if key.char == '!':
                global active
                active = not active
                print(f'Active: {active}')

    listener = keyboard.Listener(on_press=on_press)
    listener.start()

    # Declare our keyboard interface
    kb = keyboard.Controller()
    signal.signal(signal.SIGINT, handler)
    
    # ...

Essentially, we installed a keyboard listener that toggles a global active flag when the ! key is pressed. A key object as returned by a pynput keyboard listener is not guaranteed to possess a char property: keys that don’t correspond to characters like the arrow keys, space, shift and so on, don’t have one. So we need to test for the existence of such a property before we do anything with it, hence the hasattr() shenanigan. Next, we declared the keyboard interface to be used with our hit() function further down. Let’s press on:

    # ...
    state = False
    old_state = False
    count = 0
    while True:
        if active:
            # Perception
            L = probe_lightness()
            # Decision
            state = L > 200
            if state and not old_state:
                # Action
                count += 1
                hit(kb, 'e', 0.15)
                print(f'Dodging lightning strike #{count}')
                
            old_state = state
        time.sleep(0.05)

That’s basically the same main loop as before, except that we only try to perform detection when the active flag is set to True. When a state transition has been detected, we call the hit() function to simulate pressing the action key (it’s configured to E on my keyboard), and we also increment a counter to facilitate keeping track of the lightning bolts we dodged. And that’s it! Now we can launch the game, go to the Thunder Plains, and complete the most annoying mini-game ever automatically, while doing more important stuff, like drinking coffee and smoking cigarettes (don’t smoke, kids).

Testing the bot

You can see it in action in the following video:

So it does work, and dodging 200 lightning bolts has become a waiting game.

Now, you can get a totally undeserved achievement.

Conclusion

Perception-Decision-Action loops like the one we implemented are of central importance in robotics and intelligent system design. Many bots can be written in this fashion. Back in the days of Counter Strike 1.6, I remember coding an aim bot that would scan the whole screen from top to bottom, detect the blue color of a counter-terrorist helmet, and respond by quickly moving the cursor and firing (don’t do this online, kids, and don’t smoke). One advantage of such a method, is that unlike a cheat engine, it’s completely non intrusive as you’re just reading the screen and driving mouse and keyboard inputs, like a human player would do. And with the right amount of randomness added to the response, that could even throw off some heuristic cheat detection systems.

Throw in some machine learning, and you could very well make bots that can solve more complex tasks, like driving a car in Trackmania (see [3] and [4]) for example!

Acknowledgements

Many thanks to
for the awesome illustration!

Sources




The comment section requires the Utterances cookie in order to work properly. If you want to see people's comments or post a comment yourself, please enable the Utterances cookie here.