I suck at gaming but I can code
hacking
I did admit upfront that it is cheating, and although I can find comfort in the fact that I made it legit once a long time ago, the jeering FFX purists among you will certainly hiss at this article in contempt. Meh! At least they’ll be learning the most basic form of game hacking, and as Sir Francis Bacon would have put it, knowledge itself is power.
The problem to solve
Each character has an ultimate weapon allowing them, among other things, to break the 9999 damage limit, which is critical if you are to engage optional late game super-bosses. But finding the ultimate weapon isn’t enough, you have to unlock its powers thanks to two associated objects: a crest and a sigil. The crests are usually easy to find, but the sigils require you to complete an arduous task. In particular, the Venus sigil associated with Lulu’s celestial weapon can only be obtained by successfully dodging 200 lightning strikes in a row, in the Thunder Plains.
The Thunder Plains are this barren, inhospitable landscape North of Guadosalam, where an unending thunderstorm menaces to obliterate the wandering player with frequent unpredictable lightning strikes. The lightning strikes happen seemingly at random, shortly preceded by a white screen flash. When you see the flash, you need to quickly hit the action key in order to dodge the impending lightning bolt. The timing is quite tight. Each time you manage to avoid being struck, a hidden counter is incremented. When you get struck, the counter is reset to zero. You read right, you’re not told how many successful dodges you placed, so you don’t want to miscount… If you succeed in doing 200 consecutive dodges, you win the sigil. These are the rules of the nightmarish mini-game we’re having a crack at. There is in fact a specific spot where the lightning bolts become quite predictable and thus easier to dodge (seel [1]), but if we are to cheat make our lives easier, at least we’ll do it in the nerdiest way possible.
Tooling up
The astute reader understood that we’ll have to programmatically detect the white screen flashing and place an input right after. So essentially, we need two components: some way of capturing the display, and an interface to send keystrokes. As an irksome GNU/Linux user, I’ll be using the python3-xlib
module for screen captures, you could use the PIL.ImageGrab
module as well (as we’ll be using PIL
anyway), opencv-python
if you’re fancy, or win32gui
if you’re still generously sending money to Microsoft (see [2] for code examples). The keyboard input part will be handled with pynput
. There are alternatives like the pyautogui
and keyboard
modules, none of which worked for me.
Install the stuff you need, read about the various APIs if you intend to use different modules, and carry on.
Writing the bot
First, we create a new python script, and import everything we need:
#!/usr/bin/python3
import sys
import signal
import time
import Xlib.display
import PIL.Image
import PIL.ImageStat
from pynput import keyboard
Screen flash detection
This is the first sub-problem we need to solve. My strategy was to sample the screen regularly, analyze pixel lightness level, and update a basic finite state machine in order to detect a transition from a dark to a light screen. As the sampling procedure will be repeated quite frequently, only a small portion of the screen was captured so as to minimize blitting time. In fact, I’m only sampling pixels along a line starting from the left of the screen at mid-height, with length the half-width of the screen. This line will henceforth be called the probe.
Let’s write a generic function to obtain the pixel color values of an arbitrary rectangular portion of the screen:
def sample_screen(x, y, w, h):
# Capture screen
root = Xlib.display.Display().screen().root
image = root.get_image(x, y, w, h, Xlib.X.ZPixmap, 0xffffffff)
image_raw = image.data
# Sometimes, image data will be a string, I don't get why, but we always want bytes
if isinstance(image_raw, str):
image_raw = str.encode(image_raw)
# Convert raw data to a PIL image in HSV format
return PIL.Image.frombytes("RGB", (w, h), image_raw, "raw", "BGRX").convert("HSV")
Almost straightforward. First, we get a handle to the root window, then we use it to get the pixel values inside our box (starting at screen coordinates (x
,y
), of width w
and height h
). Now, a problem I encountered, and it certainly is just an artefact of python3-xlib
, is that sometimes the underlying type of the raw image image.data
will be bytes
as expected, but sometimes it will be a string, no idea why. In the latter case, we need to convert to bytes
, hence the conditional cast. If you’re using another module, you probably don’t have to worry about this. As an additional step, we convert this raw image format to a PIL
image in HSV color space, which is more suited to evaluate pixel lightness later on. Ideally, you’d want to convert to HSL straight away, but PIL
does not support HSL conversion (as far as I know), so we’ll have to do this in the next step.
Now, we write a function to get a statistic measure of the lightness along the probe:
def probe_lightness():
# Sample screen along probe
screen = Xlib.display.Display().screen()
width = screen.width_in_pixels
height = screen.height_in_pixels
image_hsv = sample_screen(0, height//2, width//2, 1)
# Get normalized median HSV color
median_color = PIL.ImageStat.Stat(image_hsv).median
median_color = [x/255 for x in median_color]
# Calculate lightness
return median_color[2]*(1-median_color[1]/2)
We sample the screen along the probe, and use PIL.ImageStat
to get the median HSV color value. I prefer using a median rather than a mean as it’s more robust to local color variations. Now, from this single color, we can calculate and return a normalized lightness value. For this, we can use the following conversion formula found here:
Let’s write a little program to test this:
def handler(signum, frame):
print("Exiting now")
exit(1)
def main(argv):
signal.signal(signal.SIGINT, handler)
state = False
old_state = False
while True:
L = probe_lightness()
state = L > 0.75
if state and not old_state:
print("Transition detected")
old_state = state
time.sleep(0.05)
if __name__ == '__main__':
main(sys.argv[1:])
We’re using an infinite loop to sample the screen regularly. Thus, we should register a signal handler so as to capture the SIGINT
signal emitted when hitting Ctrl+C
in the terminal, to halt execution properly. Each iteration we measure lightness along the probe by calling our probe_lightness()
function. If the measured lightness is above the detection threshold (0.75 here), the state
boolean flag is set to True
, and False
otherwise. Now, we need to compare this flag to the value it had in the previous iteration, which is saved in the old_state
flag. If in the previous iteration the state was False
and it is True
in the current iteration, it means that we have transitioned from a dark screen to a light screen, which means that we detected a white screen flash.
You can prepare a dark background on your desktop, and test this program by dragging a white window to the probe’s location. It should print “Transition detected” whenever the white window enters the probe area. This should confirm that the detection code is working. Now, science is all about the data! Let’s run around in circles in the Thunder Plains, sample the lightness at regular intervals, and plot it:
I spare you the detail on how to do this, it is pretty trivial. The big spikes correspond to the white flashes before a thunder strike hits. Given this, we’re justified to leave the detection threshold at 0.75, it won’t give us spurious detections as the noise floor seems to be well under. All that is left to do is to send the appropriate keystroke whenever such a transition is detected.
Response
We’ll use a keyboard.Controller
object to send keyboard events. Also, we would like to be able to toggle the bot on and off during the game, which will avoid untimely triggering of the response code in the scenes where this system is not relevant. Let’s write a small helper function that simulates pressing a key and releasing it a bit later:
def hit(kb, key, interval):
kb.press(key)
time.sleep(interval)
kb.release(key)
Here, kb
is intended to be a keyboard.Controller
object that we’ll provide later on. Let’s modify our test main()
function from the previous section to add the required features:
active = False
def main(argv):
# Toggle bot by pressing the '!' key
def on_press(key):
if hasattr(key, 'char'):
if key.char == '!':
global active
active = not active
print(f'Active: {active}')
listener = keyboard.Listener(on_press=on_press)
listener.start()
# Declare our keyboard interface
kb = keyboard.Controller()
signal.signal(signal.SIGINT, handler)
# ...
Essentially, we installed a keyboard listener that toggles a global active
flag when the !
key is pressed. A key
object as returned by a pynput
keyboard listener is not guaranteed to possess a char
property: keys that don’t correspond to characters like the arrow keys, space, shift and so on, don’t have one. So we need to test for the existence of such a property before we do anything with it, hence the hasattr()
shenanigan. Next, we declared the keyboard interface to be used with our hit()
function further down. Let’s press on:
# ...
state = False
old_state = False
count = 0
while True:
if active:
# Perception
L = probe_lightness()
# Decision
state = L > 200
if state and not old_state:
# Action
count += 1
hit(kb, 'e', 0.15)
print(f'Dodging lightning strike #{count}')
old_state = state
time.sleep(0.05)
That’s basically the same main loop as before, except that we only try to perform detection when the active
flag is set to True
. When a state transition has been detected, we call the hit()
function to simulate pressing the action key (it’s configured to E
on my keyboard), and we also increment a counter to facilitate keeping track of the lightning bolts we dodged. And that’s it! Now we can launch the game, go to the Thunder Plains, and complete the most annoying mini-game ever automatically, while doing more important stuff, like drinking coffee and smoking cigarettes (don’t smoke, kids).
Testing the bot
You can see it in action in the following video:
So it does work, and dodging 200 lightning bolts has become a waiting game.
Conclusion
Perception-Decision-Action loops like the one we implemented are of central importance in robotics and intelligent system design. Many bots can be written in this fashion. Back in the days of Counter Strike 1.6, I remember coding an aim bot that would scan the whole screen from top to bottom, detect the blue color of a counter-terrorist helmet, and respond by quickly moving the cursor and firing (don’t do this online, kids, and don’t smoke). One advantage of such a method, is that unlike a cheat engine, it’s completely non intrusive as you’re just reading the screen and driving mouse and keyboard inputs, like a human player would do. And with the right amount of randomness added to the response, that could even throw off some heuristic cheat detection systems.
Throw in some machine learning, and you could very well make bots that can solve more complex tasks, like driving a car in Trackmania (see [3] and [4]) for example!
Acknowledgements
Sources
- [1] gaming.stackexchange thread about the mini-game
- [2] Multiple implementations to retrieve screen pixel colors
- [3] Yosh’s first Youtube video about his Trackmania bot
- [4] Yosh’s second Youtube video about his Trackmania bot