PokerStars hand history: Use Python to extract meaningful data

PokerStars provides on request a text file of your hand histories. This offers an opportunity to see if we can read this text file into python and extract the meaningful parts.

Firstly, the games…

I played some Spin & Go games on PokerStars. These are 3 player winner takes all games to compete for a randomly “spun” jackpot (usually 2x the buy in although it can be more if lucky). Games are usually fast, but they’re still tournaments with a 1st, 2nd, 3rd so I thought they would be best for making some interesting varied stuff to analyse.

The references

To put at the top of the code:

path = "C:\\Users\\~"
fileName = "tourns"
name = "MBloxz"

Path is the folder location where you’ve stored the file. Remember the double \\ ‘s; if you only do \ python thinks you’re trying to escape a character. fileName is what you’ve called the file (e.g for me it was tourns.txt) and name is your PokerStars ID.

Opening the file and calling the function

The following code goes at the end of your file (i.e under the extractTournChips function I’m just about to show you.)

with open(path+fileName+".txt") as raw_file:
    raw_content = raw_file.readlines()
    content = [x.strip() for x in raw_content]
    finalChipCounts = extractTournChips(content)

We open the raw text file, read in the lines, get rid of blank space and send it into our function that will extract chip counts.

The juicy extraction function….

For now, let’s try  and extract how our chips fluctuated over time in the different Spin & Gos.

The idea is to make a function that will have the raw file as input and make a dictionary with tournament numbers as the keys and the chip counts over time (integer arrays) as the value associated to each tournament.

def extractTournChips(rawlines):
    tournChips = {}
    tournName = ""
    for line in rawlines:
        words = line.split()
        if len(words) > 5:
            if words[1] == 'Hand':
                tournName = words[4]
            if words[5] == "chips)" and words[2] == name:
                chips = int(words[3][1:])
                if tournName in tournChips.keys():
                    tournChips.update({tournName: [chips]})
    for c in tournChips.values():
    return tournChips.items()

We read through each line, splitting it into it’s words and only check the line if it’s longer than a suitable length.

Getting the tournament ID

To get the information needed, one has to look very closely at the format and wording PokerStars uses on the hand history text. The first check is if the line informs us we are talking about a new hand. The way this appears on the file is:

PokerStars Hand #168442729678: Tournament #1824236499, …

so we can check word 2 to see if it’s “Hand”.

if words[1] == 'Hand':
    tournName = words[4]

If it is, we change tournName to the 5th word: tournament number. (Note: I was multi-tabling so the hand histories jump to and from different tournaments making it necessary to note where the hand was played.)

Getting chip counts

The second check is to see if the line we are evaluating is the line stating our chip count at the start of that hand. The format for this is:

Seat 2: MBloxz (1460 in chips)

so we can check if the 3rd word is our name and 6th word is  “chips)” to know it’s a line reffering to our chip count. We can then extract the count with the 4th word, removing the bracket and converting it from a string to an integer.

if words[5] == "chips)" and words[2] == name:
     chips = int(words[3][1:])

We then update the dictionary

if tournName in tournChips.keys():
    tournChips.update({tournName: [bal]})

If we’ve already got a hand from this tournament stored already, append the new count in it’s array. If the hand is from a new tournament, make a new entry in the dictionary with the key as the tournament name.

Reordering counts

Finally PokerStars orders the file most recent hand played first, i.e when we read the file we’re storing chip counts in reverse order. To make them chronological again, after making our dictionary:

for c in tournChips.values():

The finish line…

We are now done. To visually show what we’ve done, I imported the matplotlib library and passed “finalChipCounts” into the following function. (and adding some extra stats; I’ll leave it to you to replicate these).

def showGraphics(fcc):
    for c in fcc:
    plt.plot([750]*50, color='red', linestyle='dashed', linewidth=0.5)



This shows chip count over time. I also included win ratios in my code and calculating the mean chip count at each hand (dashed line).



That’s it for now but with the knowledge and structure set-up it’s only a matter of scouring the format of the file for other data and information that can be extracted. For example you can extract how often you place 1st, 2nd, 3rd, or start to get an idea of what hands you tend to win or lose with.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s