Wordle Strategy



Wordle is a very popular word game that was purchased by the New York Times. The object is to figure out the five letter word of the day in six guesses. You are not given any hints. When you guess a word, a letter turns green if it is in the right spot. A letter turns yellow if it is in the word, but in the wrong spot. A letter turns grey if it is not in the word. There are many strategies to solve Wordle.



Some strategies involve using all vowels first. Some strategies try to use the Wheel of Fortune letters (R,S,T,L,N,E). As a data nerd, I chose to try to solve it with analytics.

The method chosen was to choose 4 words that have the 20 most common letters and use last 2 guesses to solve the puzzle

The first step was to download a list of English words from GitHub (370k words) https://github.com/dwyl/english-words. Next, 5 letter words were filtered out (15,918 words). Then, words that had repeat letters were filtered (10,171 words). Finally, all anagrams (word formed by rearranging the letters of a different word) were removed. This produced a list of 5,976 words.

Doing a count of the number of times each letter was found produced the following:

{'a': 8393, 'e': 7801, 's': 6538, 'o': 5220, 'r': 5144, 'i': 5068, 'l': 4247, 't': 4190, 'n': 4044, 'u': 3362, 'd': 2812, 'c': 2745, 'y': 2522, 'm': 2495, 'p': 2300, 'h': 2285, 'b': 2090, 'g': 1972, 'k': 1744, 'f': 1239, 'w': 1172, 'v': 879, 'z': 475, 'j': 377, 'x': 362, 'q': 140}

A quick note, The Wheel of Fortune should use A instead of E, but that could be because we are limiting to 5-letter words.

We will ignore W,V,Z,J,X, and Q and then find groups of 4 words that have the top 20 letters. This will produce thousands of groups. Here are some:
abdom ceint flusk gryph
abdom celts funky griph
abdom cents fluky griph
abdom ceryl fight punks
abdom ceryl funks pight
abdom certy finks gulph

We are just interested in a group that all the words are valid in the game.

Before the New York Times bought Wordle, all past puzzles and future puzzles were viewable in JavaScript. This list was used with the top 20 letters from above to find the effectiveness of the strategy.

How well it worked:
83.2% have all letters revealed and just need to unscramble them
The remaining 16.8% are only missing one letter