The Hidden Influence Behind our Favorite Dungeons

Over the past several World of Warcraft (WoW) expansions, one of the most frequently voiced topics among Mythic+ enthusiasts has revolved around our favorite and least favorite dungeons. But what makes a great Mythic+ dungeon and why? What makes some of us rejoice in seeing a Halls of Atonement keystone in our bags, yet Alt+F4 at the prospect of entering Spires of Ascension?

After gathering data from over a thousand WoW players, we have uncovered a factor between our favorite dungeons and instance design — but it may not be what you think. Read on to unveil our findings!

Table of Contents


With the timeless discussion surrounding dungeon design in WoW, we wanted to investigate if there were predictable reasons, outside of personal preference or individual biases, as to what makes a Mythic+ dungeon popular amongst players.

Using data from a survey and accumulated player profile data on Raider.IO, we conducted a linear regression analysis to investigate relationships between the popularity of specific dungeons across the past three expansions, and factors such as a dungeon’s timer, difficulty, and individual player skill.

Whether due to specific loot drops, dungeon speed, or mandatory requirements for certain classes, everybody has their own individual preferences for Mythic+. Despite what people reported as the reasons behind their favorite dungeons, an underlying theme began to appear from the survey results: the length of the dungeon timer. Although many people cited reasons like finicky / “unfair” mechanics, the potential for big pulls, and openness (referring to camera freedom and dungeon layout), dungeons with short overall timers quickly emerged as crowd favorites.

In this study, we will describe how the data was collected, explain the methodology we applied, and examine the variables (and adjustments to the variables). Then, we will present the results while addressing their limitations.


The data used for this study was extracted from a survey conducted by Raider.IO to identify the relationships between dungeon timers and the respondents’ favorite dungeons. Survey responses were collected through a Google Form posted on the Raider.IO Twitter, Instagram, Facebook, and Discord server.

The survey requested the following information:

Favorite Mythic+ Dungeons Survey
1. What are the highest level keys you’ve completed in any Mythic+ season(s) including Shadowlands, BFA, and Legion?

  • Below 15s
  • +15-19
  • +20-24
  • +25s and up

2. IF you played during Legion, what was your favorite Legion Mythic+ dungeon?

  • Black Rook Hold, Eye of Azshara, Darkheart Thicket, Vault of the Wardens, Neltharion’s Lair, Court of Stars, Cathedral of Eternal Night, Seat of the Triumvirate, The Arcway, Halls of Valor, Maw of Souls, Upper Karazhan, Lower Karazhan, Didn’t Play / No Opinion

3. IF you played during BFA, what was your favorite BFA Mythic+ dungeon?

  • Freehold, Underrot, Tol Dagor, Atal’Dazar, Kings’ Rest, Siege of Boralus, The MOTHERLODE!!!, Shrine of the Storm, Temple of Sethraliss, Waycrest Manor, Mechagon: Junkyard, Mechagon: Workshop, Didn’t Play / No Opinion

4. IF you played during Shadowlands, what was your favorite Shadowlands Mythic+ dungeon?

  • Spires of Ascension, Necrotic Wake, De Other Side, Mists of Tirna Scithe, Theater of Pain, Plaguefall, Sanguine Depths, Halls of Atonement, Tazavesh: Streets of Wonder, Tazavesh: Soleah’s Gambit, Didn’t Play / No Opinion

5. Any additional thoughts?


Due to concerns about whether people could accurately recall specific key level completions from years ago, the first question of the survey regardings the respondent’s highest completed key level was not divided by expansion. The framing of all three questions containing “what was your favorite (expansion’s) Mythic+ dungeon?” was intended to capture the respondents favorite dungeon. When crafting the survey questions, we considered using the word “best” instead of the word “favorite.” However, we ultimately ruled out the term “best,” as some respondents might associate “best” with the kind of loot that the dungeon dropped, or whether the key was the easiest to complete towards achieving a higher IO score.


The survey received 1,206 total responses where:

  • In Legion, 36.7% of respondents reported “Didn’t Play / No Opinion”
  • In BFA, 12.3% of respondents reported “Didn’t Play / No Opinion”
  • In Shadowlands, 0.6% of respondents reported “Didn’t Play / No Opinion”

This data point was important, as we wanted to ensure that survey participants had played in each given expansion that they were responding to.

According to the survey results, the breakdown of highest key levels completed across all expansions showed the following:

  • +25s and up: 27.7%
  • +20-24: 55.4%
  • +15-29: 15.6%
  • Below 15s: 1.3%

If we consider the actual breakdown of players that achieve each category of highest key completed, the higher level keys are over represented in these results. This will be elaborated upon further in the discussion section.


The dependent variable of the following study is the respondents’ favorite dungeon.

The notable independent variable we’re going to examine in this study is the dungeon timer, meaning the maximum time allotted to complete a Mythic+ dungeon on time. There were additional independent variables like difficulty rating, amount of runs completed, and the respondents’ highest level keys completed. However, the focus of our analysis is on the relationship between the respondents favorite dungeon and dungeon timer.

Dependent VariableFavorite Mythic+ Dungeon
Independent VariableDungeon Timer


To analyze the effects of our independent variables on the respondents favorite key, a single linear regression was used. The basic empirical strategy is to compare all of the received responses and test if it has any effect on the dependent variable.


The reason a singular linear regression was used is because of the type of information being analyzed. The information collected in the survey is used to determine independent causal reasons for the respondents favorite key. Using this regression model, we can see the changes in our dependent variable based on varying responses in our independent variables for a single time period.

The singular linear regression equation used to capture this information is set up as follows…

𝘠 = α + 𝘵 + ε


αThe reported variable indicating the average respondent's favorite key*
𝘵A variable indicating dungeon timer length per dungeon
εError term

*1-10 with 1 being least popular and 10 being most popular


Note: The X axis and Y axis have been flipped for better display

The results above are yielded when we regress the Legion dungeon timers on survey rank. This model produces a p-value of 0.1205. This indicates that there is a 12.05% chance that these outcomes are by chance which does not quite meet minimum significance standards.

For BFA, when dungeon timers are regressed on survey rank, we receive a p-value of 0.3035 indicating there is a 30.35% chance these results are obtained by chance. This is far from any significance level.

For Shadowlands, when we regress dungeon timers on survey rank, we receive a p-value of 0.0097 indicating there is a very strong relationship between key preference and a lower key timer. This means there is a less than 1% probability these results are by chance and are statistically significant. This can be represented in the following equation of Ŷ = 0.5180X - 13.41 meaning for everyone 1-minute increase in dungeon timer, the survey rank decreases in popularity by 0.5180.

To investigate further, we combined data from all three of the expansions and ran the same regression of dungeon timer on ranking.

*This graph sorts the dungeons from left to right, indicating most popular to least popular

Note: The X axis and Y axis have been flipped for better display

Because the surveys for dungeon popularity were in a ranking order of 1-10, 1-10, and 1-11, we had to determine a way to assign an accurate ranking for each dungeon across expansions with the appropriate weighting. Therefore, we aggregated the sum of all votes and assigned a number that represents the percentage of votes for each dungeon per expansion. We then used these adjusted vote totals to re-create a ranked list from 1-31. We regressed these aggregated dungeon rankings with their respective timers, which resulted in a p-value of 0.0035. This falls within the strictest significance level, indicating an extremely probable relationship between all expansion dungeon timers and survey ranking. The following regression equation of Ŷ = 0.9247X - 17.50 indicates that, for every 1-minute increase in dungeon timer, the survey rank will decrease in popularity by 0.9247. The R2 result of 0.2743 means that dungeon timer explains 27.43% of the reason why a respondent will select any dungeon as their favorite.

Additionally, we wanted to see if there was a relationship between dungeon difficulty and survey ranking. It is hard to determine what exactly a “difficult” dungeon entails, so we tested out a few variables. Since the highest key levels differed across expansions, we had to adjust the category of what a “high key” was depending on the patch. The following classifications were used per expansion:

ExpansionKeystone Level RangeSeason
Legion20+Season 3 (Patch 7.2.5)
Battle for Azeroth20+Season 3 (Patch 8.2)
Shadowlands25+Season 3 (Patch 9.2)

First, we took the top 60 profiles from each of these patches, averaged out each player's highest timed key, and assigned that number to the related dungeons difficulty level. We then regressed the key averages on survey ranking.

A possible problem with using the top 60 profiles is that players at the highest level often use strategies that are not reasonably executed by the remaining player base — even those who qualify for the high-level key category. For example, the Tol Dagor cannon pulls or the Plaguefall “Plagueborer” strategies were not employed by every player in the high-level key category. Another way we tried to represent the “difficulty” of keys by dungeon, was by regressing the number of timed completed runs that were equal to or above the “high key” ranking category on survey ranking.

We ran regressions on these variables within each expansion, as well as on normalized versions to allow cross-expansion comparison, but all such regressions yielded statistically insignificant results.


Overall, the findings of these calculations reveals that the model used to predict these survey ratings from players is significant. As stated earlier, by expansion, the results yield differing results, and the “difficulty” metric proved to be insignificant.

The first topic to address is the framing of the survey, which was touched upon in the earlier data section. Framing is one of the most important factors when trying to collect data from respondents. There is always the possibility of receiving results that aren’t representative of what a population may truly feel, due to factors like survey takers misunderstanding some questions, regional differences of word meanings, etc.

The first question about “highest level of keys” was used to measure the skill level of the responding player. There was consideration regarding us asking for the highest level of keys completed by respondents per expansion, but we had concerns towards people exaggerating their highest key levels and/or misremembering past expansions. Before ultimately deciding on this framing, we asked a small sample of high-end players if they remembered what their highest key levels completed were for the seasons listed.

Additionally, it's quite clear that the respondents for 20+ and 25+ keys are overrepresented. Although we don’t have an exact curve of the entirety of the WoW player base and what percent of players are able to complete what level of content, the amount of people that are able to time +20s is a small percentage of the overall player base. In our survey, a whopping 83.1% reported being able to complete 20+ keys (this includes 25+).

As to why respondents for 20+ and 25+ keys are overrepresented in our findings, we posit two possible explanations. The first relates to where the survey was conducted. The survey responses were taken from the Raider.IO social media accounts, with the majority of responses coming primarily from Twitter. When we include the entirety of WoW players, the vast majority do not have social media accounts that follow popular WoW oriented accounts. The responses were extracted from a limited pool of players that have social media, follow the Raider.IO account, and opted in to taking the survey. We can assume that, because Raider.IO is strongly associated with dungeon metrics and scoring, the average skill level of the players following and interacting with the Raider.IO Twitter account is likely to be higher than that of the general WoW player base.

The second explanation relates to the nature of survey responses in general. Even when surveys are conducted completely privately, there is a known pattern called the social-desirability bias, where respondents tend to over-report or overestimate themselves. Examples of this can include knowledge of policy, general intelligence, physical capability, etc. However, this is not too much of a concern because, across all categories of player skill level, the reported favorite key ranking per expansion is the same!

Next, we would like to explain the possible reasons why the combined data for all expansions are significant as a whole, but not for some expansions on an individual level.


For Legion, we recall that the regression gave us a p-value of 0.1205. One of the big indicators is that Vault of the Wardens scored the lowest on the popularity scale, but was tied for 3rd as the shortest dungeon. Additionally, Halls of Valor and The Arcway have the longest timers across all expansions at 45 minutes; however, among Legion dungeons, they scored 5th and 7th for popularity, respectively. Despite these outstanding relationships, the Legion breakdown provided a likely yet insignificant result. In other words, there was an ~87.95% chance that this relationship was attributed to dungeon timer, which doesn’t qualify for the minimum standards of statistical significance.


For Battle for Azeroth, we recall the regression gave us a p-value of 0.3035. Tol Dagor and Siege of Boralus both scored last and second to last place in popularity; however, they are both mid-length dungeons with a timer of 36 minutes. Outside of these, we see dungeons like Atal’Dazar and The Underrot ranking highest in popularity whilst having the shortest timer. Kings’ Rest and Shrine of the Storm were the longest BFA dungeons with timers of 42 minutes, each ranking 6th and 8th respectively. Siege of Boralus and Tol Dagor have the highest residual, meaning they are furthest from the trendline.

A reason for this may be that the dungeon housed commonly known issues with bugs and gimmicks. Tol Dagor was a multi-level prison (in all senses of the word) where we ascended vertically with enemies on each floor. One of the biggest issues in Tol Dagor was inadvertently pulling mobs through the floor above or below, which often assured a wipe. This dungeon interacted dangerously with a few Mythic+ affixes because of the close quarters and enemy types — similar to Grimrail Depot. For example, the Sanguine and Bolstering affixes were partially difficult to navigate in this instance. Furthermore, the dungeon featured cannons that were incredibly powerful, but with limited ammunition. If players died or missed cannon shots, they would lose a significant amount of time with no second chances. This unforgiving portion of the dungeon was situated near the end, meaning that players often invested a great deal of time by that point only to have the dungeon depleted unsatisfyingly.

Players also had similar complaints against Siege of Boralus due to gimmicks. For example, Ashvane Spotters were used to kill much of the trash with friendly fire in the latter part of the dungeon (especially on Bolstering weeks). This was a highly technical strategy with many things that could go wrong, such as the Spotters having finicky meleeing patterns or failing to properly place their Sighted Artillery ability. Additionally, the final boss had some RNG with the spawn locations of tentacles, and the boss meleeing players as they crossed a bridge (which was not an intentional design). One additional thing to note is that, for players of the Alliance faction, there was almost 1-minute of additional NPC roleplay (RP), causing an imbalance in this dungeon across both factions. There is a possibility that, if these two dungeons didn’t have relatively serious design or RNG issues, they would not have been marked so low in popularity in relation to their timers. This could suggest that there is a relationship between BFA dungeon timer and popularity unless there are serious enough flaws.


In Shadowlands, the regression provided us with a p-value of 0.0097. This indicates that there is indeed a significant relationship. Surprisingly, Sanguine Depths scored 5th in popularity despite having the second longest dungeon timer. Regardless, the results from the methods used were quite conclusive.


It's valuable to compare all the dungeons across the three expansions to create a larger dataset, allowing us to draw more comprehensive conclusions. Doing this also minimized the impact of a few dungeons such as Tol Dagor and The Arcway that deviate the furthest from the trendline while giving us more confidence in the direction of the effect. The concern with this approach is ensuring that the data is comparable between the three expansions, which is why we normalized the votes to account for fewer people voting for Legion and BFA dungeons than they did for Shadowlands dungeons. Although the comparison between expansions is still imperfect, we believe this is a sufficiently robust metric to facilitate cross-expansion comparison. When using our aggregate method explained in the results section, we received a p-value of 0.0035. This tells us that there is a 99.65% chance of a relationship between timers and favorite keys across all expansions at the highest significance level. Because all expansions were normalized against each other, the larger data set should give us a more accurate result and account for dungeons with higher residuals.

However, we want to be clear that we’re not inferring that subtracting one minute from Mists of Tirna Scithe would increase the dungeon’s popularity.


To summarize, our methodology for this study seems to indicate that there is a provable relationship between dungeon timer and survey ranking. Acknowledging that survey responses sometimes yield imperfect results, it does appear that, if all players' preferences and skill level were accurately recorded, it would result in the same conclusion that there is a significant relationship between the variables.

In our survey, we included an "any additional thoughts" question where players often chose to justify their reasoning behind their favorite dungeons. In their optional responses, the length of a dungeon was almost never mentioned in favor of other factors such as dungeon linearity, AoE friendliness, boss design, “unfair” mechanics, and camera control freedom (visibility). Despite players rarely emphasizing dungeon timer as a factor for their favorite dungeons, there is conclusive quantitative evidence of a link between timer and preference.


About the Author

Krista is a long time Twitch Partner and Cutting Edge player with a love for Economics. She looks for any excuse to put her degree to use, including sifting through World of Warcraft data! You can also find her commentating for the Race to World First, Wow Esports productions, and other events.