Leaderboard ranking with Firebase

Question

I have project that I need to display a leaderboard of the top 20, and if the user not in the leaderboard they will appear in the 21st place with their current ranking. Is there efficient way to this? I am using Cloud Firestore as a database. I believe it was mistake to choose it instead of MongoDB but I

Accepted Answer

Finding an arbitrary player&#8217;s rank in leaderboard, in a manner that scales is a common hard problem with databases.There are a few factors that will drive the solution you&#8217;ll need to pick, such as:Total Number playersRate that individual players add scoresRate that new scores are added (concurrent players * above)Score range: Bounded or UnboundedScore distribution (uniform, or are their &#8216;hot scores&#8217;)Simplistic approachThe typical simplistic approach is to count all players with a higher score, eg SELECT count(id) FROM players WHERE score > {playerScore}.This method works at low scale, but as your player base grows, it quickly becomes both slow and resource expensive (both in MongoDB and Cloud Firestore).Cloud Firestore doesn&#8217;t natively support count as it&#8217;s a non-scalable operation. You&#8217;ll need to implement it on the client-side by simply counting the returned documents. Alternatively, you could use Cloud Functions for Firebase to do the aggregation on the server-side to avoid the extra bandwidth of returning documents.Periodic UpdateRather than giving them a live ranking, change it to only updating every so often, such as every hour. For example, if you look at Stack Overflow&#8217;s rankings, they are only updated daily.For this approach, you could schedule a function, or schedule App Engine if it takes longer than 540 seconds to run. The function would write out the player list as in a ladder collection with a new rank field populated with the players rank. When a player views the ladder now, you can easily get the top X + the players own rank in O(X) time.Better yet, you could further optimize and explicitly write out the top X as a single document as well, so to retrieve the ladder you only need to read 2 documents, top-X & player, saving on money and making it faster.This approach would really work for any number of players and any write rate since it&#8217;s done out of band. You might need to adjust the frequency though as you grow depending on your willingness to pay. 30K players each hour would be $0.072 per hour($1.73 per day) unless you did optimizations (e.g, ignore all 0 score players since you know they are tied last).Inverted IndexIn this method, we&#8217;ll create somewhat of an inverted index. This method  works if there is a bounded score range that is significantly smaller want the number of players (e.g, 0-999 scores vs 30K players). It could also work for an unbounded score range where the number of unique scores was still significantly smaller than the number of players.Using a separate collection called &#8216;scores&#8217;, you have a document for each individual score (non-existent if no-one has that score) with a field called player_count.When a player gets a new total score, you&#8217;ll do 1-2 writes in the scores collection. One write is to +1 to player_count for their new score and if it isn&#8217;t their first time -1 to their old score. This approach works for both &#8220;Your latest score is your current score&#8221; and &#8220;Your highest score is your current score&#8221; style ladders.Finding out a player&#8217;s exact rank is as easy as something like SELECT sum(player_count)+1 FROM scores WHERE score > {playerScore}.Since Cloud Firestore doesn&#8217;t support sum(), you&#8217;d do the above but sum on the client side. The +1 is because the sum is the number of players above you, so adding 1 gives you that player&#8217;s rank.Using this approach, you&#8217;ll need to read a maximum of 999 documents, averaging 500ish to get a players rank, although in practice this will be less if you delete scores that have zero players.Write rate of new scores is important to understand as you&#8217;ll only be able to update an individual score once every 2 seconds* on average, which for a perfectly distributed score range from 0-999 would mean 500 new scores/second**. You can increase this by using distributed counters for each score.* Only 1 new score per 2 seconds since each score generates 2 writes** Assuming average game time of 2 minute, 500 new scores/second could support 60000 concurrent players without distributed counters. If you&#8217;re using a &#8220;Highest score is your current score&#8221; this will be much higher in practice.Sharded N-ary TreeThis is by far the hardest approach, but could allow you to have both faster and real-time ranking positions for all players. It can be thought of as a read-optimized version of of the Inverted Index approach above, whereas the Inverted Index approach above is a write optimized version of this.You can follow this related article for &#8216;Fast and Reliable Ranking in Datastore&#8217; on a general approach that is applicable. For this approach, you&#8217;ll want to have a bounded score (it&#8217;s possible with unbounded, but will require changes from the below).I wouldn&#8217;t recommend this approach as you&#8217;ll need to do distributed counters for the top level nodes for any ladder with semi-frequent updates, which would likely negate the read-time benefits.Final thoughtsDepending on how often you display the leaderboard for players, you could combine approaches to optimize this a lot more.Combining &#8216;Inverted Index&#8217; with &#8216;Periodic Update&#8217; at a shorter time frame can give you O(1) ranking access for all players.As long as over all players the leaderboard is viewed > 4 times over the duration of the &#8216;Periodic Update&#8217; you&#8217;ll save money and have a faster leaderboard.Essentially each period, say 5-15 minutes you read all documents from scores in descending order. Using this, keep a running total of players_count. Re-write each score into a new collection called scores_ranking with a new field players_above. This new field contains the running total excluding the current scores player_count.To get a player&#8217;s rank, all you need to do now is read the document of the player&#8217;s score from score_ranking -> Their rank is players_above + 1.

Advertisement

Answer

Simplistic approach

Periodic Update

Inverted Index

Sharded N-ary Tree

Final thoughts