Scalable Social Games, Robert Zubek of Zynga (liveblog)

Social games interesting from an engineering point of view sinc ethey live at the intersection of games and web. We spend time thinking about making games fun, making players want to come back. We know those engineering challenges, but the web introduces its own set, especially around users arriving somewhat unpredictably, effects where huge populations come in suddenly. SNSes are a great example of this, with spreading network effects and unpredictable traffi fluctuations.

At Zynga we have 65m daily players, 225m monthly. And usage can vary drastically — Roller Coaster Kingdom gained 1m DAUs in one weekend going from 700k to 1.7m. Another example, Fishville grow from 0 to 6m DAUs in one week. Huge scalability challenges. And finally, Farmville grew 25m DAUs in five months. The cliff is not as steep but the order of magnitude difference adds its own challenge.

Talk outline: Introducing game developers to best practices for web development. Maybe you come from consoles or mobile or whatever, the web world introduces its own set of challenges and also a whole set of solutions that are already developed that we steal, or, uh, learn from. :) If you are alreayd an experiened web developer, you may know this stuff already.

2 server approaches and two client approaches. So you get three major types.

1. Web server stack + HTML, Mafia Wars, Vampires, et
2. Web server stack + Flash, Farmville, Fishwville, Cafe world
3. Web + MMO stack + Flash, yoVille, Zynga Poker, Roller Coaster Kingdom

Web stack based on LAMP, logic in PHP, HTTP Comms. Very well understood protocol, limitations well known.

Mixed stack has game logic in MMO server such as Java, web stack for everything else. When web stack limitations are preventing the game development. Use web for the SNS pieces.

Fishville:

DB servers
–> web stack
Cache and queue servers

yoVille:

DB    >    MMO
Cache    >    Web & CDN

Why web stack? HTTP is very scalable, very short lived requests, scales very well, and easy to load balance. Each request is atomic. Stateless, easy to add more servers. But limitations esp for games: server-initiated actions (NPCs running around, if you come to lose, the monster reacts…) are hard to do over HTTP, since it is request/response. There are some tricks, like the long poll, but fundamentally this makes it harder to scale. Load balancers will get unhappy with you, you can saturate a connection.

The other thing is storing state between requests. This is more particular to game dev than web. Say you areplaying Farmville and collecting apples. You do a bunch of actions which result in many different requests to the servers, but we want to make sure that only the first click gives you an apple, so you cannot click a dozen times on one tree. Whih means stored state and validation. If you had clients talking to many web servers, you cannot use the DB< the poor thing will fall over. If we can guaratee that the client only talks to one web server, you can store it there, and save to db later. But this is tricky to do. Ensuring that people are no allowed to break session affinity even in the presence of maliious clients and browsers… hard.

So instead, you can wrap th DB servers in a caching layer that is faster that does not hit the DB all the time, such as Network Attached Caching. This works much better.

MMO servers… minimally MMO. Persistent socket connection per lient, live game support such as chat and server side push. Keeps game state in memory. We know when a player logs in and load from DB then… session affinity by default. Very different from web! We can’t do the easy load balancing like on web.

Why do them? They are harder to scale out because of load balaning. But you get stuff like the server side push, live events, lots of game state.

Diagram:

DB servers, maybe less caching wrapping it — talks to both web server and MMO server, then those both talk to client.

On the client side, things are simpler.

Flash allows high prodution quality games, game logic on client, can keep open socket. You can talk any protocol you want.

HTML+AJAX  the game is “just” a web page, minimal system reqs, and limited graphics.

SNS integration. “Easy” but not relate dto scaling. Call the host network to get friends, etc. You do run into lateny and scaling problems, as you grow larger you need to build your infrastructure so it can support gradeful performane degradation in the face of network issues. Networks provide REST APIs and sometimes client libraries.

Architectures:

Data is shared across all three of these: database, cache, etc.

Part II: Scaling solutions

aka not blowing up as you grow.

Two approaches: scaling up or scaling out.

up means that as you hit processor or IO limits, you get a better box.
out means that you add more boxes

The difference is largely architectural. When scaling up, you do not need to change code or design. But to scale out you need an architecture that works that way. Zynga chooses scaling out, huge win for us. At some point you cannot get a box big enough fast enough. Must easier to add small boxes, but you need the app to have architectural support for it.

Rollercoaster Kingdom gained a lot of players quickly. We started with one database, 500k DAUs in a week. Bottlenecked. Short term scaled up but switched to scaling out next.

Databases, very exciting. The first to fall over. Several ways to scale them. Terms unique to mySQL here but concepts the same for other systems:

Everyone starts out with one database, which is great. But you need to keep track of two things -  the limit on queries per second, do benchmarking using standard tools like SuperSmack. You want to know your q/s ceiling, and beyon that how will it perform. There are optimizations you can use to move it. And two, you need to know the player’s query profile in terms of inserts, selets, updates, and average profile per second. It might trail your DAU number, which is nice because then you can project q/s and know when you will reach capacity.

If your app grows then you will need to scale out.

Approach one, replicating data to read only slaves. Works well for blogs and web properties but hnot games, because games have a higher modification profile so your master is still a bottleneck. But useful for redundancy.

Approach two, multiple master. Better because of split writes, but now you have consistency resolution problems, which can be dealt with but increases CPU load.

Approach three and best and push the logic for resolution up to the app layer, a standard sharding approach. The app knows which data goes to which DB.

Partition data two ways”:

vertical by table, whih is easy but does not scale with DAUs. MOve players to a different box from items.

horizontal by row. Harder to do but gives best results. Different rows on different DBs, need good mapping from row to DB. Stripe rows across different boxes. Primary key modulo # of DBs. Do it on an immutable property of the row.  A logical RAID 0. Nice side eeffect to increase capacity… to sale out a shard, you add read only slaves, sync them, then shut down, cut replication, and hook it back up. Instant double capacity.

More clever schemes exist. Interaction layers which check where to go… but the nice thing about this is how straightforward it is.  No automatic replications, no magic, robust and easy to maintain.

YoVille: partiioning both ways, lots of joins had to be broken. Data patterns had to be redesigned, with sharding you need the shard id per query. Data replication had trouble catching up with high violume usage. In sharded world  cannot do joins across shards easily, there are solutions but they are expensive. Instead, do multiple selects or denormalize your data. Say a catalog of items and inventory, and you watch to match them. If catalog is small enough, just keep it in memory.

Skip transactions and foreign key constraints. Easier to push this to the app layer. The more you keep in RAM the less you will need to do this.

Caching.

If we don’t have to talk to the DB, let’s skip it. Spend memory to buy speed. Most popular right now is memache, network attached ram cache. Not just for caching queries but storing shared game state as well, such as the apple picking example. Stores simple key value pairs. Put structured game data there, and mutexes for actions across servers. Caveat: it is an LRU (least recently used) cache, not a DB. There is no persistence! If you put too much data in it, it will start dropping old values, so you need to make sure you have written the data to DB>

bc it is so foundational, you can shard it just like the DB. Different keys on different servers, or shard it veritcally or horizontally.

Game servers.

Web server part is very well known. Load-balance. Preferred approach is to load balance with a proxy first. This is nice from a security standpoint… but it i a single point of failure, capacity limits since the proxy will have a max # of connections.

If you hit those limits you load balance the load balancers… and using DNS load balancing in front of it. It doesn’t matter if dns propagation takes a while.

The other thing that is useful is redirecting media traffic away from media servers… swfs are big, audio is big, do not serve from the same place as game comms. You will spend all yor capaity on media files. Push it through a CDN, and if you are on the cloud already you can store them there instead. CDN makes it fast, sine the assets are close to the users. Another possibility is to use lightweight web servers that only server media files. But essentially, you want big server bank to only talk game data, not serve files. Seevral orders of magnitude performance by doing this.

MMO servers, the unusual part of the setup! Scaling is easiest when servers do not need to talk to each other. DBs can shard, memcache an shard, web can load balance farms, and MMOs? well, our approach is to shard like other servers.

Remove ny knowledge they have about each other and push complexity up or down. Moving it up means load balancing it somehow. Minimize interserver comms, all participants in a live event should be on the same server. Scaling out means no direct sharing — sharing thru third parties is OK, a separate service for that event traffic.

Do not let players choose their connections. Poor man’s load balancing, is a server gets hot remove it from the LB pool, if enough servers get hot, add more instances and send new connections there. Not quite true load balancing which limits scalability.

In deployment, downtime = lost revenues. In web just copy over PHP files. Socket servers are harder. How to deploy with zero downtime? Ideally you set up shadow new servers and slowly transition players over. This can be difficult — versioning issues.

For this reason, this is all harder than web servers.

Capacity planning.

We believe in scaling out, but demand can change fast.how to provision enough servers?  Different logistics. Do you provision physical servers or go to the cloud? If you have your own machines, you have more choice and controll and higher fixed costs. With cloud lower costs, faster provisioning, canot control CPU, virtualized IO, etc. On cloud easier to scale out than up.

For a legion of servers you need a custom dashboard for health, Munin for server monitoring graphs, and Nagios for alerts. First level for drilldown is graphs for every server family separately so you can isolate it to a given layer in the system. Once you know memache usage spiked, then you can drill down to particular machines…

Nagios… SMS alerts for server load, CPU load exeeds 4, test account fails to connect after 3 retries.

Put alerts on business stats too! DAUs dropping belo daily average for example. Sometimes they react faster than server stats.

If you are deployed in cloud, network problems are more common. Dropping off net or restarting is common. Be defensive, Reduce single points of failure, program defensively. This includes on the game side.

Q&A:

q: why mySQL? Other DBs are better for scaling.
a: there are other DBs that have been around longer, have greater community, but we don’t use the features those large DBs do. Looking back at the sharding slides — we don’t do a lot of even things like transactions. Easier to move that complexity to the app layer. Once you are on that path, it is a good solution.

q: did you benchmark, that sort of thing, for the different DBs?
a: yes, of course.

q: and for data integrity, if you threw foreign key constraints, that sounds scary! Is it kind of a nightmare?
a: No, not too bad at all, actually. Esp if you do not hit the DB all the time, you ind you don’t get into those dangerous situations as often.

q: is the task when you add more tables… is it as complex?
a: not too bad, has worke well.

q: assuming browsers pick it up, are you guys looking into webGL?
a: many technologies interesting, 3d in browser, silverlight. I would be interested in using them personally… once they achieve high market penetration.

q: why flash?
a: everyone has it. Very pragmatic approach.

q: Do you back up dbs?
a: of course

q: and how?
a: onc eyou go with cloud and amazon, you have to use that approach…  we have a number of redundant backups solutions.

q: I guess many joins are across friends… they have to tlak to multiple shards. Do you try to put friends on same shard?
a: no, everyone has different friends.

q: on SNS integration, did you run into issues with PHO not supporting asynh, with delays from answers from the SNS, running out of threads?
a: you will encounter delay with SNS comms, just part of the overall insfrastruture, could be anytihng, not just PHP. You have to program around it, have to find good solutions for dealing with it when it happens bc it will.

q: So you don’t switch from PHP, delay the process?
a: we did encounter a number of places where we had to dig deep into PHP in order to make it work well on that scale.

q: did you patch PHP?
a: we, uh… yes.

q: what are you feeling on tools like the no SQL sort of thing
a: we look into those atively, one the tech matures, it will be a very good candidate fot this sort of thing. But not currently implemented.

q: on sharding, you said use modulo to distrbute load. Once you have found a bottleneck, howdo you prepare the data to be moved from one shard to another.
a: You don’t move people between shards. You just copy a shard to two machine, and both have redundant, and then remove the redundant data.

q: on partitioning, partitioning to two tables. Say item trading that goes across two DBs, transactions may break? Changing ownership on two different dbs?
a: you need to do a guarantee across multiple DBs, putting the data in a memcache layer, locking it, then doing the write, or putting it in the app layre, implementing”transactions lite”

q: being on the cloud did you have to not use a service approach and have each PHP layer write direct to the DB instead of use a service layer? Say an MMO, achievments or presence services. Do you keep the servie layer as a web servie, or write direct to the DB? Your service call time can add time… even on the cloud.
a: Yes, you want this to be nicely modular… we end up not putting it on different machines. Same box as the game logic so there is no network traffic, so there is no separate layer between. So modular, but not in terms of network topology.

Posted by 솔라리스™
: