Here at Rustici Software, we’ve spent a lot of time thinking about how one scales SCORM Engine and Content Controller. As you’re building out your environment, there are some key bits to keep at the front of your mind:
- The primary limiter in large SCORM Engine and Content Controller deployments is database capacity. CPU and disk IOPS are the two most important bottlenecks you’ll face.
- Front-end application server capacity is easy to scale - just add servers behind a load balancer. Scaling the database backend is harder - you have to make the master box bigger.
- SCORM Engine is write-heavy - the RecordResults call has to update a lot of tables with information about the users’ interactions with a course, and it is the single heaviest user of disk I/O on a SCORM Engine database server.
- Read-after-write consistency is an issue with SCORM Engine - very often, your application will want to read back data immediately after it has been written to the database so that it can make decisions based upon a user’s interactions with a course. For many folks, the 1-2 second latency inherent in reading from read-only database server isn’t acceptable. This means that scaling reads horizontally is much more challenging than with an ordinary web application that can tolerate a second or two’s worth of read-after-write latency.
The problem that most of our customers encounter (and that we’ve run into as well) is that when the capacity of the database server is maxed out, they’re forced to upsize the box upon which it runs or shard the database out by tenant - scaling out reads is often of limited utility because of the read-after-write consistency problem mentioned above.
Scaling Application Servers
SCORM Engine and Content Controller application servers scale horizontally without much effort. Just add more of ‘em to your load balancer and off you go - no need to worry about session stickiness. The biggest gotcha you’ll need to consider is shared file storage - all of those SCORM and xAPI courses have to live somewhere that’s accessible to all of the application servers, all the time. We’ve seen folks have success using the following shared storage techniques:
- Windows File Server shares (on Windows servers only - SMB shares on Linux not so much.)
- iSCSI targets
- Amazon S3 (which we use in our Managed Hosted environment and support)
- Azure Cloud Storage (which we don’t support but have seen folks build out their own implementations of this. Not for the faint of heart.)
- NFS shares on Linux (we’re not big fans, but it can be made to work.)
S3 File Storage
S3 is the obvious solution if you’re operating in an AWS environment, but there’s another layer involved - to use S3 and still protect your content from public access, you need to have a layer of authentication in front of it. We solve this by using AWS’ CloudFront CDN to proxy all connections to S3. Configuring CloudFront and S3 to accomplish this is beyond the scope of this humble article, but if you’re really interested, please give us a call and we can talk about all the gory details. We’ve got Java-only libraries that will handle this for you, but they’re not included as part of Engine’s core (yet). .Net support for S3 will be forthcoming once we move this stuff into Engine’s core.
Sharding by tenant
If your application is using SCORM Engine’s tenancy features and you think that you may need to shard your databases in order to scale horizontally, you should consider using Engine’s per-tenant database connection pool features. These features allow you to define tenant-specific database connection strings. For example, some of our clients use this feature to provision separate databases for important, high-volume clients so that they can place these clients on their own dedicated database servers.
For an overview of how to use Tenancy settings within Engine, please see http://rustici-docs.s3.amazonaws.com/engine/2015.1.x/architecture-api.html#tenancy
When you’re looking at the actual hardware for SCORM Engine, the following baseline may be useful. This is the smallest possible setup that we would run for a production environment.
- Windows Server 2012 Datacenter Edition or Ubuntu 14.04 LTS
- AWS EC2 t2.medium instance
- 2 vCPUs (Intel Xeon E5-2676 v3 @2.40GHz)
- 4 GB RAM
- .NET Framework 4.5 or Oracle Java 8
- SQL Server Standard 12.x, MySQL 5.6, or Postgres 9.3.5+
- AWS RDS t2.medium instance
- 2 vCPUs (Intel Xeon E5-2676 v3 @2.40GHz)
- 4 GB RAM
- Burstable to ~150 disk IOPS (IO operations per second)
Obviously a proper production setup is going to have multiple application servers and redundant database servers to provide high availability. We are especially big fans of AWS’ Elastic Load Balancers and Multi-az database servers for high-availability setups that are also low-maintenance.
Keep in mind that as you add application servers, your Master database server is going to be forced to scale upwards. If you’re running on AWS or another cloud platform that allows you to upsize instances, this isn’t a big deal. If you’re running on your own iron, this might be a HUGE deal, and you’ll need to plan very carefully for growth. Please talk to us at firstname.lastname@example.org if this is an issue for you - we can help you plan your database environment so that you don’t get stuck with too much or too little capacity.