SCORM Cloud: Safe, Secure and Available
We keep you running
Nothing is foolproof when it comes to guaranteeing uptime. We could quote a crazy uptime percentage guarantee here, but instead, we will simply let the daily uptime stats speak for themselves. Want to see the current uptime status? Head over to our dedicated uptime page. If you have immediate questions or concerns about our uptime, please contact us or Tweet us @SCORMCloud.
Rustici Software has implemented a two-pronged approach to infrastructure and application security. We utilize the SANS 20 Critical Controls (20 CC) framework as our set of guiding principles for infrastructure and network security management. The 20 Critical Controls encompass all aspects of network management, covering topics ranging from encryption to protecting against social engineering attacks.
On the application security side, we have implemented the Open Web Application Security Project (OWASP) framework. In addition to using OWASP to guide our development and coding practices, we regularly pen test our application in a staging environment to ensure that new code changes do not introduce vulnerabilities.
Backups and Disaster Recovery
All the backups in the world are useless if you can’t use them to recover data and infrastructure. We’ve designed our backup and disaster recovery strategy around two metrics:
If there’s a disaster, how fast can we recover?
Have we done a test restore this week? If not, do one now.
To this end, we’ve taken the following measures:
Our Web and Database tiers are split between two physical availability zones. In the event that one of these zones is taken offline, SCORM Cloud can fail over to the second zone within minutes with no data loss.
SCORM Cloud is served primarily from the AWS’ US-East-1 datacenters. We have designed our configuration management systems so that we can stand up SCORM Cloud in the US-West-2 datacenter in Oregon within 2 hours of a disaster that takes the entirety of the US-East-1 datacenter group offline. In this context, disaster means we’re talking about a massive hurricane that cuts power to all of their datacenters, an earthquake, an alien invasion of Northern Virginia (because why not), or some other equally unlikely event that we really really really hope doesn’t happen because it would be awful for everyone.
We maintain backups of our MySQL databases in AWS datacenters on both the East and West Coasts, and we test restore these databases to a DR environment weekly.
We maintain active read-replica databases in the US-West-2 datacenter. In the event of a disaster that takes out multiple US-East-1 availability zones, we still have up-to-the-minute data in US-West-2.
We maintain content in version-controlled S3 buckets with a 60-day expiration period.
We synchronize our production S3 buckets to backup buckets with US-West-2 endpoints.
Every time we deploy a new version of SCORM Cloud, we’re also running a test of our DR systems. Web instances are built from scratch. API and interface tests are run and validated. Databases are restored from snapshots. CloudFormation scripts rebuild infrastructure. If it fails, we fix it. Fortunately, it doesn’t fail often 🙂
SCORM Cloud utilizes a Content Distribution Network (CDN), Amazon CloudFront. This offers drastic performance improvements for learners launching course content, especially for learners not located in North America. Learn more about SCORM Cloud CDN here.
On the infrastructure side, we utilize Amazon Web Services for every aspect of SCORM Cloud. We lean on AWS’ expertise in datacenter and infrastructure management so that we can concentrate on what we do best – developing great software.
All of our infrastructure is segmented using AWS’ Virtual Private Cloud (VPC) features. We use VPCs to keep our data and web servers carefully segregated, and to further separate our development and staging environments from production.
We use Amazon S3 for content storage. SCORM Cloud authenticates all client connections to S3, and also proxies all connections to S3.
We use Amazon RDS for MySQL database services. Our RDS instances are in a multi-availability zone configuration, so the loss of a single geographic location or database server maintenance won’t impact SCORM Cloud’s availability. We also replicate all MySQL databases to RDS instances in the AWS US-West-2 datacenter in Oregon, and maintain database backups in both US-East-1 and US-West-2 datacenters.
Our web servers are located in multiple physical availability zones in the US-East-1 datacenter.
We make extensive use of AWS’ CloudTrail and CloudWatch features to monitor our systems’ availability and to audit access. We have real-time alerts and change management configured that give us visibility into all changes to the environment.
We use AWS’ IAM identity management features to facilitate and audit access to all AWS resources.
For a complete overview of Amazon Web Services’ security practices and regulatory compliance information, please see the following:
Monitoring and Availability
We have partnered with the good folks at New Relic (who are awesome) to provide JVM-level (Java Virtual Machine-level) monitoring of SCORM Cloud. New Relic gives us transaction-level visibility into every function of SCORM Cloud, and allows us to resolve problems before our customers even notice them.
As mentioned earlier, we make extensive use of AWS’ CloudWatch features to monitor our systems at an instance level, and to alert us when problems crop up.
We also have a tertiary integration we built with Pingdom that alerts us to system-wide outages via phone calls, Slack messages, texts, emails, and carrier pigeons (well, not that last item, but if we thought it would help, we would do it). Want to see real-time status? Head over here to our uptime page.
We built out our SIEM system atop the ELK (Elasticsearch, Logstash and Kibana) Stack. All of our production instances forward their log data to Logstash, which parses the input, tags it according to rules, and the passes it along to our Elasticsearch cluster for indexing. We have built extensive dashboards in Kibana that give us visibility into the log data. Among other things, we use the ELK stack for monitoring remote administrative access, for per-instance change-control, and for monitoring application-level authentication events.
We tightly restrict access to our production databases and content stores: only a core group of developers and dev/ops folks have access to the databases and S3 buckets where the content resides.
Remote access to all production resources occurs via a restricted network here at Rustici Software. All access to our production resources occurs over an IPSec VPN connection.
Administrative access to production systems is extensively logged and audited. Because we use indelible master AMIs for our production systems (see the Configuration Management section for details), monitoring remote ssh access to these instances is easy: if someone’s made an SSH connection, then something’s amiss, and the Dev/Ops team knows about it immediately via our SIEM system.
We make extensive use of AWS’ CloudTrail system to log API and console access to AWS resources. AWS access is controlled: each person that is able to log in is granted an explicit set of rights necessary to do what they need to do, and no more. We do not use AWS root keys for access at any time – all access is managed via AWS IAM services. Each administrator accesses the system using a unique set of credentials, keypairs, or API keys.
We use Tenable’s Nessus Professional Vulnerability Scanner to monitor all of our systems for vulnerabilities. Nessus provides up-to-the-minute vulnerability intelligence that we use to guide our patch management process and our infrastructure audit process.
We use Burpsuite Professional (which is most excellent, despite its odd name) to perform automated application-level scans of our systems against the OWASP top 10 common web application vulnerabilities.
SCORM Cloud utilizes a Blue/Green production infrastructure, which is to say that at any given time our production stack is running on one of two environments. When it is time to update the production environment, we build a fresh set of Amazon Machine Images (AMIs) and deploy them to whichever stack is inactive. We build our AMIs using a pipeline consisting of Ansible, Packer, Jenkins, and StackStorm. This build pipeline allows us represent all aspects of the systems’ configuration as code, and removes the potential for human error from the build and deployment process. The entire production stack can be rebuilt from scratch, tested, and deployed in under two hours.
By using indelible master AMIs in this fashion, we gain a major security advantage: even if an instance were to be compromised, we can redeploy the system with absolute assurance that the compromise is no longer viable – the compromised instance is immediately destroyed and replaced with an instance built from trusted sources.
Protecting your privacy is essential to us. Rustici Software will not give, sell, rent or loan any personal information to any third party, unless:
It is necessary to share information in order to investigate, prevent, or take action regarding illegal activities, suspected fraud, situations involving potential threats to the physical safety of any person, violations of Terms of Service of the specific application, or as otherwise required by law.
We are Privacy Shield certified. Click here to view our certification.
When it hits the fan
We’ve built SCORM Cloud from the ground up to be fault tolerant with no single point of failure (props to Amazon for making that easy). Sometimes, though, the unexpected happens. If you want to keep up with our progress during an outage, you can follow us on Twitter @SCORMCloud.
If you have any questions, please contact us at email@example.com