March 8
Hi everyone! Hope you’re doing well 🙂
Made a lot of progress on Deserted Chateau in the last two weeks, details below.
Multiple Environments Setup
As part of getting the site ready for testing, launch and post-launch development, I have created three distinct environments for Deserted Chateau’s operation within AWS:
- Live environment: all the live code and data.
- Test environment: more akin to a staging environment, but also used for testing things that can only be tested on cloud infrastructure. Low-powered webserver, redis instance, DB instance, CDN distributions, etc.
- Localhost environment: while a lot of this is what it says - on localhost - there is also some cloud infrastructure for this. There are separate CDNs for localhost and test development, separate credentials, etc.
All of these have separate user roles, access credentials for each AWS service etc - so there has been a huge amount of setup work to do. Most of it is done now; not only will it make testing and development a bit easier, but it will also help to prevent accidental use of test credentials to cause live changes. Since for example the test users only have access to the test environments and cloud infrastructure, they can’t access the live environments and so any testing I do is safe from accidentally messing with live stuff.
AWS cleanup
As part of the above environment setup, I also went and cleaned up all of the previous credentials, roles, policies etc that were made during various testing procedures. It’ll help to keep only relevant things in there as there is a *lot* to manage, and having old unused functions and policies in the AWS area makes it harder to keep tabs on everything.
Refactoring: credentials, part 1
Previously, credentials for AWS services were stored in configuration PHP files in the codebase (with gitignore instructions, so they never made their way into version control). This presented a few problems: it means changing every webserver’s code base whenever the credentials change, and it means if for some reason I lost my local copy of the source code, I’d have to go recreate the credentials as they are not in version control for obvious reasons.
Instead of this system, I’ve now made all of those credentials part of AWS’ Parameter Store. I tried using AWS Secret Keeper, but it’s cumbersome and comes with significant costs; Parameter Store is equivalent in security terms anyway by their own admission, so it’s a lot easier to use. This way I only need to update values in the parameter store if credentials change, instead of changing the source code on each webserver instance.
It also makes it a lot easier to actually deploy the code; since there’s no longer a need for test/live/localhost environments to each have different PHP credentials values, I don’t have to manually set those when deploying a server anymore, as it will simply get the appropriate values from the parameter store. All I have to do is change one config line in the PHP code to signify what environment the server is running on, and I should probably refactor that to a configuration file outside of the code just for organisational reasons, but it’s no big deal.
Refactoring: credentials, part 2
Managing credentials on AWS is something of a mammoth task if you want to do it properly; there’s an absolute *ton* of different systems and rules. For ease of maintenance reasons, I’m doing it the proper way, so I’ve been implementing STS into my serverless (Lambda) functions and possibly onto webservers as well.
STS is basically the AWS way of doing an OAuth-system of sorts: you can “assume” a role and get temporary credentials, instead of getting hardcoded credentials or credentials from e.g. the parameter store. This means not having to manage any credentials at all in these cases (since they’re temporary anyway) and should make both maintenance and security that little bit easier. Getting STS to work with Lambda functions has been something of a challenge, but that’s working; I just need to do some integration work now.
The problem here is twofold: firstly, you can’t run a localtest environment with STS, at least not easily. I’ll need to investigate exactly how to do that. Secondly, Amazon Lightsail is technically separate from a lot of other AWS services - the way STS and credentials are set up on AWS makes them a little harder to configure, and it might involve using some credentials no matter what I do. Either way, it’s something to investigate and finalise, then move on.
Refactoring: configurations
Refactoring configuration items was driven by similar reasons to credentials, but is done differently. Instead of using AWS parameter store, configurations are now stored in the database for the given environment, and then loaded by the webservers; effectively moving the configuration from PHP files to a database table. This makes it easier to change the configuration for a given environment, as it means only changing the database values once, rather than going into every webserver and changing each one.
It’s also easier to keep live/localhost/test configurations intact, instead of having to e.g. maintain separate PHP files for that purpose which would be cumbersome and inefficient.
Serverless code: database calls
I’ve implemented a serverless function for running database queries that will likely take more than a few seconds - e.g. inserting notifications for a new artwork. Getting this to work with half-decent error handling turned out to be a major pain, as some AWS services are accessed within VPCs and some are not, so there was a bit of research I had to do in order to access some services.
As a simple explanation, things like RDS database servers are not accessible from the public internet; you have to be within your own VPC to access them. Other services like DynamoDB (a NoSQL database) are accessed via HTTPS, and can’t be accessed from within a VPC… leading to a kind of catch 22 situation. There was a solution - VPC endpoints - but it took a bit of time to find it.
Database improvements: stored procedures
I’ve made a lot of the queries in the back-end code into stored procedures instead of direct executions within the code. For one it should make managing the queries easier - queries in PHP strings aren’t the easiest to read - and for some operations where you need to perform multiple queries, a stored procedure cuts down on overhead and is more efficient.
Database improvements: RDS changes
While I was testing, I found an interesting optimisation on AWS RDS; when you start up a database, you have a choice between two standard storage types (gp2 and gp3, both SSD drives). After some research I found out that gp3 is basically way better in terms of max throughput and cost efficiency, even though gp2 is for some reason always selected as the default. You can get something like 20x more throughput for the same cost.