Solutions

A brief overview of some of the solutions I came with for challenges that appeared through time on my career.

Challenge Solution
Weak and Undocumented K8s Infrastructure 1.0 Takeover Re-creation of YMLs with updated syntax for K8s 1.13, and deploying a managed Control Plane (via AWS EKS)
Adding a daily Cronjob Using AWS Lambda functions with CloudWatch events triggers allowed to run a Cronjob without having a webserver running all the time
Too many Load Balancers on AWS were increasing the costs Installed an Ingress Router over K8s based on NGiNX which ended on up 90% of the ELBs removed
Extracting, parsing and merging meaninful NGiNX Logs from a farm of locked and secure CDN machines (partner) Created a lightweight Rsyslog client based Docker container with a volume mount of the nginx logs path (only allowed access due to security) which was broadcasting the logs from all farm machines to a centralized Rsyslog Server deployed on our infrastructure (over TCP to prevent data losses)
Some clients abusing from endpoints Installed connection limiter and blacklist middleware on the Node.js Express based APIs which was returning 429 codes for those clients
Kubernetes liveness and healthiness probes failing to ping containers not running consumable web services Created a lightweight Node.js liveness prob (listening on any endpoint and returning OS stats) which was then managed by a supervisor based container that also controlled the other main process
Up to 2 million messages per day where causing troubles on SQS Deployed and configured a RabbitMQ deployment with 4GB of RAM via AMQP Cloud which was able to process up to 8 million messages per day
African partner wanted a the mobile betting web app to work on Nokias and similar Opera Mini based smartpthones on low speed locations Created a lighweight and small JavaScript framework (15Kb) which provided all the basic DOM management, event handling for touch events and XML HTTP Request support
Large casino app was becoming a bit legacy (based on MarionetteJS and Backbone) and not keeping up with the modern JavaScript frameworks Due to the size of the codebase, rewriting was striked out, so we've added RxJS (to allow event buses and observables) plus wrapped most of the components in ES6 to make the outer logic future proof
250GB database was being filled up in 15 days Splitted the collections on daily basics (eg. col-010119, col-020119) which allowed us to run a scheduled cleanup job every day to dump, upload to S3 and drop collections older than 15 days, so no need to pay a managed database larger than 250GB
20GB Collections where a bit slow to query Splitted collections by day, created efficient indexes in key properties, swapped find() based queries with optimized aggregations
No Unit Tests on some of the backends apps Created a typescript boilerplate and converted most of the backend code to typescript on those apps, and by doing that most of the errors where automatically identified and fixed
No Docs on some of the apps and was required for integration Installed automated Swagger documentation generation for Express.js based frameworks and forked Swagger-UI to allow our partners to study the endpoints and identify payloads and securities
Simple Authentication Required Prepared and installed an oAuth2.0 based authentication server (on-premises
Lightweight Authentication Required for partner access Created a simple Web Tokens based authentication server and client library (via Bearer Token)
Client Code Based was becoming a bit large (> than recommended 300Kb per asset) Splitted the application in asset parts (sportsbook.min.xx.js, casino.min.yy.js, account.min.zz.js, jackpot.min.zz.js) which kept the assets lower than 300Kb and allowed the client browser to load an asset at a time depending on which part of the website he was using
Required a Screenshot tool to export high quality banners from HTML Created a custom deployment of Puppeeter from Google which allowed HQ screenshots to be taken under a second (single container could handle at least 10 requests for images)
Required fractional level asset load balancer Create a custom NodeJS Load Balancer based on Math Numerical Analysis using rejection sampling to create a Javascript Client Side Percentage based Load Balancer and reverse proxy (to embed assets)
Costs reduction AWS Infrastructure costs reducing (using Serverless Functions, Ingress Routers, Less EC2 Machines, less ELBs, more efficient auto scaling groups, vertical scaling and horizontal pod scaling)
Mid term Cost Projection and Estimation Calculated total cost per user for 3 6 and 12 months (inc database, queues, infrastructure, lbs, dns)
Telemetry and Micro Services Issue tracking Making use of a corellation id to be able to trace microservices calls via LogEntries logs
--