My Five Years Learnings at Freeletics
It has been already more than one month since I left my last job at Freeletics. I am still enjoying the last days of my days off, before I start my new job very soon. But I have been planning since then to write about my interesting experience and learnings at this unique company, and finally the time has come.
A little bit about Freeletics
Freeletics started in 2013 as a Bodyweight Fitness that expanded into Gym, Running, Nutrition and Mindset later in the following years.
Freeletics has been a unique place, with an amazing people and atmosphere. Of course that took a big hit with Corona and working from home. I am saying this because it was always exciting for us to meet at the office, either for training before and after work. Or of course to have lunch with the colleagues. I can share these two videos with you. The first video that can tell about the culture, and the second was an interview with me.
I joined Freeletics back in March 2016 as a Backend Engineer. The goal was to be working on the Backend which is written in Ruby on Rails. The Web Framework which I am mostly familiar with, using the Language I love the most Ruby. Even if there are some other languages that are working hard to win this title. Ruby is still number #1 at the time of writing.
In this post I want to focus on the technical side of my job. I joined at a time when our Devops team had 2 members. Only 1 of them was a full-time engineer, who was planning to leave in 2 months. With the lack of OPS resources, I showed my interest to learn, and share my learning from what I have learned about AWS / Kubernetes / Docker with the rest of the Engineers to unblock them from time to time when they had issues.
I didn’t do this with the intent to move to Devops, or help me to be promoted to Senior Backend Engineer later. It was a geniune interest to learn about Infrastructure. Also because I was dealing with this myself to host my side project ChessDuo. But yeah, that indeed helped me to be promoted to a Senior Backend Engineer later. I am not complaining.
My manager(s) noticed this, and offered me to work for a few months as part-time then full-time Devops Engineer, to unblock the team. I accepted this challenge. It was not easy for me, as it was the first time in my life I work at a company where there is a dedicated Devops team, and I was supposed to be one. In fact, it was the first time in my life that I don’t do Full-Stack Web Development. It was a temporary change, without even an official change in the title.
The company managed to grow the OPS team again, and I moved back “home”, to the Ruby on Rails programming. I worked for almost 2 years on some interesting features, and different projects. But the same thing happened again :D The OPS team is leaving. So we decided to make it official, and I moved back to the OPS team full-time. My title was changed to (Senior Technical Operations Engineer), and I kept doing this until I left, end of February 2021.
In this post, I want to learn some of the interesting stuff I learned while doing Devops at Freeletics, hoping this can be of a benefit to someone, or some manager some day:
My Personal Learnings
1. Engineering Time is also a variable in the Infrastructure Costs Equation.
When you have two choices to choose from to host some service, you need to consider the Engineers time as well. Considering that there is a variance in salaries between different countries, this might make it more challenging and hard to answer some questions. Such as “Is it better to use RDS or to host my run EC2 with PostgreSQL on it?”. hosting the database on an EC2 instance can be more exciting for Engineers. Challenging and of course cheaper than pressing a button to start an RDS Instance. But did you consider the Engieers time before considring giving up on the Managed Services? Did you consider what can go wrong?
At Freeletics, we made the right choice from the beginning. Even before I joined. We used RDS and moved to RDS Aurora later. No way our small OPS team could have managed this.
We live in some exciting time. You press a button and you get a managed service that is doing what you want. Don’t invest so much in reinventing the wheel. It might be tempting to build stuff yourself, but your time is precious. Embrace the cloud. It can save you time and also money. They are both the same.
2. But sometimes, managed services won’t give you what you want.
That’s the other side of the same coin. It is up to you to decide.
Freeletics was running the Production Backend since 2015 on Kubernetes. Even before I joined. I don’t know about the managed Kubernetes offerings from the cloud providers back then. But we didn’t move to any of them. All of AWS, Google, Digitalocean and others offer a Managed Kubernetes Clusters these days. This of course might be a tempting and good choice if you don’t have time and resources to provision the clusters yourself. See (1). But we decided to stick with Kops, which is an Open Source tool to manage and privision Kubernetes Clusters. It supports mainly AWS, with other cloud providers, but not with a stable version. We simply chose this because with KOPS we can always stick with the most recent version of Kubernetes available. For us, this was part of the equation.
It might not be a problem for Databases (MySQL, PostgreSQL). That’s why for them, going for AWS RDS still sounds like a great choice, because these databases offer a longer time of support. Kubernetes on the other hand is moving fast, and ending their support for versions really quickly. Kubernetes is a different eco-system these days. A lot of breaking changes from time to time, and we decided to be riding the front of the wave. So Kops was a tempting choice for us, than a managed Kubernetes Cluster. It took more time from us of course. But we managed.
I am not saying that Manging your Kubernetes Cluster is always better than a hosted managed Kubernetes Cluster. That’s up to you to decide. I wanted to tell, why we decided on our side.
3. Everything fails, all the time ~ Werner Vogels. Amazon CTO
I think we have experienced a failure one every part of the system. Load Balancer, DNS, Application Instances (Kubernetes) and Databases. You need always to be ready for it, with some instructures that allow Engineers to react to the failure at any time.
Try early on to invest in Monitoring and Alerting. The data that you collect from Monitoring can be priceless in case of failure. Even if you don’t need it at the time you decided to setup the Monitoring System.
Freeletics is a fitness app. And in the case of every fitness company in the world, the new year is a challenging time for the increased traffic. We experience a high increase in traffic and sign-ups. And that’s no surprise. The New Years Resolution wheel is rolling. But this traffic is normal. People sign up, but not necessary using the product all the time. And we are always ready for it.
March 2020 was a different story. We were busy in some internal company hackathon event, as we were transitioning into Homeoffice, when we started to notice not-normal increase in the traffic. But this time, people are signing up and using the product. We never saw such amount of traffic on the system, at least during my 5 years time at the comapny. It was insane.
The monitoring systems back then could not have been more useful. With the data we collected from New Relic and our internal Prometheus/Grafana, we have been able to optimize the system, and handle the increased traffic. But of course, we struggled a few times, because we were not 100% ready. This period of time deserves its own post, to describe what we have done. I will leave it to some other Freeletics Engineer to write about it :) It was a tough period for us. We learned a lot, and we were happy to serve users who couldn’t train at an actual Gym anymore.
4. Infrastructure as Code is Cool
As I was learning and doing Devops at Freeletics, I learned so many stuff for the first time in the context of a Production System. Kubernetes / Docker / AWS Aurora / Helm / Kops. I was even giving some Docker Workshops for the internal company engineers and some external guests in Munich. It was fun. But if you ask me, which tool you found exciting the most for me, it was definitely Terraform. But it was not much the tool as much as the concept of Infrastructure as Code itself.
When I was a software engineer, I used to tell the computer what to do, and how to do it. Then when I started the transition to Devops, I missed writing code in the beginning. People approach me with a request some resources, I press a button on the AWS Console, and I get the resource. Until I was introduced to Terraform to manage the infrastructure as code.
You don’t tell Terraform how to create stuff. You describe what you want in code, and it handles this. That was cool.
But here is the catch: While Kubernetes / Docker and Helm are basically Devops tools, Terraform is not necessary is. You can use it to manage different kind of resources. Not only infrastructure. You can use it to manage your Github Repos for example. Basically it can talk to any API that handles resources. Not every software engineer should learn Kubernetes or Docker. But every software engineer should learn Terraform. This might sound like a bold statement. That’s my handle on Twitter @OmarQunsul, let’s argue there :P
5. Migrating Databases is a Big Challenge
Back in 2018 we were hosting one of our biggest Backend Databases on RDS with PostgreSQL 9.3. When the support for this version was ending soon, and we decided to move our database to Aurora RDS, that is running PostgreSQL 9.6. Easier to say than to do. Our first choice was using AWS DMS. But for us it didn’t work simply because we were using JSON columns in our database. The service might have been updated to handle this, but at that time, we needed a solution. Bucardo was our next choice, and it didn’t disappoint.
Migrating databases requires a lot of coordination, planning and testing. You need to make sure everything goes right, before you pull the switch, and move the traffic to the new database.
One of the things that we didn’t have chance to test, was AWS RDS Proxy. It was introduced recently, and it could have been of a great benefit for our database migration projects.
This as well deserves a longer post. If someone from Freeletics is reading this, please let me know if I am allowed to write about it :D Assuming I haven’t told any secrets already, over what is written in Engineering Blog, which is totally worth reading.
6. User management can be pain in the a$$, try to make it simpler
Our strategy was always, we don’t give permissions to users. We assign users to groups, and we manage permissions on the groups. That simply made our life much easier. That was it!
We didn’t really invest into making this better, and I wish we did early.
I am aware of some solutions, and I am planning to learn them on my own later. Like Hashicorp Boundary
Freeletics was a unique era of my life. Working at a unique product, with amazing team. 17th February was my last day at Freeletics, and it was really a hard day for me, saying Goodbye to everyone, remotely unfortunately.
I hope you enjoyed this post.
I am looking forward to starting a new chapter of my life.