If communities are the heart of a project, contributors are the lifeblood that keeps open source development pumping away. This time we introduce Aleksandr Sorokoumov. If his volunteer work on the Apache Cassandra project inspires you to get involved, try reading our guide ‘Contributing to Cassandra’.
About Our Contributor
Based in Germany, Aleksandr Sorokoumov is a software engineer at Confluent, where he works on ksqlDB, a database purpose-built for stream processing applications. Previously, he was employed at DataStax working on DSE storage engine and AstraDB, a cloud-native database built on top of Apache Cassandra. As well as contributing code to the project, he’s also begun contributing technical write-ups to share some of his hard-fought-for experience with users and contributors.
Contributor Questions
Question: What are you currently working on, or have worked on, in the past for the Apache Cassandra project?
Answer: I am currently working on an article about the internals of CommitLog. This is a component in Apache Cassandra responsible for durability. I worked on several projects at my previous job that required an in-depth understanding of how CommitLog works. It took me some time to find out the details as I had to dig into old JIRA issues and ask experts clarifying questions. I believe this information will benefit DBAs and new contributors; therefore, I felt it would be good to share it.
Previously, I worked on improving serialization efficiency (CASSANDRA-15215). VInt (or Varints) is an encoding used to serialize integers, so that smaller numbers occupy less space. With my patch, the serialization throughput increased by up to 30%.
Question: What’s been the most rewarding aspect of being part of the Apache Cassandra community?
Answer: There are two important aspects I would like to mention. First of all, it is interaction with the community members. People working on Apache Cassandra are among the most supportive and experienced professionals I have had a chance to work with. Collaborating with them creates a ton of learning opportunities. The second aspect is the impact of the work you do. Apache Cassandra is a database running at a massive scale that powers the infrastructure of many well-known companies. It just feels great to know that my work is deployed at that scale.
Question: For someone reading this and can help, what types of contributions or support do you think the Cassandra community needs, or is there anything they can do to help you?
Answer: In my opinion, one of the areas the Cassandra community will always appreciate help is work on quality. Cassandra 4.0 was a great breakthrough for the project in paying back the technical debt.
With many upcoming exciting features in-flight, it takes constant effort from everyone to maintain the high bar. In my opinion, fixing a failing test is one of the best ways to get started. Such issues are often well-scoped, and the community always welcomes patches in this area.
Note: If you’d like to try fixing some failing tests, you can find unassigned tickets on this kanban board.
Question: What areas of interest and fields are you passionate about in your career?
Answer: I am passionate about distributed databases and streaming systems. One of the aspects I love about these fields is the bi-directional feedback loop between industry and academia. On the one hand, concepts behind systems like Apache Cassandra were first described in research publications. For example, Transient Replication implements an algorithm published in a paper in 1986. How cool is that? On the other hand, the communities working on production systems contribute back new theoretical approaches. A recent example is Accord, which is a leaderless protocol for distributed transactions.
Successful projects in these areas often live for a long time and have a massive impact on the entire industry. Apache Cassandra is 13 years old, an impressive age for a piece of software, yet we have an energetic community and so many exciting upcoming features.
Question: How were you introduced to open source software development, and what was your first contribution to an open source project?
Answer: I was fortunate to participate in Google Summer of Code in 2014 during my Master’s studies. I worked on Incanter, a data-science library for Clojure and its integration with core.matrix - a pluggable backend for array programming.
Question: What advice would you give to someone getting started with open source projects?
Answer: Don’t be shy to reach out to the community. I know from personal experience that it might feel scary at first to ask questions in public mediums, such as mailing lists or Slack. However, there are many fantastic open source communities, such as Apache Cassandra, where people will be happy to help you find a starting issue, answer questions, and review patches.
Question: What do you like to do with your spare time when you’re not volunteering on the project?
Answer: I like to learn new skills in my free time. I am currently learning how to play an electric guitar.
How to contribute
As well as the diving into failing tests, as mentioned above, a good starting point for anyone wanting to help with code is our unassigned “Starter Tickets” for versions 4.0.x and 4.1.x, which you’ll find listed on this kanban board. Just remember to assign yourself to the ticket and acknowledge the status, such as ‘Work in Progress’ and ‘Needs Committer/Patch Available’ when you submit your patch.
You can also introduce yourself to active developers on the ASF Slack in the #cassandra-dev Slack channel. If you use @cassandra_mentors
, that will put you in touch with our Cassandra mentors.