Over the past years, I've organized dozens of events to help developers learn about technologies, network with other professionals, and get excited about the latest developments in the field. I've developed a pretty crisp understanding that communities are the backbone of our industry, and I also realized that I ought to be working with the communities. Once I had a clear idea of my professional mission, I wanted to find a company that not only best fits my skills and aspirations but would be thrilled about building a community. Today, I am joining Bytewax as a Director of DevRel and Operations. Here are a few reasons why I am enthusiastic about joining Bytewax.
All things Data Streaming
There is so much said about the importance of data and how the ability to quickly extract valuable insights from this data can give businesses a significant competitive advantage. To achieve that, companies are increasingly adopting stream processing architectures to power data-driven services that can respond to customer demand in real time. If you are skeptical about data streaming, listen to this insightful Current22's panel, "If Streaming Is the Answer, Why Are We Still Doing Batch?", it covers the reasoning behind batches/streams and when to use what. The topic is so hot that I don't know how I'd deal with FOMO if I never had a chance to work with data-streaming apps!
Python era
We are living in "The Era of Python": GitHub's Octoverse, TIOBE's Programming community index, and StackOverflow survey show the growing interest in Python. In large part due to Python's versatility in everything from development to education to machine learning and data science. The number of users on GitHub is expected to reach 100 million soon, the majority of them will be new to the field, and Python is a top choice for them. Especially for Data Scientists and Data Engineers. Frankly, I've been curious about Python for quite some time and am thrilled to be a part of Bytewax for that reason, too.
Data and Python
The Python community has created a wealth of libraries that make it an ideal language for working with data. These libraries, such as Numpy, Pandas, scikit-learn, and many more, provide a range of tools for collecting, cleaning, and analyzing data, as well as for building machine learning models and visualizing results. Python's rich ecosystem enables it to perform all of these tasks quickly and easily.
The current state of Data Streaming
Widely used stream processing frameworks, such as Kafka, Pulsar, Flink, and Spark, are Java-based and were designed to be a part of a homogenous and monolithic development environment. Some tasks, such as deployment or setting up data sharing or rebalancing streams after a configuration update, require significant knowledge of the Java ecosystem. Some stream processing platforms have added limited support for Python, but Java-derived languages are still the only first-class citizens. This can make it challenging for companies where engineers are not familiar with JVM to adopt these platforms effectively. There are also some performance reasons that can limit the ability to leverage the power of stream processing when it's JVM based: fat Jars, cold-startup time, you name it.
Data Streaming 💛 Python = Bytewax
The absence of Python native tooling makes it difficult, costly, and error-prone to achieve real-time data stream processing. At Bytewax, we are building a framework that enables Data Scientists and Data Engineers to quickly and effectively build applications with data streams. You'll be able to navigate data streams and create customizations necessary to meet your needs without hiring highly skilled architects to just support streaming platforms.
Focusing on your experience
As I said at the beginning, I am convinced that communities are the backbone of our industry. The rich ecosystem truly shows how incredible the Python community is and that people care to share. Working with such communities is a privilege and always rewarding. I aim to give the community's voice a strong presence within Bytewax and ensure your needs are heard and addressed. Bytewax is an open-source project, and we are dedicated to making it accessible to everyone. We are just starting out on our journey, and we would love for you to join us in building the next generation of Python native data streaming framework. If you have feedback, ideas, suggestions, or if you would like to become a contributor, please don't hesitate to reach out. We will engage with you online and offline through conferences, meetups, user groups, our slack, our github, and social media (Twitter, LinkedIn).
On a personal note, I look forward to connecting with you over coffee and learning about a feature you wish existed or your data engineering project that would benefit from partnering with us! Don't hesitate to reach out to me directly on twitter @oli_kitty or on LinkedIn, and see you soon in the community!
I'm thrilled I get to do this work, and delighted to get started doing it with Bytewax. Let's go!
Stay updated with our newsletter
Subscribe and never miss another blog post, announcement, or community event.