shutterstock_743743426.jpg

Pablo Galindo Salgado is one of five members of the Python Steering Council, which plays a key role in the development of the Python programming language. He served as the release manager for the recently released Python 3.11, and he’s part of the team spearheading the  Faster CPython project.

If that wasn’t enough, Pablo Galindo Salgado is also a software engineer on Bloomberg’s Python Infrastructure & Tooling team, where he helps drive the use of Python by thousands of the company’s software engineers. A few years ago, we spoke with him about the evolution of Python; in this new interview, he breaks down everything from Python’s adoption by data scientists to Bloomberg’s recent collaboration with Microsoft to boost the language’s speed even further.

A few years ago, it seemed like Python was supplanting R as the programming language of choice for some data-science and academic shops. Python is extremely versatile, of course, while R is primarily focused on data analytics/statistical analysis. How is Python continuing to evolve in the context of data science?

One of the things that makes Python exceptional for data science and statistical analysis is its rich and vibrant ecosystem of open source libraries, such as NumPy, SciPy, and Pandas. Not only do these libraries add new features and capabilities, but also we see new, more specialized libraries appear every day which solve specific problems that appear as the technology evolves. One example of this is the vast ecosystem of machine learning libraries that have been developed (on top of the previous data libraries) in the past few years.

From the perspective of the Python development team, we are always trying to make efforts to help library developers better integrate with the interpreter and provide new language features for many different tasks, including data science. One example of this is when we added the matrix multiplication operator (@) in Python 3.5. This operator allows users to perform matrix multiplication using the @ symbol, making it easier to write concise and readable code for linear algebra tasks.

As new challenges arise, the community always collaborates to converge organically towards a solution, be it by creating new libraries, improving existing ones or, sometimes, by changing the language.

How is Python currently impacting the finance realm? It’s used extensively at Bloomberg, but on an industry level, how is it changing how apps/services are developed?

The finance realm has always been intimately connected with Python’s data science use cases. Historically, an overwhelming majority of Python usage in finance has been for data science and data analysis tasks. However, over the past five years or so, we have seen many different financial companies (including Bloomberg, as well as other companies like hedge funds and banks) using Python for many different tasks that were previously only being developed using other languages.

These tasks include Web services, database access layers, batch job processing, task scheduling, and many other tasks related to things like risk management, portfolio optimization, and data processing for many different kinds of financial instruments. Another area where Python has gained a lot of traction is in the development of in-house development tools.

Furthermore, Python has a large and active community of users, which means there are many open source libraries and tools available for financial tasks. This makes it easier and faster for developers to build financial applications and services using Python compared to other languages, which has been a key element in the success of Python in many different areas, including financial services. This is especially remarkable because companies in the financial services sector have historically been conservative and somewhat reticent to introduce dynamic languages to their tech stack.

What sparked the Bloomberg and Microsoft collaboration over Python, and what’s the ultimate goal?

When Guido van Rossum joined Microsoft in 2020, he decided that he wanted to dedicate the majority of his time at the company towards making Python faster. To achieve that, he started a small team within Microsoft with the aim of developing new strategies to make CPython (the default Python implementation) faster. Improving the performance of the Python interpreter is something that has always excited me and I have spent quite a lot of time working towards this goal in the past. So, shortly after this team was started at Microsoft, and after some conversations, Bloomberg has allowed me to spend half my time collaborating with the “Faster CPython” team. Since then, the team has grown to include both new Microsoft employees and additional collaborators from both the Python core team and other companies.

The ultimate goal of this collaboration is to make Python faster for everyone and for all use cases.

Python 3.12 (alpha 3) was released on 12/14. What’s the long-term direction of Python with each new release? What do Python developers need to stay aware of as they think about the long-term future of the language?

Python 3.12 doesn’t have any new modules yet, and only some new features. This is because it is still very early in its development cycle, so many of the big changes have not been done yet. You can check [this site] at any point to see what will be included in the new release.

In general, the new features added to new releases are proposed as Python Enhancement Proposals (PEPs) that are discussed in the context of every individual release. These proposals are normally implemented in the release they target or sometimes the next one. Of course, this does not mean there aren’t projects that span many releases, but these do not generally have a defined roadmap.

Some of the things we are working on related to the long-term direction of the language include the Faster CPython project (to make Python faster), an effort to make CPython compatible with WebAssembly so Python can be used better in web browsers and compatible platforms, an effort to improve the UX and provide better error messages, and a general effort to improve the typing language. But, what is included in every release is not generally planned beforehand and is organically decided as the release reaches the feature freeze point. For more information regarding Python’s release cadence, you can check out PEP 602.

Regarding what Python developers should be aware of as they think about the long-term future of the language, one of the points that keeps recurring relates to compatibility. In the release team, we take backward compatibility very seriously, but Python is in a very interesting position compared to other languages because we expose an overwhelming number of internal interfaces that library developers can use to integrate with the language, especially when creating native compiled extensions in languages such as C, C++ or Rust. Some of these internal interfaces are not officially exposed or are considered “private” or “unstable” (in the sense that they can change in new releases of Python). This situation regularly forces many libraries that are leveraging some of the trickier interfaces to be changed when a new release of the interpreter is made available. As the dependency graph of libraries is complex and dense, this means that upgrading a given project to the newest version of Python may be a non-trivial task because all the dependencies and dependencies of dependencies must be upgraded first—and sometimes there are many challenges along the way.

Of course, there is a fine balance between offering some of these intimate interfaces that can be leveraged for performance and also offering some guarantees related to backward compatibility. The core team is very aware of these challenges and we are trying to move towards some better defined interface surface with more strongly defined guarantees. Simultaneously, we are trying to encourage projects to test the compatibility of their libraries and applications with new releases of the interpreter earlier. This way, we can shorten the feedback loop between them and us in order to help the whole ecosystem be ready sooner when a new release is published.

In general, this is a very challenging problem that equally involves both technical and human aspects. But it is a problem we are slowly, but steadily, solving.