For the past few years, the narrative around the R programming language, which is used heavily in data science, has remained much the same: Although academia and specialized data-science firms used R pretty heavily, Python was rapidly eclipsing it as the language of choice for all things data-related.
However, the latest update of the TIOBE Index suggests something incredible: That news of R’s demise has been premature, and the language might be making a bit of a comeback. Specifically, R has jumped up to eighth place on the Index, up from 20th place a year ago.
What’s behind this surprising rise? “There are 2 trends that might boost the R language: 1) the days of commercial statistical languages and packages such as SAS, Stata and SPSS are over,” TIOBE wrote in a note accompanying the data. “Universities and research institutes embrace Python and R for their [statistical] analyses, 2) lots of statistics and data mining need to be done to find a vaccine for the COVID-19 virus.”
In order to generate its rankings, TIOBE leverages data from a variety of aggregators and search engines, including Google, Wikipedia, YouTube, and Amazon. For a language to rank, it must be Turing complete, have its own Wikipedia entry, and earn more than 5,000 hits for +”<language> programming” on Google. That methodology has attracted its share of critics over the years, who argue that the rankings are more a measure of these languages’ “buzz” (and SEO juice) than actual usage.
That being said, TIOBE is a useful way to monitor languages that are potentially on the rise (and fall). And for a long time, it seemed that R was falling. Way back in 2018, for example, a KDnuggets poll of technologists who used both R and Python showed a slow decline in R usage in favor of Python. At around the same time, a separate survey from Burtch Works revealed that Python use among analytics professionals grew from 53 percent to 69 percent over that same two-year period, even as the R user-base shrunk by nearly a third.
“R has issues with scalability,” Enriko Aryanto, the CTO and a co-founder of the Redwood City, Calif.-based QuanticMind, a data platform for intelligent marketing, told Dice at the time. “It’s a single-threaded language that runs in RAM, so it’s memory-constrained, while Python has full support for multi-threading and doesn’t have memory issues. When choosing a language, it all comes down to choosing what’s best to solve your problem.”
Meanwhile, Python continued to slither its way deeper into the data-science arena. “Behind Python’s growth is a speedily-expanding community of data science professionals and hobbyists—and the tools and frameworks they use every day,” GitHub stated during its 2019 edition of the State of the Octoverse. “These include the many core data science packages powered by Python that are both lowering the barriers to data science work and proving foundational to projects in academia and companies alike.”
But while Python remains immensely popular, both in data science and programming as a whole, it seems you shouldn’t count out R just yet; it’s clearly still drawing attention and usage—whether or not COVID-19 has anything to do with it.
In the meantime, if you’re new to Python and want to learn its ways, check out Python.org, which offers lots of documentation, including a useful beginner’s guide to programming in it. Once you’ve learned some core concepts, focus on writing faster code (via Functions, Lists, and more), debugging, and other more advanced skills. Microsoft also has a video series, “Python for Beginners,” with 44 short videos (most under five minutes in length; none longer than 13 minutes); it recently added even more content, including “More Python for Beginners” (20 videos), which covers key concepts such as managing a file system and asynchronous operations, and “Even More Python for Beginners: Data Tools” (31 videos).