There is a better way to manage dependencies, package, and publish Python projects.

Surprised to see you are still using virtualenv instead of Poetry
Surprised to see you are still using virtual env instead of Poetry — Photo by Andrea Piacquadio from Pexels

I was daunted by the complexities of projects when I started my data science career. We were using Virutalenv in all our python projects.

I’m impressed by the Node Package Manager (npm) and always wondered why we don’t have one like that in Python.

I was yearning for a single tool to maintain isolated environments, manage dev and production dependencies, packaging, and publishing.

Thankfully, we have Poetry now.

In a nutshell, Poetry is a tool for dependency management and packaging in Python.

But this official definition is incomplete because I found Poetry does more than managing dependencies and packaging.

Poetry…


Skimpy makes it incredibly easy to summarize datasets in notebooks and terminals.

Summarize a data set
Summarize a data set — Photo by Lukas from Pexels

Describe was the first function I try on any new dataset. But I found a better one now.

I replaced it with Skimpy. It’s a small python package that shows some extended summary results for a dataset. You can also run it on a terminal window without entering a Python shell.

You can install it from PyPI using the following command.

pip install skimpy

Why Skimpy?

In a previous post, I’ve shared three Python exploratory data analysis tools. With them, you can generate more complete reports about your datasets in the blink of an eye.

But what if you need a simpler…


Plotly dash apps are the fastest way to build production-grade dashboards in python.

Screenshot of the dashboard created in Python.
Screenshot of the dashboard created in Python — by the Author.

I don’t have to convince you why we need an interactive dashboard. But what most people don’t know is that they don’t have to buy expensive licenses of Tableau or PowerBI. You don’t have to enroll in a JavaScript course either.

Dash apps allow you to build interactive dashboards purely in Python. Interestingly, it could reach heights that popular BI platforms can not. Also, you can host it on your servers and your terms.

Why Dash? Why not Tableau, Power BI, or some JavaScript library?

BI Platforms such as Tableau and PowerBI do a fantastic job. It allows even non-technical managers to do data exploration themselves. …


Data Science

Everything we need to start data science is affordable and accessible

Democratizing data science
Democratizing data science — Image created by the Author.

I trust the first profound attempt was in 1985. A revolutionary software changed the way we think about data. It allowed ordinary people to do extraordinary data analyses. We call it Excel, developed by Microsoft initially for Machintosh.

Since then, the field of data science has evolved and become accessible for everyone.

  • Access to knowledge had phenomenal improvements. If you’ve been listening to data science-related interviews, you may have noticed one in every ten mentions Andrew Ng’s machine learning course. It’s a free online resource available for anyone aspiring to be a data scientist.
  • Affordable Infrastructure — training a model…


Making sense from data is common sense. Coding skills aren’t the superpower of a data scientist or data engineer.

Data scientists and data engineers need no coding skills.
Data scientists and Data Engineers need no coding skills. — Photo by Lukas from Pexels

If you have dreams of becoming a data scientist or a data engineer, you’d probably see a black screen full of codes in that dream. Polishing your coding skills may be the popular advice you get on this journey. Yet, surprisingly, it has nothing to do with programming.

Data science is the process of making sense from a raw collection of records. A programing language is only a tool. It’s like a container for cooking your meals. But the container itself is not the meal.

People lose interest in data science because some aren’t good at programming. They couldn’t get…


I am migrating all my ETL work from Airflow to this super-cool framework

Illustration from Undraw

I was a big fan of Apache Airflow. Even today, I don’t have many complaints about it. But the new technology Prefect amazed me in many ways, and I can’t help but migrating everything to it.

Prefect (and Airflow) is a workflow automation tool. You can orchestrate individual tasks to do more complex work. You could manage task dependencies, retry tasks when they fail, schedule them, etc.

I trust workflow management is the backbone of every data science project. Even small projects can have remarkable benefits with a tool like Prefect. It eliminates a significant part of repetitive tasks. …


Cut down your data exploration time to one-tenth of its original duration using these Python exploratory data analysis tools.

Image by Daniel Hannah from Pixabay

I remember the good old college days where we spent weeks analyzing survey data in SPSS. It’s interesting to see how far we came from that point.

Today, we do all of them and a lot more in a single command before you even blink.

That’s a remarkable improvement!

This short article will share three impressive Python libraries for exploratory data analysis (EDA). Not a Python pro? Don’t worry! You can benefit from these tools even if you know nothing about Python.

They could save weeks of your data exploration and improve its quality. …


Hi Bex T., thanks for the great question. It's indeed simple.

Here's how to do it on Linux. I apologize; I don't use Windows often. But I believe the method could be similar.

Create a file and name it the way you need the API. For illustration, I've created one called hello. Note that it doesn't have any file type extension such as .py or .sh.

The following is how its content should look like.

#! /usr/bin/python
print("hello world")

Note the first line #! /usr/bin/python . This line will tell the OS which executor to use when running the script…


Overcoming Python’s limitations and using it for heavy data analytics and machine learning via web requests.

Photo by Miguel Á. Padriñán from Pexels

In a previous article, I wrote about the limitations of using Python web apps for analytics projects. Some of the points kindled readers' curiosity and inspired me to write another story to complement it.

The central question of this article is, “if Python has severe drawbacks because of its sync behavior, how do platforms such as Instagram and Spotify use it to serve millions around the world?”

While I have no official information from these platforms (or similar ones), I have insights into handling such massive requests from my experience.

Here in this article, I prepared a demo to show…

Thuwarakesh Murallie

Data scientist @ Stax, Inc and 2X top writer on Medium. linkedin.com/in/thuwarakesh

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store