Back to Blog

Creating Python Project in the Year 2024

Published on: 2024-9-16 Creating Python Project in 2024

Working with a data-heavy product means I often need to quickly run experiments, whether to validate a new feature or to run some load on production-grade infrastructure.

Wide Angle Analytics is backed exclusively by type-safe Scala. We chose Scala to move fast with the assurance of a great language and ecosystem to deliver correct and efficient solutions.

However, Scala can feel sluggish when attempting to deal with quick experiments.

Scala Alternative

Luckily, there is Perl... just kidding. I love Perl. Python, we are talking about Python. You can't talk about data science, data engineering, or even AI without mentioning Python.

This super expressive, albeit a tad wonky language, convinced hordes of C-style syntax enthusiasts that yes, we can trust braceless code. Heck, even Scala 3 supports braceless code style these days.

Ok, so Python. I would use Python to write experiments. This is often code that is not necessarily throwaway but will not make it into production at Wide Angle. So, tests are less important.

Python Developers

From what I experienced in my career, there are three types of Python developers. I am sure there are more variations, but we are talking about my anecdotal experience. If you don't agree, please take it to Hacker News and rage there.

Script Kiddy

The first type of developer is someone who uses Python as a glorified Bash. Hack and slash but gets the job done. All the dependencies live in the global installation, there is just one file, the script.py, and often, the script has one __main__ function and lots of if/else blocks.

Package Developer

Unlike the script kiddy, this kind of developer uses a virtual environment, defines dependencies, and maybe even pushes the package to git.

A few files, core dependencies, and Bob's your uncle, you got yourself a decent sandbox to run code.

Application Developer

Lastly, you have your serious Application Developers. All you Django aficionados, yes you! This is where I am out of my depth. Personally, I have never delivered a substantial Python project. I saw large codebases and they were intimidating. Python was never my cup of tea for bigger systems.

If I wanted to feel elitist, I would make a snarky remark about how Python is less safe than Scala, or even gasp, Java. But the truth is, Scala was always more familiar and felt easier due to preexisting knowledge.

So, no fault to Python itself, I can't say what a real Python application developer is like.

Ad-Hoc Scripting in Python

I, as many readers who reached this far, started as your typical Script Kiddy. Occasionally nuking local Linux installations by doing some horrible seppuku with system-wide Python installations.

With time, I grew the patience and practice to always build my application package and use that, instead of a hodgepodge of individual scripts.

Besides not destroying your local operating system, here are some benefits to this approach:

  • You have full control over your development Python version and the packages it uses.
  • When necessary, you can pull external packages, in a specific version, without affecting other projects.
  • And finally, you end up with a repeatable build package, you can drop into Git and share with colleagues or the community.

Building a Python Package in 2024

Ok, you are convinced, you go to Google or ChatGPT and ask for instructions on how to create your new Python project/package. Chances are you will get some outdated or completely broken (🙄 AI) code snippets that you will paste into your code editor and end up being thoroughly disappointed.

So, if you are reading this guide today, this is how to do it today. Like everything, the information shared here will get outdated soon. You have been warned.

The Prerequisites

  1. Python 3
  2. Pip

Create your new project:

/project
  /README.md

Next, create and activate your virtual environment:

$ python3 -m venv .venv
$ source ./venv/bin/activate

With that, you are in your sandbox. Whatever you install via pip will be localized to your project only.

Core Project Structure

Now, it is time to lay the foundations for our project structure:

/analyzer
  /README.md
  /.venv
  /pyproject.toml
  /setup.py
  /runner
    /app.py

The pyproject.toml is a package configuration1 file. Here we will define a build tool, some basic project information, etc.

We are using a boilerplate setup.py for legacy sake only. It is a very small file:

from setuptools import setup

setup()

And lastly, our code. That lives in analyzer/runner/app.py and of course in many more files that we will later reference.

With that said, let's create our TOML build file:

[build-system]
requires = ["setuptools"]
build-backend = "setuptools.build_meta"

[project]
name = "waa-analyzer"
authors = [
    {name = "Jarek Rozanski", email = "jarek@wideangle.co"},
]
description = "Data experiment for Wide Angle Analytics"
readme = "README.md"
requires-python = ">=3.12"
keywords = ["web-analytics", "analytics"]
dependencies = [
    "elasticsearch",
]
version = "0.0.1"

[project.scripts]
analyzer = "runner.app:run"

[tool.setuptools]
packages = ["runner"]

The above TOML file2 defines a package with the following features:

  1. The package/project is called analyzer.
  2. When you run the analyzer script, it will trigger the function run defined in the app.py file, in the runner sub-package.
  3. The runtime depends on elasticsearch as a dependency.

Pretty neat.

Sample runner application code:

def run():
    print("Hello, World")

Tip

If you plan on pushing your code to a repository, make sure to create a .gitignore file and exclude .venv from tracked files. You don't want these files in your source control.

Build It

First, build it in development mode, so it is easier to test and change:

python3 -m pip install -e .

And assuming all worked out...

Run It

$ analyzer
Hello, World

The above works as we defined the call to the function run in runner.app as a script. That script was made available in our current path, in the virtual environment. If you restart your shell, you will need to reactivate the previously defined virtual environment.

That's It

And that's how you build a quick Python project, with dependencies and repeatable builds, in 2024.

Go and learn some Python for greater good 😀

Jarek Rozanski
Author: Jarek Rozanski Jarek Rozanski is the Founder of Wide Angle Analytics. After a successful career in investment banking and financial services, he decided to explore the world of start-ups and eventually start his own. Privacy, one of our basic human rights, needs strong protection according to Jarek.
Looking for web analytics that do not require Cookie Banner and avoid Adblockers?
Try Wide Angle Analytics!