talks/2024-08-14_boston-python-presentation-night_rockylinux/index.md
2024-08-14 11:18:10 -04:00

506 lines
19 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
theme: default
class:
- lead
- invert
header: Python and Rocky Linux
footer: Rocky Enterprise Software Foundation
---
# Python and Rocky Linux
<div data-marpit-fragment>
<h4>(a love story in three parts)</h4>
</div>
---
<!-- paginate: true -->
<!-- footer: "" -->
<!-- header: "" -->
## Who am I?
<!--
My name is Neil Hanlon, and I'm one of the founders of Rocky Linux, serving as the infrastructure team lead.
I work at CIQ in the Open Source Program Office, where I primarily focus on the Rocky Linux community and infrastructure, striving to make it the best it can be.
I'm also a Fedora packager, a contributor to OpenStack-Ansible, and in a former life, I was a network engineer.
-->
![bg right:40%](bg.png)
<div data-marpit-fragment>
### "Professionally"
- Rocky Linux cofounder & infra lead
- Work @ CIQ in OSPO
- Fedora/EPEL Packager/contributor
- OpenStack-Ansible reviewer/contributor
- Menace to society (IPv6 Zealot)
</div>
<div data-marpit-fragment>
### "Unprofessionally"
- Plays guitar and trumpet
- Tinkerer, HAM (KC1UYE)
- pretend electrician
</div>
---
<!-- header: Python and Rocky Linux -->
<!-- footer: Rocky Enterprise Software Foundation -->
# What is Rocky Linux?
<!--
So.. you may be asking yourself.. What the heck _is_ Rocky Linux?
-->
---
<!--
Rocky Linux was founded in December 2020 in response to the CentOS project's shift in focus from CentOS Linux to CentOS Stream, which no longer rebuilds RHEL but tracks ahead of its next release.
RHEL is widely used in the industry for mission-critical enterprise services, supporting architectures from x86_64 and aarch64 to powerpc and s390x mainframes.
Rocky Linux exists to fill the gap left by CentOS Linux's absence, offering a free rebuild of RHEL that aims to be entirely compatible with the upstream distribution. Some of those gaps have Python-shaped holes.
-->
# Rocky Linux is a community-driven Enterprise Linux distribution--stable enough for the largest enterprise to rely on it, and community-driven to ensureit stays accessible to all
---
<!-- dfooter: (ref: https://x.com/carlwgeorge/status/1439724296742576130/) -->
<!-- footer: "" -->
<!-- header: "" -->
<!--
Let's discuss what Rocky Linux is and its lifecycle, particularly in the context of Enterprise Linux.
After the abrupt end-of-life for CentOS Linux 8 and the shift in focus to CentOS Stream, Rocky Linux emerged to fill the gap.
Rocky is built from a minimized subset of the Fedora package set.
A new major release occurs every three years, which corresponds to roughly six Fedora releases.
Each major release is supported for 10 years; the first half with full support and the latter half with 'Maintenance' support.
We also leverage community-supported addon repositories like EPEL and rpmfusion to include many Python packages that are not in the base OS.
-->
## Enterprise Linux?
<div data-marpit-fragment>
<img src="E_rTivEXMAg6Tjh.png" style="display: block; width: 76%; margin: 0 auto 0.5em auto;" />
</div>
* CentOS Linux 8 EOL-ed early; focus shifted to CentOS Stream
* Built from minimal subset of Fedora package set
* New major release every 3 years (about six Fedora releases)
* Each major release supported for 10 years
- First half: Full support
- Second half: 'Maintenance' support
* Community-supported addon repositories like EPEL and rpmfusion
- Many Python packages not included in the base OS
---
<!-- header: Python and Rocky Linux -->
<!-- footer: Rocky Enterprise Software Foundation -->
<marquee direction=right><h1>:snake:</h1></marquee>
---
# Part I - The Operating System
<!--
Let's start with Part I: The Operating System.
In any modern Linux distribution, you'll find software written in a myriad of languages such as Java, Rust, C, C++, Ruby, Node.js, and more. However, core components of Fedora-based distributions, including Rocky Linux, often rely heavily on Python.
While the list isn't exhaustive, I've selected a few key components that illustrate just how important Python is to the core of our system.
-->
<div data-marpit-fragment>
* platform-python
* dnf / rpm
* anaconda (no, not that anaconda)
* cockpit-project
</div>
---
<!--
Let's discuss Cockpit, a web-based graphical interface designed for managing servers.
Cockpit uses a mix of Python and JavaScript to provide a user-friendly experience.
It's available for a variety of distributions, including Debian and Fedora.
Think of it as a modern, more user-friendly alternative to Webmin.
Cockpit allows you to manage various server aspects such as networking, storage, logs, containers, VMs, and more.
-->
## Cockpit
* Web-based graphical interface for servers
* Built with Python and JavaScript
* Available for Debian and Fedora
* Think "Webmin but less bad"
* Manages networking, storage, logs, containers, VMs, etc.
<!-- @TODO add image -->
---
<!--
Next, let's talk about Anaconda, also known as pyanaconda.
Anaconda is the installation program used by Fedora, Red Hat, and others.
It allows for both GUI (via VNC) and TUI-based interactions during the installation process.
Additionally, Anaconda can process kickstart files for automated, unattended installations.
It uses Gtk.Builder and Glade for customization and modifications of the UI elements.
The UI is currently undergoing rewrites to smooth out some rough edges, likely to be seen in Fedora 42.
Anaconda leverages the blivet library for disk setup and supports various addons like ostree and openscap.
-->
## Anaconda (pyanaconda)
* Installation program for Fedora, Red Hat, others
* GUI (VNC) or TUI based interaction
* Supports kickstart files for unattended installations
* Uses Gtk.Builder / Glade for customization
* UI being rewritten for fewer rough edges
- Expected in Fedora 42
* Blivet: Storage library for disk setup
* Addons: ostree, openscap, etc.
<!-- @TODO add image -->
---
<!--
Now let's break down DNF, short for "Dandified Yum."
It serves as the package manager for distributions like Fedora, RHEL, SUSE, and a few others.
DNF is essentially a rewrite of `yum`, which itself was a rewrite of `yup`, and so on.
It's a high-level tool that sits on top of `rpm` to manage packages.
Introduced in 2013, dnf4 used a combination of Python and C/C++.
For context, dnf5, which came out in 2018, is written in C++ and is currently used only in Fedora.
Well focus on the more widely-used dnf4 for now.
-->
## DNF ("Dandified Yum")
* Package manager for Fedora, RHEL, SUSE, and others
* Rewritten version of `yum`, which was a rewrite of `yup`
* Layer on top of `rpm` for package management
* DNF4 (2013): Python + C/C++
- We don't talk about DNF 1-3...
* DNF5 (2018): C++
- Currently only in Fedora
---
<!--
Now, let's discuss DNF4 and its relationship with Python.
There is quite a bit of history here, much of which can be quite dry.
One significant change was moving to an external dependency solver, making the process faster and less memory-intensive.
The transition from Python 2 to 3 was particularly challenging during this period.
Several libraries play a role in DNF's functionality:
- libdnf: Provides a high-level API for DNF and is written in C/C++.
- libsolv: A package satisfiability solver also written in C.
- librepo: A libcURL wrapper used for fetching Linux repository metadata and packages, with Python bindings available.
- libcomps: An alternative to the slower python yum.comps library, also with Python bindings.
-->
### DNF4 and Python
* Lots of probably boring history
* External dependency solver: faster, less memory usage
* Python 2/3 transition was (and is) challenging
* libdnf: High-level API for DNF (C/C++)
* libsolv: Package satisfiability solver (C)
* librepo: libcURL wrapper for fetching repo metadata and packages (C/Python bindings)
* libcomps: Alternative to slower python yum.comps library (C/Python bindings)
---
<!-- header: "" -->
<!-- footer: "" -->
## Platform-Python
<!--
Let's talk about platform-python, a critical component for the stability of any EL-based distribution, including Rocky Linux.
Every system needs a stable version of Python to rely on.
For Rocky 8, this version is Python 3.6.8, and for Rocky 9, it's Python 3.9.18.
Both are accessible via `/usr/libexec/platform-python`.
Next, we'll look at some common pitfalls when using Python on EL systems.
-->
* System must have a stable Python version
* Rocky 8: Python 3.6.8
* Rocky 9: Python 3.9.18
* System python always located at `/usr/libexec/platform-python`
<div data-marpit-fragment>
## Python on EL - Pitfalls
* Tools like Ansible may get confused
- May need to force `ansible_interpreter`
* Many Python packages/modules built against specific Python versions
* RPM modularity exacerbates the issue
* Many end up using pip for package installation :(
* No solid solution yet
</div>
---
<!-- header: Python and Rocky Linux -->
<!-- footer: Rocky Enterprise Software Foundation -->
## Using / Developing with Python on Rocky
<!--
Let's now discuss the experience of using and developing with Python on Rocky Linux.
How hard is it really to get started and be productive with Python on our platform?
The short answer is, like most things: it depends.
While there are hurdles, such as non-standard repositories and the intricacies of building RPM packages, tools like pkgs.org can help find what you need. For many, pip remains a go-to resource for managing Python packages.
-->
<div data-marpit-fragment>
### How Hard Is It?
* It depends!
* More challenging and frustrating than ideal
* Use pkgs.org for finding non-standard repos (EPEL, rpmfusion)
* Building RPMs isn't too hard, but pip is often your best friend
</div>
<!-- @TODO add image -->
---
### Python on Rocky
<!--
Next, let's take a closer look at how Python is handled in Rocky 8 and Rocky 9.
Understanding the differences between these versions can help you navigate the complexities of developing with Python on Rocky Linux.
In Rocky 8, you have both modular and non-modular Python versions. Platform-python is non-modular and is crucial for dnf.
However, most Python packages and modules are built against the default Python version.
In Rocky 9, platform-python is synonymous with Python3 and is required for dnf. Instead of modules, Python is now provided as separately named packages, like python3.11 and python3.12, which can be installed in parallel.
-->
#### Rocky 8
* Modular and non-modular Python versions
* Platform-python (non-modular) required for dnf
* Most packages/modules built against default Python
* Modular python not modularized "correctly"
#### Rocky 9
* Platform-python == Python3
- Required for dnf
* Python provided as separate packages (python3.11, python3.12)
- Installable in parallel
---
<!-- footer: (ref: https://www.rfc-editor.org/rfc/rfc1925)-->
# Part B - Modularity
![](rfc1925-5.png)
<!--
Up next, we'll dive into RPM modularity. This concept was introduced to address the growing complexity and diversity of software requirements in modern enterprise environments.
It aims to provide a way to customize and manage different versions of software components within the same distribution.
By allowing users to select and combine different modules, we hoped to offer greater flexibility and control over their systems.
However, as we will discuss, implementing RPM modularity brought its own set of challenges.
-->
---
## Really Quick Backstory, I Promise
<!--
In the beginning, our installation base was like managing a small garden with a few servers, carefully maintained within a controlled environment. Development cycles were long, and software ran on system-installed packages.
Today, it's like managing an industrialized farm. The sheer scale makes it impossible to manage servers individually. Development cycles are now measured in days, not months or quarters, with parallel systems having disjoint dependencies.
For an enterprise distribution promising 10 years of support, this creates a vicious cycle: fewer, outdated packages and missing dependencies lead to a bad developer experience, fewer developers, fewer users, and less funding.
Containers might seem like a solution but often just shift the problem without solving it. They make the enterprise distribution merely a base for running different containerized environments.
The idea of modularity—allowing users to customize their OS components—emerged but proved problematic. Modules were rarely isolated, needed individual bootstrapping per stream, and required extensive testing.
-->
* Before: Carefully managed servers like a backyard garden
* Now: Fully industrialized farm with automated preparation, planting, and harvesting
* Cattle vs. Pets analogy
* Development cycles now measured in days
* EL releases need to be supported for 10 years, but new tech evolves rapidly (e.g., new JS frameworks)
* Diverse tech stack: Python, Postgres, Node.js, Nginx, PHP, Ruby (some evolving faster than others)
* "It is easier to move a problem around than it is to solve it."
* What if users could compose their "own" OS?
---
<!-- footer: (source: https://youtu.be/F5SWz3yPXjo ; The Self Abolition of Enterprise Linux Distributions; Dan Čermák, SUSE) -->
![bg](self-abolition.png)
<!--
Imagine the most cyclical digraph you can think of. Now make it ten times worse.
This slide illustrates the seemingly endless loop of problems and solutions that come with maintaining an enterprise Linux distribution.
We have to juggle software velocity, modularity complexities, and the ever-evolving landscape of dependencies and packages.
As we discussed earlier with RPM modularity, this can lead to a multiplicative number of challenges that make maintenance highly complex.
Lets dig deeper into how these issues create a vicious cycle and what possible strategies we can employ to mitigate these challenges.
-->
---
## So We're Doomed?
<!--
So, are we doomed?
We face challenges in packaging and maintaining all the software needed for a modern distro, but we can focus on maintaining the most essential components.
Software development velocity isn't going to slow down, and as humans, we can only get so fast.
Our best strategy is to focus on improving our tooling to enhance efficiency and automation.
e.g. here's a diagram from dan cermak from suse discussing how we could have an tight, iterative pypi-RPM build loop
-->
<div data-marpit-fragment style="display: inline-flex; width: 50%; align-self: end">
<img src=well-yes-no.png width=500px align=right />
</div>
<div style="display: inline-flex; width: 50%; align-self: start; margin-top: -8em;">
* Can't package everything
* **Can** maintain the important stuff
* Software velocity isn't slowing down
* We humans are only so fast
* Focus on improving tooling
</div>
<div data-marpit-fragment style="display: inline-flex; width: 50%; align-self: end; margin-top: -13em;">
<img src=python-rpm-loop.png width=500px align=right />
</div>
---
<!-- footer: Rocky Enterprise Software Foundation -->
# Part 3 - Empanadas and friends
---
<!--
The tools that we use to build Rocky Linux are heavily influenced by Fedora's tools, many of which are written in Python due to its approachability and versatility.
Initially, we adopted various Fedora tools such as koji, mock, MBS, pungi, and lorax to compose the OS.
- Pungi serves as the dependency solver and compose maker.
- Lorax helps in creating images (ISOs) and can perform other custom tasks.
Even though some of these tools have undergone updates or replacements, they remain fundamental to our processes.
To gain better control over our release builds, we created distrobuild as a wrapper around koji and MBS.
We've also been using PV2 for automating upstream imports during our transition from Go to Python with our build system (Peridot).
This is a big WIP still, but there's good things on the horizon.
Additionally, Apollo is our errata feed suite, providing UI and workflows to fetch updates. However, it still requires significant improvements.
-->
## Building Rocky Linux
* Leveraging Fedora tools: koji, mock, MBS, pungi, lorax
* Distrobuild: enhances koji/MBS control
* PV2: used for automating upstream imports
* Recognizing our challenges: "It's us, we're the problem"
* Apollo: errata publisher requiring improvements
---
## empanadas (git.resf.org/sig_core/toolkit)
<!--
Empandas is central to how we handle release engineering and image building in Rocky Linux.
Originally, our toolkit started as a series of bash scripts.
Over time, we consolidated these scripts into Empandas, a Python CLI.
The CLI relies on packages like click and rpm python, with a substantial amount of `subprocess.run()` calls for executing shell commands.
There's an excessive and somewhat embarrassing reliance on `subprocess.run()`.
I don't consider myself a particularly skilled Python developer, and there are ongoing refactoring efforts to improve the codebase.
-->
<div style="display: inline-flex; width: 300px; align-self: end; margin-top: 6em;">
<img src=empanadas.png width=300px align=right />
</div>
<div style="display: inline-flex; width: 100%; align-self: start; margin-top: -14em;">
* empandas is **the** way Rocky does release engineering and image building
* toolkit began as a bunch of (bash) scripts
* began compiling into empanadas as a python CLI
* click, rpm python, ... `subprocess.run()`
* like an embarrassing amount of `subprocess.run()`
* I don't claim to be a good Python dev
* some refactoring efforts going on
</div>
---
<!--
Several key tools and services are integral to our operations in Rocky Linux.
Mirrormanager helps us manage our network of mirror servers for efficient distribution.
Mailman and Hyperkitty are essential for our mailing list and discussion management.
We also use KIWI for creating and managing appliance images.
There are other important tools and services that contribute to our processes, though I might not recall all of them off the top of my head.
-->
## Friends
* mirrormanager
* mailman/hyperkitty
* kiwi
* others I've surely forgotten
---
<!--
What does it all mean?
Python plays a fundamental role not only in Fedora, CentOS, and Rocky Linux distributions but also in the processes involved in building and distributing these OSes.
Python's approachability makes it well-known among "sysadmin-adjacent" individuals, even if they don't always write Pythonic code.
Lastly, we can always refer to RFC-1925 for answers to any challenging questions even when they're not related to networking.
-->
# What does it all mean?
* Python is core to Fedora, CentOS, and Rocky Linux distributions
* Tremendous amount of Python used in building and distributing the OS
* Python is approachable for "sysadmin-adjacent" roles, even if the code isnt always Pythonic
* RFC-1925 always has an answer to any question
---
# Thank You!
---
# Q & A
---
<div style="display: inline-flex; align-self: end; margin-top: 2em;">
<div style="width: 60%">
## Join us on Mattermost! https://chat.rockylinux.org
</div>
<img src=chat.rockylinux.org.png width=200px align=right style="margin-left: -2em; display: block;" />
</div>
<div style="display: inline-flex; width: 100%; align-self: start; margin-top: -14em;">
- neil@shrug.pw / neil@resf.org
- @kneel.bsky.social
- [thepotato.tech](https://thepotato.tech)
</div>
---