Using private Python repositories with Aria Automation Orchestrator

In my last post, I covered an issue that can occur when using the Urllib3 Python package with Aria Automation Orchestrator due to the version of OpenSSL within the Orchestrator Polyglot functionality. In this post, I will cover how the Manage Repositories feature in Orchestrator 8.11.2 and higher can be used with a private Python repository to manage dependencies when you do not have the ability to connect Orchestrator to the internet or where you need to use private packages.

I will be using devpi to host my Python repository, it is also possible to use alternative solutions such as gitlab and I hope to be able to blog about this in a future post soon.

Background

This post is based on a use case I was recently working on as part of VMware Hands-On Labs, preparing an exercise for inclusion in the Cloud Management labs that will be unveiled this week at VMware Explore US in August 2023.

The exercise uses Aria Automation Orchestrator (formally vRO) with the NSX Python SDK to automate the creation of an NSX segment. In earlier releases of Orchestrator, this was possible by creating a zip file of the script and all its dependencies and importing it into Orchestrator. This, however, no longer worked for my exercise due to a mismatch in the version of OpenSSL, and I needed to be able to specify an exact version of the module to use. As Hands-On labs run pre-created ‘pods’ in an isolated environment, it is impossible to connect to the internet when a user launches the lab. Any configuration required for the exercise must be stored locally, preconfigured, and available offline.

There was a further complication in that the NSX Python SDK modules are not available in the public Pypi repository used by default in Orchestrator; they are published on GitHub. This means that even though it was possible to connect Orchestrator to the internet during the build phase of the POD, I could not simply download the modules needed and pre-stage them from Pypi. I needed to use a private repository where I could host the NSX Python SDK modules and any dependencies.

Not being an experienced user of Python, I didn’t know much about what would be required, and I found devpi mentioned in a blog post in the context of setting up a local server an employee could use for development work while they were away from home and offline from their work systems. It met my needs for uploading and hosting private packages and making them accessible in the same format used by Pypi, meaning Orchestrator would be able to consume them. It is also Open Source and therefore doesn’t have a license fee associated with it which was a requirement for its use in HOL.

Lab Configuration

In my scenario, I used the following configuration:

Aria Automation Orchestrator 8.12 (embedded instance single node deployment)
2 x Ubuntu 22.04 VMs (devpi server and client)
NSX Python SDK from GitHub

I chose Ubuntu 22.04 as it provides Python 3.10.6, aligning with the Python 3.10 that Orchestrator provides and because a lot of the devpi content was based on Ubuntu, and I’m not a Linux expert.

It’s also worth mentioning while I used two VMs to act as server and client, it is entirely possible to use a single machine as a combined server/client setup. I didn’t configure my server to run as a background service or process since it would only be used for a single event over a short time. The client VM gave me a way to test that the devpi server and contents were accessible remotely from a simple Linux command, as I initially had an issue trying to access it in Orchestrator due to a misconfiguration on the devpi server.

I created a new user account (admin01) during the installation of Ubuntu on both my VMs and enabled OpenSSH; apart from those two things and setting up my network connection, the installation followed the default settings.

Devpi Server Installation

The installation of Devpi is simple. First, while logged in as your new user (I named mine admin01 for simplicity), install pip for Python3 on the VM that will be your devpi server using the command:

sudo apt install python3-pip

Now install the devpi-server package using the command:

sudo pip3 install -U devpi-server

This next step is optional, devpi offers a web interface for the server that users can browse to explore the registry contents. It is not required for the Orchestrator integration, but as someone who comes from a Windows background, I find GUIs easier than CMD interfaces for things I am unfamiliar with. To install the web interface run the command:

sudo pip3 install -U devpi-web

Now we need to initialise the devpi server instance ready to apply our custom configuration. If you intend to run devpi as a service in the background, there is some additional configuration to be performed. Refer to the devpi documentation for more information on creating the configuration files etc. Use the following commands to complete the initialisation and to start the server instance. In the example command I ran, I configured my devpi server instance to listen on the devpi-01a.corp.local FQDN using port 3141 and HTTP protocol:

devpi-init
devpi-server -host devpi-01a.corp.local -port 3141

Once the server start-up process completes, it will begin a process of caching the pypi.org registry contents to its local index. The server will print to the screen regularly and will state it is committing 2500 new documents to the search index and then indexer queue size ~ 205. The queue size number will fluctuate as it searches the remote registry and updates the local cache. It can take 30+ minutes to complete this process based on the experience seen in my lab. You can leave this process running through to completion; no user interaction is required. This same process will run each time you start the depi instance using the devpi-server command; however, the index cache will process any changes since it last ran, so you won’t need to recache all packages again. An example of this delta synch is shown in the image below.

Leave the ssh session open so the server instance continues to run; you don’t need to do anything else with it now.

If you installed the optional web interface package, you can now use a browser and go to HTTP://devpi-01a.corp.local:3141 to open the webpage for the server. If the server has started up, you should see a basic devpi webpage. You can browse into the devpi>root tree structure to see the packages being synched into the pypi index from pypi.org. This private registry structure is cached from pypi.org in our ssh session and confirms the server is up and running.

Devpi Client Installation

The installation of the Devpi client is also optional and straight forward.

We start by logging in to the VM with our admin01 user (see lab configuration section) and installing pip, just as we did for our Devpi server:

sudo apt install python3-pip

Next, we install the devpi-client package using the command:

sudo pip3 install -U devpi-client

The Devpi client is now installed. To configure it to use our Devpi server instance, we run the command:

devpi use http://devpi-01a.corp.local:3141

Devpi Sub-registry Configuration

To be able to use Devpi to host private packages, we will create a new sub-registry on our Devpi server. We will do this so that the NSX SDK packages can be uploaded and hosted locally without needing to be uploaded to pypi. The registry root/pypi created when we started our Devpi instance is read-only; we cannot upload custom packages to it locally; they can only be pulled from pypi.org.

Before we create the sub-registry, we will create a local user account that can be associated with it. To do this, we run the command:

devpi user -c testuser password=password123

This creates a user within devpi named test user with the password set to password123.

Next, we log in with this new user account:

devpi login testuser –password=password123

To create our new sub-registry, we run the following command:

devpi index -c dev bases=root/pypi

This creates the new registry index named dev, with the base registry of root/pypi. This base registry allows devpi to use packages in the dev registry and then also packages in the root/pypi registry when a package is not found in the dev registry. This reduces our overhead, as we don’t need to sync the public packages we will need to this new dev registry and the pypi registry. If you now browse the devpi server web page, you should see the new sub-registry listed.

Finally, we configure our devpi client to use this new dev sub-registry by running the command:

devpi use testuser/dev

This completes the initial configuration of devpi. In the next section, we will focus on getting our NSX SDK packages from GitHub and uploading them to our dev sub-registry in devpi.

NSX SDK Package Upload

As mentioned in the background section, the NSX SDK packages we want to use within Aria Automation Orchestrator are hosted on GitHub. We will start our process of getting them hosted on our devpi server by copying them to our devpi client VM. To do this, we will create a new directory and then clone the GitHub repo by running the commands:

mkdir vsphere-automation-sdk-python
cd vsphere-automation-sdk-python/
git clone https://github.com/vmware/vsphere-automation-sdk-python.git

Once cloned, we navigate into the lib subdirectory of the cloned repo using the commands:

cd vsphere-automation-sdk-python/
cd lib

Now we upload the packages from the SDK we want using the following command as an example:

devpi upload nsx-policy-python-sdk/nsx_policy_python_sdk-4.1.0.1.0-py2.py3-none-any.whl

If you have any problems uploading by specifying the subdirectory as part of the path, you can navigate to the subdirectory first and then specify the .whl file name. As our packages are already in wheel format, as designated by the .whl file extension, we don’t need to perform any build or packaging commands first; we can upload the already processed files.

Repeat the previous command for the nsx-python-sdk, vapi-common-client, vapi-runtime modules, each in their subdirectory of the lib directory.

If you now refresh the web page for the dev sub-registry, you will see the uploaded packages (note I uploaded extra packages in my lab while working out the minimal packages required for my use case.

We can test our new sub-registry and uploaded packages by running a test installation on our devpi client VM using the command:

python3 -m pip --disable-pip-version-check install --extra-index-url http://devpi-01a.corp.local:3141/testuser/dev/+simple/ --trusted-host devpi-01a.corp.local vapi-runtime

Notice we provide the –extra-index-url flag and the –trusted-host flag with our devpi server FQDN. This tells pip to use our private registry if the package specified does not exist in. the public pypi.org registry and to trust the devpi server even though it runs over HTTP and has a self-signed certificate.

This completes the setup of the repository for private Python packages. We can now configure the repository into Aria Automation Orchestrator to use the NSX SDK packages in Orchestrator workflows and actions.

Aria Automation Orchestrator Configuration

With our devpi server up and running, the final stage of the configuration is to add a connection from Aria Automation Orchestrator to the devpi server so that it can access the packages we have uploaded. We do this in two stages. First, we configure a new repository to point to our devpi instance, and then we create an Environment within Orchestrator to specify which packages, with their versions and locations, we want to use.

The repository feature in Aria Automation Orchestrator allows us to connect to additional repositories besides pypi.org to download packages. This instructs Orchestrator to use the–extra-index-url flag when downloading packages. It is important to note that Orchestrator will always try and download packages from pypi.org first, even when a custom repository is added and even if there is no internet connectivity. This is because Orchestrator uses the pip functionality provided by the –extra-index-url flag and cannot be overridden.

To create a repository, we log in to Aria Automation Orchestrator as a user with administrator rights. Under the Assets section of the menu, we select Environments and then click on the Manage Repositories link next to the NEW ENVIRONMENT button.

Click on Add Rep. Here; we add the details of our repository. Give the repository a name; I used the FQDN of my devpi server for simplicity. The runtime environment I set to Python 3.10 to match the version of the packages and code I would be using in my action. The location is the URL to be used for the repository, so for my lab, it was http://devpi-01a.corp.local:3141/testuser/dev/+simple/ which is the same URL when accessing the web interface of the devpi server. The devpi server uses HTTP, so we can leave the authentication box unselected. The image below shows the configuration I used in my lab:

Next, we need to create an Environment that can use our repository. To create the Enviroment we use the NEW ENVIRONMENT button on the Environments page. I named it nsx_python to reflect its use of the NSX Python SDK files. Most of the configuration we need to set is on the definition tab. Here we specify the runtime environment we want to use with the Environment, again selecting Python 3.10 (this can be manually overridden at the workflow/action level if needed). I increased the memory limit to 1024MB in my lab; this is optional based on the resources your workflows/actions will require. Then we add each of the dependencies we have for our workflows/actions, which in the NSX SDK usecase is a mix of our private packages as some publicly available packages. Clicking on the + ADD button, we are presented with a dialog box prompting us for the name of the package, the version and the repository we want to use. The image below shows the values used for the nsx-python-sdk package as an example:

We need to add an entry for each package we want to use with the version we want to use. Once completed for the NSX SDK, we have a list that looks like the image below. Notice how we are including public packages like requests and specifying the version to use, which in some cases is not the most recent version:

The final part of the Environment configuration is to include any Environment Variables we want to use. As our Devpi server is running over HTTP, we will include the trusted hosts variable to add this flag to the pip command when it runs.

After saving the Environment, we can open the Download Logs tab and watch the logs as our packages are downloaded. It will take a few minutes to download and install all listed dependencies.

The configuration of the Environment and Repository is now complete, and it is ready for use within workflows/actions. To use it just select the Environment as the runtime environment inside the action or scripting element of a workflow:

Notes about the Repositories feature

Whilst working on this use case, I encountered a few frustrating or interesting features with repositories and Environments.

In the version of Aria Automation Orchestrator I am using in my lab, there is a misconfiguration in the product where all Warning level events in the download logs are flagged as errors. This is due to be resolved in a later release.
Packages are downloaded only when the Environment is created or the list of Dependencies is amended. Changing the version of the Environment or clicking SAVE without making changes will not refresh the contents. There is also no refresh or update button to force the download of packages again.
All packages are downloaded when a download is initiated within an Environment. There is no option to perform a delta download. For the HOL example, I shared in this blog post, we temporarily enabled internet access for Aria Automation Orchestrator to allow the creation of an environment and repository in devpi. Once we had downloaded the packages, we deleted the devpi server to release resources to other VMs. After testing, we noticed a missing package and amended the environment to add the package as a dependency. We expected this to try to download just the missing package, which was available on pypi.org. Instead, it attempted to download all of the listed dependencies and effectively overwrote the existing configuration and downloaded packages meaning our code then failed.
When using an Environment in an Aria Automation Orchestrator, I have found that the configuration is cached and it does not always pick up changes such as the download of additional packages automatically. Incrementing the version number of the Environment seems to force an update and resolves the issue. An example of the error seen when this occurs using the NSX SDK packages from this blog post is that Aria Automation Orchestrator reports that the module requests could not be found even though you have seen it has successfully been downloaded via the Download Logs of the Environment.
When performing the download of dependencies Orchestrator will mark the status of the Environment as Up To Date even if it experienced errors during the download of the dependencies e.g. the first time I configured my environment using my devpi repository the devpi server was not accepting connections on its’s FQDN, only on the localhost URL. I did not know this and configured Aria Automation Orchestrator to use the server. Even though it was unable to connect to devpi and could not download the NSX SDK packages the status was still marked as Up To Date once the downloads ran out of retries.

2 thoughts on “Using private Python repositories with Aria Automation Orchestrator”

Tony October 10, 2023 — 8:50 pm

Really helpful blog post, did you ever find a way of stopping the constant attempts to access the public pypi repo first? Or at least reducing retry attempts? Im finding it very frustrating that i have to wait for a significant amount of time for packages to fail to be found publicly before being located on our internal repo

1. (Post author)
  
  Katherine Skilling October 11, 2023 — 9:01 am
  
  No I didn’t find a way to stop it. From what I was told it is the default behaviour of consuming Python repos and not something Aria Automation is in control of. I’m not a Python expert so I didn’t argue with that statement 🙂

Using private Python repositories with Aria Automation Orchestrator

Background

Lab Configuration

Devpi Server Installation

Devpi Client Installation

Devpi Sub-registry Configuration

NSX SDK Package Upload

Aria Automation Orchestrator Configuration

Notes about the Repositories feature

Like this:

2 thoughts on “Using private Python repositories with Aria Automation Orchestrator”

Leave a CommentCancel reply

Background

Lab Configuration

Devpi Server Installation

Devpi Client Installation

Devpi Sub-registry Configuration

NSX SDK Package Upload

Aria Automation Orchestrator Configuration

Notes about the Repositories feature

Share this:

Like this:

2 thoughts on “Using private Python repositories with Aria Automation Orchestrator”

Leave a CommentCancel reply

Discover more from kskilling