Skip to content

Conversation

@ChristinaLK
Copy link
Contributor

@ChristinaLK ChristinaLK commented Apr 29, 2025

@ChristinaLK
Copy link
Contributor Author

i ALSO do include Andrew's new tool, so we should, uh, make sure that's live before we merge.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how this fits in this container PR. A leftover file from a previous branch?


This guide describes the general process for creating an Apptainer container.
Specifically, we discuss the components of the "definition file" and how that file is used to construct or "build" the container itself.
Specifically, we discuss the components of the "definition file" and how that file is used to construct or "build" the container itself. For a more step-by-step description of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Specifically, we discuss the components of the "definition file" and how that file is used to construct or "build" the container itself. For a more step-by-step description of
Specifically, we discuss the components of the "definition file" and how that file is used to construct or "build" the container itself. **Use this guide as a reference to help customize the contents of your container.** For a step-by-step tutorial of

add `chmod 777 /tmp` in guide
condor_submit -i build.sub
```
{:.term}
TBD: graphic
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll need a graphic here before we merge.

1. **Start an interactive job for building.** We require that you build containers while in an interactive build job.
1. **Build the container.** To build a container, Apptainer uses the instructions in the `.def` file to create a `.sif` file. The `.sif` file is the compressed collection of all the files that comprise the container.
1. **(Optional): Test the container.** Once the image (`.sif` file) is created, it is important to test it to make sure you have all software, packages, and libraries installed correctly.
1. **Move the container to a persistent location.** We recommend placing the image file into your `/staging` folder
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. **Move the container to a persistent location.** We recommend placing the image file into your `/staging` folder
1. **Move the container to a persistent location.** We recommend placing the image file into your `/staging` folder.


```
exit
chtc-submit-apptainer-build -build image.def
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This command currently does not exist in $PATH

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is my winter break project, I hope.

For now, we should provide a build.sub file.

{:.tip-header}
>
> Apptainer `.sif` files can be fairly large, especially if you have a complex software stack.
> If your interactive job abruptly fails during the build step, you may need to increase the value of `request_disk` in the submit file generated by `chtc-submit-apptainer-build`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
> If your interactive job abruptly fails during the build step, you may need to increase the value of `request_disk` in the submit file generated by `chtc-submit-apptainer-build`
> If your interactive job abruptly fails during the build step, you may need to increase the value of `request_disk` in the submit file generated by `chtc-submit-apptainer-build`.

Comment on lines +189 to +195
> If the build command fails, examine the output for error messages that may explain why the build was unsuccessful.
Typically there is an issue with a package installation, such as a typo or a missing but required dependency.
Sometimes there will be an error during an earlier package installation that doesn't immediately cause the container build to fail.
But, when you test the container, you may notice an issue with the package.
>
> If you are having trouble finding the error message, edit the definition file and remove (or comment out) the installation commands that come after the package in question.
Then rebuild the image, and now the relevant error messages should be near the end of the build output.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
> If the build command fails, examine the output for error messages that may explain why the build was unsuccessful.
Typically there is an issue with a package installation, such as a typo or a missing but required dependency.
Sometimes there will be an error during an earlier package installation that doesn't immediately cause the container build to fail.
But, when you test the container, you may notice an issue with the package.
>
> If you are having trouble finding the error message, edit the definition file and remove (or comment out) the installation commands that come after the package in question.
Then rebuild the image, and now the relevant error messages should be near the end of the build output.
> If the build command fails, examine the output for error messages that may explain why the build was unsuccessful.
> Common errors include:
> * Typos
> * Missing dependencies
> Search the internet for the error message as a starting point for troubleshooting software installation. You can also [reach out](get-help) to us for help.

1. Follow the instructions in [our guide](conda-installation.html#option-b-create-your-own-portable-copy)
Create your own portable copy of your Conda packages:

> This approach may be sensitive to the operating system of the execution point.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
> This approach may be sensitive to the operating system of the execution point.
> This approach is sensitive to the operating system of the execution point.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest deleting the "More information" section (up to the "executable" section)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly this page is super long... may need a trim in another PR later

highlighter: none
layout: guide
title: Running HTC Jobs Using Docker Containers
title: Use Custom Software in Jobs Using Docker
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the previous title is more informative - the changed title is vague and could possibly confuse a reader into thinking this guide is about building Docker containers

Copy link
Contributor

@xamberl xamberl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The general structure LGTM! There's a couple comments/suggestions I added, but I think it should be good to merge. We can then go in with a fine-tooth comb on each software-related page.

@aowen-uwmad aowen-uwmad self-assigned this Dec 4, 2025
Copy link
Contributor

@aowen-uwmad aowen-uwmad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, I like the changes you've made. Once the specific comments in mine & Amber's reviews are addressed, this is good to go.

apt install -y gcc make wget
```

> The `chmod 777 /tmp` is a specific workaround for building containers on the HTC system. Do not use this line if you are using Apptainer to build containers on a different system.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
> The `chmod 777 /tmp` is a specific workaround for building containers on the HTC system. Do not use this line if you are using Apptainer to build containers on a different system.
> The `chmod 777 /tmp` is a specific workaround for building containers on the HTC system at CHTC. **Do not use this** line if you are using Apptainer to build containers **on a different system**.


```
exit
chtc-submit-apptainer-build -build image.def
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is my winter break project, I hope.

For now, we should provide a build.sub file.

Comment on lines +181 to +184
> Apptainer `.sif` files can be fairly large, especially if you have a complex software stack.
> If your interactive job abruptly fails during the build step, you may need to increase the value of `request_disk` in the submit file generated by `chtc-submit-apptainer-build`
> In this case, the `.log` file should have a message about the reason the interactive job was interrupted.
{:.tip}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
> Apptainer `.sif` files can be fairly large, especially if you have a complex software stack.
> If your interactive job abruptly fails during the build step, you may need to increase the value of `request_disk` in the submit file generated by `chtc-submit-apptainer-build`
> In this case, the `.log` file should have a message about the reason the interactive job was interrupted.
{:.tip}
> Your interactive job may fail abruptly during the build step.
> The most common reason is that your interactive job ran out of disk space.
> **Try increasing the value of `request_disk` in the submit file.**
>
> You can see the exact reason the job was interrupted by running `condor_q -hold` or by looking in the `.log`.
{:.tip}

To run a job using a Docker container, you will need access to a Docker container
image that has been built and placed onto the
[DockerHub](https://hub.docker.com/) website or another Docker registry service
like [https://quay.io/] or a GitLab registry.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
like [https://quay.io/] or a GitLab registry.
like [https://quay.io/](https://quay.io/) or a GitLab registry.

## Quickstart: Python

### Option A (recommended)
### Option A - Own Container (recommended)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### Option A - Own Container (recommended)
### Option A - Your Container (recommended)

Doesn't match the bit in the R card.

1. Find an existing container
1. Build your own container image

If building your own container, the process will include:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If building your own container, the process will include:
#### Find an existing container image
TBD: guidance on finding useful, trusted container images online.
#### Build your own container image
To build your own container, the process will include:

@aowen-uwmad
Copy link
Contributor

Once the updated guides are live on the website, I think we should ask Patricia Tran to do a once over to look for any additional improvements. But this has been lingering long enough that it shouldn't be blocker.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants