Skip to content

Latest commit

 

History

History
112 lines (81 loc) · 3.24 KB

README.md

File metadata and controls

112 lines (81 loc) · 3.24 KB

Flux Terraform AMI

Terraform module to create Amazon Machine Images (AMI) for Flux Framework HashiCorp Packer and AWS CodeBuild. We are mirroring functionality from GoogleCloudPlatform/scientific-computing-examples. Thank you Google, we love you!

Usage

Build Images with Packer

Let's first go into build-images to use packer to build our images. You'll need to export your AWS credentials in the environment:

export AWS_ACCESS_KEY_ID=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
export AWS_SECRET_ACCESS_KEY=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

You'll need to first install packer You can use the Makefile there to build all (or a select set of) images.

cd ./build-images
$ make
# this builds a shared node setup
$ make node

Note that the build takes about 50 minutes (why we use an AMI and don't build Flux on the fly!

==> flux-compute.amazon-ebs.flux-compute: Deleting temporary keypair...
Build 'flux-compute.amazon-ebs.flux-compute' finished after 50 minutes 39 seconds.

==> Wait completed after 50 minutes 39 seconds

==> Builds finished. The artifacts of successful builds are:
--> flux-compute.amazon-ebs.flux-compute: AMIs were created:
us-east-1: ami-0ff535566e7c13e8c

make[1]: Leaving directory '/home/vanessa/Desktop/Code/flux/terraform-ami/build-images/node'

A previous design (building separate images for login, compute, and manager) was started, but not finished in lieu of the simpler design. It's included in build-images/multi for those interested.

Deploy with Terraform

Once you have images, choose a directory under examples to deploy from:

$ cd examples/autoscale

And then init and build:

$ make init
$ make fmt
$ make validate
$ make build

And they all can be run with make:

$ make

You can then shell into any node, and check the status of Flux. I usually grab the instance name via "Connect" in the portal, but you could likely use the AWS client for this too.

$ ssh -o 'IdentitiesOnly yes' -i "mykey.pem" rocky@ec2-xx-xxx-xx-xxx.compute-1.amazonaws.com

Check the cluster status, the overlay status, and try running a job:

$ flux resource list
     STATE NNODES   NCORES NODELIST
      free      2        2 i-012fe4a110e14da1b.ec2.internal,i-0354d878a3fd6b017.ec2.internal
 allocated      0        0 
      down      0        0 
[rocky@i-012fe4a110e14da1b ~]$ flux run -N 2 hostname
i-012fe4a110e14da1b.ec2.internal
i-0354d878a3fd6b017.ec2.internal

You can look at the startup script logs like this if you need to debug.

$ cat /var/log/cloud-init-output.log

That's it. Enjoy!

License

HPCIC DevTools is distributed under the terms of the MIT license. All new contributions must be made under this license.

See LICENSE, COPYRIGHT, and NOTICE for details.

SPDX-License-Identifier: (MIT)

LLNL-CODE- 842614