Build a complete CI/CD Pipeline and its infrastructure with AWS — Jenkins — Bitbucket — Docker — Terraform → Part 5

Kevin De Notariis
15 min readJun 11, 2021
Technologies used: Terraform, Bitbucket, Docker, AWS, Jenkins

In this fifth part of the tutorial, we are going to implement the user data of both the Web App instance and the Jenkins instance. For the Jenkins instance, this will require the creation of various scripts which will be uploaded to the correct S3 bucket and then pull down and run in the user data to avoid the 16k characters limit. For more information regarding this topic, you can check this article.

Part 1 (here)→ Set up the project by downloading and looking at the Web App which will be used to test our infrastructure and pipeline. Here we also create & test suitable Dockerfiles for our project and upload everything to Bitbucket.

Part 2 (here)→ Set up Slack and create a Bot which will be used by Jenkins to send notifications on the progression/status of the pipeline.

Part 3 (here)→ Create the first part of the AWS Infrastructure with Terraform. Here we will create the EC2 instances / SSH keys and the actual Network infrastructure plus the basis for the IAM roles.

Part 4 (here)→ Create the second part of the AWS Infrastructure with Terraform. We are going to create the S3 buckets, the ECR repositories and complete the definition of the IAM roles by adding the correct policies.

Part 5 (current article) Complete the configuration of the Jenkins and Web App instances by implementing the correct user data.

Part 6 (here)Implement the Pipeline in a Jenkinsfile and try out the pipeline, see how eveything fit together and lay down some final comments.

Let’s get started!

Web App User Data

We did already create a user_data.sh in Terraform/application-server/ . Let’s open it in VSCode.

The Web App instance needs to pull down the image from the ECR repository (its URL will be in the variable repository-url ) and run it.

So, in that user_data.sh we are going to put the following:

Let’s analyze it:

  • First we yum update (remember that we are on an Amazon Linux Machine, which is based on CentOS distribution, so we do not have apt but we got yum );
  • Then we install Docker which in these Amazon Linux instances comes in the amazon-linux-extras .
  • Afterwards we start Docker and we enable it so that it will start automatically on reboot.
  • The next part is the core functionality of the script. What does it do? It creates a file start-website where it will put the following code:
/bin/sh -e -c 'echo \$(aws ecr get-login-password --region us-east-1) | docker login -u AWS --password-stdin ${repository_url}'sudo docker pull ${repository_url}:releasesudo docker run -p 80:8000 ${repository_url}:release

Now the first line is an elaborated way to log into the AWS ECR repository without writing the password to stdout. This is a security precaution, it avoids the password to be left in some history logs. The important command is the following:

echo \$(aws ecr get-login-password --region us-east-1) | docker login -u AWS --pasword-stdin ${repository_url}

the \$(aws ecr get-login-password --region us-east-1) will retrieve the password to log into the ECR. Since we are in an Amazon Linux instance, we already have available the AWS CLI with the profile configuration already set. After retrieving this password, we pipe it into the docker login -u AWS --password-stdin where the password-stdintells the login command to take the password from stdin, namely the value we are piping and with ${repository_url} we specify the repository to log into.

The next two lines:

sudo docker pull ${repository_url}:releasesudo docker run -p 80:8000 ${repository_url}:release

will pull the image from the repository and run a container from that image ${repository-url}:release . It will map the docker’s internal port 8000, to the 80 port of the instance. This command will then start the Web App and will serve it to the port 80 of this instance.

  • Once we have created this file, we need to move it to the folder dedicated to scripts that need to be run at boot time. This folder (specific to AWS instances) is located at /var/lib/cloud/scripts/per-boot/ .
  • After that, we mark that script as an executable.
  • Finally we actually run it, since otherwise we would need to reboot the instance to run the script (the first time the instance is created).

Jenkins User Data

The Jenkins’ user data will be much more elaborated. If you have never used Jenkins it will be hard to understand the reasons for some choices, I will however try to explain everything in the most possible understandable way.

Let’s modify the jenkins-server/main.tf , and in the user_data we are going to put the following:

user_data = templatefile(  "${path.module}/user_data.sh",  {    repository_url = var.repository-url,    repository_test_url = var.repository-test-url,    repository_staging_url = var.repository-staging-url,    instance_id = var.instance-id,    bucket_logs_name = var.bucket-logs-name,    public_dns = var.public-dns,    admin_username = var.admin-username,    admin_password = var.admin-password,    admin_fullname = var.admin-fullname,    admin_email = var.admin-email,    remote_repo = var.remote-repo,    job_name = var.job-name,    job_id = var.job-id,    bucket_config_name = var.bucket-config-name  })

We are pulling a user_data.sh file (which we are going to create soon) and assign all these variables. The jenkins-server/main.tf should look like:

Alright, in jenkins-server/ let’s create the user_data.sh file which we are going to populate below.

creating the user_data.sh file in jenkins-server

Now, the first part of the user_data.sh will look like:

Okay, that’s scary! let’s break it down:

  • Firstly we yum update and then we install git . The -y option allow the installation to proceed without prompting the user (since there will be no user to prompt).
  • Afterwards we install Jenkins. Since Jenkins is built on Java we also install the jdk. The commands that we run to install Jenkins are suitable for CentOS distributions, if you are hosting jenkins on another AMI you need to use the correct syntax (you can check it here ).
  • As in the Web App user data, we install docker from the amazon-linux-extras ,
  • Then we start and enable both Jenkins and Docker.
  • After that, we need to make sure that Jenkins and the ec2-user are allowed to use Docker. For that reason we use the command usermod -a -G docker ec2-user and usermod -a -G docker jenkins .
  • Below we create a folder opt in the jenkins $HOME , namely in /var/lib/jenkins and we change the ownership and group to give it to the jenkins user. This folder will contain useful files and programs that will be used in the Pipeline.
  • Then we download, extract and move Arachni which will be used in the pipeline to make some security checks.
  • In order to have available some variables needed in the pipeline, we save them out in some files which we place in the opt folder we created before. This variables are: repository_url , repository_test_url , repository_staging_url , instance_id and bucket_logs_name and needs to be saved in the Jenkins instance since the pipeline needs them to properly work.
  • Finally we change the ownership and group of all the new files in the opt directory. This is necessary since the operations we are doing in this script are made by the super user and the Jenkins user may not have access to them if we do not change these flags. At the end we sleep for 60 seconds to make sure that everything is set up and ready for the next stage.

Below that code we will have the second part:

  • Since we will run some scripts which require some variables, we export all the necessary variables. In this way, they will be globally available to every subshell.
  • After that, we download the script files from the S3 bucket. The command sudo aws s3 cp s3://${bucket_config_name}/ ./ --recursive will take every element in the bucket named ${bucket_config_name} and will put them in the ‘current’ directory ./ (this I believe will be the root directory / , but we actually don’t care). Clearly, to be able to run them, we mark all of them as executable.
  • Then we execute them! Between the download and the confirm script, there is a sudo sleep 120 . This will allow Jenkins to download and install the plugins before going on with the other scripts.
  • Now, the script create_credentials.sh will create the Jenkins credentials needed to access bitbucket (we will store there the bitbucket private key). These credentials will have an ID which is needed in the next script create_multibranch_pipeline.sh . In order to retrieve this ID we make use of the script get_credentials_id.sh . This Script will make a GET request to the Jenkins API, which will return a JSON with the credentials. The code python -c "import sys;import json;print(json.loads(......)) > credentials_id is used to parse that JSON, extract the ID and output it to a file credentials_id , so that the create_multibranch_pipeline.sh will easily read it to get the ID.
  • At the end, we remove all these configuration files (and the credentials_id ) and reboot the instance.

Create Jenkins Configuration Files

Since in the above user_data we reference a bunch of scripts, we need to also create them. As we mentioned before, these scripts need to actually be uploaded to the S3 bucket <first_name><last_name>-jenkins-config .

In the Terraform root directory create a jenkins-config folder:

screen of the Windows terminal running commands

And create the following 6 scripts:

  • confirm_url.sh
  • create_admin_user.sh
  • create_credentials.sh
  • create_multibranch_pipeline.sh
  • download_install_plugins.sh
  • get_credentials.sh
creating the confirm_url.sh, create_admin_user.sh, create_credentials.sh, create_multibranch_pipeline.sh, download_install_plugins.sh and get_credentials_id.sh files

Now we need to upload all these files to the jenkins-config bucket. In order to do that, we add a new resource of type aws_s3_bucket_object which will upload all files in the folder jenkins-config to that bucket. In the s3.tf file, add the following resource:

Awesome, the last thing that we are left to do is to implent the scripts.

Part of the code that we are going to write here can be found in this article.

create_admin_user.sh

This script will create the Jenkins admin user with the credentials that we put earlier in the terraform.tfvars . All these scripts will be quite messy since we have to deal with CSRF protection tokens and the data we are actually POSTing will url-encoded. To summarize, this is what we are doing in this script:

  • Take the default initial password that Jenkins provides us and store it in the variable old_password ;
  • Create URL encoded versions of some variables that we need to pass in the POST request body;
  • Request a CSRF crumb token and store the corresponding Cookie in the cookie_jar temporary file.
  • Make the POST request to setUpWizard/createAdminUser noting the --cookie $cookie_jar and the -H "$full_crumb" where we specify the cookie and the crumb taken from the previous step. Also we make the POST request with the -u "admin:$old_password" , authenticating ourselves with the password furnished by jenkins.

If we URL-decode the --data-raw we see that it contains the following:

username=$username_URLEncoded&password1=$password_URLEncoded&password2=$password_URLEncoded&fullname=$fullname_URLEncoded&email=$email_URLEncoded&Jenkins-Crumb=$only_crumb&json={"username": "$username_URLEncoded", "password1": "$password_URLEncoded", "$redact": ["password1", "password2"], "password2": "$password_URLEncoded", "fullname": "$fullname_URLEncoded", "email": "$email_URLEncoded", "Jenkins-Crumb": "$only_crumb"}&core:apply=&Submit=Save&json={"username": "$username_URLEncoded", "password1": "$password_URLEncoded", "$redact": ["password1", "password2"], "password2": "$password_URLEncoded", "fullname": "$fullname_URLEncoded", "email": "$email_URLEncoded", "Jenkins-Crumb": "$only_crumb"}

Namely we provide a bunch of information to the Jenkins API:

  • Username
  • Password (new one)
  • Fullname
  • email
  • Crumb Token

With this script, our Jenkins admin account will be created.

download_install_plugins.sh

In each script we will grab a new CSRF crumb token. I guess we could in principle save that token and reuse it in all the subsequent requests, however I prefer to make all of them independent of each other and repeat the token-grabbing part. Note that here we use -u "$user:$password" to make the request, using the new password and the new user we created with the previous script.

In the POST request we see that we specify some plugins to install. The defaults would be:

'plugins':['cloudbees-folder','antisamy-markup-formatter','build-timeout','credentials-binding','timestamper','ws-cleanup','ant','gradle','workflow-aggregator','github-branch-source','pipeline-github-lib','pipeline-stage-view','git','ssh-slaves','matrix-auth','pam-auth','ldap','email-ext','mailer']

However, for our use case, we also install the following plugins:

  • Bitbucket → To allow us to correctly have a webhook which will be triggered by Bitbucket once a push on the remote repo has been performed.
  • Docker-Workflow → Which will allow us to use some docker commands in the Jenkinsfile of our Multibranch pipeline.
  • Blueocean → Which will provide us with a beautiful UI design.

confirm_url.sh

First thing here we ulr-encode the public DNS name of the Jenkins instance (stored in the $url variable), which will be the one we will be using to access the Jenkins server from the Web. Then we grab the cookie and token and make the POST request to /setupWizard/configureInstance . In the body of the request, we pass the rootUrl as the DNS nam mentioned before, along with the crunb.

create_credentials.sh

At this point, we need to ‘create’ the credentials to allow Jenkins to access the Bitbucket repo. We will create Jenkins credentials storing the SSH private key pulled down from the AWS Secrets Manager.

This script will look like:

  • The first command will take the secret from AWS Secrets Manager, parse it with a python command and then output it inside a ssh_tmp auxiliary file. In particular the command $(aws secretsmanager get-secret-value --secret-id simple-web-app --region us-east-1) will retrieve the simple-web-app secret and output it inside the python script: import sys;import json;print(json.loads(raw_input())['SecretString'])['private']) which will parse the json returned from the AWS API call and extract the secret key. Then the key extracted that way, will be put in the ssh_tmp file.
  • With the line ssh_private_key=$(awk -v ORS='\\n' '1' ssh_tmp) we correctly parse newlines in the key and place everything in the variable ssh_private_key .
  • Afterwards, as awalys, we retrive the crumb and the cookie.
  • Finally we make the POST request to /credentials/store/system/domain/_/createCredentials . There are various ways to save credentials in Jenkins (you can look here) and in our case we are defining ours to be global. We name them Git and we provide the ssh_private_key variable content to the privateKey field in the json body.

Cool! In order to use them, namely to tell the multibranch pipeline which credentials it needs to use, we must extract the ID that jenkins generated for these credentials. This is done by the next script.

get_credentials_id.sh

As always we grab the crumb and the cookie, and then we make the GET request to /credentials/store/system/domain/_/api/json?tree=credentils[id]

This script will return a JSON with a given structure. This JSON is parsed by the code in /jenkins-server/user_data.sh (before the ./create_multibranch_pipeline.sh line):

python -c "import sys;import json;print(json.loads(raw_input())['credentials'][0]['id'])" <<< $(./get_credentials_id.sh) > credentials_id

This python command will extract the ID from the JSON coming out from the GET request above and then write that ID in the credentials_id file, which will be used by the next script.

create_multibranch_pipeline.sh

  • First we grab the ID of the credentials stored in the credentials_id file.
  • We URL-encode the Job name and the remote URL of the bitbucket repo.
  • Grab the token and the cookie.
  • Create the Job with a POST request to /createItem passing as the name of the Job the content of the variable jobName . We also specify that it needs to be a multibranch pipeline.
  • Finally we configure the actual Pipeline by making a POST request to /job/$jobName_URLEncoded/configSubmit . In the raw data we specify the remote repository and the credentials using our variables. We also furnish a required Job ID (in the variable jobID ) that we have previously genereted using the Terraform resource random_id in the random.tf file.

These scripts might seem a bit cumbersome, in particular when considering the --data-raw content. These POST requests have been taken directly from the ‘Chrome Developer tools > Network’ when these steps were performed on the Jenkins instance from browser.

I was able to use the --data-urlencode command of cURL for some of them, I also tried to use application/json header to pass directly JSONs to make the requests more readable. However I wasn’t been able to do that correctly, so if you find a way to make the above requests without using --data-raw but instead using a more readable format, please let me know! 😉

Sanity Check

Alright, we should have completed the Jenkins configuration and the application user data, let’s see whether everything works fine!

Open up the terminal in the Terraform directory and:

screen of the Windows terminal running commands
terraform init

Then:

terraform apply

Confirming the changes.

Now, in order for the Jenkins instance to complete its setup, a couple of minutes may be required. So, if the terraform apply concludes correctly, just wait some more minutes for the Jenkins configuration to correctly finish.

First of all, we can check the AWS S3 service to see whether the Jenkins config files have been correctly uploaded.

Click on the <first_name><last_name>-jenkins-config S3 bucket:

Screen showing the bucket to enter

And we should have the 6 scripts:

Screen showing the scripts correctly uploaded in the S3 bucket

Wonderful!

Let’s now try to log in the Jenkins instance from the Web with the credentials we have provided in the terraform.tfvars :

Screen showing where we have defined the jenkins admin credentials

To grab the Jenkins URL, let’s go to the AWS EC2 service, click on instances on the left side and then click on the jenkins instance:

Screen showing the instance to enter

And then copy the public dns name:

Screen showing where to find the dns name of the instance

Open a new web page and navigate to:

http://<your_public_dns_name>:8080

So in my case, I will navigate to (http not https):

Screen showing my url to navigate to

After hitting enter, you should be prompted with the Jenkins Login:

Screen showing the jenkins login page

This is a good sign, at least that the admin account creation has been completed correctly. Put your credentials there and after signing in, you should be redirected to the dashboard:

Screen showing the jenkins dashboard with the pipeline created

Cool, it seems that our multibranch pipeline has been created correctly. On the left, we should see the Open Blue Ocean button which will confirm us that the plugins has been correctly downloaded and installed. We might also check if the credentials has been saved. On the left, click on Manage Jenkins :

Screen showing to click the ‘Manage Jenkins’ button

and then Manage Credentials :

Screen displaying to click ‘Manage Credentials’ button

Here, we should see our Git (bitbucket) credentials:

Screen showing the Jenkins credentials we created

Alright, eveything seems to work fine. Let’s try to SSH into the machine and look around.

Follow the same steps as in the Part 3 of this tutorial. Namely, go to the AWS EC2 service and into the Jenkins instance. Click the Connect button and grab the ssh example, it should be something like:

ssh -i "jenkins.pem" ec2-user@ec2-35-168-176-186.compute-1.amazonaws.com

Copy this in the terminal (making sure you are in the same directory as the jenkins.key ) and after renaming .pem in .key in the above command, run it:

screen of the Windows terminal running commands
ssh -i “jenkins.key” ec2-user@<your_dns_name>

If you get the following error:

Screen showing the output of the ssh command when there is already a key associated to that instance

You need to open your $home\.ssh\known_hosts (if you are on Windows) or

~/.ssh/known_hosts

if you are on linux, and delete the line that the output is suggesting you:

Screen showing where to find the line to delete from the known_hosts file

So in windows you can open that file in VSCode and delete the line:

screen of the Windows terminal running commands
code $home\.ssh\known_hosts

Once we have correctly SSHed into the machine, we can ‘cat out’ the content of the cloud-init-output log file to see whether eveything went fine:

screen of the Windows terminal running commands
sudo cat /var/log/cloud-init-output.log

(you can then exit the machine with exit )

Nice, at this point we can add , commit and push our changes to bitbucket. Let’s go into the simple-web-app folder and:

screen of the Windows terminal running commands
git add .
screen of the Windows terminal running commands
git commit -a -m “Implemented the user data of the Instances”

And finally:

screen of the Windows terminal running commands
git push

Perfect! This fifth part of the tutorial ends here, in the next one we are going to complete our journey by implementing the Jenkinsfile with the pipeline and by doing some tests with a couple of final considerations!

Cheers!

Kevin

--

--

Kevin De Notariis

Theoretical Physicist and Infra Transformation specialist at Accenture Netherlands