Build a complete CI/CD Pipeline and its infrastructure with AWS — Jenkins — Bitbucket — Docker — Terraform → Part 5
In this fifth part of the tutorial, we are going to implement the user data of both the Web App instance and the Jenkins instance. For the Jenkins instance, this will require the creation of various scripts which will be uploaded to the correct S3 bucket and then pull down and run in the user data to avoid the 16k characters limit. For more information regarding this topic, you can check this article.
Part 1 (here)→ Set up the project by downloading and looking at the Web App which will be used to test our infrastructure and pipeline. Here we also create & test suitable Dockerfiles for our project and upload everything to Bitbucket.
Part 2 (here)→ Set up Slack and create a Bot which will be used by Jenkins to send notifications on the progression/status of the pipeline.
Part 3 (here)→ Create the first part of the AWS Infrastructure with Terraform. Here we will create the EC2 instances / SSH keys and the actual Network infrastructure plus the basis for the IAM roles.
Part 4 (here)→ Create the second part of the AWS Infrastructure with Terraform. We are going to create the S3 buckets, the ECR repositories and complete the definition of the IAM roles by adding the correct policies.
Part 5 (current article) → Complete the configuration of the Jenkins and Web App instances by implementing the correct user data.
Part 6 (here)→ Implement the Pipeline in a Jenkinsfile and try out the pipeline, see how eveything fit together and lay down some final comments.
Let’s get started!
Web App User Data
We did already create a user_data.sh
in Terraform/application-server/
. Let’s open it in VSCode.
The Web App instance needs to pull down the image from the ECR repository (its URL will be in the variable repository-url
) and run it.
So, in that user_data.sh
we are going to put the following:
Let’s analyze it:
- First we
yum update
(remember that we are on an Amazon Linux Machine, which is based on CentOS distribution, so we do not haveapt
but we gotyum
); - Then we install
Docker
which in these Amazon Linux instances comes in theamazon-linux-extras
. - Afterwards we
start
Docker and weenable
it so that it will start automatically on reboot. - The next part is the core functionality of the script. What does it do? It creates a file
start-website
where it will put the following code:
/bin/sh -e -c 'echo \$(aws ecr get-login-password --region us-east-1) | docker login -u AWS --password-stdin ${repository_url}'sudo docker pull ${repository_url}:releasesudo docker run -p 80:8000 ${repository_url}:release
Now the first line is an elaborated way to log into the AWS ECR repository without writing the password to stdout. This is a security precaution, it avoids the password to be left in some history logs. The important command is the following:
echo \$(aws ecr get-login-password --region us-east-1) | docker login -u AWS --pasword-stdin ${repository_url}
the \$(aws ecr get-login-password --region us-east-1)
will retrieve the password to log into the ECR. Since we are in an Amazon Linux instance, we already have available the AWS CLI with the profile configuration already set. After retrieving this password, we pipe it into the docker login -u AWS --password-stdin
where the password-stdin
tells the login command to take the password from stdin, namely the value we are piping and with ${repository_url}
we specify the repository to log into.
The next two lines:
sudo docker pull ${repository_url}:releasesudo docker run -p 80:8000 ${repository_url}:release
will pull the image from the repository and run a container from that image ${repository-url}:release
. It will map the docker’s internal port 8000
, to the 80
port of the instance. This command will then start the Web App and will serve it to the port 80
of this instance.
- Once we have created this file, we need to move it to the folder dedicated to scripts that need to be run at boot time. This folder (specific to AWS instances) is located at
/var/lib/cloud/scripts/per-boot/
. - After that, we mark that script as an executable.
- Finally we actually run it, since otherwise we would need to reboot the instance to run the script (the first time the instance is created).
Jenkins User Data
The Jenkins’ user data will be much more elaborated. If you have never used Jenkins it will be hard to understand the reasons for some choices, I will however try to explain everything in the most possible understandable way.
Let’s modify the jenkins-server/main.tf
, and in the user_data
we are going to put the following:
user_data = templatefile( "${path.module}/user_data.sh", { repository_url = var.repository-url, repository_test_url = var.repository-test-url, repository_staging_url = var.repository-staging-url, instance_id = var.instance-id, bucket_logs_name = var.bucket-logs-name, public_dns = var.public-dns, admin_username = var.admin-username, admin_password = var.admin-password, admin_fullname = var.admin-fullname, admin_email = var.admin-email, remote_repo = var.remote-repo, job_name = var.job-name, job_id = var.job-id, bucket_config_name = var.bucket-config-name })
We are pulling a user_data.sh
file (which we are going to create soon) and assign all these variables. The jenkins-server/main.tf
should look like:
Alright, in jenkins-server/
let’s create the user_data.sh
file which we are going to populate below.
Now, the first part of the user_data.sh
will look like:
Okay, that’s scary! let’s break it down:
- Firstly we
yum update
and then we installgit
. The-y
option allow the installation to proceed without prompting the user (since there will be no user to prompt). - Afterwards we install Jenkins. Since Jenkins is built on Java we also install the jdk. The commands that we run to install Jenkins are suitable for CentOS distributions, if you are hosting jenkins on another
AMI
you need to use the correct syntax (you can check it here ). - As in the Web App user data, we install docker from the
amazon-linux-extras
, - Then we start and enable both Jenkins and Docker.
- After that, we need to make sure that
Jenkins
and theec2-user
are allowed to use Docker. For that reason we use the commandusermod -a -G docker ec2-user
andusermod -a -G docker jenkins
. - Below we create a folder
opt
in the jenkins$HOME
, namely in/var/lib/jenkins
and we change the ownership and group to give it to the jenkins user. This folder will contain useful files and programs that will be used in the Pipeline. - Then we download, extract and move
Arachni
which will be used in the pipeline to make some security checks. - In order to have available some variables needed in the pipeline, we save them out in some files which we place in the
opt
folder we created before. This variables are:repository_url
,repository_test_url
,repository_staging_url
,instance_id
andbucket_logs_name
and needs to be saved in the Jenkins instance since the pipeline needs them to properly work. - Finally we change the ownership and group of all the new files in the
opt
directory. This is necessary since the operations we are doing in this script are made by the super user and the Jenkins user may not have access to them if we do not change these flags. At the end wesleep
for 60 seconds to make sure that everything is set up and ready for the next stage.
Below that code we will have the second part:
- Since we will run some scripts which require some variables, we
export
all the necessary variables. In this way, they will be globally available to every subshell. - After that, we download the script files from the S3 bucket. The command
sudo aws s3 cp s3://${bucket_config_name}/ ./ --recursive
will take every element in the bucket named${bucket_config_name}
and will put them in the ‘current’ directory./
(this I believe will be the root directory/
, but we actually don’t care). Clearly, to be able to run them, we mark all of them as executable. - Then we execute them! Between the download and the confirm script, there is a
sudo sleep 120
. This will allow Jenkins to download and install the plugins before going on with the other scripts. - Now, the script
create_credentials.sh
will create the Jenkins credentials needed to access bitbucket (we will store there the bitbucket private key). These credentials will have an ID which is needed in the next scriptcreate_multibranch_pipeline.sh
. In order to retrieve this ID we make use of the scriptget_credentials_id.sh
. This Script will make a GET request to the Jenkins API, which will return a JSON with the credentials. The codepython -c "import sys;import json;print(json.loads(......)) > credentials_id
is used to parse that JSON, extract the ID and output it to a filecredentials_id
, so that thecreate_multibranch_pipeline.sh
will easily read it to get the ID. - At the end, we remove all these configuration files (and the
credentials_id
) and reboot the instance.
Create Jenkins Configuration Files
Since in the above user_data
we reference a bunch of scripts, we need to also create them. As we mentioned before, these scripts need to actually be uploaded to the S3 bucket <first_name><last_name>-jenkins-config
.
In the Terraform root directory create a jenkins-config
folder:
And create the following 6 scripts:
confirm_url.sh
create_admin_user.sh
create_credentials.sh
create_multibranch_pipeline.sh
download_install_plugins.sh
get_credentials.sh
Now we need to upload all these files to the jenkins-config
bucket. In order to do that, we add a new resource of type aws_s3_bucket_object
which will upload all files in the folder jenkins-config
to that bucket. In the s3.tf
file, add the following resource:
Awesome, the last thing that we are left to do is to implent the scripts.
Part of the code that we are going to write here can be found in this article.
create_admin_user.sh
This script will create the Jenkins admin user with the credentials that we put earlier in the terraform.tfvars
. All these scripts will be quite messy since we have to deal with CSRF protection tokens and the data we are actually POSTing will url-encoded. To summarize, this is what we are doing in this script:
- Take the default initial password that Jenkins provides us and store it in the variable
old_password
; - Create URL encoded versions of some variables that we need to pass in the POST request body;
- Request a CSRF crumb token and store the corresponding Cookie in the
cookie_jar
temporary file. - Make the POST request to
setUpWizard/createAdminUser
noting the--cookie $cookie_jar
and the-H "$full_crumb"
where we specify the cookie and the crumb taken from the previous step. Also we make the POST request with the-u "admin:$old_password"
, authenticating ourselves with the password furnished by jenkins.
If we URL-decode the --data-raw
we see that it contains the following:
username=$username_URLEncoded&password1=$password_URLEncoded&password2=$password_URLEncoded&fullname=$fullname_URLEncoded&email=$email_URLEncoded&Jenkins-Crumb=$only_crumb&json={"username": "$username_URLEncoded", "password1": "$password_URLEncoded", "$redact": ["password1", "password2"], "password2": "$password_URLEncoded", "fullname": "$fullname_URLEncoded", "email": "$email_URLEncoded", "Jenkins-Crumb": "$only_crumb"}&core:apply=&Submit=Save&json={"username": "$username_URLEncoded", "password1": "$password_URLEncoded", "$redact": ["password1", "password2"], "password2": "$password_URLEncoded", "fullname": "$fullname_URLEncoded", "email": "$email_URLEncoded", "Jenkins-Crumb": "$only_crumb"}
Namely we provide a bunch of information to the Jenkins API:
- Username
- Password (new one)
- Fullname
- Crumb Token
With this script, our Jenkins admin account will be created.
download_install_plugins.sh
In each script we will grab a new CSRF crumb token. I guess we could in principle save that token and reuse it in all the subsequent requests, however I prefer to make all of them independent of each other and repeat the token-grabbing part. Note that here we use -u "$user:$password"
to make the request, using the new password and the new user we created with the previous script.
In the POST request we see that we specify some plugins to install. The defaults would be:
'plugins':['cloudbees-folder','antisamy-markup-formatter','build-timeout','credentials-binding','timestamper','ws-cleanup','ant','gradle','workflow-aggregator','github-branch-source','pipeline-github-lib','pipeline-stage-view','git','ssh-slaves','matrix-auth','pam-auth','ldap','email-ext','mailer']
However, for our use case, we also install the following plugins:
- Bitbucket → To allow us to correctly have a webhook which will be triggered by Bitbucket once a push on the remote repo has been performed.
- Docker-Workflow → Which will allow us to use some docker commands in the Jenkinsfile of our Multibranch pipeline.
- Blueocean → Which will provide us with a beautiful UI design.
confirm_url.sh
First thing here we ulr-encode the public DNS name of the Jenkins instance (stored in the $url
variable), which will be the one we will be using to access the Jenkins server from the Web. Then we grab the cookie and token and make the POST request to /setupWizard/configureInstance
. In the body of the request, we pass the rootUrl
as the DNS nam mentioned before, along with the crunb.
create_credentials.sh
At this point, we need to ‘create’ the credentials to allow Jenkins to access the Bitbucket repo. We will create Jenkins credentials storing the SSH private key pulled down from the AWS Secrets Manager.
This script will look like:
- The first command will take the secret from AWS Secrets Manager, parse it with a python command and then output it inside a
ssh_tmp
auxiliary file. In particular the command$(aws secretsmanager get-secret-value --secret-id simple-web-app --region us-east-1)
will retrieve thesimple-web-app
secret and output it inside the python script:import sys;import json;print(json.loads(raw_input())['SecretString'])['private'])
which will parse the json returned from the AWS API call and extract the secret key. Then the key extracted that way, will be put in thessh_tmp
file. - With the line
ssh_private_key=$(awk -v ORS='\\n' '1' ssh_tmp)
we correctly parse newlines in the key and place everything in the variablessh_private_key
. - Afterwards, as awalys, we retrive the crumb and the cookie.
- Finally we make the POST request to
/credentials/store/system/domain/_/createCredentials
. There are various ways to save credentials in Jenkins (you can look here) and in our case we are defining ours to be global. We name themGit
and we provide thessh_private_key
variable content to theprivateKey
field in the json body.
Cool! In order to use them, namely to tell the multibranch pipeline which credentials it needs to use, we must extract the ID that jenkins generated for these credentials. This is done by the next script.
get_credentials_id.sh
As always we grab the crumb and the cookie, and then we make the GET request to /credentials/store/system/domain/_/api/json?tree=credentils[id]
This script will return a JSON with a given structure. This JSON is parsed by the code in /jenkins-server/user_data.sh
(before the ./create_multibranch_pipeline.sh
line):
python -c "import sys;import json;print(json.loads(raw_input())['credentials'][0]['id'])" <<< $(./get_credentials_id.sh) > credentials_id
This python command will extract the ID from the JSON coming out from the GET request above and then write that ID in the credentials_id
file, which will be used by the next script.
create_multibranch_pipeline.sh
- First we grab the ID of the credentials stored in the
credentials_id
file. - We URL-encode the Job name and the remote URL of the bitbucket repo.
- Grab the token and the cookie.
- Create the Job with a POST request to
/createItem
passing as the name of the Job the content of the variablejobName
. We also specify that it needs to be a multibranch pipeline. - Finally we configure the actual Pipeline by making a POST request to
/job/$jobName_URLEncoded/configSubmit
. In the raw data we specify the remote repository and the credentials using our variables. We also furnish a required Job ID (in the variablejobID
) that we have previously genereted using the Terraform resourcerandom_id
in therandom.tf
file.
These scripts might seem a bit cumbersome, in particular when considering the --data-raw
content. These POST requests have been taken directly from the ‘Chrome Developer tools > Network’ when these steps were performed on the Jenkins instance from browser.
I was able to use the --data-urlencode
command of cURL for some of them, I also tried to use application/json
header to pass directly JSONs to make the requests more readable. However I wasn’t been able to do that correctly, so if you find a way to make the above requests without using --data-raw
but instead using a more readable format, please let me know! 😉
Sanity Check
Alright, we should have completed the Jenkins configuration and the application user data, let’s see whether everything works fine!
Open up the terminal in the Terraform
directory and:
Then:
terraform apply
Confirming the changes.
Now, in order for the Jenkins instance to complete its setup, a couple of minutes may be required. So, if the terraform apply concludes correctly, just wait some more minutes for the Jenkins configuration to correctly finish.
First of all, we can check the AWS S3 service to see whether the Jenkins config files have been correctly uploaded.
Click on the <first_name><last_name>-jenkins-config
S3 bucket:
And we should have the 6 scripts:
Wonderful!
Let’s now try to log in the Jenkins instance from the Web with the credentials we have provided in the terraform.tfvars
:
To grab the Jenkins URL, let’s go to the AWS EC2 service, click on instances
on the left side and then click on the jenkins instance:
And then copy the public dns name:
Open a new web page and navigate to:
http://<your_public_dns_name>:8080
So in my case, I will navigate to (http not https):
After hitting enter, you should be prompted with the Jenkins Login:
This is a good sign, at least that the admin account creation has been completed correctly. Put your credentials there and after signing in, you should be redirected to the dashboard:
Cool, it seems that our multibranch pipeline has been created correctly. On the left, we should see the Open Blue Ocean
button which will confirm us that the plugins has been correctly downloaded and installed. We might also check if the credentials has been saved. On the left, click on Manage Jenkins
:
and then Manage Credentials
:
Here, we should see our Git
(bitbucket) credentials:
Alright, eveything seems to work fine. Let’s try to SSH into the machine and look around.
Follow the same steps as in the Part 3 of this tutorial. Namely, go to the AWS EC2 service and into the Jenkins instance. Click the Connect
button and grab the ssh
example, it should be something like:
ssh -i "jenkins.pem" ec2-user@ec2-35-168-176-186.compute-1.amazonaws.com
Copy this in the terminal (making sure you are in the same directory as the jenkins.key
) and after renaming .pem
in .key
in the above command, run it:
If you get the following error:
You need to open your $home\.ssh\known_hosts
(if you are on Windows) or
~/.ssh/known_hosts
if you are on linux, and delete the line that the output is suggesting you:
So in windows you can open that file in VSCode and delete the line:
Once we have correctly SSHed into the machine, we can ‘cat out’ the content of the cloud-init-output
log file to see whether eveything went fine:
(you can then exit the machine with exit
)
Nice, at this point we can add
, commit
and push
our changes to bitbucket. Let’s go into the simple-web-app
folder and:
And finally:
Perfect! This fifth part of the tutorial ends here, in the next one we are going to complete our journey by implementing the Jenkinsfile with the pipeline and by doing some tests with a couple of final considerations!
Cheers!
Kevin