Automated Sensitive Secrets Monitoring with Gitleaks and Github Actions

Sensitive secrets being exposed accidentally / hardcoded as part of scripts/code is always a major concern / blind spot for internal security team of any organization. Even though the repositories of an org are private, equipped with right set of regex , it takes less than a minute for an insider to get hold of sensitive secrets and ability to wreck havoc any system that the insider has no access to.(ex: slack tokens, prod database creds etc) .

In this post we will see how this end to end sensitive secrets monitoring system can be built for an organization which uses github as version control system as part of their SDLC.

Quick overview of the desired workflow : 

As you can see , we will be using gitleaks (an awesome and flexible secrets detection tool https://github.com/zricethezav/gitleaks ) to detect any patterns of secret that might be interesting to us. We would also be using github actions for couple of reasons instead of other build systems. One of the strongest point is because of its tight integration with github and less integration overheads. Github actions also gives  us more control over the events part of github.

Brief overview of the desired workflow :

  1. We would be cloning, modifying a bit of code and host the github actions repository  at https://github.com/zricethezav/gitleaks-action/tree/v1.1.3 as one of our own private organization repository. This helps us to modify the existing gitleaks-action as per our needs. We would also be hosting gitleaks rules file in the same repo. this is for couple of reason,
    1. we want to make sure we have a centralized rules file for all the repositories that we want to monitor as part of the organization. This also helps us to modify rules file at a central location. (adding whitelisting patterns , whitelisting paths etc)
    2. We also do not want to use latest version of git leaks that might break the workflow(we use version v6.1.0).
  2. Gitleaks would be configured as part of github actions workflow for all the repositories we want to monitor for any sensitive secret patterns. This step would use the private github action we created in the step 1.
  3. Configured Github actions workflow trigger the scan for any secrets when a developer commits a new code to release/master branch.( Push event) / raises a new Pull request against master branch(PR event).

Step 1 :

Clone the repository hosted at https://github.com/zricethezav/gitleaks-action/tree/v1.1.3 , create a private repository under company org with desired name(ex: github-actions) and host the cloned code there . ( thanks to relaxed license policy by https://github.com/zricethezav/ 😀 )

Step 2 : 

Modify the existing Dockerfile by specifying version of git leaks release under private ‘github-actions’ repository which was created in step 1.

FROM zricethezav/gitleaks:v6.1.0

LABEL "com.github.actions.name"="gitleaks-action"
LABEL "com.github.actions.description"="runs gitleaks on push and pull request events"
LABEL "com.github.actions.icon"="shield"
LABEL "com.github.actions.color"="purple"
LABEL "repository"="https://github.com/zricethezav/gitleaks-action"

ADD entrypoint.sh /entrypoint.sh
ENTRYPOINT ["/entrypoint.sh"]

Step 3 :

Gitleaks use toml formatted file for defining rules . Create a rules file with desired name(ex: rules.toml ) under root of newly created repo ‘github-actions’ (created in step 1) . Gitleaks uses this rules file to detect the sensitive patterns. Gitleaks project has an excellent documentation how rules are defined here .(Configuration section of readme) and also couple of examples are define by the project here .

Example rules.toml :

[[rules]]
	description = "API Key"
	regex = '''[a-zA-z_0-9]*(?i)(api_key|token|apikey|db_pass|db_password)["']*(\s)*[=:](\s)*["'][a-zA-z+0-9_/!+=]+['"]'''
	tags = ["key", "APIKEY"]
		[rules.allowlist]
		description = "ignore non sensitive key"
		regexes = [
		    '''(?i)(TEST_TOKEN|DUMMY_KEY)["'a-z\sA-z0-9_=]*'''
		]	
		[[rules.Entropies]]
			Min = "4.3"
			Max = "8.0"

[[rules]]
          description = "Slack"
          regex = '''xox[baprs]-([0-9a-zA-Z]{10,48})?'''
          tags = ["key", "Slack"]


[allowlist]
          description = "global allowlist. whitelisting paths"
          paths = ['''test/*''']

Here in the rules file we define individual regex patterns which will be matched against github commits to detect sensitive secrets.

Here in the above example, we have defined two rules. First rule is one of my custom rule which contains regex to detect sensitive key value patterns with substring ( api_key, token, apikey, db_pass, db_password)(you can craft and test regex rules here : https://regex101.com/ (select golang for testing gitleaks rules).

We also have the ability to whitelist the certain detected patterns as shown below under rules.allowlist section per rule. We can also define the shannon entropy value under rules.Entropies section per rule. Defining threshold  entropy values further refines the resultant detections and gives less false positives. This has its cons, higher the entropy, lower the detection of short/common secrets .You can also calculate shannon entropy of the string over here : https://www.shannonentropy.netmark.pl/ .

Second rule detects if the scanned commit contains slack user/bot tokens . We also have the ability to define global allowlist which applies to all the repositories which are being monitored for secrets. In this case if any test / dummy secrets are defined under test/ any repository path , gitleaks skips scanning the defined folder.

Step 4 :

Modify the existing entrypoint.sh file under private  ‘github-actions’ repository which was created in step 1 as follows . This is done in order to use centralized rules file(rules.toml) .

#!/bin/bash

#modified
CONFIG="--config ./gitleaksaction/rules.toml"

echo running gitleaks "$(gitleaks --version) with the following command :👇"

DONATE_MSG="👋 maintaining gitleaks takes a lot of work so consider sponsoring me or donating a little something\n\e[36mhttps://github.com/sponsors/zricethezav\n\e[36mhttps://www.paypal.me/zricethezav\n"

if [ "$GITHUB_EVENT_NAME" = "push" ]
then
  echo gitleaks --pretty --repo-path=$GITHUB_WORKSPACE --verbose --redact --commit=$GITHUB_SHA $CONFIG
  gitleaks --pretty --repo-path=$GITHUB_WORKSPACE --verbose --redact --commit=$GITHUB_SHA $CONFIG
elif [ "$GITHUB_EVENT_NAME" = "pull_request" ]
then 
  git --git-dir="$GITHUB_WORKSPACE/.git" log --left-right --cherry-pick --pretty=format:"%H" remotes/origin/$GITHUB_BASE_REF...remotes/origin/$GITHUB_HEAD_REF > commit_list.txt
  echo gitleaks --pretty --repo-path=$GITHUB_WORKSPACE --verbose --redact --commits-file=commit_list.txt $CONFIG
  gitleaks --pretty --repo-path=$GITHUB_WORKSPACE --verbose --redact --commits-file=commit_list.txt $CONFIG
fi

if [ $? -eq 1 ]
then
echo -e "\e[31m🛑 STOP! Gitleaks encountered leaks"
echo "----------------------------------"
echo -e $DONATE_MSG
exit 1
else
echo -e "\e[32m✅ SUCCESS! Your code is good to go!"
echo "------------------------------------"
echo -e $DONATE_MSG
fi

Step 5 :

Final step is to add the github actions workflow file under all the repositories of an organization that needs to be monitored for secrets.

Github actions workflow files are defined under ( <repo_root>/.github/workflows/<file>.yaml) . Add the following workflow file(gitleaks.yaml) under all the repositories of the org(<monitored_repo>/.github/workflows/gitleaks.yaml).

name: gitleaks
on:
  pull_request:
    branches: 
      - master
  push:
    branches: 
      - master
jobs:
  gitleaks:
    runs-on: ubuntu-latest
    steps:
    - name: Code checkout
      uses: actions/checkout@v2
      with:
        fetch-depth: '0'  
    - name: gitleaks-action checkout(private)
      uses: actions/checkout@v2
      with:
        repository: <yourOrgChangeThis>/gitleaks-action
        ref: master 
        token: ${{ secrets.GIT_ACTION_TOKEN }} # stored in GitHub secrets
        path: ./gitleaksaction
    - name: run gitleaks action
      uses: ./gitleaksaction

This workflow file does couple of things.

It triggers only on events, push(direct commit) and pull request against master branch ( defined under ‘on’ section of workflow file.)

it uses default ubuntu latest runner for running this job.

Finally defined workflow job checks out the current repository which triggered the scan and the private gitleaks-actions repository which was created and modified in earlier steps. Since github actions is hosted inside a private organization, we have to define a github secret at org level ( secrets.GIT_ACTION_TOKEN ) with the github access token. Finally gitleaks action is run inside the runner.

Step 6 :

Define GIT_ACTION_TOKEN inside org level github secrets with necessary permissions ( read and checkout permissions) .

Test it :

Test and verify by directly committing to master branch/ raising a PR which contains sensitive secret which matches the regex pattern.

example test string pattern :     passsecretA23token = “dasojdhaskj12111AcX0908123lkjsalkdjlajdsaw281238dsadaAsda1”

If the branch is protected, github actions check prevents merging the feature branch and fails the check.(can also check the result manually under actions tab of a repository .)

Execution Flow Diagram : 

Results : 

How to remove the valid detected secrets?!

Consider the leaked sensitive keys as compromised and rotate the secrets.

Use BFG repo cleaner to remove the secrets : https://rtyley.github.io/bfg-repo-cleaner/ . This is not completely efficient,  as the read only PR heads containing secrets are not removed(only github support team can help with this).

That’s it for now. Just want to keep this blog alive .Shubha Dina.