How to export Github pull requests

Aug 12, 2022

I had the need recently to export Github pull requests from a repository to a CSV so that I could do some analysis.

I wasn’t able to find a simple way to this in the Github UI. When I searched, I found several tools. But, the API seemed quite simple, so I just wrote a script that would dump all pull requests from a Github repository to a CSV.

#!/bin/bash

# This script requires jq to be installed and available in the path.

TOKEN=$1
ORG=$2
REPO=$3
OUTPUT_RAW=""

get_pull_requests() {
	curl -s --location --request GET "https://api.github.com/repos/$ORG/$REPO/pulls?state=all&per_page=40&page=$1" \
		--header "Authorization: token $TOKEN" \
		--header "Accept: application/vnd.github+json"
}

get_raw_output() {
	printf '%s' "$1" | jq -r
}

i=1
while [ "$OUTPUT_RAW" != "[]" ] ; do
	OUTPUT=$( get_pull_requests "$i" )
	OUTPUT_RAW=$( get_raw_output "$OUTPUT" )

	i=$((i+1))

	printf '%s' "$OUTPUT" | jq -r '.[] | [ .created_at, .html_url, .user.login, .title ] | @csv'
done

To use the script, you’ll need to have jq. On a mac, you can use brew install jq.

The only other pre-requisite that you’ll need to export pull requests from Github is a personal access token.

From there, you should just need to run the script with something like to export all of your pull requests for a given repository:

sh github_pulls_export.sh TOKEN ORG REPO

If you’d like to change what data gets exported, simply change the fields that are pulled in this section:

[ .created_at, .html_url, .user.login, .title ]

You can modify that printf line to get an idea of what fields are even included that you can pull from

Responses

robin

August 8, 2023

Hello and thanks for setting up the script in the first place. When I am running it now,it gives the below error:
jq: error (at :3): Cannot index string with string “created_at” .

Eric Binnion

August 8, 2023

In my testing, this can happen when a bad response is returned. Here’s an updated version of the script that adds a basic HTTP status code check in. Let me know if this helps.

#!/bin/bash

# This script requires jq to be installed and available in the path.

TOKEN=$1
ORG=$2
REPO=$3
OUTPUT_RAW=""

get_pull_requests() {
    curl -s -w "\n%{http_code}" --location --request GET "https://api.github.com/repos/$ORG/$REPO/pulls?state=all&per_page=40&page=$1" \
        --header "Authorization: token $TOKEN" \
        --header "Accept: application/vnd.github+json"
}

get_raw_output() {
    printf '%s' "$1" | jq -r
}

i=1
while [ "$OUTPUT_RAW" != "[]" ] ; do
    OUTPUT=$( get_pull_requests "$i" )

    HTTP_CODE=$(printf "%s" "$OUTPUT" | tail -n1)
    BODY=$(printf "%s" "$OUTPUT" | sed '$d')

    if [ "$HTTP_CODE" -lt 200 ] || [ "$HTTP_CODE" -ge 300 ]; then
        echo "HTTP code: $HTTP_CODE" >&2
        echo "$BODY" >&2
        exit 1
    fi

    printf '%s' "$BODY" | jq -r '.[] | [ .created_at, .html_url, .user.login, .title ] | @csv'

    OUTPUT_RAW=$( get_raw_output "$BODY" )
    i=$((i+1))
done

How to export Github pull requests

Responses

Leave a ReplyCancel reply

You may also enjoy…

Discover more from Eric Binnion