Skip to content

Instantly share code, notes, and snippets.

@epicserve
Last active December 22, 2015 10:09
Show Gist options
  • Select an option

  • Save epicserve/6456150 to your computer and use it in GitHub Desktop.

Select an option

Save epicserve/6456150 to your computer and use it in GitHub Desktop.
List the top pages for a website using Google Analytics
"""
List the top pages for a website using Google Analytics. This is a base
example that you could build off of to create lists on your website like,
"Top Articles this Week", etc.
Installation
------------
1. Install the required python libraries::
pip install google-api-python-client==1.2
pip install PyOpenSSL==0.13.1
2. Go to https://code.google.com/apis/console/ to get your API credentials.
- Click create project
- Click the on/off button next to "Analytics API" to turn on
"Analytics API" service.
- Click on the "API Access" tab
- Click the "Create an OAuth 2.0 client ID" button
- Add a "Product name" and "Home Page URL" and then click next
- Select "Service Account"
- Click "Create client ID"
- Click the "Download private key" button
- Rename the file google-api-private-key.p12 and move it into the same
directory as this script
4. Make note of your service account email address
(e.g. 1212232334344545@developer.gserviceaccount.com).
Log in to Google Analytics account and then give your service account email
address read access to the web site property you want to use this script for.
Helpful Links
-------------
- `API Console <https://code.google.com/apis/console>`_
- `Service Accounts <https://developers.google.com/accounts/docs/OAuth2ServiceAccount>`_
- `Using OAuth 2.0 for Server to Server Applications <https://developers.google.com/accounts/docs/OAuth2ServiceAccount>`_
- `Service Account Example <https://code.google.com/p/google-api-python-client/source/browse/samples/service_account/tasks.py>`_
Usage
-----
Before you run the script you'll need to get the property ID for the website
you want to get top pages for. Login to Google Analytics and then view the
analytics for the website you want the top pages for. The property ID will be
the number after the "p" in the URL.
If the URL was
https://www.google.com/analytics/web/?hl=en&pli=1#report/visitors-overview/a1212121w23232323p34343434/,
then your profile ID would be 34343434.
Run the script::
python list_ga_top_pages.py --profile_id 34343434 \
--service_account_email 1212232334344545@developer.gserviceaccount.com \
--start_date '2013-09-01' --end_date '2013-09-05' --filter '^/news/201*' --max_results 20
"""
from apiclient.discovery import build
from oauth2client.client import SignedJwtAssertionCredentials
import argparse
import httplib2
__author__ = "epicserve@gmail.com (Brent O'Connor)"
def get_ga_service(
service_account_email,
pk_file_path,
scope='https://www.googleapis.com/auth/analytics.readonly'):
# Load the key in PKCS 12 format private key
f = open(pk_file_path, 'rb')
key = f.read()
f.close()
credentials = SignedJwtAssertionCredentials(
service_account_email,
key,
scope='https://www.googleapis.com/auth/analytics.readonly')
http = credentials.authorize(httplib2.Http())
service = build("analytics", "v3", http=http)
return service
def get_top_pages(service, profile_id, start_date, end_data, filter='^/*', max_results=50):
return service.data().ga().get(
ids='ga:' + str(profile_id),
start_date=start_date,
end_date=end_data,
metrics='ga:pageviews',
dimensions='ga:pagePath',
sort='-ga:pageviews',
filters='ga:pagePath=~' + filter,
start_index='1',
max_results=max_results).execute()
if __name__ == '__main__':
parser = argparse.ArgumentParser(description='List the top pages for a website.')
parser.add_argument('-pid', '--profile_id', type=int, required=True)
parser.add_argument('--service_account_email', required=True)
parser.add_argument('--start_date', required=True, help='YYYY-MM-DD')
parser.add_argument('--end_date', required=True, help='YYYY-MM-DD')
parser.add_argument('--filter', default='^/*', help='^/news/20*')
parser.add_argument('--max_results', default=50, type=int)
parser.add_argument('--pk_file_path', default='google-api-private-key.p12')
args = parser.parse_args()
service = get_ga_service(args.service_account_email, args.pk_file_path)
results = get_top_pages(service, args.profile_id, start_date=args.start_date, end_data=args.end_date, filter=args.filter, max_results=args.max_results)
if results.get('rows', []):
for row in results.get('rows'):
print('{0:<152}{1:>5}'.format(*row))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment