Python in Digital Analytics: Automating Facebook metric reports
I am very happy to come back to this topic, after started this blog by writing about Facebook metrics 6 months ago.
Just a little side note – I used to code a lot at 16 years old – never found much fun and gave up once I made it to college. Ironically, this time I had a lot of joy learning Python to connect to Facebook API for “insights” in the past few months.
The business case here is straightforward: We need to understand how we are doing in social media channel from the metrics perspective. As I addressed in my previous post, the data exporting feature of Facebook business manager is simply not good enough.
Using Python, I was able to directly call Facebook’s Graphic API, and retrieve the metrics – impressions, likes, shares, comments, link clicks etc – of each individual posts as far as 2 years ago, to a CSV file. Then I can port it with Microsoft PowerBI and do all kinds of analysis and visualization.
What I like this the most, is by having this Python code in place, I can bypass the traditional “Extract, Transform and Load” process of using the data warehouse. As a proud “full stack” analyst, I want my “data to insights” journey as simple as possible. Also, I can’t afford the typical 1 day of delay (of data warehousing), as a social media post can go viral and suddenly receive 100 “likes” in 4 hours.
Now let me explain how it works in a bit more geeky way…
First and foremost, you must be able to “walk” before “run”.
The “page analyst” access for the page you’d like to receive metric from is a prerequisite. You must be able to download the Facebook metrics manually first, before leveraging automated solution.
Then you should get yourself familiar with the metrics provided by Facebook’s graphic API. My previous blog post can be a good starting point.
Next we are starting to deal with actually coding. I built two functions in my program:
-
- The first one is to get all the posts within a “Facebook page” and all of its main attributes, including created time, unique post ID, link URL, message text etc.
[code language=”python”]
def get_post():
url_post = “https://graph.facebook.com/v2.9/”<span style=”color: rgb(255, 0, 0);” data-mce-style=”color: #ff0000;”>Your Post Name</span>”/?fields=posts.limit(99)%7Btype%2Ccreated_time%2Cid%2Cpermalink_url%2Cmessage%7D&access_token=” + key
count = 0
rows_post = []
rows_post.append([“id”,”post_type”,”created_time”,”permalink_url”,”message”,”impression”,”like”,”comment”,”share”,”click”])
while True:
response_post = requests.get(url_post)
response_post.raise_for_status()
FBPost = json.loads(response_post.text)
if count > 20: # Set time out = 20 (times) calling Facebook API. Maximum post the program can handle is 99 * 20 = 1980
return rows_post
elif count == 0:
for item in FBPost[“posts”][“data”]:
temp_post_type = item.get(“type”,””)
temp_createdtime = item.get(“created_time”,””)
temp_id = item.get(“id”,””)
temp_permlink = item.get(“permalink_url”,””)
temp_message = item.get(“message”,””)
temp = [temp_id,temp_post_type,temp_createdtime,temp_permlink,temp_message]
print(temp)
dateObject = datetime.strptime(temp_createdtime,’%Y-%m-%dT%H:%M:%S+0000′) # Parse time format – originated from stackoverflow
if dateObject < startDate:
return rows_post
rows_post.append(temp)
url_post = FBPost[“posts”][“paging”][“next”] # Using the “next” field to be the new url
count = count + 1
else:
for item in FBPost[“data”]:
temp_post_type = item.get(“type”,””)
temp_createdtime = item.get(“created_time”,””)
temp_id = item.get(“id”,””)
temp_permlink = item.get(“permalink_url”,””)
temp_message = item.get(“message”,””)
temp_message = temp_message.translate(non_bmp_map) # Using transition table to replacing non-BMP
temp = [temp_id,temp_post_type,temp_createdtime,temp_permlink,temp_message]
print(temp)
dateObject = datetime.strptime(temp_createdtime,’%Y-%m-%dT%H:%M:%S+0000′) # Copy from stackoverflow
if dateObject < startDate:
return rows_post
rows_post.append(temp)
url_post = FBPost[“paging”][“next”] # Using the “next” field to be the new url
count = count + 1
[/code]
- The other one is to retrieve all the required metrics for a specific post. The facebook API actually allows for getting multiple metrics and bundles them together in one call. Here I used three: Impression(post_impressions), Engagement(post_story_adds_by_action_type) and Clicks(post_consumptions_by_type_unique).
[code language=”python”]
def get_post_metrics(post_id):
metric1 = ‘post_impressions’
metric2 = ‘post_story_adds_by_action_type’
metric3 = ‘post_consumptions_by_type_unique’
url = ‘https://graph.facebook.com/v2.9/’ + post_id + ‘/insights/’ + metric1 + ‘,’ + metric2 + ‘,’ + metric3 + ‘?access_token=’ + key
response = requests.get(url)
response.raise_for_status()
post_metrics = json.loads(response.text)
impression = 0
like = 0
comment = 0
share = 0
click = 0
fb_impression = post_metrics[‘data’][0] # Parse impression metric
if fb_impression[‘name’]==’post_impressions’:
impression = fb_impression[‘values’][0].get(“value”,0)
fb_engagement = post_metrics[‘data’][1] # Parse engagement metrics: like, share, comments
if fb_engagement[‘name’]==’post_story_adds_by_action_type’:
like = fb_engagement[‘values’][0][‘value’].get(“like”,0)
comment = fb_engagement[‘values’][0][‘value’].get(“comment”,0)
share = fb_engagement[‘values’][0][‘value’].get(“share”,0)
fb_click = post_metrics[‘data’][2] # Parse clicks metric
if fb_click[‘name’]==’post_consumptions_by_type_unique’:
click = fb_click[‘values’][0][‘value’].get(“link clicks”,0)
return [impression, like, comment, share, click]
[/code]
In the main program, you just need to get a key from Facebook API explorer , and use a simple “loop” to go thru all the posts. The library I have used is :
[code language=”python”]
import json, requests, sys, csv
[/code]
Last but not least, here are some tricks and road blocks I have experienced in my journey:
- Parsing “created_time” from Facebook’s API: As a first time python learner, I was struggling making the “string” coming out of JSON data structure to a Python compatible “time”. This stackoverflow post really helped me up and running.
[code language=”python”]
from datetime import datetime, timedelta
dateObject = datetime.strptime(temp_createdtime,’%Y-%m-%dT%H:%M:%S+0000′)
[/code]
- Handling UTF-8 issue for the weird character from the actual facebook message text: Facebook really surprised me. Python kept giving me an error message like the one below.
[code language=”python”]
Traceback (most recent call last):
File “C:\Python36\Facebook-V2.py”, line 117, in <module>
rows_output = get_post()
File “C:\Python36\Facebook-V2.py”, line 51, in get_post
print(temp)
UnicodeEncodeError: ‘UCS-2’ codec can’t encode characters in position 140-140: Non-BMP character not supported in Tk
[/code]
Researching on stackoverflow helped me again, I ended up adding these two lines to fix the error.
[code language=”python”]
non_bmp_map = dict.fromkeys(range(0x10000, sys.maxunicode + 1), 0xfffd)
temp_message = temp_message.translate(non_bmp_map) # Using transition table to replacing non-BMP
[/code]
To wrap up, I am a big believer of analysts should be able to write codes to get the data right, no matter it is using SQL to query against database, or Python to query against API like I did here.
I’d hope this will be a new starting point for me, as Facebook is just one of the social media platform and I am planning to connect with at least LinkedIn and Twitter. I am hoping to post some new learning in the next 6 months.
samhwwong
Great post Tom! Nice to see how you used Python to improve your workflow.
Tom Tao
Thanks Sam. Looking forward to learning from each other on Python in the future.