Using Pandas to Convert JSON to Dataframes
Using Pandas to Convert JSON to Dataframes
Situation: We are pulling in JSON data and need to convert it to a dataframe to load to a relational database or for further analysis.
It’s likely the JSON is nested with at least two levels. Here the requests
library makes an http request to an API endpoint. A function run_query(q)
is written to return the request in JSON data.
import requests
import json
import pandas as pd
def run_query(q):
request = requests.post('https://api-endpoint'
'',
json={'query': query})
if request.status_code == 200:
return request.json()
else:
raise Exception('Query failed. return code is {}. {}'.format(
request.status_code, query))
Then we need to unnest the JSON into a list of dictionaries before converting into dataframe. THe flow is JSON -> get Items -> turn into List -> dig down into List of Dictionaries -> convert to dataframe:
# returns JSON
result = run_query(query)
# get Items
result_items = result.items()
# turn into List
result_list = list(result_items)
# dig down into List of Dictionaries (2 levels)
lst_of_dict = result_list[0][1].get('transferBanks')
# convert to data frame
df = pd.json_normalize(lst_of_dict)
For more content on data science, R, and Python find me on Twitter.