Google Cloud Platform Blog: World's largest event dataset now publicly available in BigQuery

Google Cloud Platform Blog

Product updates, customer stories, and tips and tricks on Google Cloud Platform

World's largest event dataset now publicly available in BigQuery

Thursday, May 29, 2014

Today’s guest blog comes from Kalev H. Leetaru, a fellow and adjunct faculty in the Edmund A. Walsh School of Foreign Service at Georgetown University in Washington DC. His award-winning work centers on the application of high performance computing and "big data" to grand challenge problems.GDELT Event DatabaseGoogle BigQueryYou can take it for a spin hereGDELT blogPearson correlationGive it a try hereSELECT Year, Actor1Name, Actor2Name, Count FROM ( SELECT Actor1Name, Actor2Name, Year, COUNT(*) Count, RANK() OVER(PARTITION BY YEAR ORDER BY Count DESC) rank FROM (SELECT Actor1Name, Actor2Name, Year FROM [gdelt-bq:full.events] WHERE Actor1Name < Actor2Name and Actor1CountryCode != '' and Actor2CountryCode != '' and Actor1CountryCode!=Actor2CountryCode), (SELECT Actor2Name Actor1Name, Actor1Name Actor2Name, Year FROM [gdelt-bq:full.events] WHERE Actor1Name > Actor2Name and Actor1CountryCode != '' and Actor2CountryCode != '' and Actor1CountryCode!=Actor2CountryCode), WHERE Actor1Name IS NOT null AND Actor2Name IS NOT null GROUP EACH BY 1, 2, 3 HAVING Count > 100 ) WHERE rank=1 ORDER BY YearSELECT MonthYear MonthYear, INTEGER(norm*100000)/1000 Percent FROM ( SELECT ActionGeo_CountryCode, EventRootCode, MonthYear, COUNT(1) AS c, RATIO_TO_REPORT(c) OVER(PARTITION BY MonthYear ORDER BY c DESC) norm FROM [gdelt-bq:full.events] GROUP BY ActionGeo_CountryCode, EventRootCode, MonthYear ) WHERE ActionGeo_CountryCode='UP' and EventRootCode='14' ORDER BY ActionGeo_CountryCode, EventRootCode, MonthYear;Let me knowGDELT Blog