Skip to content

Instantly share code, notes, and snippets.

@michael-erasmus
Created October 30, 2014 14:01
Show Gist options
  • Select an option

  • Save michael-erasmus/842735920578702f4b18 to your computer and use it in GitHub Desktop.

Select an option

Save michael-erasmus/842735920578702f4b18 to your computer and use it in GitHub Desktop.
extract_actions_taken.pig
set mongo.input.query {"date":{"\$gt":{"\$date":$MAX_DATE}}}
set mongo.input.split.create_input_splits false
actions_taken =
LOAD '$BUFFER_METRICS_MONGO_URI.event.seamless.actions_taken'
USING com.mongodb.hadoop.pig.MongoLoader(
'user_id:chararray,
visitor_id:chararray,
client_id:chararray,
last_modified:chararray,
date:chararray,
user_joined_at:chararray,
value:bag{t:tuple()},
extra_data:chararray'
);
extract = foreach actions_taken generate
user_id,
visitor_id,
client_id,
last_modified,
user_joined_at,
date,
value,
extra_data
;
rmf $OUTPUT_PATH/extract-actions-taken;
store extract into '$OUTPUT_PATH/extract-actions-taken' using PigStorage();
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment