This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import boto3 | |
| def getParameter(param_name): | |
| return boto3.client('ssm').get_parameter(Name=param_name)['Parameter']['Value'] | |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # Data file is protein sequence database psd7003.xml (683MB) from http://www.cs.washington.edu/research/xmldatasets/www/repository.html#pir | |
| # Structure is list of ~260,000 <ProteinEntry> elements | |
| # Goal is to read data file, and process each <ProteinEntry> onto its own line of output | |
| # This simulates filtering/splitting a large list of similar XML elements | |
| $ time ruby xmlsplit.rb test.xml > output_rb | |
| real 2m35.965s | |
| user 2m28.158s | |
| sys 0m3.340s |