Skip to content

Instantly share code, notes, and snippets.

@biomadeira
Last active September 8, 2015 15:28
Show Gist options
  • Select an option

  • Save biomadeira/d2c9d27407c8bf4e90d8 to your computer and use it in GitHub Desktop.

Select an option

Save biomadeira/d2c9d27407c8bf4e90d8 to your computer and use it in GitHub Desktop.
Modified perl script for updating BIOLIP
#!/usr/bin/perl -w
use strict;
# BIOLIP from Zhang Lab
# http://zhanglab.ccmb.med.umich.edu/BioLiP/
# Folder Structure
# ================
# NOBACK/
# DB/
# biolip/
# database/
# biolip.txt
# Get old datasets:
# wget http://zhanglab.ccmb.med.umich.edu/BioLiP/download/BioLiP.tar.bz2
# tar -zxvf BioLiP.tar.bz2
# mv BioLiP_2013-03-6.txt biolip_all_data.txt
# then run (modified) download_all_sets.pl to update and then
# cat database/biolip_all_data.txt | cut -f 1,2,5,6 > biolip.txt
# `mkdir -p BioLiP_updated_set`;
#`rm -fr BioLiP_updated_set/*`;
chdir "database";
`rm weekly.html`;
my $head= "http://zhanglab.ccmb.med.umich.edu/BioLiP/weekly";
my $address="http://zhanglab.ccmb.med.umich.edu/BioLiP/weekly.html";
system("wget -o log -c $address") == 0 or die "System call failed: $!";
my @rst=`cat weekly.html`;
my @all=();
foreach my $r(@rst)
{
if($r =~ /\<tr\>\<td\>(\S+)\<\/td\>/)
{
# print "$1\n";
push(@all, $1);
}
}
my $tot=@all;
print "\n====================================================\n";
print "In total, there are $tot weeks to update.\n\n";
my $annotation="biolip_all_data.txt";
# open(OUT, ">$annotation");
# close(OUT);
# my $annotation1="BioLiP_UP_nr.txt";
# open(OUT, ">$annotation1");
# close(OUT);
foreach my $r(@all)
{
my $rec="receptor_$r.tar.bz2";
my $rec1="receptor1_$r.tar.bz2";
my $lig="ligand_$r.tar.bz2";
my $ano="BioLiP_$r.txt";
my $ano1="BioLiP_$r\_nr.txt";
if(-s $ano)
{
print "The week $r was updated before, skip this one\n";
}
else
{
print "Dowload redundant set for the week $r...\n";
# system("wget -o log -c $head/$rec") == 0 or die "System call failed: $!";
# system("tar -xvf $rec >log")== 0 or die "System call failed: $!";
# system("wget -o log -c $head/$rec1") == 0 or die "System call failed: $!";
# system("tar -xvf $rec1>log")== 0 or die "System call failed: $!";
# system("wget -o log -c $head/$lig") == 0 or die "System call failed: $!";
# system("tar -xvf $lig >log")== 0 or die "System call failed: $!";
system("wget -o log -c $head/$ano") == 0 or die "System call failed: $!";
# print "Dowload non-redundant set for the week $r...\n";
# $rec="receptor_$r\_nr.tar.bz2";
# $rec1="receptor1_$r\_nr.tar.bz2";
# $lig="ligand_$r\_nr.tar.bz2";
# system("wget -o log -c $head/$rec") == 0 or die "System call failed: $!";
# system("tar -xvf $rec >log")== 0 or die "System call failed: $!";
# system("wget -o log -c $head/$rec1") == 0 or die "System call failed: $!";
# system("tar -xvf $rec1>log")== 0 or die "System call failed: $!";
# system("wget -o log -c $head/$lig") == 0 or die "System call failed: $!";
# system("tar -xvf $lig >log")== 0 or die "System call failed: $!";
# system("wget -o log -c $head/$ano1") == 0 or die "System call failed: $!";
}
`cat $ano >> $annotation`;
# `cat $ano1 >> $annotation1`;
#last;
}
print "Cheers! All updates are done.\n";
print "====================================================\n\n";
# print "Please download the old sets manually at
# http://zhanglab.ccmb.med.umich.edu/BioLiP/download.html
# Please read http://zhanglab.ccmb.med.umich.edu/BioLiP/download/readme.txt
# about explanation of the annotation file.
# ";
# print "Please feel free to contact me (Jianyi, yangji\@umich.edu) if you have any problems with BioLiP.\n
# Thanks for using the BioLiP database!
# ---------------------------------------------
# Please cite the following paper if you use BioLiP in your projects:
# Jianyi Yang, Ambrish Roy, Yang Zhang, BioLiP: a semi-manually curated database for biologically
# relevant ligand-protein interactions, Nucleic Acids Research, 41:D1096-D1103, 2013.
# ";
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment