They want you to put in your .gz files and let them do everything.
I'm fine with that because currently I am on a time crunch to submit an abstract for PAG. I re-gzipped the files because I forgot to use the -c flag.
Then I re-added the soft links to the relevant dirs. You have to delete the previous links, btw. Now they direct to the .gz files.
I tried running some trial commands from the dev node to see if stuff will work now. It seems to be ok, so I edited their bash to run on PBS. It seemed to break b/c I didn't load the perl module.
I'll try again.
I had to figure out how to install the required perl module. I tried several options including perlbrew and local::lib. These options are supposed to allow for local perl modules to run, however that didn't seem to work and I was getting confused. I opted to have my admin install the module that was needed Math::Random::MT::Auto.
Once that was set I finally seemed to get everything to run.
And so, to recap:
1) Change the config file, run
2) I modified all the "Run_all_bats.sh" files to work on PBS. Do do that I added the relevant time and comp requests at the top of the script. I also made sure to load the relevant module for the script. Including FastX, Perl, BWA, and SAMTools.
Finally, if the "Run_all_bats.sh" needed a variable (ie to work on one of the bulks) I modified the code were the original script used $1 to $var. Then the script may be submitted in this fashion:
Where -v passes the following variables to the script and var=# is the variable passed with the value # (in this case either 9 for the Reference, 0 for Bulk A and 1 for Bulk B.
An example below:
Unfortunately after several attempts I can't seem to get this pipeline to work for me.
The problems arise from perl version issues. And since we are running this on HPCC it is hard for me to fiddle around with the perl versions etc...