Skip to content

Instantly share code, notes, and snippets.

@surhudm
Last active November 26, 2021 10:26
Show Gist options
  • Select an option

  • Save surhudm/d89d0a30fc72f06ba8428de9a3095d11 to your computer and use it in GitHub Desktop.

Select an option

Save surhudm/d89d0a30fc72f06ba8428de9a3095d11 to your computer and use it in GitHub Desktop.

Revisions

  1. surhudm revised this gist Nov 14, 2021. 1 changed file with 34 additions and 3 deletions.
    37 changes: 34 additions & 3 deletions LSST_pipeline_setup.md
    Original file line number Diff line number Diff line change
    @@ -79,7 +79,40 @@ butler register-instrument $DIR lsst.obs.subaru.HyperSuprimeCam

    ## Data finally!

    If you have access to gen3-shared-repo-admin tools, then skip this and go down one section:

    Let us ingest some raw data from the directory `$HOME/Subaru_rawdata` now. Depending upon the size of your data, this can take a really loooooooooooong time.
    ```
    butler --progress ingest-raws $DIR $HOME/Subaru_rawdata -t direct 2>&1 > rawingest.log &
    ```
    Define each exposure as a single visit using the next command:
    ```
    butler define-visits $DIR HSC
    ```

    Since you have ingested the raws, now we do not have to ingest the raws once again. So create a file called `skipraws.py`.

    To this file add,
    ```
    # skipraws.py file
    import lsst.obs.base.gen2to3.convertRepo
    assert type(config)==lsst.obs.base.gen2to3.convertRepo.ConvertRepoConfig, 'config is of type %s.%s instead of lsst.obs.base.gen2to3.convertRepo.ConvertRepoConfig' % (type(config).__module__, type(config).__name__)
    config.datasetIgnorePatterns=["raw"]
    config.doMakeUmbrellaCollection=False
    config.doExpandDataIds=False
    ```

    If you have a gen2 root directory which has your previous processing say from HSCpipe, then you can utilize it here and ingest the skymaps, reference catalogs, calibrations with the next command:
    ```
    GEN2ROOT=$HOME/gen2root
    butler --progress convert $DIR --gen2root $GEN2ROOT -C skipraws.py -t direct 2>&1 > gen2convert.log &
    ```

    If you do not have one, then you download the calibration data from https://www.subarutelescope.org/Observing/Instruments/HSC/calib_data.html. You need to then create a gen2 repo and ingest these calibrations into the gen2 repository first following instructions at https://hsc.mtk.nao.ac.jp/pipedoc/pipedoc_8_e/ . You need to specifically initialize the repository, use `ingestRaws` to ingest a couple of exposures. Then ingest all the calibrations `CALIB`, `SKY`, `FLAT`, `BIAS`, `DARK` following the procedure written there. Once you have all this, then you can get your skyamps, refcats and calibrations from this gen2 repository using the command above.

    After this you can also inherit any of your reruns one after the other. For more complicated rerun ingestion you should take a look at the script `convert.py` available here https://github.com/lsst/obs_base/blob/master/python/lsst/obs/base/script/convert.py and play around with it.

    ## Advanced repositories: gen3 shared repo admin tools

    @@ -88,6 +121,4 @@ Next we can use some butler-admin tools to do some ingestion of data in to this
    mkdir $HOME/github
    cd $HOME/github
    git clone git@github.com:lsst-dm/gen3_shared_repo_admin.git
    ```


    ```
  2. surhudm created this gist Nov 14, 2021.
    93 changes: 93 additions & 0 deletions LSST_pipeline_setup.md
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,93 @@
    Here I describe my experience in setting up the LSST pipeline Gen3 butler on an IUCAA server.

    ## Set up the LSST stack
    ```
    mkdir -p lsst_stack
    cd lsst_stack
    curl -OL https://raw.githubusercontent.com/lsst/lsst/master/scripts/newinstall.sh
    bash newinstall.sh -ct
    source loadLSST.bash
    eups distrib install -t v23_0_0_rc2
    curl -sSL https://raw.githubusercontent.com/lsst/shebangtron/master/shebangtron | python
    setup lsst_distrib
    ```

    ## Set up postgresql

    The LSST gen3 pipeline requires a database to be set up. This can be either done with sqlite3 (but may not be suited for heavy processing). Sqlite3 database creation is as simple as creating just an empty file with that name. But here I describe the setup of a postgresql server. This does not require any root password.

    - Download the latest postgresql server from: https://www.postgresql.org/
    - Untar the file and change the directory to the untarred one

    ```
    ./configure --prefix=$HOME
    make -j 20
    make install
    cd contrib
    make install
    export LD_LIBRARY_PATH=$LD_LIBRARY_PATH=$HOME/lib
    ```
    Now initialize the database, change the `~/gen3_db` to the location you want your database to reside
    ```
    $HOME/bin/initdb -D ~/gen3_db
    ```

    Open the file `~/gen3_db/postgresql.conf`, and change listen addresses to the appropriate address that you need to listen from.
    ```
    listen_addresses = '*'
    max_connections = 1200
    ```
    In `pg_hba.conf`, add the following line (assumes your infiniband network is on `192.168.1.XXX` addresses, otherwise change appropriately.
    ```
    host all all 192.168.1.0/24 md5
    ```

    Now start the postgresql server:
    ```
    $HOME/bin/pg_ctl -D ~/gen3_db -l logfile start
    ```

    First create the database location, then open it in `psql`, add a `btree_gist` extension and also add a password:
    ```
    createdb gen3
    ~/bin/psql gen3
    gen3=# CREATE EXTENSION btree_gist
    gen3=# \password
    ```

    Now that this has been setup, you can change all the `trust` authentication in `~/gen3_db/pg_hba.conf` to md5. This way all access will now be password based. You can setup the password using the environment variable `$PGPASSWORD` or write it in clear text in a `$HOME/.pgpass` file.
    ```
    export PGPASSWORD=YourPassword
    ```

    ## Create a gen3 repository

    Now let us create a space for the gen3 repository.
    ```
    mkdir $HOME/gen3_repo
    ```
    Setup butler and register the instrument (in our case Subaru HSC).

    ```
    echo "registry:" > reg.yaml
    echo " db: postgresql://username@server_ip_address/gen3" >> reg.yaml
    DIR=$HOME/gen3_repo
    butler create $DIR --seed-config reg_2018.yaml --override
    butler register-instrument $DIR lsst.obs.subaru.HyperSuprimeCam
    ```

    ## Data finally!



    ## Advanced repositories: gen3 shared repo admin tools

    Next we can use some butler-admin tools to do some ingestion of data in to this repository. The repository however is invitation only at this moment.
    ```
    mkdir $HOME/github
    cd $HOME/github
    git clone git@github.com:lsst-dm/gen3_shared_repo_admin.git
    ```