Difference between revisions of "NOTS Computing for Matching Entrepreneurs to VCs"

From edegan.com
Jump to navigation Jump to search
 
(15 intermediate revisions by 2 users not shown)
Line 1: Line 1:
{{McNair Projects
+
{{Project
 +
|Has project output=Tool,How-to
 +
|Has sponsor=McNair Center
 
|Has title=NOTS Computing for Matching Entrepreneurs to VCs
 
|Has title=NOTS Computing for Matching Entrepreneurs to VCs
 
|Has owner=Wei Wu,
 
|Has owner=Wei Wu,
Line 8: Line 10:
 
|Has image=NOTS.png
 
|Has image=NOTS.png
 
}}
 
}}
In progress. Building documentation from: https://docs.rice.edu/confluence/display/CD/Getting+Started+on+NOTS
+
When in doubt, consult the full documentation by CRC: https://docs.rice.edu/confluence/display/CD/Getting+Started+on+NOTS
 
=Synopsis=
 
=Synopsis=
Summer 2018. We try to use NOTS ([https://docs.rice.edu/confluence/display/CD/Getting+Started+on+NOTS Night Owls Time-Sharing Service]), a computing cluster of Rice's CRC (Center for Research Computing) to run the Matlab code for Matching Entrepreneurs to VCs. This is a documentation on how to log onto and use NOTS. Currently only Wei and Ed have access to NOTS.
+
Summer 2018. We try to use NOTS ([https://docs.rice.edu/confluence/display/CD/Getting+Started+on+NOTS Night Owls Time-Sharing Service]), a computing cluster of Rice's CRC (Center for Research Computing) to run the Matlab code for Matching Entrepreneurs to VCs. This is a documentation on how to use NOTS. Currently only Wei and Ed have access to NOTS.
  
 
=Getting Started=
 
=Getting Started=
Line 33: Line 35:
 
=Files Placement=
 
=Files Placement=
 
Currently all the readjusted code for matching entrepreneurs to VCs are stored under /projects/fox/work
 
Currently all the readjusted code for matching entrepreneurs to VCs are stored under /projects/fox/work
==To transfer files==
+
==Transferring files==
We have access to some of the directory on NOTS. In most cases, we will either work within our $HOME directory (4 GB storage quota):  
+
We have access to some of the directories on NOTS. In most cases, we will either work within our $HOME directory (4 GB storage quota):  
 
  /home/*username*
 
  /home/*username*
or within $PROJECT under Jeremy Fox (100 GB storage quota):
+
or within $PROJECT under the group of Prof. Jeremy Fox (100 GB storage quota):
 
  /projects/fox
 
  /projects/fox
  
Line 42: Line 44:
 
  scp some_file.dat *.incl *.txt (your_login_name)@nots.rice.edu:
 
  scp some_file.dat *.incl *.txt (your_login_name)@nots.rice.edu:
 
This will put the files into your $HOME directory on NOTS
 
This will put the files into your $HOME directory on NOTS
 +
 +
=Set up Matlab Parallel Computing Toolbox (PCT) on NOTS=
 +
Reference: https://docs.rice.edu/confluence/display/CD/Set+up+MATLAB+Parallel+Computing+Toolbox+from+a+cluster+login+node+on+DAVINCI
 +
 +
==Setting Up Passwordless SSH (SSH Keys) on the Clusters (copied/modified from [https://docs.rice.edu/confluence/display/CD/Setting+Up+Passwordless+SSH+%28SSH+Keys%29+on+the+Clusters this])==
 +
Passwordless SSH is required on the Shared Computing Resources if you need to run MPI jobs using srun, or need to use other specialized software which uses SSH for communication between nodes.  The srun command spawns copies of your executable on all of the nodes allocated to you by SLURM.  It will communicate with these nodes via SSH so it is necessary that SSH is configured with SSH host keys (passwordless SSH) for your account.  This document describes how to enable passwordless SSH on these systems.<br>
 +
 +
*The first step in establishing passwordless SSH is to create your public host keys.  Login to the cluster and run the ssh-keygen command. Accept all of the default values and do not enter a passphrase.
 +
ssh-keygen -t rsa
 +
 +
*After you have created your public host key above, append the contents of ~/.ssh/id_rsa.pub to ~/.ssh/authorized_keys.  This will enable mpirun to login from one compute node to another using SSH without a password.
 +
cat .ssh/id_rsa.pub >> .ssh/authorized_keys
 +
 +
*To avoid ssh prompts when automatically logging into compute nodes allocated by the scheduler, configure ssh to not use strict host key checking.  Create the file ~/.ssh/config as shown below.
 +
nano ~/.ssh/config
 +
 +
Host *
 +
    StrictHostKeyChecking no
 +
    UserKnownHostsFile /dev/null
 +
    LogLevel QUIET
 +
 +
==Import a cluster configuration on PCT==
 +
Open Matlab:
 +
matlab -nodisplay
 +
 +
This step only has to be performed once. Once imported, the profile will persist through multiple MATLAB runs. For each cluster the profile includes cluster settings for all available queues. The following commands, run on NOTS, will import the profiles and set Commons as the default:
 +
configCluster
 +
 +
==Wall Time==
 +
Specifying job time (walltime) is required before submitting a job via matlab and validating a cluster parallel profile. Here is an example, please be as accurate as possible to minimize your wait time.
 +
For example, set the walltime for one day and one hour:
 +
 +
ClusterInfo.setWallTime('1-01:00:00')
 +
 +
==Optimal use of the Cluster Configuration==
 +
For performance and file quota reasons, please have your workspace in $SHARED_SCRATCH/your-userid/. In order to do this, you must first create a workspace directory and then modify your cluster configuration to use the new workspace configuration. To change the location for your default profile you can run something like the following in MATLAB:
 +
 +
workdir = [getenv('SHARED_SCRATCH') filesep getenv('USER') filesep 'MdcsDataLocation'];
 +
mkdir(workdir);
 +
 +
pc = parcluster;
 +
set(pc, 'JobStorageLocation', workdir);
 +
p.saveProfile;
 +
 +
=Submitting jobs through Matlab PCT=
 +
https://docs.rice.edu/confluence/display/CD/MATLAB+Parallel+Computing+Toolbox+testing+on+the+Shared+Computing+Clusters

Latest revision as of 13:48, 21 September 2020


Project
NOTS Computing for Matching Entrepreneurs to VCs
Project logo 02.png
Project Information
Has title NOTS Computing for Matching Entrepreneurs to VCs
Has owner Wei Wu
Has start date 2018-07-09
Has deadline date
Has keywords NOTS, Matlab
Has project status Active
Is dependent on Estimating Unobserved Complementarities between Entrepreneurs and Venture Capitalists Matlab Code, Parallelize msmf corr coeff.m
Has sponsor McNair Center
Has project output Tool, How-to
Copyright © 2019 edegan.com. All Rights Reserved.

When in doubt, consult the full documentation by CRC: https://docs.rice.edu/confluence/display/CD/Getting+Started+on+NOTS

Synopsis

Summer 2018. We try to use NOTS (Night Owls Time-Sharing Service), a computing cluster of Rice's CRC (Center for Research Computing) to run the Matlab code for Matching Entrepreneurs to VCs. This is a documentation on how to use NOTS. Currently only Wei and Ed have access to NOTS.

Getting Started

SSH to NOTS with your net id. For example, from a Linux/UNIX machine:

ssh -Y (username)@nots.rice.edu

To check what software is available, type

module spider

To see what modules are loaded,

module list

Currently, Matlab 2015a is installed on NOTS. To load a software such as Matlab 2015a,

module load MATLAB/2015a

To load this module by default at login,

module save

To unload all the modules,

module purge

Files Placement

Currently all the readjusted code for matching entrepreneurs to VCs are stored under /projects/fox/work

Transferring files

We have access to some of the directories on NOTS. In most cases, we will either work within our $HOME directory (4 GB storage quota):

/home/*username*

or within $PROJECT under the group of Prof. Jeremy Fox (100 GB storage quota):

/projects/fox

To transfer some files from your local Linux/UNIX machine to NOTS, use the following Secure Copy command on your local terminal:

scp some_file.dat *.incl *.txt (your_login_name)@nots.rice.edu:

This will put the files into your $HOME directory on NOTS

Set up Matlab Parallel Computing Toolbox (PCT) on NOTS

Reference: https://docs.rice.edu/confluence/display/CD/Set+up+MATLAB+Parallel+Computing+Toolbox+from+a+cluster+login+node+on+DAVINCI

Setting Up Passwordless SSH (SSH Keys) on the Clusters (copied/modified from this)

Passwordless SSH is required on the Shared Computing Resources if you need to run MPI jobs using srun, or need to use other specialized software which uses SSH for communication between nodes. The srun command spawns copies of your executable on all of the nodes allocated to you by SLURM. It will communicate with these nodes via SSH so it is necessary that SSH is configured with SSH host keys (passwordless SSH) for your account. This document describes how to enable passwordless SSH on these systems.

  • The first step in establishing passwordless SSH is to create your public host keys. Login to the cluster and run the ssh-keygen command. Accept all of the default values and do not enter a passphrase.
ssh-keygen -t rsa
  • After you have created your public host key above, append the contents of ~/.ssh/id_rsa.pub to ~/.ssh/authorized_keys. This will enable mpirun to login from one compute node to another using SSH without a password.
cat .ssh/id_rsa.pub >> .ssh/authorized_keys
  • To avoid ssh prompts when automatically logging into compute nodes allocated by the scheduler, configure ssh to not use strict host key checking. Create the file ~/.ssh/config as shown below.
nano ~/.ssh/config

Host *
   StrictHostKeyChecking no
   UserKnownHostsFile /dev/null
   LogLevel QUIET

Import a cluster configuration on PCT

Open Matlab:

matlab -nodisplay

This step only has to be performed once. Once imported, the profile will persist through multiple MATLAB runs. For each cluster the profile includes cluster settings for all available queues. The following commands, run on NOTS, will import the profiles and set Commons as the default:

configCluster

Wall Time

Specifying job time (walltime) is required before submitting a job via matlab and validating a cluster parallel profile. Here is an example, please be as accurate as possible to minimize your wait time. For example, set the walltime for one day and one hour:

ClusterInfo.setWallTime('1-01:00:00')

Optimal use of the Cluster Configuration

For performance and file quota reasons, please have your workspace in $SHARED_SCRATCH/your-userid/. In order to do this, you must first create a workspace directory and then modify your cluster configuration to use the new workspace configuration. To change the location for your default profile you can run something like the following in MATLAB:

workdir = [getenv('SHARED_SCRATCH') filesep getenv('USER') filesep 'MdcsDataLocation'];
mkdir(workdir);

pc = parcluster;
set(pc, 'JobStorageLocation', workdir);
p.saveProfile;

Submitting jobs through Matlab PCT

https://docs.rice.edu/confluence/display/CD/MATLAB+Parallel+Computing+Toolbox+testing+on+the+Shared+Computing+Clusters