NOTS Computing for Matching Entrepreneurs to VCs

From edegan.com
Jump to navigation Jump to search


Project
NOTS Computing for Matching Entrepreneurs to VCs
Project logo 02.png
Project Information
Has title NOTS Computing for Matching Entrepreneurs to VCs
Has owner Wei Wu
Has start date 2018-07-09
Has deadline date
Has keywords NOTS, Matlab
Has project status Active
Is dependent on Estimating Unobserved Complementarities between Entrepreneurs and Venture Capitalists Matlab Code, Parallelize msmf corr coeff.m
Has sponsor McNair Center
Has project output Tool, How-to
Copyright © 2019 edegan.com. All Rights Reserved.

When in doubt, consult the full documentation by CRC: https://docs.rice.edu/confluence/display/CD/Getting+Started+on+NOTS

Synopsis

Summer 2018. We try to use NOTS (Night Owls Time-Sharing Service), a computing cluster of Rice's CRC (Center for Research Computing) to run the Matlab code for Matching Entrepreneurs to VCs. This is a documentation on how to use NOTS. Currently only Wei and Ed have access to NOTS.

Getting Started

SSH to NOTS with your net id. For example, from a Linux/UNIX machine:

ssh -Y (username)@nots.rice.edu

To check what software is available, type

module spider

To see what modules are loaded,

module list

Currently, Matlab 2015a is installed on NOTS. To load a software such as Matlab 2015a,

module load MATLAB/2015a

To load this module by default at login,

module save

To unload all the modules,

module purge

Files Placement

Currently all the readjusted code for matching entrepreneurs to VCs are stored under /projects/fox/work

Transferring files

We have access to some of the directories on NOTS. In most cases, we will either work within our $HOME directory (4 GB storage quota):

/home/*username*

or within $PROJECT under the group of Prof. Jeremy Fox (100 GB storage quota):

/projects/fox

To transfer some files from your local Linux/UNIX machine to NOTS, use the following Secure Copy command on your local terminal:

scp some_file.dat *.incl *.txt (your_login_name)@nots.rice.edu:

This will put the files into your $HOME directory on NOTS

Set up Matlab Parallel Computing Toolbox (PCT) on NOTS

Reference: https://docs.rice.edu/confluence/display/CD/Set+up+MATLAB+Parallel+Computing+Toolbox+from+a+cluster+login+node+on+DAVINCI

Setting Up Passwordless SSH (SSH Keys) on the Clusters (copied/modified from this)

Passwordless SSH is required on the Shared Computing Resources if you need to run MPI jobs using srun, or need to use other specialized software which uses SSH for communication between nodes. The srun command spawns copies of your executable on all of the nodes allocated to you by SLURM. It will communicate with these nodes via SSH so it is necessary that SSH is configured with SSH host keys (passwordless SSH) for your account. This document describes how to enable passwordless SSH on these systems.

  • The first step in establishing passwordless SSH is to create your public host keys. Login to the cluster and run the ssh-keygen command. Accept all of the default values and do not enter a passphrase.
ssh-keygen -t rsa
  • After you have created your public host key above, append the contents of ~/.ssh/id_rsa.pub to ~/.ssh/authorized_keys. This will enable mpirun to login from one compute node to another using SSH without a password.
cat .ssh/id_rsa.pub >> .ssh/authorized_keys
  • To avoid ssh prompts when automatically logging into compute nodes allocated by the scheduler, configure ssh to not use strict host key checking. Create the file ~/.ssh/config as shown below.
nano ~/.ssh/config

Host *
   StrictHostKeyChecking no
   UserKnownHostsFile /dev/null
   LogLevel QUIET

Import a cluster configuration on PCT

Open Matlab:

matlab -nodisplay

This step only has to be performed once. Once imported, the profile will persist through multiple MATLAB runs. For each cluster the profile includes cluster settings for all available queues. The following commands, run on NOTS, will import the profiles and set Commons as the default:

configCluster

Wall Time

Specifying job time (walltime) is required before submitting a job via matlab and validating a cluster parallel profile. Here is an example, please be as accurate as possible to minimize your wait time. For example, set the walltime for one day and one hour:

ClusterInfo.setWallTime('1-01:00:00')

Optimal use of the Cluster Configuration

For performance and file quota reasons, please have your workspace in $SHARED_SCRATCH/your-userid/. In order to do this, you must first create a workspace directory and then modify your cluster configuration to use the new workspace configuration. To change the location for your default profile you can run something like the following in MATLAB:

workdir = [getenv('SHARED_SCRATCH') filesep getenv('USER') filesep 'MdcsDataLocation'];
mkdir(workdir);

pc = parcluster;
set(pc, 'JobStorageLocation', workdir);
p.saveProfile;

Submitting jobs through Matlab PCT

https://docs.rice.edu/confluence/display/CD/MATLAB+Parallel+Computing+Toolbox+testing+on+the+Shared+Computing+Clusters